Which one of the following statements is TRUE?
A. Spark SQL does not support HiveQL
B. Spark SQL does not support ANSI SQL
C. To use Spark with Hive, HiveQL queries have to rewritten in Scala
D. Spark SQL allows relational queries expressed in SQL, HiveQL, or Scala
In order for an SPSS Modeler stream to be incorporated for use in an InfoSphere Streams application leveraging SPSS Modeler Solution Publisher, you need to:
A. add a Type node
B. insert any Output node
C. add a Table node as the terminal node
D. Make the terminal node a scoring branch
Extracting structured data from various database into a "sandbox" location without writing code can be performed using which tool include with BigInsights?
A. Flume
B. Data Click
C. DataStage
D. Big SQL Load
What is Flume?
A. A distributed filesystem
B. A platform for executing MapReduce jobs
C. A programming language that translates high-level queries into map tasks and reduce tasks
D. A service for moving large amounts of data around a cluster soon after the data is produced.
A large bank was planning to offload existing data from a data warehouse into Hadoop and use SQL queries to access historical data. Which one of the following statements is true for using HiveQL?
A. It supports four logical operators in query predicates: IN, NOT IN, EXISTS, and NOT EXISTS
B. It does not support nested sub-queries
C. Hive supports all ANSI SQL 2011 syntax
D. All of the above
What does the acronym "PCI" stand for in the phrase "PCI compliant"?
A. Payment Card Industry
B. Personal Credit and Income
C. Premium Credit Inspection
D. Proactive Controls Implementation
Which of the following statements regarding importing streaming data from InfoSphere Streams into Hadoop is TRUE?
A. InfoSphere Streams can only write to HDFS not read from HDFS
B. InfoSphere Streams can only write directly to BigInsights, not other Hadoop distributions like Hortonworks or Cloudera
C. A Streams developer needs to account for the fact that BigInsights may not be able to absorb the incoming streams at the rate InfoSphere Streams is sending them
D. Adding a Big Data toolkit operator (for writing to Hadoop) to an InfoSphere Streams Processing Language (SPL) application requires that the SPL application be recompiled
Which source operator detects SPSS Collaboration and Deployment Services notification events for a specific SPSS Modeler file and downloads the indicated file version for the refreshed scoring branch?
A. SPSSPublish operator
B. SPSSScoring operator
C. SPSSModeler operator
D. SPSSRepository operator
Bloom Filter in HBase can be used to determine which of the following?
A. Whether a record exists in a region server
B. Whether a record does not exist in a region server
C. Items in the catalog database
D. None of the above
Which one of the following file formats is optimal for querying tables with many columns and performing aggregation operations such as SUM() and AVG()?
A. Text
B. Avro
C. JSON
D. PARQUET