In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
A. It returns a reference to a different Writable object time.
B. It returns a reference to a Writable object from an object pool.
C. It returns a reference to the same Writable object each time, but populated with different data.
D. It returns a reference to a Writable object. The API leaves unspecified whether this is a reused object or a new object.
E. It returns a reference to the same Writable object if the next value is the same as the previous value, or a new Writable object otherwise.
Which process describes the lifecycle of a Mapper?
A. The JobTracker calls the TaskTracker's configure () method, then its map () method and finally its close () method.
B. The TaskTracker spawns a new Mapper to process all records in a single input split.
C. The TaskTracker spawns a new Mapper to process each key-value pair.
D. The JobTracker spawns a new Mapper to process all records in a single file.
Consider the following two relations, A and B.
Which Pig statement combines A by its first field and B by its second field?
A. C = DOIN B BY a1, A by b2;
B. C = JOIN A by al, B by b2;
C. C = JOIN A a1, B b2;
D. C = JOIN A SO, B $1;
Workflows expressed in Oozie can contain:
A. Sequences of MapReduce and Pig. These sequences can be combined with other actions including forks, decision points, and path joins.
B. Sequences of MapReduce job only; on Pig on Hive tasks or jobs. These MapReduce sequences can be combined with forks and path joins.
C. Sequences of MapReduce and Pig jobs. These are limited to linear sequences of actions with exception handlers but no forks.
D. Iterntive repetition of MapReduce jobs until a desired answer or state is reached.
Which one of the following statements is FALSE regarding the communication between DataNodes and a federation of NameNodes in Hadoop 2.2?
A. Each DataNode receives commands from one designated master NameNode.
B. DataNodes send periodic heartbeats to all the NameNodes.
C. Each DataNode registers with all the NameNodes.
D. DataNodes send periodic block reports to all the NameNodes.
Assuming the following Hive query executes successfully:
Which one of the following statements describes the result set?
A. A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the input data A1 table.
B. An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the inputdata table.
C. A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines column of the inputdata table.
D. A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines column of the inputdata table.
Your cluster's HDFS block size in 64MB. You have directory containing 100 plain text files, each of which
is 100MB in size. The InputFormat for your job is TextInputFormat.
Determine how many Mappers will run?
A. 64
B. 100
C. 200
D. 640
You need to perform statistical analysis in your MapReduce job and would like to call methods in the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archive (JAR) file. Which is the best way to make this library available to your MapReducer job at runtime?
A. Have your system administrator copy the JAR to all nodes in the cluster and set its location in the HADOOP_CLASSPATH environment variable before you submit your job.
B. Have your system administrator place the JAR file on a Web server accessible to all cluster nodes and then set the HTTP_JAR_URL environment variable to its location.
C. When submitting the job on the command line, specify the ç’´ibjars option followed by the JAR file path.
D. Package your code and the Apache Commands Math library into a zip file named JobJar.zip
All keys used for intermediate output from mappers must:
A. Implement a splittable compression algorithm.
B. Be a subclass of FileInputFormat.
C. Implement WritableComparable.
D. Override isSplitable.
E. Implement a comparator for speedy sorting.
Which one of the following is NOT a valid Oozie action?
A. mapreduce
B. pig
C. hive
D. mrunit