Which of the following tool was designed to import data from a relational database into HDFS?
A. HCatalog B. Sqoop
C. Flume
D. Ambari
What does Pig provide to the overall Hadoop solution?
A. Legacy language Integration with MapReduce framework
B. Simple scripting language for writing MapReduce programs
C. Database table and storage management services
D. C++ interface to MapReduce and data warehouse infrastructure
Given the following Pig command:
logevents = LOAD andapos;input/my.logandapos; AS (date:chararray, levehstring, code:int, message:string);
Which one of the following statements is true?
A. The logevents relation represents the data from the my.log file, using a comma as the parsing delimiter
B. The logevents relation represents the data from the my.log file, using a tab as the parsing delimiter
C. The first field of logevents must be a properly-formatted date string or table return an error
D. The statement is not a valid Pig command
You want to Ingest log files Into HDFS, which tool would you use?
A. HCatalog
B. Flume
C. Sqoop
D. Ambari
Which Hadoop component is responsible for managing the distributed file system metadata?
A. NameNode
B. Metanode
C. DataNode
D. NameSpaceManager
Which describes how a client reads a file from HDFS?
A. The client queries the NameNode for the block location(s). The NameNode returns the block location(s) to the client. The client reads the data directory off the DataNode(s).
B. The client queries all DataNodes in parallel. The DataNode that contains the requested data responds directly to the client. The client reads the data directly off the DataNode.
C. The client contacts the NameNode for the block location(s). The NameNode then queries the DataNodes for block locations. The DataNodes respond to the NameNode, and the NameNode redirects the client to the DataNode that holds the requested data block(s). The client then reads the data directly off the DataNode.
D. The client contacts the NameNode for the block location(s). The NameNode contacts the DataNode that holds the requested data block. Data is transferred from the DataNode to the NameNode, and then from the NameNode to the client.
What are the TWO main components of the YARN ResourceManager process? Choose 2 answers
A. Job Tracker
B. Task Tracker
C. Scheduler
D. Applications Manager
Can you use MapReduce to perform a relational join on two large tables sharing a key? Assume that the two tables are formatted as comma-separated files in HDFS.
A. Yes.
B. Yes, but only if one of the tables fits into memory
C. Yes, so long as both tables fit into memory.
D. No, MapReduce cannot perform relational operations.
E. No, but it can be done with either Pig or Hive.
Given the following Hive commands:
Which one of the following statements Is true?
A. The file mydata.txt is copied to a subfolder of /apps/hive/warehouse
B. The file mydata.txt is moved to a subfolder of /apps/hive/warehouse
C. The file mydata.txt is copied into Hive's underlying relational database 0.
D. The file mydata.txt does not move from Its current location in HDFS
When is the earliest point at which the reduce method of a given Reducer can be called?
A. As soon as at least one mapper has finished processing its input split.
B. As soon as a mapper has emitted at least one record.
C. Not until all mappers have finished processing all records.
D. It depends on the InputFormat used for the job.