For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop's files stored?
A. In HDFS, In the directory of the user who generates the job
B. On the local disk of the slave node running the task
C. Cached In the YARN container running the task, then copied into HDFS on fob completion
D. Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode
Which YARN daemon or service negotiates map and reduce Containers from the Scheduler, tracking their status and monitoring for progress?
A. ResourceManager
B. ApplicationMaster
C. NodeManager
D. ApplicationManager
On a cluster running CDH 5.0 or above, you use the hadoop fs put command to write a 300MB file into a previously empty directory using an HDFS block of 64MB. Just after this command has finished writing 200MB of this file, what would another use see when they look in the directory?
A. They will see the file with its original name. if they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster
B. They will see the file with a ._COPYING_extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster.
C. They will see the file with a ._COPYING_ extension on its name. if they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)
D. The directory will appear to be empty until the entire file write is completed on the cluster
Which three basic configuration parameters must you set to migrate your cluster from MapReduce1 (MRv1) to MapReduce v2 (MRv2)?
A. Configure the NodeManager hostname and enable services on YARN by setting the following property in yarn-site.xml:
B. Configure the number of map tasks per job on YARN by setting the following property in mapredsite.xml:
C. Configure MapReduce as a framework running on YARN by setting the following property in mapredsite.xml:
D. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml:
E. Configure a default scheduler to run on YARN by setting the following property in sapred- site.xml:
F. Configure the NodeManager to enable MapReduce services on YARN by adding following property in yarn-site.xml:
Your Hadoop cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. Can you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still have a function cluster?
A. Yes. The daemon will receive data from the NameNode to run Map tasks
B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks
C. Yes. The daemon will receive Reduce tasks only
You observe that the number of spilled records from Map tasks far exceeds the number of map output records. Your child heap size is 1GB and your io.sort.mb value is set to 100 MB. How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
A. Decrease the io.sort.mb value to 0
B. Increase the io.sort.mb to 1GB
C. For 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O
D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as close to equals) the number of map output records
Each node in your Hadoop cluster, running YARN, has 64 GB memory and 24 cores. Your yarn- site.xml
has the following configuration:
You want YARN to launch no more than 16 containers per node. What should you do?
A. No action is needed: YARN's dynamic resource allocation automatically optimizes the node memory and cores
B. Modify yarn-site.xml with the following property:
C. Modify yarn-site.xml with the following property:
D. Modify yarn-site.xml with the following property:
Your Hadoop cluster contains nodes in three racks. You have NOT configured the dfs.hosts property in the NameNode's configuration file. What results?
A. No new nodes can be added to the cluster until you specify them in the dfs.hosts file
B. Presented with a blank dfs.hosts property, the NameNode will permit DatNode specified in mapred.hosts to join the cluster
C. Any machine running the DataNode daemon can immediately join the cluster
D. The NameNode will update the dfs.hosts property to include machine running DataNode daemon on the next NameNode reboot or with the command dfsadmin -refreshNodes
What processes must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes, and you want to change a configuration parameter so that it affects all six DataNodes.
A. You must modify the configuration file on each of the six DataNode machines.
B. You must restart the NameNode daemon to apply the changes to the cluster
C. You must restart all six DatNode daemon to apply the changes to the cluste
D. You don't need to restart any daemon, as they will pick up changes automatically
E. You must modify the configuration files on the NameNode only. DataNodes read their configuration from the master nodes.
Which process instantiates user code, and executes map and reduce tasks on a cluster running MapReduce V2 (MRv2) on YARN?
A. NodeManager
B. ApplicationMaster
C. ResourceManager
D. TaskTracker
E. JobTracker
F. DataNode
G. NameNode