Leads4pass > Databricks > Databricks Certifications > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK > DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

DATABRICKS-CERTIFIED-ASSOCIATE-DEVELOPER-FOR-APACHE-SPARK Online Practice Questions and Answers

Questions 4

The code block displayed below contains multiple errors. The code block should return a DataFrame that

contains only columns transactionId, predError, value and storeId of DataFrame

transactionsDf. Find the errors.

Code block:

transactionsDf.select([col(productId), col(f)])

Sample of transactionsDf:

1.+-------------+---------+-----+-------+---------+----+

2.|transactionId|predError|value|storeId|productId| f| 3.+-------------+---------+-----+-------+---------+----+

4.| 1| 3| 4| 25| 1|null|

5.| 2| 6| 7| 2| 2|null|

6.| 3| 3| null| 25| 3|null|

7.+-------------+---------+-----+-------+---------+----+

A. The column names should be listed directly as arguments to the operator and not as a list.

B. The select operator should be replaced by a drop operator, the column names should be listed directly as arguments to the operator and not as a list, and all column names should be expressed as strings without being wrapped in a col() operator.

C. The select operator should be replaced by a drop operator.

D. The column names should be listed directly as arguments to the operator and not as a list and following the pattern of how column names are expressed in the code block, columns productId and f should be replaced by transactionId, predError, value and storeId.

E. The select operator should be replaced by a drop operator, the column names should be listed directly as arguments to the operator and not as a list, and all col() operators should be removed.

Buy Now
Questions 5

The code block shown below should return a copy of DataFrame transactionsDf with an added column

cos. This column should have the values in column value converted to degrees and having the cosine of

those converted values taken, rounded to two decimals. Choose the answer that correctly fills the blanks in

the code block to accomplish this.

Code block:

transactionsDf.__1__(__2__, round(__3__(__4__(__5__)),2))

A. 1. withColumn

2.

col("cos")

3.

cos

4.

degrees

5.

transactionsDf.value

B. 1. withColumnRenamed

2.

"cos"

3.

cos

4.

degrees

5.

"transactionsDf.value"

C. 1. withColumn

2.

"cos"

3.

cos

4.

degrees

5.

transactionsDf.value

D. 1. withColumn

2.

col("cos")

3.

cos

4.

degrees

5.

col("value")

E. 1. withColumn

2.

"cos"

3.

degrees

4.

cos

5.

col("value")

Buy Now
Questions 6

The code block displayed below contains an error. The code block below is intended to add a column itemNameElements to DataFrame itemsDf that includes an array of all words in column itemName. Find the error.

Sample of DataFrame itemsDf:

1.+------+----------------------------------+-------------------+

2.|itemId|itemName |supplier |

3.+------+----------------------------------+-------------------+ 4.|1 |Thick Coat for Walking in the Snow|Sports

Company Inc.|

5.|2 |Elegant Outdoors Summer Dress |YetiX |

6.|3 |Outdoors Backpack |Sports Company Inc.|

7.+------+----------------------------------+-------------------+

Code block:

itemsDf.withColumnRenamed("itemNameElements", split("itemName"))

A. All column names need to be wrapped in the col() operator.

B. Operator withColumnRenamed needs to be replaced with operator withColumn and a second argument "," needs to be passed to the split method.

C. Operator withColumnRenamed needs to be replaced with operator withColumn and the split method needs to be replaced by the splitString method.

D. Operator withColumnRenamed needs to be replaced with operator withColumn and a second argument " " needs to be passed to the split method. E. The expressions "itemNameElements" and split("itemName") need to be swapped.

Buy Now
Questions 7

Which of the following code blocks reads in the JSON file stored at filePath as a DataFrame?

A. spark.read.json(filePath)

B. spark.read.path(filePath, source="json")

C. spark.read().path(filePath)

D. spark.read().json(filePath)

E. spark.read.path(filePath)

Buy Now
Questions 8

The code block displayed below contains an error. The code block should write DataFrame transactionsDf

as a parquet file to location filePath after partitioning it on column storeId.

Find the error.

Code block:

transactionsDf.write.partitionOn("storeId").parquet(filePath)

A. The partitioning column as well as the file path should be passed to the write() method of DataFrame transactionsDf directly and not as appended commands as in the code block.

B. The partitionOn method should be called before the write method.

C. The operator should use the mode() option to configure the DataFrameWriter so that it replaces any existing files at location filePath.

D. Column storeId should be wrapped in a col() operator.

E. No method partitionOn() exists for the DataFrame class, partitionBy() should be used instead.

Buy Now
Questions 9

Which of the following describes Spark's standalone deployment mode?

A. Standalone mode uses a single JVM to run Spark driver and executor processes.

B. Standalone mode means that the cluster does not contain the driver.

C. Standalone mode is how Spark runs on YARN and Mesos clusters.

D. Standalone mode uses only a single executor per worker per application.

E. Standalone mode is a viable solution for clusters that run multiple frameworks, not only Spark.

Buy Now
Questions 10

Which of the following describes Spark's way of managing memory?

A. Spark uses a subset of the reserved system memory.

B. Storage memory is used for caching partitions derived from DataFrames.

C. As a general rule for garbage collection, Spark performs better on many small objects than few big objects.

D. Disabling serialization potentially greatly reduces the memory footprint of a Spark application.

E. Spark's memory usage can be divided into three categories: Execution, transaction, and storage.

Buy Now
Questions 11

Which of the following code blocks reads the parquet file stored at filePath into DataFrame itemsDf, using a valid schema for the sample of itemsDf shown below?

Sample of itemsDf:

1.+------+-----------------------------+-------------------+

2.|itemId|attributes |supplier |

3.+------+-----------------------------+-------------------+

4.|1 |[blue, winter, cozy] |Sports Company Inc.|

5.|2 |[red, summer, fresh, cooling]|YetiX |

6.|3 |[green, summer, travel] |Sports Company Inc.|

7.+------+-----------------------------+-------------------+

A. 1.itemsDfSchema = StructType([

2.

StructField("itemId", IntegerType()),

3.

StructField("attributes", StringType()),

4.

StructField("supplier", StringType())])

5.

6.itemsDf = spark.read.schema(itemsDfSchema).parquet(filePath)

B. 1.itemsDfSchema = StructType([

2.

StructField("itemId", IntegerType),

3.

StructField("attributes", ArrayType(StringType)),

4.

StructField("supplier", StringType)])

5.

6.itemsDf = spark.read.schema(itemsDfSchema).parquet(filePath)

C. 1.itemsDf = spark.read.schema('itemId integer, attributes , supplier string').parquet(filePath)

D. 1.itemsDfSchema = StructType([

2.

StructField("itemId", IntegerType()),

3.

StructField("attributes", ArrayType(StringType())),

4.

StructField("supplier", StringType())])

5.

6.itemsDf = spark.read.schema(itemsDfSchema).parquet(filePath)

E. 1.itemsDfSchema = StructType([

2.

StructField("itemId", IntegerType()),

3.

StructField("attributes", ArrayType([StringType()])),

4.

StructField("supplier", StringType())])

5.

6.itemsDf = spark.read(schema=itemsDfSchema).parquet(filePath)

Buy Now
Questions 12

The code block displayed below contains an error. The code block should read the csv file located at path data/transactions.csv into DataFrame transactionsDf, using the first row as column header and casting the columns in the most appropriate type. Find the error. First 3 rows of transactions.csv: 1.transactionId;storeId;productId;name 2.1;23;12;green grass 3.2;35;31;yellow sun 4.3;23;12;green grass Code block: transactionsDf = spark.read.load("data/transactions.csv", sep=";", format="csv", header=True)

A. The DataFrameReader is not accessed correctly.

B. The transaction is evaluated lazily, so no file will be read.

C. Spark is unable to understand the file type.

D. The code block is unable to capture all columns.

E. The resulting DataFrame will not have the appropriate schema.

Buy Now
Questions 13

Which of the elements in the labeled panels represent the operation performed for broadcast variables?

Larger image

A. 2, 5

B. 3

C. 2, 3

D. 1, 2

E. 1, 3, 4

Buy Now
Exam Name: Databricks Certified Associate Developer for Apache Spark 3.0
Last Update: Jan 17, 2025
Questions: 180
10%OFF Coupon Code: SAVE10

PDF (Q&A)

$49.99

VCE

$55.99

PDF + VCE

$65.99