PySpark MCQs Solution | TCS Fresco Play | Fresco Play
Disclaimer: The primary purpose of providing this solution is to assist and support anyone who are unable to complete these courses due to a technical issue or a lack of expertise. This website's information or data are solely for the purpose of knowledge and education.
Make an effort to understand these solutions and apply them to your Hands-On difficulties. (It is not advisable that copy and paste these solutions).
All Question of the MCQs Present Below for Ease Use Ctrl + F with the question name to find the Question. All the Best!
1. Which among the following programming languages does Spark support
All the options
2. Spark was first coded as a C project
FALSE
3. PySpark is built on top of Spark's Java API
TRUE
4. SparkContext uses Py4J to launch a JVM and create a
JavaSparkContext
5. Which among the following is a/are feature(s) of DataFrames?
All the options
6. Which among the following is an example of Transformation
groupByKey([numPartitions])
7. Spark SQL can read and write data from Hive Tables
TRUE
8. DataFrame is data organized into ______ columns
named
9. Parquet stores nested data structures in a flat ________ format
Columnar
10. Spark SQL does not provide support for both reading and writing Parquet files
FALSE
11. Spark SQL brings native support for SQL to Spark
TRUE
12. We cannot pass SQL queries directly to any DataFrame
FALSE
13. How to create a table in Hive warehouse programmatically from Spark
spark.sql("CREATE TABLE IF NOT EXISTS table_name(column_name_1 DataType,column_name_2 DataType,......,column_name_n DataType) USING hive")
14. External tables are used to store data outside the
Hive
15. Spark SQL supports reading and writing data stored in Hive
TRUE
16. Which among the following is an example of Action
foreach(func)
17. Co-Variance of two random columns is near to
zero
18. HBase shell is implemented using __________
Java
19. ____________ is a component on top of Spark Core.
Spark SQL
20. Registering a DataFrame as a ________ view allows you to run SQL queries over its data.
Temporary
21. The schema of loaded DataFrame df can be checked using
df.printschema()
22. HBase is a distributed ________ database built on top of the Hadoop file system.
Column-oriented
23. If the schema of the table does not match with the data types present in the file containing the table, then Hive ________
Reports Null values for mismatched data
24. Spark was first coded as a C project
FALSE
25. Select the correct statement.
For cluster manager, Spark supports standalone Hadoop YARN
26. Parallelized collections are created by calling SparkContext’s parallelize method on an existing iterable or collection in driver program.
TRUE
27. In PySpark, sorting is in _________ order, by default
ascending
28. For filtering the data, filter command is used
TRUE
2 comments