PySpark MCQs Solution | TCS Fresco Play

PySpark MCQs Solution | TCS Fresco Play | Fresco Play | TCS

1. Which among the following programming languages does Spark support

All the options

2. Spark was first coded as a C project


3. PySpark is built on top of Spark's Java API


4. SparkContext uses Py4J to launch a JVM and create a


5. Which among the following is a/are feature(s) of DataFrames?

All the options

6. Which among the following is an example of Transformation


7. Spark SQL can read and write data from Hive Tables


8. DataFrame is data organized into ______ columns


9. Parquet stores nested data structures in a flat ________ format


10. Spark SQL does not provide support for both reading and writing Parquet files


11. Spark SQL brings native support for SQL to Spark


12. We cannot pass SQL queries directly to any DataFrame


13. How to create a table in Hive warehouse programmatically from Spark

spark.sql("CREATE TABLE IF NOT EXISTS table_name(column_name_1 DataType,column_name_2 DataType,......,column_name_n DataType) USING hive")

14. External tables are used to store data outside the


15. Spark SQL supports reading and writing data stored in Hive


16. Which among the following is an example of Action


17. Co-Variance of two random columns is near to


18. HBase shell is implemented using __________


19. ____________ is a component on top of Spark Core.

Spark SQL

20. Registering a DataFrame as a ________ view allows you to run SQL queries over its data.


21. The schema of loaded DataFrame df can be checked using


22. HBase is a distributed ________ database built on top of the Hadoop file system.


23. If the schema of the table does not match with the data types present in the file containing the table, then Hive ________

Reports Null values for mismatched data

25. Select the correct statement.

For cluster manager, Spark supports standalone Hadoop YARN

26. Parallelized collections are created by calling SparkContext’s parallelize method on an existing iterable or collection in driver program.


27. In PySpark, sorting is in _________ order, by default


28. For filtering the data, filter command is used



