Apache Spark/Apache Spark Sample Test,Sample questions

Question:
Apache Spark supports –

 
 

1.Batch processing

2.Stream processing

3. Graph processing

4.All of the above


Question:
Can you combine the libraries of Apache Spark into the same Application, for example, MLlib, GraphX, SQL and DataFrames etc.

1.yes

2.no

3.none

4.None of These


Question:
FlatMap transforms an RDD of length N into another RDD of length M. which of the following is true for N and M.
a. N>M

b. N<M

c. N<=M

1. Either a or b

2.Either b or c

3.Either a or c

4.None of the above


Question:
FlatMap transforms an RDD of length N into another RDD of length M. which of the following is true for N and M.
a. N>M

b. N<M

c. N<=M

1. Either a or b

2.Either b or c

3.Either a or c

4.None of the above


Question:
For Multiclass classification problem which algorithm is not the solution?

1.Naive Bayes

2.Random Forests

3. Logistic Regression

4.Decision Trees


Question:
For Regression problem which algorithm is not the solution?

 
 

1.Logistic Regression

2.Ridge Regression

3.Decision Trees

4.Gradient-Boosted Trees


Question:
How many Spark Context can be active per JVM?

 
 
 

1.More than one

2.Only one

3.Not specific

4.None of the above


Question:
How many tasks does Spark run on each partition?

1. Any number of task

2.one

3.More than one less than five

4.None of These


Question:
How much faster can Apache Spark potentially run batch-processing programs when processed in memory than MapReduce can?

1. 10 times faster

2.20 times faster

3. 100 times faster

4.200 times faster


Question:
In how many ways RDD can be created?

1.4

2.3

3.2

4.1


Question:
In which of the following Action the result is not returned to the driver.

 

1.collect()

2. top()

3.countByValue()

4.foreach()


Question:
In which of the following cases do we keep the data in-memory?

1. Iterative algorithms

2. Interactive data mining tools

3. Both the above

4.None of These


Question:
The shortcomings of Hadoop MapReduce was overcome by Spark RDD by

 

 All of the above

1.Lazy-evaluation

2.DAG

3. In-memory processing

4.All of the above


Question:
The write operation on RDD is

 

1. Fine-grained

2. Coarse-grained

3. Either fine-grained or coarse-grained

4. Neither fine-grained nor coarse-grained


Question:
What are the features of Spark RDD?

 

1.In-memory computation

2. Lazy evaluations

3.Fault Tolerance

4.All of the above


Question:
What is action in Spark RDD?

 
 

1.The ways to send result from executors to the driver

2.Takes RDD as input and produces one or more RDD as output.

3.Creates one or many new RDDs

4.All of the above


Question:
When does Apache Spark evaluate RDD?

  
 

1.Upon action

2.Upon transformation

3.On both transformation and action

4.None of the above


Question:
Which of the following is a tool of Machine Learning Library?

 

1.Persistence

2. Utilities like linear algebra, statistics

3.Pipelines

4.All of the above


Question:
Which of the following is a transformation?

 

1.take(n)

2.top()

3. countByValue()

4.mapPartitionWithIndex()


Question:
Which of the following is action?

 
  

1.Union(dataset)

2.Intersection(other-dataset)

3.Distinct()

4.CountByValue()


Question:
Which of the following is false for Apache Spark?

1. It provides high-level API in Java, Python, R, Scala

2. It can be integrated with Hadoop and can process existing Hadoop HDFS data

3.Spark is an open source framework which is written in Java

4.Spark is 100 times faster than Bigdata Hadoop


Question:
Which of the following is not a function of Spark Context in Apache Spark?

1. Entry point to Spark SQL

2.To Access various services

3.To set the configuration

4.To get the current status of Spark Application


Question:
Which of the following is not a transformation?

 
 
 
 

1.Flatmap

2.Map

3.Reduce

4.Filter


Question:
Which of the following is not an action?

 

1.collect()

2.take(n)

3.top()

4.map


Question:
Which of the following is not true for map() Operation?

1.Map transforms an RDD of length N into another RDD of length N.

2. In the Map operation developer can define his own custom business logic.

3. It applies to each element of RDD and it returns the result as new RDD

4.Map allows returning 0, 1 or more elements from map function.


Question:
Which of the following is open-source?

1. Apache Spark

2.Apache Hadoop

3.Apache Flink

4.All of the above


Question:
Which of the following is the entry point of Spark Application –

 
 

1.SparkSession

2.SparkContext

3. None of the both

4.Only 1


Question:
Which of the following is the entry point of Spark SQL?

 

1.SparkSession

2. SparkContext

3.Both 1 and 2

4.None


Question:
Which of the following is the reason for Spark being Speedy than MapReduce?

1. DAG execution engine and in-memory computation

2.Support for different language APIs like Scala, Java, Python and R

3.RDDs are immutable and fault-tolerant

4.None of the above


Question:
Which of the following is true about DataFrame?

 
 
 

1.Data Frames provide a more user-friendly API than RDDs.

2.Data Frame API have provision for compile-time type safety

3.Both the above

4.None of the above


Question:
Which of the following is true about narrow transformation –


 

1. The data required to compute resides on multiple partitions.

2.The data required to compute resides on the single partition.

3. Both the above

4.None


Question:
Which of the following is true about wide transformation –


 

1. The data required to compute resides on multiple partitions.

2. The data required to compute resides on the single partition.

3.Both 1 and 2

4.None of the both


Question:
Which of the following is true for RDD?

 
 
 
 None of the above

1.RDD is programming paradigm

2.RDD in Apache Spark is an immutable collection of objects

3.It is database

4.None of the above


Question:
Which of the following is true for RDD?

 
 
 None of the above

1.RDD is programming paradigm

2.RDD in Apache Spark is an immutable collection of objects

3.It is database

4.None of the above


Question:
Which of the following is true for RDD?

 We can operate Spark RDDs in parallel with a low-level API
 
 
 

1. We can operate Spark RDDs in parallel with a low-level API

2. RDDs are similar to the table in a relational database

3. It allows processing of a large amount of structured data

4.It has built-in optimization engine


Question:
Which of the following is true for Spark core?


 
 
 

1. It is the kernel of Spark

2.It enables users to run SQL / HQL queries on the top of Spark.

3.It is the scalable machine learning library which delivers efficiencies

4.Improves the performance of iterative algorithm drastically.


Question:
Which of the following is true for Spark MLlib?

 

1.Provides an execution platform for all the Spark application

2. It is the scalable machine learning library which delivers efficiencies

3. enables powerful interactive and data analytics application across live streaming data

4.All of the above


Question:
Which of the following is true for Spark R?


 

 

1. It allows data scientists to analyze large datasets and interactively run jobs

2.It is the kernel of Spark

3. It is the scalable machine learning library which delivers efficiencies

4.It enables users to run SQL / HQL queries on the top of Spark.


Question:
Which of the following is true for Spark Shell?

 

1.It helps Spark applications to easily run on the command line of the system

2.It runs/tests application code interactively

3.It allows reading from many types of data sources

4.All of the above


Question:
Which of the following is true for Spark SQL?

1. It is the kernel of Spark

2. Provides an execution platform for all the Spark applications

3. It enables users to run SQL / HQL queries on the top of Spark.

4.It enables users to run SQL / HQL queries on the top of Spark.


Question:
Which of the following provide the Spark Core’s fast scheduling capability to perform streaming analytics.

1.RDD

2.GraphX

3.Spark Streaming

4. Spark R


Question:
You can connect R program to a Spark cluster from –

 
 
 
 

1.RStudio

2.R Shell

3.Rscript

4.All of the above


More MCQS

  1. Apache Spark
Search
R4R Team
R4Rin Top Tutorials are Core Java,Hibernate ,Spring,Sturts.The content on R4R.in website is done by expert team not only with the help of books but along with the strong professional knowledge in all context like coding,designing, marketing,etc!