Cassandra Interview Questions for Freshers/Cassandra Interview Questions and Answers for Freshers & Experienced

What is Gossip Protocol?

Gossip Protocol in Cassandra is a peer-to-peer communication protocol in which nodes can choose among themselves with whom they want to exchange their state information. The nodes exchange information about themselves and about the other nodes that they have gossiped about, so all nodes quickly learn about all other nodes in the cluster.

Differentiate between Drop and Truncate in CQLSH

* The Drop table command drops specified table including all the data from the keyspace.

* The Truncate table command is used to truncate a table and deletes all the rows of the table permanently.

Differentiate between Static and Dynamic CQL Tables.

* A Static Table uses a relatively static set of column names and is similar to Relational Database Table.

* A dynamic table allows you to pre-compute result sets and stores them in a single row for efficient data retrieval.

Differentiate between the various types of Primary Keys in Cassandra.

In the Single Primary Key, there is only a single column as a Primary Key.
The column is also called partitioning key. Data is partitioned on the basis of that column. Data is spread on different nodes on the basis of the partition key.

* In Compound Primary Key, data is partitioned and then clustered

* race_name is the partitioning key and race_position is the Clustering key. Data will be partitioned on the basis of race_name and data will be clustered on the basis of race_position. Clustering is the process that sorts data in the partition. Retrieval of rows is very efficient when rows for a partition key are stored in order, based on the clustering column.

* Composite partitioning key is used to create multiple partitions for the data

what is Cassandra- CQL collections?

Cassandra CQL collections help you to store multiple values in a single variable. In Cassandra, you can use CQL collections in following ways

List: It is used when the order of the data needs to be maintained, and a value is to be stored multiple times (holds the list of unique elements)
SET: It is used for group of elements to store and returned in sorted orders (holds repeating elements)
MAP: It is a data type used to store a key-value pair of elements

Mention what needs to be taken care while adding a Column?

While adding a column you need to take care that the

* Column name is not conflicting with the existing column names
* Table is not defined with compact storage option

What is a Row in Cassandra? and What are the different elements of it?

A row is a collection of sorted columns. It is the smallest unit that stores related data in Cassandra. Any component of a Row can store data or metadata

The different elements/parts of a row are the

* Row Key
* Column Keys
* Column Values

What is Network Topology Strategy?

This is used when we deploy a cluster across Multiple Datacenters. It is the primary consideration to insert replicas. Can satisfy reads, locally without incurring cross Data-Center Latency and also Handle Failure Scenarios.

What is the main objective of creating Cassandra?

The main objective of Cassandra is to handle a large amount of data. Furthermore, the objective also ensures fault tolerance with the swift transfer of data.

Who developed Cassandra and in which language?

Avinash Lakshman and Prashant Malik developed Cassandra using Java. Later Apache took it under it for further development.

Give some advantages of Cassandra.

These are the advantages if Cassandra:

Since data can be replicated to several nodes, Cassandra is fault tolerant.

Cassandra can handle a large set of data.

Cassandra provides high scalability.

Tell something about the query language used in Cassandra Database.

Cassandra query language is used for Cassandra Database. It is an interface that a user uses to access the database. It basically is a communication medium. All the operations are carried out from this panel.

What do you mean by replication Strategy?

The replica placement strategy refers to how the replicas will be placed in the ring
There are different strategies that ship with Cassandra for determining which nodes will get copies of which keys
There are mainly two types of Strategies:

* Simple Strategy
* Network Topology Strategy

What do you mean by replication factor?

Cassandra stores copies (called replicas) of each row based on the row key. The replication factor refers to the number of nodes that will act as copies (replicas) of each row of data.

Define a column family.

A keyspace contains many column families. They basically represent the table. Furthermore, it basically defines titles or application specific tables.

What is mandatory while creating a table in Cassandra?

While creating a table primary key is mandatory, it is made up of one or more columns of a table.

hat does the shell commands “Capture” and “Consistency” determines?

There are various Cqlsh shell commands in Cassandra. Command “Capture”, captures the output of a command and adds it to a file while, command “Consistency” display the current consistency level or set a new consistency level.

what is Cassandra-Cqlsh?

Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things

Define a schema
Insert a data and
Execute a query

when you can use Alter keyspace?

ALTER KEYSPACE can be used to change properties such as the number of replicas and the durable_write of a keyspace.

What is the syntax to create keyspace in Cassandra?

Syntax for creating keyspace in Cassandra is

CREATE KEYSPACE <identifier> WITH <properties>

What is the data center?

A data center is a collection of Cassandra nodes. The data in a data center is stored in the form of a cluster, where the cluster is also referred to as a collection of nodes.

What is a node?

A node is a basic unit of Cassandra, and it is a system that is part of a cluster. Node is the main area where the data is stored.

And the units of a node is represented as computer/server

State the differences between a node, a cluster, and a data center in Cassandra.

There are various components of Cassandra. While a node is a single machine running Cassandra, cluster is a collection of nodes that have similar types of data grouped together. Data centers are useful components when serving customers in different geographical areas. You can group different nodes of a cluster into different data centers.

Explain CAP Theorem.

With a strong requirement to scale systems when additional resources are needed, CAP Theorem plays a major role in maintaining the scaling strategy. It is an efficient way to handle scaling in distributed systems. Consistency, availability, and partition tolerance (CAP) theorem states that in distributed systems like Cassandra, users can enjoy only two out of these three characteristics.

One of them needs to be sacrificed. Consistency guarantees the return of most recent write for the client; availability returns a rational response within minimum time; and in partition tolerance, the system will continue its operations when network partitions occur. The two options available are AP and CP.

Explain the concept of Bloom Filter.

Associated with SSTable, Bloom filter is an off-heap (off the Java heap to native memory) data structure to check whether there is any data available in the SSTable before performing any I/O disk operation.

What is SSTable? How is it different from other relational tables?

SSTable expands to ‘Sorted String Table,’ which refers to an important data file in Cassandra and accepts regular written memtables. They are stored on disk and exist for each Cassandra table. Exhibiting immutability, SSTables do not allow any further addition and removal of data items once written. For each SSTable, Cassandra creates three separate files like partition index, partition summary, and a bloom filter.

Define the management tools in Cassandra.

DataStax OpsCenter: It is the Internet-based management and monitoring solution for Cassandra cluster and DataStax. It is free to download and includes an additional edition of OpsCenter.

SPM primarily administers Cassandra metrics and various OS and JVM metrics. Besides Cassandra, SPM also monitors Hadoop, Spark, Solr, Storm, ZooKeeper, and other Big Data platforms. The main features of SPM include correlation of events and metrics, distributed transaction tracing, creating real-time graphs with zooming, anomaly detection, and heartbeat alerting.

What is a Keyspace in Cassandra?

A keyspace is the outermost container for data in Cassandra. Like a relational database, a keyspace has a name and a set of attributes that define keyspace-wide behaviour. The keyspace is used to group Column families together.

How does Cassandra write?

Cassandra performs the write function by applying two commits: first, it writes to a commit log on the disk, and then it commits to an in-memory structure known as memtable. Once the two commits are successful, the write is achieved. Writes are written in the table structure as SSTables (sorted string tables). Cassandra offers faster write performance.

Explain the concept of tunable consistency in Cassandra.

Tunable consistency is a phenomenal character that makes Cassandra a favored database choice of Developers, Analysts, and Big data Architects. Consistency refers to the up-to-date and synchronized data rows on all their replicas. Cassandra’s tunable consistency allows users to select the consistency level best suited for their use cases. It supports two consistencies: eventual consistency and strong consistency.

The former guarantees consistency when no new updates are made on a given data item, i.e., all accesses return the last updated value eventually. Systems with eventual consistency are known to have achieved replica convergence.

For strong consistency, Cassandra supports the following condition:
R + W > N where,
N – Number of replicas
W – Number of nodes that need to agree for a successful write
R – Number of nodes that need to agree for a successful read

what is a column family in Cassandra?

Column family in Cassandra is referred for a collection of Rows.

Mention what are the main components of Cassandra Data Model?

The main components of Cassandra Data Model are

* Cluster
* Keyspace
* Column
* Column & Family

How Cassandra stores data?

* All data stored as bytes
* When you specify validator, Cassandra ensures those bytes are encoded as per requirement
* Then a comparator orders the column based on the ordering specific to the encoding
* While composite are just byte arrays with a specific encoding, for each component it stores a two byte length followed by the byte encoded component followed by a termination bit.

What is a YAML file in Cassandra?

The cassandra.yaml file is the main configuration file for Cassandra. After changing properties in the cassandra.yaml file, you must restart the node for the changes to take effect.

What is CQLSH? And why is it used?

Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things:

* Define a schema
* Insert a data, and
* Execute a query

What are the Different types of Data Model?

There are majorly 3 types/stages of Data Model

* Conceptual Data Model
* Logical Data Model
* Physical Data Model

What is Graph DB? Explain with an example.

The type of NoSQL database in which a flexible graphical representation is used. The key purpose is to store relationships between nodes.

What is Column Store DB? Explain with an example.

Data is stored in cells are grouped in columns of data rather than as rows of data. Columns are logically grouped into column families.
One row may have one or multiple data records, which is indexed by a partition key.

What is Key-Value Store DB? Explain with an example.

All of the data within database consists of an indexed key and a value. A key may correspond to one or multiple values (hash table). Provides a great performance and can be very easily scaled as per business needs.

What are the functions of Cassandra?

This database supports two main categories of functions:

Scalar functions: Its primary purpose is taking some groups of values and producing an output with it.

Aggregate functions: Its primary function is producing a combined result using selected multiple rows.

What are the main components of Cassandra?

The components of Cassandra include:

* Node
* Data cluster
* Commit log
* Cluster
* Meme-table
* SSTable
* Bloom filter

Name the features of Cassandra.

Cassandra has become famous for its outstanding technical features. Here are some features you must know:

* Elastic scalability
* Always on architecture
* Fast linear and scale performance
* Flexible in data storage
* Easy to do data distribution
* Excellent transaction support.

What are the applications of Cassandra?

Cassandra has become the primary choice for many companies when it comes to app development and data management. Even new start-ups are preferring it because of the ease with which an operator can work.

Cassandra is a great application where data is collected at high speed from different kinds of sources. As the internet of things application could use Cassandra. It could also be used in product and retail apps, messaging, social media analytics, and even by a recommendation engine.

Describe the benefits of using Cassandra?

Cassandra has features that are very beneficial as it is easy to work with; Some of those are high performance, fault tolerance, predictable scaling, distributed database. It has high scores on these parameters, and it is also preferred because it is an open-source distributed and NoSQL database management system.

What is Apache Cassandra?

Cassandra is an open-source, distributed, and decentralized database. It is also used for managing a large amount of structured data which is spread out everywhere.

what is composite type in Cassandra?

In Cassandra, composite type allows to define key or a column name with a concatenation of data of different type. You can use two types of Composite Type

* Row Key
* Column Name

What is the use of Cassandra and why to use Cassandra?

Cassandra was designed to handle big data workloads across multiple nodes without any single point of failure. The various factors responsible for using Cassandra are

* It is fault tolerant and consistent
* Gigabytes to petabytes scalabilities
* It is a column-oriented database
* No single point of failure
* No need for separate caching layer
* Flexible schema design
* It has flexible data storage, easy data distribution, and fast writes
* It supports ACID (Atomicity, Consistency, Isolation, and Durability)properties
* Multi-data center and cloud capable
* Data compression

Define replication strategy.

These strategies define the technique how the replicas are placed in a cluster. There are mainly two types of Replication Strategy:
Simple strategy
Network Topology Strategy

Define replication factor.

The data in a node undergoes replication. The data is copied from one node to another to ensure fault tolerance. The replication factor is the number of copies of the data that are sent to different nodes.

Why is Apache Cassandra developed?

Cassandra is a distributed database management system. It is initially developed at Facebook to improve its performance, and it is a tool made to power the Facebook inbox search feature. Due to its outstanding technical features, Cassandra became very popular and a top-level project.

R4R Team
R4R provides Cassandra Freshers questions and answers (Cassandra Interview Questions and Answers) .The questions on website is done by expert team! Mock Tests and Practice Papers for prepare yourself.. Mock Tests, Practice Papers,Cassandra Interview Questions for Freshers,Cassandra Freshers & Experienced Interview Questions and Answers,Cassandra Objetive choice questions and answers,Cassandra Multiple choice questions and answers,Cassandra objective, Cassandra questions , Cassandra answers,Cassandra MCQs questions and answers Java, C ,C++, ASP, C# ,Struts ,Questions & Answer, Struts2, Ajax, Hibernate, Swing ,JSP , Servlet, J2EE ,Core Java ,Stping, VC++, HTML, DHTML, JAVASCRIPT, VB ,CSS, interview ,questions, and answers, for,experienced, and fresher R4r provides Python,General knowledge(GK),Computer,PHP,SQL,Java,JSP,Android,CSS,Hibernate,Servlets,Spring etc Interview tips for Freshers and Experienced for Cassandra fresher interview questions ,Cassandra Experienced interview questions,Cassandra fresher interview questions and answers ,Cassandra Experienced interview questions and answers,tricky Cassandra queries for interview pdf,complex Cassandra for practice with answers,Cassandra for practice with answers You can search job and get offer latters by studing .learn in easy ways .