Cassandra Interview Questions

Cassandra Interview Questions | Freshers & Experienced

  • Sharad Jaiswal
  • 08th Dec, 2021
  • 656 Followers

About Cassandra

When you need scalability and high availability, the Cassandra database is the right choice without jeopardizing performance. On commodity hardware or cloud infrastructure, linear scalability and proven fault-tolerance make it the perfect platform for mission-critical data. Cassandra's replication strong support across multiple datacentres is best-in-class, providing your users with lower latency and the comfort that you can sustain regional outages.

Key Features of Cassandra

Below are few major features of Cassandra

Distributed

Scalability

Fault-tolerance

MapReduce Support

MapReduce Support

Practice Best Cassandra Interview Questions & Answers

Q1. What is Cassandra?

Apache Cassandra is an open-source distributed database management system. It is basically a NoSQL database that offers high scalability and availability such that it is built to handle large amounts of data across multiple data centers and the cloud.

Q2. Enlist few features of Apache Cassandra?

The features of Apache Cassandra are:

  • High Scalability
  • Rigid Architecture
  • Fast Linear-scale Performance
  • Fault-tolerant
  • Flexible Data Storage
  • Easy Data Distribution
  • Transaction Support

Q3. List the data storage units available in Cassandra?

The data storage units available in Cassandra are Cluster; Node; Keyspace; Column Family; Row; Column.

Q4. What is Cluster in Cassandra?

A cluster in Cassandra is one of the outermost shells which work as a storage unit. It is a collection of nodes that represents a single system.

Q5. What is SSTable?

SSTable stands for Sorted Strings Table which stores a set of immutable row fragments in sorted order based on row keys.

Q6. What is memtable?

A memtable is a write-back cache of data rows that can be looked up by key i.e. unlike a write-through cache, writes are batched up in the memtable until it is full, when a memtable is full, and it is written to disk as SSTable.

Q7. Explain CAP Theorem.

The CAP theorem is also known as Brewer's theorem which states that a distributed database system can only provide two characteristics among Consistency, Availability, and Partition Tolerance at any instant.

Q8. How to write a query in Cassandra?

We can write a query in Cassandra by using Cassandra Query Language or CQL. It can be written as described in the following steps:

Step 1 – start CQL shell (cqlsh) ·

Step 2 – create and use a keyspace ·

Step 3 – describe and list keyspace ·

Step 4 – create table and insert records ·

Step 5 – display

Q9. Explain Cqlsh.

cqlsh is a command-line shell for interacting with Cassandra through CQL 0rthe Cassandra Query Language.

Q10. What is Super Column in Cassandra?

A super column is a special column and also a key-value pair. It stores a map of the sub-column. Column families are generally stored on disk in individual files.

Q11. What is the difference between Column and Super Column in Cassandra?

Column - A column is a tuple of name, value, and timestamp.

Super column - A Super column is a column that contains all of the other columns except the other super column.

Q12. What is Thrift?

Thrift is an interface definition language and binary communication protocol or an inter-process communication layer that is basically used for connection defining and creating services for numerous programming languages.

Q13. Explain the different Logging levels available in Cassandra.

The different Logging levels available in Cassandra are:

  • TRACE
  • DEBUG
  • INFO
  • WARN
  • ERROR
  • FATAL

Q14. What is Tombstone in Cassandra?

In Cassandra, instead of clearing the disc immediately when data is deleted, Cassandra writes a special value, known as a tombstone, to show that data has been deleted. Therefore it prevents deleted data from being returned during reads.

Q15. Name the ports used by Cassandra?

The ports used by Cassandra are :

  • 7000 for cluster communication
  • 7001 if SSL is enabled
  • 9042 for native protocol clients
  • 7199 for JMX.

Q16. What is Key-Value Store DB?

A key-value database is a type of nonrelational database that uses a simple key-value method to store data such that a key serves as a unique identifier.

Q17. Describe the different consistency levels for read operation in Cassandra?

The different consistency levels for read operation in Cassandra are:

ALL-A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition.

EACH_QUORUM -Strong consistency.

QUORUM- A write must be written to the commit log and memtable on a quorum of replica nodes across all datacenters.

LOCAL_QUORUM- Strong consistency.

ONE- A write must be written to the commit log and memtable of at least one replica node.

TWO-A write must be written to the commit log and memtable of at least two replica nodes.

THREE-A write must be written to the commit log and memtable of at least three replica nodes.

LOCAL_ONE- A write must be sent to, and successfully acknowledged by, at least one replica node ANY- A write must be written to at least one node.

Q18. What do you mean by Column Family?

A column family is a database object that contains columns of related data that is consists of a key–value pair, where the key is mapped to a value that is a set of columns.

Q19. Explain the concept of compaction in Cassandra?

In Cassandra, the concept of compaction is used to perform different kinds of operations in which it takes one or more sstables and outputs new sstables.

Q20. What are partitions and Tokens in Cassandra?

A partition is basically a type of key in Cassandra known as partition key that is used for read and write operation.

A token is basically the hashed value of the primary key which are mapped to the partition keys using a 'partitioner'.

Q21. What do you mean by Tunable Consistency?

Cassandra provides a lot of control over data consistency. Thus with Tunable Consistency you can set the CL for each read and write request.

Q22. Define commit log in Cassandra?

Commitlogs are an append-only log of all mutations local to a Cassandra node such that any data are written, will first be written to a commit log before being written to a memtable in Cassandra.

Q23. What is a YAML file in Cassandra?

In Cassandra, YAML file or cassandra. yaml file is the main configuration file for DataStax Enterprise or DSE where dse. yaml file is the primary configuration file for security.

Q24. What are durable writes?

In cassadra, Durable writes are all the writes to a replica node that are recorded both in memory and in a commit log on disk before they are acknowledged as a success therefore If a crash or server failure occurs before the memtables are flushed to disk, the commit log is replayed on restart to recover any lost writes.

Q25. What is Hinted Handoff?

In Cassandra, the Hinted handoff is a feature that is used to optimize the cluster consistency process and anti-entropy when a replica-owning node is not available, such that to accept a replica from a successful write operation. This may happen due to network issues or other problems.

Q26. What is a Column Family?

A column family is a database object that contains columns of related data that is consists of a key–value pair, where the key is mapped to a value that is a set of columns.

Q27. What is Composite partitioning key?

A composite partition key is a partition key that uses two or more columns to identify where data will reside and thus it is used when the data stored is too large to reside in a single partition.

Q28. What is Gossip Protocol?

A gossip protocol is a procedure or process of computer peer-to-peer communication that is based on the way epidemics spread to ensure that data is disseminated to all members of a group.

Q29. What are the main components of Cassandra Data Model?

The main components of the Cassandra Data Model are:

keyspaces- Keyspaces are the containers of data

tables- Tables contain a set of columns and a primary key, and stores the data in a set of rows.

columns-Columns define the structure of data in a table.

About Author :

  • Author of Cassandra Interview Questions

    Sharad Jaiswal

    My name is Sharad Jaiswal, and I am the founder of Conax web Solutions. My tech stacks are PHP, NodeJS, Angular, React. I love to write technical articles and programming blogs.

Leave A Comment :

Valid name is required.

Valid name is required.

Valid email id is required.