This document provides an introduction to Apache Cassandra, a NoSQL distributed database. It discusses Cassandra's history and development by Facebook, key features including distributed architecture, data replication, fault tolerance, and linear scalability. It also compares relational and NoSQL databases, and lists some major companies that use Cassandra like Netflix, Apple, and eBay.
2. Introduction to Apache Cassandra
Brief history about Apache
Cassandra
What is NoSQL?
Difference between Relational
Database vs NoSQL Database
Features of Apache Cassandra
Who is using Apache Cassandra?
TODAY WE'LL
UNDERSTAND
ABOUT
3. Introduction to Apache Cassandra
Apache Cassandra is leading NoSQL distributed data
management system,wide column store, that drives many
of today's modern business application by offering
continuous availability, high scalability and performance.
Cassandra handles the huge amount of data with its
distributed architecture.
Cassandra is implemented using JAVA.
4. Introduction to Apache Cassandra
Apache Cassandra was designed with understanding
hard/system failure can occur.
Peer to Peer Distributed system.
Partitions the data and controls Data Replication.
Read/Write to any node.
Gossip Protocol to communicate through various nodes in
cluster.
Commit Log ensures Data Durability.
Data also written to memory structure(memtable).
7. Structure of a Column Store Database
A Keyspace is kind of schema in relational model.
Column family consists of multiple rows.
1. Row Key
2. Column
3. Value
4. Timestamp
Benefits of Column Store Databases
Compression,Aggregation queries,Scalability and Fast to
load and query.
8. Brief history
about Apache
Cassandra
Avinash Lakshman and Prashant
Malik initially developed Cassandra
at Facebook to power the
Facebook inbox search feature.
Facebook open sourced it in July
2008.
Apache incubator accepted
Cassandra in March 2009.
The latest version of Apache
Cassandra is 3.11.4.
9. NoSQL database also known as "Not Only
SQL" or "Non-relational"databases which
doesn't require any fixed table schema
unlike SQL.
NoSQL databases are increasingly used
in Big Data and real-time web
applications.
NoSQL databases are sometimes called
Not Only SQL i.e. they may support SQL-
like query language.
Referred as structured storage which
consists of relational database.
What is NoSQL?
10. Difference
between
Relational
Database vs
NoSQL
Database
Relational Database NoSQL Database
Handles data coming in low
velocity
Handles data coming in high
velocity
Data arrive from one or few
locations
Data arrive from many
locations
Manages structured data Manages structured
unstructured and semi-
structured data.
Handles data in the moderate
volume.
Handles data in very high
volume.
Deployed in Vertical fashion. Deployed in Horizontal
fashion.
12. Distributed
Every node in the cluster has the same role.
Data is distributed across the cluster (so
each node contains different data), but there
is no master as every node can service any
request.
Linear Scale Performance:
As more nodes are added, the performance
of Cassandra increases.
No Single point of failure
Cassandra replicates data on different nodes
that ensures no single point of failure.
13. Fault Detection and Recovery
Failed nodes can easily be restored and
recovered.
Data Compression
Cassandra can compress up to 80% data
without any overhead.Supports data types with
Fast writes and reads.
Cassandra Query language
Cassandra provides query language that is
similar like SQL language. It makes very easy for
relational database developers moving from
relational database to Cassandra.