際際滷

際際滷Share a Scribd company logo
APACHE
CASSANDRA
 Introduction to Apache Cassandra
 Brief history about Apache
Cassandra
 What is NoSQL?
 Difference between Relational
Database vs NoSQL Database
 Features of Apache Cassandra
 Who is using Apache Cassandra?
TODAY WE'LL
UNDERSTAND
ABOUT
Introduction to Apache Cassandra
Apache Cassandra is leading NoSQL distributed data
management system,wide column store, that drives many
of today's modern business application by offering
continuous availability, high scalability and performance.
Cassandra handles the huge amount of data with its
distributed architecture.
Cassandra is implemented using JAVA.
Introduction to Apache Cassandra
 Apache Cassandra was designed with understanding
hard/system failure can occur.
 Peer to Peer Distributed system.
 Partitions the data and controls Data Replication.
 Read/Write to any node.
 Gossip Protocol to communicate through various nodes in
cluster.
 Commit Log ensures Data Durability.
 Data also written to memory structure(memtable).
Introduction to Apache Cassandra
CLIENT SEND DATA TO NODE DISTRIBUTED DATABASE
Structure of a Column Store Database
Structure of a Column Store Database
A Keyspace is kind of schema in relational model.
Column family consists of multiple rows.
1. Row Key
2. Column
3. Value
4. Timestamp
Benefits of Column Store Databases
Compression,Aggregation queries,Scalability and Fast to
load and query.
Brief history
about Apache
Cassandra
 Avinash Lakshman and Prashant
Malik initially developed Cassandra
at Facebook to power the
Facebook inbox search feature.
 Facebook open sourced it in July
2008.
 Apache incubator accepted
Cassandra in March 2009.
 The latest version of Apache
Cassandra is 3.11.4.
NoSQL database also known as "Not Only
SQL" or "Non-relational"databases which
doesn't require any fixed table schema
unlike SQL.
NoSQL databases are increasingly used
in Big Data and real-time web
applications.
NoSQL databases are sometimes called
Not Only SQL i.e. they may support SQL-
like query language.
Referred as structured storage which
consists of relational database.
What is NoSQL?
Difference
between
Relational
Database vs
NoSQL
Database
Relational Database NoSQL Database
Handles data coming in low
velocity
Handles data coming in high
velocity
Data arrive from one or few
locations
Data arrive from many
locations
Manages structured data Manages structured
unstructured and semi-
structured data.
Handles data in the moderate
volume.
Handles data in very high
volume.
Deployed in Vertical fashion. Deployed in Horizontal
fashion.
FEATURES OF APACHE
CASSANDRA
Distributed
Every node in the cluster has the same role.
Data is distributed across the cluster (so
each node contains different data), but there
is no master as every node can service any
request.
Linear Scale Performance:
As more nodes are added, the performance
of Cassandra increases.
No Single point of failure
Cassandra replicates data on different nodes
that ensures no single point of failure.
Fault Detection and Recovery
Failed nodes can easily be restored and
recovered.
Data Compression
Cassandra can compress up to 80% data
without any overhead.Supports data types with
Fast writes and reads.
Cassandra Query language
Cassandra provides query language that is
similar like SQL language. It makes very easy for
relational database developers moving from
relational database to Cassandra.
Who is using Apache Cassandra?

More Related Content

Apache Cassandra

  • 2. Introduction to Apache Cassandra Brief history about Apache Cassandra What is NoSQL? Difference between Relational Database vs NoSQL Database Features of Apache Cassandra Who is using Apache Cassandra? TODAY WE'LL UNDERSTAND ABOUT
  • 3. Introduction to Apache Cassandra Apache Cassandra is leading NoSQL distributed data management system,wide column store, that drives many of today's modern business application by offering continuous availability, high scalability and performance. Cassandra handles the huge amount of data with its distributed architecture. Cassandra is implemented using JAVA.
  • 4. Introduction to Apache Cassandra Apache Cassandra was designed with understanding hard/system failure can occur. Peer to Peer Distributed system. Partitions the data and controls Data Replication. Read/Write to any node. Gossip Protocol to communicate through various nodes in cluster. Commit Log ensures Data Durability. Data also written to memory structure(memtable).
  • 5. Introduction to Apache Cassandra CLIENT SEND DATA TO NODE DISTRIBUTED DATABASE
  • 6. Structure of a Column Store Database
  • 7. Structure of a Column Store Database A Keyspace is kind of schema in relational model. Column family consists of multiple rows. 1. Row Key 2. Column 3. Value 4. Timestamp Benefits of Column Store Databases Compression,Aggregation queries,Scalability and Fast to load and query.
  • 8. Brief history about Apache Cassandra Avinash Lakshman and Prashant Malik initially developed Cassandra at Facebook to power the Facebook inbox search feature. Facebook open sourced it in July 2008. Apache incubator accepted Cassandra in March 2009. The latest version of Apache Cassandra is 3.11.4.
  • 9. NoSQL database also known as "Not Only SQL" or "Non-relational"databases which doesn't require any fixed table schema unlike SQL. NoSQL databases are increasingly used in Big Data and real-time web applications. NoSQL databases are sometimes called Not Only SQL i.e. they may support SQL- like query language. Referred as structured storage which consists of relational database. What is NoSQL?
  • 10. Difference between Relational Database vs NoSQL Database Relational Database NoSQL Database Handles data coming in low velocity Handles data coming in high velocity Data arrive from one or few locations Data arrive from many locations Manages structured data Manages structured unstructured and semi- structured data. Handles data in the moderate volume. Handles data in very high volume. Deployed in Vertical fashion. Deployed in Horizontal fashion.
  • 12. Distributed Every node in the cluster has the same role. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request. Linear Scale Performance: As more nodes are added, the performance of Cassandra increases. No Single point of failure Cassandra replicates data on different nodes that ensures no single point of failure.
  • 13. Fault Detection and Recovery Failed nodes can easily be restored and recovered. Data Compression Cassandra can compress up to 80% data without any overhead.Supports data types with Fast writes and reads. Cassandra Query language Cassandra provides query language that is similar like SQL language. It makes very easy for relational database developers moving from relational database to Cassandra.
  • 14. Who is using Apache Cassandra?