際際滷

際際滷Share a Scribd company logo
Cassandra for Barcodes, Products and Scans:

                       The Backend Infrastructure at Scandit




Christof Roduner
Co-founder and COO
christof@scandit.com   Link: NoSQL concept and data model

@scandit
www.scandit.com                                             February 1, 2012
AGENDA
2




       About Scandit

       Requirements

       Apache Cassandra

       Scandit backend
WHAT IS SCANDIT?
3



    Scandit provides developers best-in-class tools to
    build, analyze and monetize product-centric apps.




      IDENTIFY           ANALYZE            MONETIZE
       Products         User Interest         Apps
ANALYZE:
    THE SCANALYTICS PLATFORM
4

       Tool for app publishers
       App-specific usage statistics

       Insights into consumer behavior:
           What do users scan?
                Product categories? Groceries, electronics, books, cosmetics, ?
           Where do users scan?
                At home? Or while in a retail store?
           Top products and brands

       Identify new opportunities:
           Customer engagement
           Product interest
           Cross-selling and up-selling
BACKEND REQUIREMENTS
5

       Product database
            Many millions of products
            Many different data sources
            Curation of product data (filtering, etc.)


       Analysis of scans
            Accept and store high volumes of scans
            Generate statistics over extended time periods
            Correlate with product data


       Provide reports to developers
BACKEND DESIGN GOALS
6

       Scalability
            High-volume storage
            High-volume throughput
            Support large number of concurrent client requests (app)


       Availability

       Low maintenance
WHICH DATABASE?
7


    Apache Cassandra
     Large, distributed key-value store (DHT)

     束NoSQL損 Polyglot Persistence

     Inspired by:

            Amazons Dynamo distributed storage system
            Googles BigTable data model
       Originally developed at Facebook
            Inbox search
WHY DID WE CHOOSE IT?
8

       Looked very fast
            Even when data is much larger than RAM


       Performs well in write-heavy environment

       Proven scalability
            Without downtime

       Tunable replication

       Easy to run and maintain
            No sharding
            All nodes are the same - no coordinators, masters, slaves, 

       Data model
            YMMV
WHAT YOU HAVE TO GIVE UP
9


       Joins
       Referential integrity
       Transactions
       Expressive query language
       Consistency (tunable, but)

       Limited support for:
            Schema
            Secondary indices
CASSANDRA DATA MODEL
10

                                         Disclaimer: I tend to say 束hash損
                                         when I mean 束dictionary, map,
        Column families                 associative array損 (Can you tell
                                         my favorite language?)
        Rows
        Columns

        (Supercolumns)
             Well skip them - Cassandra developers dont like
              them
COLUMNS AND ROWS
11

        Column:
             Is a name-value pair
             row_key, CF, column, timestamp and value

        Row:
             Has exactly one key
             Contains any number of columns
             Columns are always automatically sorted by their name

        Column family:
             A collection of any number of rows (!)
             Has a name
             束Like a table損
EXAMPLE COLUMN FAMILY
12

     "users": {
                                                                Row with key 束christof損
         "christof": {
             "email": "christof@scandit.com",
             "phone": "123-456-7890"
         }

         "moritz": {
             "email": "moritz@scandit.com",
                                                              Two columns, automatically
             "web": "www.example.com"                           sorted by their names
         }                                                        (束email損, 束web損)
     }



        A column family 束users損 containing two rows
        Columns can be different in every row
             First row has a column named 束phone損, second row does not
        Rows can have many columns
             You can add millions of them
DATA IN COLUMN NAMES
13

        Column names can be used to store data
        Frequent pattern in Cassandra
        Takes advantage of column sorting
     "logins": {

         "christof": {
             "2012-01-29   16:22:30   +0100":   "208.115.113.86",
             "2012-01-30   07:48:03   +0100":   "66.249.66.183",
             "2012-01-30   18:06:55   +0100":   "208.115.111.70",
             "2012-01-31   12:37:26   +0100":   "66.249.66.183"
         }

         "moritz": {
             "2012-01-23 01:12:49 +0100": "205.209.190.116"
         }

     }
SCHEMA AND DATA TYPES
14

        Schema is optional
        Data type can be defined for:
             Keys
             The values of all columns with a given name
             The column names in a CF
        By default, data type BLOB is used


        Data Types
             BLOB (default)                     UUID
             ASCII text                         Integer (arbitrary length)
             UTF8 text                          Float
             Timestamp                          Double
             Boolean                            Decimal
CLUSTER ORGANIZATION
15

                                      Range 1-64,
                   Node 1           stored on node 2
                   Token 0




       Node 4                  Node 2
      Token 192               Token 64




                   Node 3            Range 65-128,
                  Token 128         stored on node 3
STORING A ROW
16
                                                                    Range 1-64,
 1.   Calculate md5 hash for row key                              stored on node 2
      Example: md5(foobar") = 48
                                                        Node 1
                                                        Token 0
 2.   Determine data range for hash
      Example: 48 lies within range 1-64


 3.   Store row on node responsible
                                            Node 4                       Node 2
      for range
                                           Token 192                    Token 64
      Example: store on node 2



                                                        Node 3
                                                       Token 128
                                                                   Range 65-128,
                                                                  stored on node 3
IMPLICATIONS
17


        Cluster automatically balanced
             Load is shared equally between nodes
             No hotspots

        Scaling out?
             Easy
             Divide data ranges by adding more nodes
             Cluster rebalances itself automatically

        Range queries not possible
             You cant retrieve 束all rows from A-C損
             Rows are not stored in their 束natural損 order
             Rows are stored in order of their md5 hashes
IF YOU NEED RANGE QUERIES
18


     Option 1: 束Order Preserving Partitioner損 (OPP)
        OPP determines node based on a rows key instead of its hash
        Dont use it
             Manually balancing a cluster is hard
             Hotspots
             Balancing cluster for one column family creates hotspot for another


     Option 2: Use columns instead of rows
        Columns are always sorted
        Rows can store millions of columns
REPLICATION
19


                                                                   Replica 1
    Tunable replication factor                                     of row
     (RF)                                               Node 1     束foobar損
                                                        Token 0
    RF > 1: rows are automatically
     replicated to next RF-1 nodes

    Tunable replication strategy                                   Node 2
                                            Node 4
         束Ensure two replicas in                                  Token 64
          different data centers, racks,   Token 192
          etc.損



                                                        Node 3
                                                       Token 128   Replica 2
                                                                    of row
                                                                   束foobar損
CLIENT ACCESS
20

        Clients can send read and write
         requests to any node
             This node will act as
              coordinator
                                                          Node 1         Replica 1
        Coordinator forwards request                     Token 0         of row
         to nodes where data resides                                     束foobar損




              Client                          Node 4                  Node 2
                                             Token 192               Token 64



          Request:
                                                                       Replica 2
          insert(                                         Node 3        of row
            "foobar": { "email": "fb@example.com" }      Token 128     束foobar損
          )
CONSISTENCY LEVELS
21

        For all requests, clients can set a consistency level (CL)

        For writes:
             CL defines how many replicas must be written before
              束success損 is returned to client

        For reads:
             CL defines how many replicas must respond before result is
              returned to client

        Consistency levels:
             ONE
             QUORUM
             ALL
              (data center-aware levels)
INCONSISTENT DATA
22

        Example scenario:
             Replication factor 2
             Two existing replica for row 束foobar損
             Client overwrites existing columns in 束foobar損
             Replica 2 is down

        What happens:
             Column is updated in replica 1, but not replica 2 (even with CL=ALL !)

        Timestamps to the rescue
             Every column has a timestamp
             Timestamps are supplied by clients
             Upon read, column with latest timestamp wins

        Use NTP
PREVENTING INCONSISTENCIES
23

        Read repair

        Hinted handoff

        Anti entropy
RETRIEVING DATA (API)
24
        At a row level, you can
             Get all rows
             Get a single row by specifying its key
             Get a number of rows by specifying their keys
             Get a range of rows
                   Only with OPP, strongly discouraged

        At a column level, you can
             Get all columns
             Get a single column by specifying its name
             Get a number of columns by specifying their names
             Get a range of columns by specifying the name of the first and
              last column

        Again: no ranges of rows
CASSANDRA QUERY LANGUAGE
     (CQL)
25



     UPDATE users SET
         "email" = "christof@scandit.com",
         "phone" = "123-456-7890"
         WHERE KEY = "christof";



     "users": {

         "christof": {
             "email": "christof@scandit.com",
             "phone": "123-456-7890"
         }

         "moritz": {
             "email": "moritz@scandit.com",
             "web": "www.example.com"
         }

     }
CASSANDRA QUERY LANGUAGE
     (CQL)
26




     SELECT * FROM users WHERE KEY = "christof";




     "users": {

         "christof": {
             "email": "christof@scandit.com",
             "phone": "123-456-7890"
         }

         "moritz": {
             "email": "moritz@scandit.com",
             "web": "www.example.com"
         }

     }
CASSANDRA QUERY LANGUAGE
     (CQL)
27


     SELECT FIRST 3 "2012-01-30 00:00:00 +0100" ..
            "2012-01-31 23:59:59 +0100"
         FROM logins
         WHERE KEY = "christof";


     "logins": {

         "christof": {
             "2012-01-29   16:22:30   +0100":   "208.115.113.86",
             "2012-01-30   07:48:03   +0100":   "66.249.66.183",
             "2012-01-30   18:06:55   +0100":   "208.115.111.70",
             "2012-01-31   12:37:26   +0100":   "66.249.66.183"
             "2012-02-09   10:27:28   +0100":   "218.315.37.21",
         }

         "moritz": {
             "2012-01-23 01:12:49 +0100": "205.209.190.116"
         }

     }
SECONDARY INDICES
28

        Secondary indices can be defined for (single) columns

        Secondary indices only support equality predicate (=)
         in queries

        Each node maintains index for data it owns
             When indexed column is queried, request must be forwarded
              to all nodes
             Sometimes better to manually maintain your own index
PRODUCTION EXPERIENCE
29

        No stability issues

        Very fast

        Language bindings dont have the same quality
             Out of sync, bugs

        Data model is a mental twist

        Design-time decisions sometimes hard to change

        Rudimentary access control
TRYING OUT CASSANDRA
30


        DataStax website
             Company founded by Cassandra developers
             Provides
                   Documentation
                   Amazon Machine Image


        Apache website

        Mailing lists
CLUSTER AT SCANDIT
31


        Several nodes in two data centers

        Linux machines

        Identical setup on every node
             Allows for easy failover
NODE ARCHITECTURE
32
     from mobile apps and web browsers




                                            Phusion Passenger
                                              mod_passenger

                                            Website & REST API
                                            Ruby on Rails, Rack




                                                                  to other nodes
THANK YOU!

www.scandit.com

More Related Content

Cassandra 2012 scandit

  • 1. Cassandra for Barcodes, Products and Scans: The Backend Infrastructure at Scandit Christof Roduner Co-founder and COO christof@scandit.com Link: NoSQL concept and data model @scandit www.scandit.com February 1, 2012
  • 2. AGENDA 2 About Scandit Requirements Apache Cassandra Scandit backend
  • 3. WHAT IS SCANDIT? 3 Scandit provides developers best-in-class tools to build, analyze and monetize product-centric apps. IDENTIFY ANALYZE MONETIZE Products User Interest Apps
  • 4. ANALYZE: THE SCANALYTICS PLATFORM 4 Tool for app publishers App-specific usage statistics Insights into consumer behavior: What do users scan? Product categories? Groceries, electronics, books, cosmetics, ? Where do users scan? At home? Or while in a retail store? Top products and brands Identify new opportunities: Customer engagement Product interest Cross-selling and up-selling
  • 5. BACKEND REQUIREMENTS 5 Product database Many millions of products Many different data sources Curation of product data (filtering, etc.) Analysis of scans Accept and store high volumes of scans Generate statistics over extended time periods Correlate with product data Provide reports to developers
  • 6. BACKEND DESIGN GOALS 6 Scalability High-volume storage High-volume throughput Support large number of concurrent client requests (app) Availability Low maintenance
  • 7. WHICH DATABASE? 7 Apache Cassandra Large, distributed key-value store (DHT) 束NoSQL損 Polyglot Persistence Inspired by: Amazons Dynamo distributed storage system Googles BigTable data model Originally developed at Facebook Inbox search
  • 8. WHY DID WE CHOOSE IT? 8 Looked very fast Even when data is much larger than RAM Performs well in write-heavy environment Proven scalability Without downtime Tunable replication Easy to run and maintain No sharding All nodes are the same - no coordinators, masters, slaves, Data model YMMV
  • 9. WHAT YOU HAVE TO GIVE UP 9 Joins Referential integrity Transactions Expressive query language Consistency (tunable, but) Limited support for: Schema Secondary indices
  • 10. CASSANDRA DATA MODEL 10 Disclaimer: I tend to say 束hash損 when I mean 束dictionary, map, Column families associative array損 (Can you tell my favorite language?) Rows Columns (Supercolumns) Well skip them - Cassandra developers dont like them
  • 11. COLUMNS AND ROWS 11 Column: Is a name-value pair row_key, CF, column, timestamp and value Row: Has exactly one key Contains any number of columns Columns are always automatically sorted by their name Column family: A collection of any number of rows (!) Has a name 束Like a table損
  • 12. EXAMPLE COLUMN FAMILY 12 "users": { Row with key 束christof損 "christof": { "email": "christof@scandit.com", "phone": "123-456-7890" } "moritz": { "email": "moritz@scandit.com", Two columns, automatically "web": "www.example.com" sorted by their names } (束email損, 束web損) } A column family 束users損 containing two rows Columns can be different in every row First row has a column named 束phone損, second row does not Rows can have many columns You can add millions of them
  • 13. DATA IN COLUMN NAMES 13 Column names can be used to store data Frequent pattern in Cassandra Takes advantage of column sorting "logins": { "christof": { "2012-01-29 16:22:30 +0100": "208.115.113.86", "2012-01-30 07:48:03 +0100": "66.249.66.183", "2012-01-30 18:06:55 +0100": "208.115.111.70", "2012-01-31 12:37:26 +0100": "66.249.66.183" } "moritz": { "2012-01-23 01:12:49 +0100": "205.209.190.116" } }
  • 14. SCHEMA AND DATA TYPES 14 Schema is optional Data type can be defined for: Keys The values of all columns with a given name The column names in a CF By default, data type BLOB is used Data Types BLOB (default) UUID ASCII text Integer (arbitrary length) UTF8 text Float Timestamp Double Boolean Decimal
  • 15. CLUSTER ORGANIZATION 15 Range 1-64, Node 1 stored on node 2 Token 0 Node 4 Node 2 Token 192 Token 64 Node 3 Range 65-128, Token 128 stored on node 3
  • 16. STORING A ROW 16 Range 1-64, 1. Calculate md5 hash for row key stored on node 2 Example: md5(foobar") = 48 Node 1 Token 0 2. Determine data range for hash Example: 48 lies within range 1-64 3. Store row on node responsible Node 4 Node 2 for range Token 192 Token 64 Example: store on node 2 Node 3 Token 128 Range 65-128, stored on node 3
  • 17. IMPLICATIONS 17 Cluster automatically balanced Load is shared equally between nodes No hotspots Scaling out? Easy Divide data ranges by adding more nodes Cluster rebalances itself automatically Range queries not possible You cant retrieve 束all rows from A-C損 Rows are not stored in their 束natural損 order Rows are stored in order of their md5 hashes
  • 18. IF YOU NEED RANGE QUERIES 18 Option 1: 束Order Preserving Partitioner損 (OPP) OPP determines node based on a rows key instead of its hash Dont use it Manually balancing a cluster is hard Hotspots Balancing cluster for one column family creates hotspot for another Option 2: Use columns instead of rows Columns are always sorted Rows can store millions of columns
  • 19. REPLICATION 19 Replica 1 Tunable replication factor of row (RF) Node 1 束foobar損 Token 0 RF > 1: rows are automatically replicated to next RF-1 nodes Tunable replication strategy Node 2 Node 4 束Ensure two replicas in Token 64 different data centers, racks, Token 192 etc.損 Node 3 Token 128 Replica 2 of row 束foobar損
  • 20. CLIENT ACCESS 20 Clients can send read and write requests to any node This node will act as coordinator Node 1 Replica 1 Coordinator forwards request Token 0 of row to nodes where data resides 束foobar損 Client Node 4 Node 2 Token 192 Token 64 Request: Replica 2 insert( Node 3 of row "foobar": { "email": "fb@example.com" } Token 128 束foobar損 )
  • 21. CONSISTENCY LEVELS 21 For all requests, clients can set a consistency level (CL) For writes: CL defines how many replicas must be written before 束success損 is returned to client For reads: CL defines how many replicas must respond before result is returned to client Consistency levels: ONE QUORUM ALL (data center-aware levels)
  • 22. INCONSISTENT DATA 22 Example scenario: Replication factor 2 Two existing replica for row 束foobar損 Client overwrites existing columns in 束foobar損 Replica 2 is down What happens: Column is updated in replica 1, but not replica 2 (even with CL=ALL !) Timestamps to the rescue Every column has a timestamp Timestamps are supplied by clients Upon read, column with latest timestamp wins Use NTP
  • 23. PREVENTING INCONSISTENCIES 23 Read repair Hinted handoff Anti entropy
  • 24. RETRIEVING DATA (API) 24 At a row level, you can Get all rows Get a single row by specifying its key Get a number of rows by specifying their keys Get a range of rows Only with OPP, strongly discouraged At a column level, you can Get all columns Get a single column by specifying its name Get a number of columns by specifying their names Get a range of columns by specifying the name of the first and last column Again: no ranges of rows
  • 25. CASSANDRA QUERY LANGUAGE (CQL) 25 UPDATE users SET "email" = "christof@scandit.com", "phone" = "123-456-7890" WHERE KEY = "christof"; "users": { "christof": { "email": "christof@scandit.com", "phone": "123-456-7890" } "moritz": { "email": "moritz@scandit.com", "web": "www.example.com" } }
  • 26. CASSANDRA QUERY LANGUAGE (CQL) 26 SELECT * FROM users WHERE KEY = "christof"; "users": { "christof": { "email": "christof@scandit.com", "phone": "123-456-7890" } "moritz": { "email": "moritz@scandit.com", "web": "www.example.com" } }
  • 27. CASSANDRA QUERY LANGUAGE (CQL) 27 SELECT FIRST 3 "2012-01-30 00:00:00 +0100" .. "2012-01-31 23:59:59 +0100" FROM logins WHERE KEY = "christof"; "logins": { "christof": { "2012-01-29 16:22:30 +0100": "208.115.113.86", "2012-01-30 07:48:03 +0100": "66.249.66.183", "2012-01-30 18:06:55 +0100": "208.115.111.70", "2012-01-31 12:37:26 +0100": "66.249.66.183" "2012-02-09 10:27:28 +0100": "218.315.37.21", } "moritz": { "2012-01-23 01:12:49 +0100": "205.209.190.116" } }
  • 28. SECONDARY INDICES 28 Secondary indices can be defined for (single) columns Secondary indices only support equality predicate (=) in queries Each node maintains index for data it owns When indexed column is queried, request must be forwarded to all nodes Sometimes better to manually maintain your own index
  • 29. PRODUCTION EXPERIENCE 29 No stability issues Very fast Language bindings dont have the same quality Out of sync, bugs Data model is a mental twist Design-time decisions sometimes hard to change Rudimentary access control
  • 30. TRYING OUT CASSANDRA 30 DataStax website Company founded by Cassandra developers Provides Documentation Amazon Machine Image Apache website Mailing lists
  • 31. CLUSTER AT SCANDIT 31 Several nodes in two data centers Linux machines Identical setup on every node Allows for easy failover
  • 32. NODE ARCHITECTURE 32 from mobile apps and web browsers Phusion Passenger mod_passenger Website & REST API Ruby on Rails, Rack to other nodes