際際滷

際際滷Share a Scribd company logo
RDBMS vs. Other Data Stores forScalabilityramki.g@directi.comTechTalk 2009, IIIT Hyderabad
ScalabilityIncrease Resources  Increase Performance (Linearly)Performance?Latency, Capacity, ThroughputVertical Scalability (Scaling Up)Divide the functionalityHorizontal Scalability (Scaling Out)Divide the data
Relational DatabaseTable, Row, ColumnSet, Item, Property
Relational TheorySelection: SELECTFilter: WHEREJoin: JOIN, LEFT JOIN,RIGHT JOINCorrelation: SELECT a FROM A WHERE A.b IN (SELECT b FROM B WHERE b.a > a)
Relational TheoryAggregationSet OperatorsUnion, Intersection, MinusGroup ByMAX, MIN, SUM, AVG
Transactions: AtomicityTransaction LevelEntire Logical operations is a transactionMultiple statementsStatement levelEach statement is either successful or not, no partial successMultiple recordsRecord LevelAll modifications to a record are successful or not
Transactions: ConsistencyIntegrity ConstraintsReferential Integrity
Transactions: Isolation LevelsSerializableA definite order of mutations/transactions is possible to arrive to state B from state ARepeatable ReadAny data read by a transaction will remain so till transaction is completeNon Repeatable Read aka Read CommittedTwo reads within a transaction may give different resultsDirty ReadA transaction might read data which might then be rolledback
RDBMS LuxuriesMultiple IndexesAuto Increments/SequencesTriggers
Scalability in RDBMSReplicationRead Replication (Master-Slave)Read Write Replication (Master-Master)ClusterDistributed TransactionTwo-phase commits
Scalability ImpedimentsPerformanceSub-Queries/Correlation, Joins, Aggregates, Referential Integrity constraintsBasic GuaranteeConsistencyAvailability
CAP?Conjecture: Distributed systems cannot ensure all three of the following properties at onceConsistency The client perceives that a set of operations has occurred all at once.Availability Every operation must terminate in an intended response.Partition tolerance Operations will complete, even if individual components are unavailable.
ACID to BASEBasically Available - system seems to work all the timeSoft State - it doesn't have to be consistent all the timeEventually Consistent - becomes consistent at some later time
BASE: An ExampleBEGIN TransactionINSERT INTO ORDER( oid, timestamp, customer)FOREACH item IN itemList	INSERT INTO ORDER_ITEM ( oid, item.id, item.quantity, 	item.unitprice)	//UPDATE INVENTORY SET quantity=quantity-	item.quantityWHERE item = item.idCOMMITEND TransactionAssume Each statement is queued for execution You will get COMMIT success
Alternate ImplementationsBigTable  Google  CPHbase  Apache  CP HyperTable  Community - CPDynamo  Amazon  APSimpleDB Amazon - APVoldemort  LinkedIn  APCassandra  Facebook APMemcacheDB  - community  CP/AP
Data ModelsKey/Value Pairs Dynamo, MemcacheDB, VoldemortRow-ColumnBigTable, Casandra, SimpleDB, Hypertable, Hbase
Programming Models// Open the tableTable *T = OpenOrDie("/bigtable/web/webtable");// Write a new anchor and delete an old anchorRowMutation r1(T, "com.cnn.www");r1.Set("anchor:www.c-span.org", "CNN");r1.Delete("anchor:www.abc.com");Operation op;Apply(&op, &r1);
BigTable: Consistent yet Infinitely ScalableSingle MasterB+ tree based data distribution
BigTable: TransactionsEnities and Entity GroupsInvoiceInvoice ItemDelivery Note
Dynamo: Highly available and Infinitely ScalableConsistent HashingPeer to Peer DistributedGossip based member discovery
RDBMS or Other?Nature of BusinessMaturity of the ProductCost of AdoptionMaturity of the alternative Datastores
Q&A

More Related Content

Scalability: Rdbms Vs Other Data Stores

Editor's Notes

  • #15: May have to discuss Queuing Systems, Idempotency and so on