Jurģis Orups - CTO of Clusterpoint gave a talk "Clusterpoint Inside-Out" - Clusterpoint, a NoSQL database born in Latvia. Jurgis dived into details on and around the story behind the database, technology and global market.
Jurģis Orups is CTO of Clusterpoint, has remarkable experience building large-scale, distributed systems and he is also cofounder in Clusterpoint spending his time building emerging NoSQL database.
1 of 28
Download to read offline
More Related Content
"Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event
3. Inspiration (2001)
FTS for Sybase & FoxPro
First distributed design & implementation
– trying to bite Google (:
Folk song search portal www.dainuskapis.lv
5. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Inverted Index
Problem – real time updates to index
Pierpaolo Basile, Information Access with Lucene, ݺߣshare.net
6. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Inverted Index
Pierpaolo Basile, Information Access with Lucene, ݺߣshare.net
7. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Inverted Index
Pierpaolo Basile, Information Access with Lucene, ݺߣshare.net
8. Infant (2006)
Clusterpoint (2006) – first startup in LV
Seeded by Imprimatur Capital
Team of 2.5 developers and 0.5 CEO
6 months wicked C/C++ coding
biting Google again – search appliance vertical
- “didn't go well”
9. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Inverted index
Two type FTS indices:
− Memory (mutable)
− Disk based (immutable)
Dump memory index when full
Merge dumpings
Problem solved – real time updates!
10. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Query language
Simple query
js developer Dublin
Advanced query
js developer
<sex>=”female”</sex>
<salary>2000 .. 5000</salary>
<place>=”Dublin”</place>
Aggregation (SQL like)
SELECT sex, count(sex) GROUP BY sex
11. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Lookup tables (column-stores)
Associative array/hash map
Constant access/modify time
Memory mapped
Append only
Perfect when accesing data by column
i.e. aggregation, faceting, filtering
12. Child (2008)
Trust in enterprise sales model
First commercial customers
(directories, portals, e-shops, public sector)
Positioning as database challenging
NoSQL – heard nothing about it
... mhm maybe we are NoSQL ?!
The San Francisco NOSQL Meetup on June 11, 2009 was important to the trend's development.
(Wikipedia)
15. Teenager (these days)
Less trust in enterprise model
Shift to free software & Cloud
Grow customer base
Innovate
Develop for developers
16. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Transactions – for what?
ATM cash withdrawal
Checkout
Transfer of goods (monies, credits, lifes :)
Booking
17. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Transactions – example
Begin
Retrieve value for A1
Retrieve value for A2
Check
Update value for A1
Update value for A2
Commit
18. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Transactions – behind the scenes
Begin – fix the “view of the world”
Retrieve A1 (version v1)
Retrieve A2 (version v2)
Check
Update A1: if v1' != v1 then rollback else
continue
Update A2: if v2' != v2 then rollback else
continue
20. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Transactions – behind the scenes
Begin – fix the “view of the world”
Retrieve A1 (version v1)
Retrieve A2 (version v2)
Check
Update A1: if v1' != v1 then rollback else
continue
Update A2: if v2' != v2 then rollback else
continue
21. Talk is cheap.
Show me the code.
(c) Linus Torvalds
Transactions - distributed
Tough because of sharding & replication
Transaction log – no SPOF and it scales via
sharding & replication
Optimistic locking – high concurrency
Isolation – phantom reads
26. How does it work?
Once database is stored in Clusterpoint Cloud it
is broken up in many shards and distributed
among many servers.
27. Try it
Signup for Cloud
http://cloud.clusterpoint.com
Atendees 3 months free of charge access upt
to 100GB storage
Be part of community
Have a fun!