Talk I did @ http://www.kingsofcode.nl about the things we learned the lst year about making http://www.netlog.com scalable and delivering high performance to our users...
1 of 70
More Related Content
Netlog: What we learned about scalability & high availability
1. 27 mei 2008
Folke Lemaitre
Director of Development
http://nl.netlog.com/folke
What we learned about
scalability & high availability
2. Overview
What is Netlog?
Translations
Network topology
Scaling Databases
Caching
Search
Q&A
6. What: it兵s personal
You rule: it兵s yours
Music YOU
ANOTHER
Photos
Games
ANOTHER
YOU Videos
People
Blogs
Photos
Relations.
7. Friend Activity
Share & discover friends兵 activity
Pinguke V
Mari . reageert
Toon Coppens
wijzigt haar op haar foto
Jan Maarten
Willems tekent
uploadt een profielfoto het gastenboek
nieuwe foto
van nico b
Jaak Noukens
en Jo zijn nu
vrienden
Stijn Symons
uploadt een
nieuwe foto
Kenny Gryp
tekent het
gastenboek van
Lorenz Bogaert
14. Applications
OpenSocial
sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1
Of鍖cially announced tomorrow@ Google I/O
Stay tuned!
Public launch for june
16. It兵s going pretty good
More than 35,000,000 unique members
More than 4,000,000,000 pageviews/Month
19 languages and more coming up
More than 20 countries
Current Alexa Top-100 ranking
(most visited web sites in the world)
Current ComScore Europe Top-10 ranking
17. 0
50.000.000
100.000.000
150.000.000
200.000.000
Ja
nu
16%
3%
Fe ar
br y-
Western Asia
ua 07
Eastern Europe
M ry-
ar 07
ch
10%
Ap -07
22%
ril
-
M 07
ay
Southern Europe
Ju -07
Americas 3%
ne
-
Ju 07
ly
Northern Europe
Au -0
gu 7
st
-0
7
O
c
N tob
ov er
Monthly Visits
e -0
D mb 7
ec e
em r-0
46%
Ja be 7
nu r-0
Fe ary 7
Western Europe
br -0
ua 8
It兵s going pretty good
M ry-
ar 08
ch
Ap -08
ril
-0
8
0
10.000.000
20.000.000
30.000.000
40.000.000
Ja
nu
0
1.250.000.000
2.500.000.000
3.750.000.000
5.000.000.000
Fe ary Ja
br -0 n
ua 7 Fe uar
M ry- br y-0
ar 07 ua 7
ch M ry-
Ap -07 ar 0
ch 7
ril
- Ap -07
M 07 ril
ay
M -07
Ju -07 ay
ne Ju -07
- ne
Ju 07
l Ju -07
Au y-0
gu 7 Au ly-0
st gu 7
-0 st
7 -0
O 7
ct O
N obe ct
ov N ob
e r-0 ov er
-
D mb 7 e
ec e D mb 07
em r-0 ec e
Monthly Unique Visitors
em r-0
Monthly Page Requests
Ja be 7 Ja be 7
nu r-0 n r-
Fe ary 7 Fe uar 07
br -0 br y-0
ua 8 ua 8
M ry- M ry-
ar 08 ar 0
ch ch 8
Ap -08 Ap -08
ril ril
-0 -0
8 8
20. 19 languages and alot more coming!
Slovenina
Espa単ol Catal
Svenska
suomi esky
sloven邸ina Deutsch Magyar
Nederlands
fran巽ais
从亳亶 Italiano Afrikaans
English
Dansk T端rk巽e
Polski Hrvatski
Lietuvi迭 kalba
Eesti Latvie邸u valoda
Portugu棚s
Rom但n 弍仍亞舒从亳
Norsk (bokm奪l)
29. Overview
Netlog Datacenters
Database Pools
Slave Slave
Master Master
Slave Slave
User Pool Activity Pool
Web Cluster
Slave Slave
Master Master
Slave Slave
Friendships Pool ...
Internet Web Load Balancer
Firewall
Memcache Pools
Static Load Balancer
Session Cache
Slave
Master
General Cache Slave
Html Cache
Primary Pool
CDN
Storage Servers
30. Web Servers
Software
Apache 2
Php 5.2.6
eAccelerator 0.9.5.2 for bytecode caching
Keepalived for high availability
200 servers
450 000 requests per second
31. Database Servers
MySQL Enterprise 4.1.22
200 database servers
40 thousand tables
70 billion records
60 thousand queries per second
37. Database Pools
Different data on different database pools:
messaging
friendships
blogs
music
videos
...
38. Replication
write to one master
read from multiple slaves (and master)
pros
easy to implement
read intensive applications scale very well
cons
write intensive applications don兵t scale
39. Partitioning (sharding)
Divide data on primary key:
all user data for users with id 1 - 10 in database1
all user data for users with id 11 - 20 in database2
...
Best scaling possible
How?
managed in code
MySQL partitioning (available from version 5.1)
40. Analyse, analyse, analyse!
Tag your queries
SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */
Analyse mysql slow logs
Analyse process lists
Analyse based on tags
1023 User:getUser():230
512 User::isOnline():124
10 Activities:getActivity():320
minutely cron that checks for too many
connections
if too many connections, log process list
42. Introduction to memcached
Developed by Danga Interactive:
http://www.danga.com/
Initially developed for LiveJournal:
http://www.livejournal.com/
OpenSource
43. Introduction to memcached
Least Recently Used
Fast!
Distributed
Automatic failover
Big Hash table: set/add/get/delete
44. What to cache?
sessions
query caching
processed data
generated html
45. Session Cache
99% hit ratio
Time to live is 20 minutes
Faster than session database
46. Query Cache
Why memcache and not MySQL query cache?
MySQL invalidates cached queries on a table on
every update
different query cache for different replicated
databases
Add to generic database classes
Cache key is query
49. HTML Caching
Pro鍖le blocks are fully cached
Data needed to generate html is also cached
When data changes, html is invalidated, cached
data updated
High cache hit rate on pro鍖le pages
50. 3 ways of caching
Cache with TTL
Cache forever with invalidate
Cache forever with update
51. Cache with TTL
The good:
Quickly achieve better performance on existing code
The bad:
Users see outdated information
TTL can not be high
Caching ef鍖ciency is minimal
60. Global Locking: Chat Example
Example: add new message to cached shared
chat thread
61. Flooding detection
User can only redo action A after a timeout
a guestbook message can only be posted once every
2 minutes
User can not do action A more than X times in T
minutes
only 12 failed login attempts per hour are allowed
63. Flooding detection
User can only redo action A after a timeout
a guestbook message can only be posted once every
2 minutes
User can not do action A more than X times in T
minutes
only 12 failed login attempts per hour are allowed
65. MySQL full-text search
Initially used for our search
can be very slow
extra load on most of our databases, since most
content is searchable
Better search engine needed
Sphinx!
OpenSource search engine developed by Andrew
Aksyonoff (http://sphinxsearch.com/)
66. Sphinx Features
very fast indexing
very fast searching
0.04 seconds average
5 million searches / day
60 searches / second
distributed
document 鍖elds
stopwords
api available in many languages
PhP, Java, Python, Ruby, Perl, C++, ...
67. Sphinx Indexer
Index is read-only (except for attributes)
Build new index while searching old one
How we index:
rebuild full index from data once in a while (daily,
weekly)
generate delta indexes often (every minute, 5
minutes)
contains changes for search index since last full index merge
full index merge of previous index and delta (every
hour)
68. Sphinx Search
Search query returns list of ids
For every result page shown, we fetch data
associated with ids
data is cached with memcache for every id