�ݺ�ߣ

The Prisoner’s Dilemma
& SOAs
Lessons from Amazon, Google, and Lucidchart
By Derrick Isaacson

http://theinspirationroom.com/daily/print/2009/1/wrigleys_orbit_interrogation.jpg

Prisoner's Dilemma and Service-oriented Architectures

“Can I get
that without
the bacon?”
- no one ever
http://www.food.com/photo-finder/all/bacon?photog=1072593

http://baconipsum.com/?paras=1&type=all-meat&start-with-lorem=1

https://qzprod.files.wordpress.com/2013/03/costco-retail-web.jpg?w=1600
“Wow, that was a cheap trip
to Costco” - no one ever
http://www.someecards.com/usercards/viewcard/MjAxMi03YWZiMjJiMTg3NDFhYTUy

I can’t remember if that getter function takes 100ns or 100ms.
- no one ever
• Should I try to abstract away this service request as a
“remote procedure call”?
• 6 orders of magnitude difference!

My front-side bus only fails for 1 second every 17 minutes!
- no one ever
• 99.9% availability

Our internet only supports .NET.
- no one ever
• Do your clients rely on an SDK?

Distributed System Architectures
Does it have to be “Service-oriented”?

http://upload.wikimedia.org/wikipedia/commons/d/da/KL_CoreMemory.jpg
Distributed Memory

RPC
<I’m>
<not>
<making>
<a>
<service>
<request>
<I’m>
<just>
<calling>
<a>
<procedure>

Distributed File System
mount -t nfs -o proto=tcp,port=2049 nfs-server:/ /mnt

Distributed Data Stores
• Replicated MySQL
• Mongo
• S3
• RDS
• BigTable
• Cassandra
…

Service-oriented Architecture Attempt
Social Bookmarking App

GET /profiles/123
GET /users/123
Calculate something
GET /users/123/permissions
If user can’t view profile
send 403
POST /eventFeed {new profile view}
GET /users/123/friends
GET /bookmarks?userId=123
GET /catalog/books?ids=1,3,10
Calculate something else
GET /bookmarks/trending
Send HTML

Simple SOA Availability
<98.7%
99.5%
99.8%
99.6%
.995 * .998 * .998 * .996 = 0.987

Early days Lucidchart by Status Code
96.5%
2xx or
3xx

Early days Lucidchart 1 second Latencies
10.8%
> 1s

What Happened?!?
How can a SOA make my app better!

"A distributed system is at best a
necessary evil, evil because of the extra
complexity...
or perhaps better put, a sensible
engineering decision given the trade-offs
involved."
-David Cheriton, Distributed Systems Lecture Notes, ch. 1

The CAP Theorem
http://learnyousomeerlang.com/distribunomicon

The CAP Theorem1
• Safety – nothing bad ever happens
• Liveness – good things happen
• Unreliability – network dis-connectivity,
crash failures, message loss, Byzantine
failures, slowdown, etc.
• Consistency – every response sent to a
client is correct
• Availability – every request gets a
response
• Partition tolerance – operating in the
face of arbitrary failures

Consistency:
Nothing Bad
Happens

Assumption: Failures Happen
Availability Consistency

GET /profiles/123
GET /users/123
Calculate something
GET /users/123/permissions
If user can’t view profile
send 403
POST /eventFeed {new profile view}
GET /users/123/friends
GET /bookmarks?userId=123
GET /catalog/books?ids=1,3,10
Calculate something else
GET /bookmarks/trending
Send response

ResponseHandler<User> handler = new ResponseHandler<User>()
{
public User handleResponse(
final HttpResponse response) {
int status = response.getStatusLine().getStatusCode();
if (status >= 200 && status < 300) {
HttpEntity entity = response.getEntity();
return entity != null ? Parser.parse(entity) : null;
} else {
…
}
}
};
HttpGet userGet = new HttpGet("http://example.com/users/123");
User user = httpclient.execute(userGet, handler);
https://hc.apache.org/httpcomponents-client-4.3.x/examples.html
Works great to calculate a user!

Best Effort Availability -
Guaranteed consistency

Best Effort Consistency -
Guaranteed availability

Amazon Checkout
http://highscalability.com/amazon-architecture

“WOW
I really regret
sacrificing
consistency for
availability”
-said no amazon ever That’s $74 Billion

Google File System: relaxed consistency model
Throughput
Latency

Hang Consistency!
Add:
• Caching
• Timeouts
• Retries
• Guessing
• Anything!

Anti-entropy
Added energy to combat failure

Tip 1:
HTTP
Caching
Availability/Performance Consistency

Tip 2: HTTP Caching as Fallback

Tip 3: Retries
• Exponential backoffs & max retries

Tip 3: HTTP Caching Technologies
• Apache HttpComponents – HttpClient Cache
• Ehcache
• Redis
• Memcached
• CloudFront
• Akamai
• Berkeley DB
• AWS SNS (for notifying caches components of changes)

Segmenting CAP
A is highly available and B is highly consistent

Segmenting Consistency and Availability
1. Data Partitioning
Shopping Cart
Warehouse Inventory DB

Segmenting
2. Operation Partitioning
Reads
Writes
Dynamo PNUTS&

Segmenting
3. Functional partitioning
User Service, Document Snapshots
Document Service

Segmenting
4. Hierarchical Partitioning
Leaves
Root

http://www.slashgear.com/google-data-center-hd-photos-hit-where-the-internet-lives-gallery-17252451/

Data Driven Design 1
Timeouts & Retries

• Max I/O wait time = # of threads * (CONNECT_TIMEOUT +
READ_TIMEOUT)
• 9 front end servers received 1900 requests in 60 seconds and 300
for Flickr resources (16%).
• 35 requests per server per minute
• Max 100 threads, => 6,000 thread seconds in one minute
• Goal: ensure < 10% of thread seconds spent blocked on Flickr I/O
• 600 < 35 requests * (CONNECT_TIMEOUT + READ_TIMEOUT)
• CONNECT_TIMEOUT + READ_TIMEOUT < 17 seconds
TCP Connect
Send
Request Block on socket read Read response
CONNECT_TIMEOU
T
READ_TIMEOUT

Best Effort Consistency System
99.9%
99.5%
99.8%
99.6%

Wow, my
pizza has too
many
toppings
- no one ever
http://upload.wikimedia.org/wikipedia/commons/6/60/Pizza_Hut_Meat_Lover's_pizza_3.JPG

“WOW
My system has
too much
availability.”
-no one ever

Questions?
golucid.co
http://www.slideshare.net/DerrickIsaacson

References
1. Perspectives on the CAP Theorem
2. Bacon Ipsum
3. Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant
Web
4. The Google File System
5. Big Table
6. Amazon Architecture References
7. Apache HttpComponents
8. Apache HttpClient Cache
9. Ehcache

�ݺ�ߣ

Prisoner's Dilemma and Service-oriented Architectures

More Related Content

Prisoner's Dilemma and Service-oriented Architectures

Editor's Notes