際際滷

際際滷Share a Scribd company logo
Chord
A Scalable Peer-to-peer Lookup
Service for Internet Applications
CS294-4: Peer-to-peer Systems
Markus B旦hning
bohning@uclink.berkeley.edu
2
What is Chord? What does it do?
 In short: a peer-to-peer lookup service
 Solves problem of locating a data item in a collection
of distributed nodes, considering frequent node
arrivals and departures
 Core operation in most p2p systems is efficient
location of data items
 Supports just one operation: given a key, it maps
the key onto a node
3
Chord Characteristics
 Simplicity, provable correctness, and provable
performance
 Each Chord node needs routing information about
only a few other nodes
 Resolves lookups via messages to other nodes
(iteratively or recursively)
 Maintains routing information as nodes join and
leave the system
4
Mapping onto Nodes vs. Values
 Traditional name and location services provide a
direct mapping between keys and values
 What are examples of values? A value can be an
address, a document, or an arbitrary data item
 Chord can easily implement a mapping onto values
by storing each key/value pair at node to which that
key maps
5
Napster, Gnutella etc. vs. Chord
 Compared to Napster and its centralized servers,
Chord avoids single points of control or failure by a
decentralized technology
 Compared to Gnutella and its widespread use of
broadcasts, Chord avoids the lack of scalability
through a small number of important information for
rounting
6
DNS vs. Chord
DNS
 provides a host name to
IP address mapping
 relies on a set of special
root servers
 names reflect
administrative boundaries
 is specialized to finding
named hosts or services
Chord
 can provide same service:
Name = key, value = IP
 requires no special
servers
 imposes no naming
structure
 can also be used to find
data objects that are not
tied to certain machines
7
Freenet vs. Chord
 both decentralized and symmetric
 both automatically adapt when hosts leave and join
 Freenet
 does not assign responsibility for documents to specific
servers, instead lookups are searches for cached copies
+ allows Freenet to provide anonymity
 prevents guaranteed retrieval of existing documents
 Chord
 does not provide anonymity
+ but its lookup operation runs in predictable time and always
results in success or definitive failure
8
Addressed Difficult Problems (1)
 Load balance: distributed hash function, spreading
keys evenly over nodes
 Decentralization: chord is fully distributed, no
node more important than other, improves
robustness
 Scalability: logarithmic growth of lookup costs with
number of nodes in network, even very large
systems are feasible
9
Addressed Difficult Problems (2)
 Availability: chord automatically adjusts its internal
tables to ensure that the node responsible for a key
can always be found
 Flexible naming: no constraints on the structure of
the keys  key-space is flat, flexibility in how to map
names to Chord keys
10
Example Application using Chord:
Cooperative Mirroring
 Highest layer provides a file-like interface to user
including user-friendly naming and authentication
 This file systems maps operations to lower-level block
operations
 Block storage uses Chord to identify responsible node for
storing a block and then talk to the block storage server
on that node
File System
Block Store
Chord
Block Store
Chord
Block Store
Chord
Client Server Server
11
The Base Chord Protocol (1)
 Specifies how to find the locations of keys
 How new nodes join the system
 How to recover from the failure or planned
departure of existing nodes
12
Consistent Hashing
 Hash function assigns each node and key an m-bit
identifier using a base hash function such as SHA-1
 ID(node) = hash(IP, Port)
 ID(key) = hash(key)
 Properties of consistent hashing:
 Function balances load: all nodes receive roughly the
same number of keys  good?
 When an Nth node joins (or leaves) the network, only
an O(1/N) fraction of the keys are moved to a different
location
13
6
1
2
6
0
4
2
6
5
1
3
7
2
identifier
circle
identifier
node
X key
Successor Nodes
successor(1) = 1
successor(2) = 3
successor(6) = 0
14
Node Joins and Departures
6
1
2
0
4
2
6
5
1
3
7
successor(6) = 7
6
1
successor(1) = 3
15
Scalable Key Location
 A very small amount of routing information suffices
to implement consistent hashing in a distributed
environment
 Each node need only be aware of its successor node
on the circle
 Queries for a given identifier can be passed around
the circle via these successor pointers
 Resolution scheme correct, BUT inefficient: it may
require traversing all N nodes!
16
Acceleration of Lookups
 Lookups are accelerated by maintaining additional
routing information
 Each node maintains a routing table with (at most)
m entries (where N=2m) called the finger table
 ith entry in the table at node n contains the identity
of the first node, s, that succeeds n by at least 2i-1
on the identifier circle (clarification on next slide)
 s = successor(n + 2i-1) (all arithmetic mod 2)
 s is called the ith finger of node n, denoted by
n.finger(i).node
17
Finger Tables (1)
0
4
2
6
5
1
3
7
1
2
4
[1,2)
[2,4)
[4,0)
1
3
0
finger table
start int. succ.
keys
1
2
3
5
[2,3)
[3,5)
[5,1)
3
3
0
finger table
start int. succ.
keys
2
4
5
7
[4,5)
[5,7)
[7,3)
0
0
0
finger table
start int. succ.
keys
6
18
Finger Tables (2) - characteristics
 Each node stores information about only a small
number of other nodes, and knows more about
nodes closely following it than about nodes farther
away
 A nodes finger table generally does not contain
enough information to determine the successor of an
arbitrary key k
 Repetitive queries to nodes that immediately
precede the given key will lead to the keys
successor eventually
19
Node Joins  with Finger Tables
0
4
2
6
5
1
3
7
1
2
4
[1,2)
[2,4)
[4,0)
1
3
0
finger table
start int. succ.
keys
1
2
3
5
[2,3)
[3,5)
[5,1)
3
3
0
finger table
start int. succ.
keys
2
4
5
7
[4,5)
[5,7)
[7,3)
0
0
0
finger table
start int. succ.
keys
finger table
start int. succ.
keys
7
0
2
[7,0)
[0,2)
[2,6)
0
0
3
6
6
6
6
6
20
Node Departures  with Finger Tables
0
4
2
6
5
1
3
7
1
2
4
[1,2)
[2,4)
[4,0)
1
3
0
finger table
start int. succ.
keys
1
2
3
5
[2,3)
[3,5)
[5,1)
3
3
0
finger table
start int. succ.
keys
2
4
5
7
[4,5)
[5,7)
[7,3)
6
6
0
finger table
start int. succ.
keys
finger table
start int. succ.
keys
7
0
2
[7,0)
[0,2)
[2,6)
0
0
3
6
6
6
0
3
21
Source of Inconsistencies:
Concurrent Operations and Failures
 Basic stabilization protocol is used to keep nodes
successor pointers up to date, which is sufficient to
guarantee correctness of lookups
 Those successor pointers can then be used to verify
the finger table entries
 Every node runs stabilize periodically to find newly
joined nodes
22
Stabilization after Join
np
succ(n
p
)
=
n
s
ns
n
pred(n
s
)
=
n
p  n joins
 predecessor = nil
 n acquires ns as successor via some n
 n notifies ns being the new
predecessor
 ns acquires n as its predecessor
 np runs stabilize
 np asks ns for its predecessor (now n)
 np acquires n as its successor
 np notifies n
 n will acquire np as its predecessor
 all predecessor and successor
pointers are now correct
 fingers still need to be fixed, but
old fingers will still work
nil
pred(n
s
)
=
n
succ(n
p
)
=
n
23
Failure Recovery
 Key step in failure recovery is maintaining correct
successor pointers
 To help achieve this, each node maintains a successor-list
of its r nearest successors on the ring
 If node n notices that its successor has failed, it replaces
it with the first live entry in the list
 stabilize will correct finger table entries and successor-list
entries pointing to failed node
 Performance is sensitive to the frequency of node joins
and leaves versus the frequency at which the stabilization
protocol is invoked
24
Chord  The Math
 Every node is responsible for about K/N keys (N
nodes, K keys)
 When a node joins or leaves an N-node network,
only O(K/N) keys change hands (and only to and
from joining or leaving node)
 Lookups need O(log N) messages
 To reestablish routing invariants and finger tables
after node joining or leaving, only O(log2N)
messages are required
25
Experimental Results
 Latency grows slowly with
the total number of nodes
 Path length for lookups is
about 遜 log2N
 Chord is robust in the face
of multiple node failures

More Related Content

Similar to lec03-chord-150210004632-conversion-gate.ppt (20)

Intro 2 Computer Networks
Intro 2 Computer NetworksIntro 2 Computer Networks
Intro 2 Computer Networks
rakeshgoswami
Introduction to Computer Networks
Introduction to Computer NetworksIntroduction to Computer Networks
Introduction to Computer Networks
Venkatesh Iyer
Chord Presentation at Papers We Love SF, August 2016
Chord Presentation at Papers We Love SF, August 2016Chord Presentation at Papers We Love SF, August 2016
Chord Presentation at Papers We Love SF, August 2016
Tom Faulhaber
Cassandra Architecture
Cassandra ArchitectureCassandra Architecture
Cassandra Architecture
Prasad Wali
Efficiency analysis of public key management scheme for wsn using TinyOS
Efficiency analysis of public key management scheme for wsn using TinyOSEfficiency analysis of public key management scheme for wsn using TinyOS
Efficiency analysis of public key management scheme for wsn using TinyOS
vik001ind
Chord DHT
Chord DHTChord DHT
Chord DHT
John-Alan Simmons
Chord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chord- A Scalable Peer-to-Peer Lookup Service for Internet ApplicationsChord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chandan Thakur
ch03 network security in computer sys.ppt
ch03 network security in computer sys.pptch03 network security in computer sys.ppt
ch03 network security in computer sys.ppt
ubaidullah75790
Tapestry
TapestryTapestry
Tapestry
Sutha31
Ch03
Ch03Ch03
Ch03
Joe Christensen
Data Encryption standard in cryptography
Data Encryption standard in cryptographyData Encryption standard in cryptography
Data Encryption standard in cryptography
NithyasriA2
Content addressable network(can)
Content addressable network(can)Content addressable network(can)
Content addressable network(can)
Amit Dahal
Peer to peer Paradigms
Peer to peer ParadigmsPeer to peer Paradigms
Peer to peer Paradigms
hassan ahmed
Chord
ChordChord
Chord
Sutha31
DES.ppt
DES.pptDES.ppt
DES.ppt
RizwanBasha12
paper4.pdf
paper4.pdfpaper4.pdf
paper4.pdf
aminasouyah
doc4.pdf
doc4.pdfdoc4.pdf
doc4.pdf
aminasouyah
sheet4.pdf
sheet4.pdfsheet4.pdf
sheet4.pdf
aminasouyah
doc4.pdf
doc4.pdfdoc4.pdf
doc4.pdf
aminasouyah
lecture3.pdf
lecture3.pdflecture3.pdf
lecture3.pdf
aminasouyah
Intro 2 Computer Networks
Intro 2 Computer NetworksIntro 2 Computer Networks
Intro 2 Computer Networks
rakeshgoswami
Introduction to Computer Networks
Introduction to Computer NetworksIntroduction to Computer Networks
Introduction to Computer Networks
Venkatesh Iyer
Chord Presentation at Papers We Love SF, August 2016
Chord Presentation at Papers We Love SF, August 2016Chord Presentation at Papers We Love SF, August 2016
Chord Presentation at Papers We Love SF, August 2016
Tom Faulhaber
Cassandra Architecture
Cassandra ArchitectureCassandra Architecture
Cassandra Architecture
Prasad Wali
Efficiency analysis of public key management scheme for wsn using TinyOS
Efficiency analysis of public key management scheme for wsn using TinyOSEfficiency analysis of public key management scheme for wsn using TinyOS
Efficiency analysis of public key management scheme for wsn using TinyOS
vik001ind
Chord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chord- A Scalable Peer-to-Peer Lookup Service for Internet ApplicationsChord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chord- A Scalable Peer-to-Peer Lookup Service for Internet Applications
Chandan Thakur
ch03 network security in computer sys.ppt
ch03 network security in computer sys.pptch03 network security in computer sys.ppt
ch03 network security in computer sys.ppt
ubaidullah75790
Tapestry
TapestryTapestry
Tapestry
Sutha31
Data Encryption standard in cryptography
Data Encryption standard in cryptographyData Encryption standard in cryptography
Data Encryption standard in cryptography
NithyasriA2
Content addressable network(can)
Content addressable network(can)Content addressable network(can)
Content addressable network(can)
Amit Dahal
Peer to peer Paradigms
Peer to peer ParadigmsPeer to peer Paradigms
Peer to peer Paradigms
hassan ahmed
Chord
ChordChord
Chord
Sutha31

Recently uploaded (20)

Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
ASHISHDESAI85
Frankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkundeFrankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkunde
Lisa Emerson
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
US Patented ReGenX Generator, ReGen-X Quatum Motor EV Regenerative Accelerati...
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
G8 mini project for alcohol detection and engine lock system with GPS tracki...
G8 mini project for  alcohol detection and engine lock system with GPS tracki...G8 mini project for  alcohol detection and engine lock system with GPS tracki...
G8 mini project for alcohol detection and engine lock system with GPS tracki...
sahillanjewar294
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
slayshadow705
Industrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products ProfileIndustrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products Profile
zebcoeng
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
NgocThang9
Engineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdfEngineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdf
Sona
Water Industry Process Automation & Control Monthly - March 2025.pdf
Water Industry Process Automation & Control Monthly - March 2025.pdfWater Industry Process Automation & Control Monthly - March 2025.pdf
Water Industry Process Automation & Control Monthly - March 2025.pdf
Water Industry Process Automation & Control
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdfCS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
PonniS7
Taykon-Kalite belgeleri
Taykon-Kalite belgeleriTaykon-Kalite belgeleri
Taykon-Kalite belgeleri
TAYKON
Embedded System intro Embedded System intro.ppt
Embedded System intro Embedded System intro.pptEmbedded System intro Embedded System intro.ppt
Embedded System intro Embedded System intro.ppt
23ucc580
Env and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdfEnv and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdf
MahmudHasan747870
Cloud Computing concepts and technologies
Cloud Computing concepts and technologiesCloud Computing concepts and technologies
Cloud Computing concepts and technologies
ssuser4c9444
only history of java.pptx real bihind the name java
only history of java.pptx real bihind the name javaonly history of java.pptx real bihind the name java
only history of java.pptx real bihind the name java
mushtaqsaliq9
Turbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdfTurbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdf
Totok Sulistiyanto
eng funda notes.pdfddddddddddddddddddddddd
eng funda notes.pdfdddddddddddddddddddddddeng funda notes.pdfddddddddddddddddddddddd
eng funda notes.pdfddddddddddddddddddddddd
aayushkumarsinghec22
How to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using ArduinoHow to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using Arduino
CircuitDigest
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptxMathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
ppkmurthy2006
Best KNow Hydrogen Fuel Production in the World The cost in USD kwh for H2
Best KNow  Hydrogen Fuel Production in the World The cost in USD kwh for H2Best KNow  Hydrogen Fuel Production in the World The cost in USD kwh for H2
Best KNow Hydrogen Fuel Production in the World The cost in USD kwh for H2
Daniel Donatelli
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
Integration of Additive Manufacturing (AM) with IoT : A Smart Manufacturing A...
ASHISHDESAI85
Frankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkundeFrankfurt University of Applied Science urkunde
Frankfurt University of Applied Science urkunde
Lisa Emerson
G8 mini project for alcohol detection and engine lock system with GPS tracki...
G8 mini project for  alcohol detection and engine lock system with GPS tracki...G8 mini project for  alcohol detection and engine lock system with GPS tracki...
G8 mini project for alcohol detection and engine lock system with GPS tracki...
sahillanjewar294
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
Structural QA/QC Inspection in KRP 401600 | Copper Processing Plant-3 (MOF-3)...
slayshadow705
Industrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products ProfileIndustrial Valves, Instruments Products Profile
Industrial Valves, Instruments Products Profile
zebcoeng
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
15. Smart Cities Big Data, Civic Hackers, and the Quest for a New Utopia.pdf
NgocThang9
Engineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdfEngineering at Lovely Professional University (LPU).pdf
Engineering at Lovely Professional University (LPU).pdf
Sona
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdfCS3451-OPERATING-SYSTEM NOTES ALL123.pdf
CS3451-OPERATING-SYSTEM NOTES ALL123.pdf
PonniS7
Taykon-Kalite belgeleri
Taykon-Kalite belgeleriTaykon-Kalite belgeleri
Taykon-Kalite belgeleri
TAYKON
Embedded System intro Embedded System intro.ppt
Embedded System intro Embedded System intro.pptEmbedded System intro Embedded System intro.ppt
Embedded System intro Embedded System intro.ppt
23ucc580
Env and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdfEnv and Water Supply Engg._Dr. Hasan.pdf
Env and Water Supply Engg._Dr. Hasan.pdf
MahmudHasan747870
Cloud Computing concepts and technologies
Cloud Computing concepts and technologiesCloud Computing concepts and technologies
Cloud Computing concepts and technologies
ssuser4c9444
only history of java.pptx real bihind the name java
only history of java.pptx real bihind the name javaonly history of java.pptx real bihind the name java
only history of java.pptx real bihind the name java
mushtaqsaliq9
Turbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdfTurbocor Product and Technology Review.pdf
Turbocor Product and Technology Review.pdf
Totok Sulistiyanto
eng funda notes.pdfddddddddddddddddddddddd
eng funda notes.pdfdddddddddddddddddddddddeng funda notes.pdfddddddddddddddddddddddd
eng funda notes.pdfddddddddddddddddddddddd
aayushkumarsinghec22
How to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using ArduinoHow to Build a Maze Solving Robot Using Arduino
How to Build a Maze Solving Robot Using Arduino
CircuitDigest
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptxMathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
Mathematics behind machine learning INT255 INT255__Unit 3__PPT-1.pptx
ppkmurthy2006
Best KNow Hydrogen Fuel Production in the World The cost in USD kwh for H2
Best KNow  Hydrogen Fuel Production in the World The cost in USD kwh for H2Best KNow  Hydrogen Fuel Production in the World The cost in USD kwh for H2
Best KNow Hydrogen Fuel Production in the World The cost in USD kwh for H2
Daniel Donatelli

lec03-chord-150210004632-conversion-gate.ppt

  • 1. Chord A Scalable Peer-to-peer Lookup Service for Internet Applications CS294-4: Peer-to-peer Systems Markus B旦hning bohning@uclink.berkeley.edu
  • 2. 2 What is Chord? What does it do? In short: a peer-to-peer lookup service Solves problem of locating a data item in a collection of distributed nodes, considering frequent node arrivals and departures Core operation in most p2p systems is efficient location of data items Supports just one operation: given a key, it maps the key onto a node
  • 3. 3 Chord Characteristics Simplicity, provable correctness, and provable performance Each Chord node needs routing information about only a few other nodes Resolves lookups via messages to other nodes (iteratively or recursively) Maintains routing information as nodes join and leave the system
  • 4. 4 Mapping onto Nodes vs. Values Traditional name and location services provide a direct mapping between keys and values What are examples of values? A value can be an address, a document, or an arbitrary data item Chord can easily implement a mapping onto values by storing each key/value pair at node to which that key maps
  • 5. 5 Napster, Gnutella etc. vs. Chord Compared to Napster and its centralized servers, Chord avoids single points of control or failure by a decentralized technology Compared to Gnutella and its widespread use of broadcasts, Chord avoids the lack of scalability through a small number of important information for rounting
  • 6. 6 DNS vs. Chord DNS provides a host name to IP address mapping relies on a set of special root servers names reflect administrative boundaries is specialized to finding named hosts or services Chord can provide same service: Name = key, value = IP requires no special servers imposes no naming structure can also be used to find data objects that are not tied to certain machines
  • 7. 7 Freenet vs. Chord both decentralized and symmetric both automatically adapt when hosts leave and join Freenet does not assign responsibility for documents to specific servers, instead lookups are searches for cached copies + allows Freenet to provide anonymity prevents guaranteed retrieval of existing documents Chord does not provide anonymity + but its lookup operation runs in predictable time and always results in success or definitive failure
  • 8. 8 Addressed Difficult Problems (1) Load balance: distributed hash function, spreading keys evenly over nodes Decentralization: chord is fully distributed, no node more important than other, improves robustness Scalability: logarithmic growth of lookup costs with number of nodes in network, even very large systems are feasible
  • 9. 9 Addressed Difficult Problems (2) Availability: chord automatically adjusts its internal tables to ensure that the node responsible for a key can always be found Flexible naming: no constraints on the structure of the keys key-space is flat, flexibility in how to map names to Chord keys
  • 10. 10 Example Application using Chord: Cooperative Mirroring Highest layer provides a file-like interface to user including user-friendly naming and authentication This file systems maps operations to lower-level block operations Block storage uses Chord to identify responsible node for storing a block and then talk to the block storage server on that node File System Block Store Chord Block Store Chord Block Store Chord Client Server Server
  • 11. 11 The Base Chord Protocol (1) Specifies how to find the locations of keys How new nodes join the system How to recover from the failure or planned departure of existing nodes
  • 12. 12 Consistent Hashing Hash function assigns each node and key an m-bit identifier using a base hash function such as SHA-1 ID(node) = hash(IP, Port) ID(key) = hash(key) Properties of consistent hashing: Function balances load: all nodes receive roughly the same number of keys good? When an Nth node joins (or leaves) the network, only an O(1/N) fraction of the keys are moved to a different location
  • 14. 14 Node Joins and Departures 6 1 2 0 4 2 6 5 1 3 7 successor(6) = 7 6 1 successor(1) = 3
  • 15. 15 Scalable Key Location A very small amount of routing information suffices to implement consistent hashing in a distributed environment Each node need only be aware of its successor node on the circle Queries for a given identifier can be passed around the circle via these successor pointers Resolution scheme correct, BUT inefficient: it may require traversing all N nodes!
  • 16. 16 Acceleration of Lookups Lookups are accelerated by maintaining additional routing information Each node maintains a routing table with (at most) m entries (where N=2m) called the finger table ith entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle (clarification on next slide) s = successor(n + 2i-1) (all arithmetic mod 2) s is called the ith finger of node n, denoted by n.finger(i).node
  • 17. 17 Finger Tables (1) 0 4 2 6 5 1 3 7 1 2 4 [1,2) [2,4) [4,0) 1 3 0 finger table start int. succ. keys 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 finger table start int. succ. keys 2 4 5 7 [4,5) [5,7) [7,3) 0 0 0 finger table start int. succ. keys 6
  • 18. 18 Finger Tables (2) - characteristics Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes farther away A nodes finger table generally does not contain enough information to determine the successor of an arbitrary key k Repetitive queries to nodes that immediately precede the given key will lead to the keys successor eventually
  • 19. 19 Node Joins with Finger Tables 0 4 2 6 5 1 3 7 1 2 4 [1,2) [2,4) [4,0) 1 3 0 finger table start int. succ. keys 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 finger table start int. succ. keys 2 4 5 7 [4,5) [5,7) [7,3) 0 0 0 finger table start int. succ. keys finger table start int. succ. keys 7 0 2 [7,0) [0,2) [2,6) 0 0 3 6 6 6 6 6
  • 20. 20 Node Departures with Finger Tables 0 4 2 6 5 1 3 7 1 2 4 [1,2) [2,4) [4,0) 1 3 0 finger table start int. succ. keys 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 finger table start int. succ. keys 2 4 5 7 [4,5) [5,7) [7,3) 6 6 0 finger table start int. succ. keys finger table start int. succ. keys 7 0 2 [7,0) [0,2) [2,6) 0 0 3 6 6 6 0 3
  • 21. 21 Source of Inconsistencies: Concurrent Operations and Failures Basic stabilization protocol is used to keep nodes successor pointers up to date, which is sufficient to guarantee correctness of lookups Those successor pointers can then be used to verify the finger table entries Every node runs stabilize periodically to find newly joined nodes
  • 22. 22 Stabilization after Join np succ(n p ) = n s ns n pred(n s ) = n p n joins predecessor = nil n acquires ns as successor via some n n notifies ns being the new predecessor ns acquires n as its predecessor np runs stabilize np asks ns for its predecessor (now n) np acquires n as its successor np notifies n n will acquire np as its predecessor all predecessor and successor pointers are now correct fingers still need to be fixed, but old fingers will still work nil pred(n s ) = n succ(n p ) = n
  • 23. 23 Failure Recovery Key step in failure recovery is maintaining correct successor pointers To help achieve this, each node maintains a successor-list of its r nearest successors on the ring If node n notices that its successor has failed, it replaces it with the first live entry in the list stabilize will correct finger table entries and successor-list entries pointing to failed node Performance is sensitive to the frequency of node joins and leaves versus the frequency at which the stabilization protocol is invoked
  • 24. 24 Chord The Math Every node is responsible for about K/N keys (N nodes, K keys) When a node joins or leaves an N-node network, only O(K/N) keys change hands (and only to and from joining or leaving node) Lookups need O(log N) messages To reestablish routing invariants and finger tables after node joining or leaving, only O(log2N) messages are required
  • 25. 25 Experimental Results Latency grows slowly with the total number of nodes Path length for lookups is about 遜 log2N Chord is robust in the face of multiple node failures