際際滷

際際滷Share a Scribd company logo
13-05-2016
1
Ankita Dubey
13-05-2016
2
Overview
 Introduction
 Framework
 Supporting atomicity
 Concurrency control
 Summarization
 References
2
13-05-2016
3
Distributed Database System
3
 A distributed database system consists of loosely coupled
sites that share no physical component
 Database systems that run on each site are independent
of each other
 Transactions may access data at one or more sites
13-05-2016
4
Introduction
 The management of distributed transactions require
dealing with several problems which are strictly
interconnected, like-
a. Reliability
b. Concurrency control
c. Efficient utilization of the resources of the whole
system.
4
13-05-2016
5
Framework for Transaction
Management
1. Properties of transactions
2. Goals of transaction management
3. Distributed transactions
5
13-05-2016
6
1. Properties of transactions
 Atomicity
 Durability
 Serializability
 Isolation
6
13-05-2016
7
2. Goals of transaction
management
 CPU and main memory utilization
 Control messages
 Response time
 Availability
7
13-05-2016
8
Lets summarize!
 The goal of transaction management in a distributed
database is to control the execution of transactions so
that:
1. Transactions have atomicity, durability, serializability
and isolation properties.
2. Their cost in terms of main memory, CPU and number
of transmitted control messages and their response time
are minimized.
3. The availability of the system is maximized.
8
13-05-2016
9
3. Distributed transactions
 Agent: An agent is a local process which performs
some actions on behalf of an application.
9
13-05-2016
10
Example
10
13-05-2016
11
An updating transaction
11
An updating transaction Updating a master tape is fault
tolerant: If a run fails for any reason, all the tape could be
rewound and the job restarted with no harm done.
13-05-2016
12
Basic Transaction Primitives
12
13-05-2016
13
Transaction execution
13
13-05-2016
14
Nested versus distributed
transaction
14
13-05-2016
15
Flat and nested transactions
15
13-05-2016
16
Flat and nested transactions
 Flat transaction send out requests to different servers
and each request is completed before client goes to the
next one.
 Nested transaction allows sub-transactions at the same
level to execute concurrently.
16
13-05-2016
17
Supporting Atomicity of Distributed
Transactions
 Recovery in centralized databases
 Communication failures in distributed databases.
 Recovery of distributed transactions.
 The 2-Phase-commitment protocol
17
13-05-2016
18
1-phase atomic commit protocol
 A transaction comes to an end when the client requests
that a transaction be committed or aborted.
 Simple way is: coordinator to communicate the commit or
abort request to all of the participants in the transaction
and to keep on repeating the request until all of them have
acknowledged that they had carried it out.
 Inadequate because when the client requests a commit, it
does not allow a server to make a unilateral decision to
abort a transaction. E.g. deadlock avoidance may force a
transaction to abort at a server when locking is used. So any
server may fail or abort and client is not aware.
18
13-05-2016
19
2-phase commit protocol
 Allow any participant to abort its part of a transaction.
Due to atomicity, the whole transaction must also be
aborted.
 In the first phase, each participant votes for the
transaction to be committed or aborted. Once voted to
commit, not allowed to abort it. So before votes to
commit, it must ensure that it will eventually be able
to carry out its part, even if it fails and is replaced.
 A participant is said to be in a prepared state if it will
eventually be able to commit it. So each participant
needs to save the altered objects in the permanent
storage device together with its status prepared.
19
13-05-2016
20
2-phase commit protocol
 In the second phase, every participant in the
transaction carries out the joint decision. If any one
participant votes to abort, the decision must be to
abort. If all the participants vote to commit, then the
decision is to commit the transaction.
 The problem is to ensure that all of the participants
vote and that they all reach the same decision. It is an
example of consensus. It is simple if no error occurs.
However, it should work when servers fail, message
lost or servers are temporarily unable to communicate
with one another.
20
13-05-2016
21
2-phase commit protocol
 If the client requests abort, or if the transaction is
aborted by one of the participants, the coordinator
informs the participants immediately.
 It is when the client asks the coordinator to commit
the transaction that two-phase commit protocol comes
into use.
 In the first phase, the coordinator asks all the
participants if they are prepared to commit; and in the
second, it tells them to commit or abort the
transaction.
21
13-05-2016
22
Operations for 2-phase commit protocol
 canCommit?(trans)-> Yes / No
 Call from coordinator to participant to ask whether it can commit a
transaction. Participant replies with its vote.
 doCommit(trans)
 Call from coordinator to participant to tell participant to commit its
part of a transaction.
 doAbort(trans)
 Call from coordinator to participant to tell participant to abort its
part of a transaction.
 haveCommitted(trans, participant) 
 Call from participant to coordinator to confirm that it has
committed the transaction.
 getDecision(trans) -> Yes / No
 Call from participant to coordinator to ask for the decision on a
transaction after it has voted Yes but has still had no reply after
some delay. Used to recover from server crash or delayed messages.
22
13-05-2016
23
Communication in 2-phase commit protocol
23
13-05-2016
24
Concurrency Control for
Distributed Transactions
 Concurrency control based on locking in centralized
databases.
 Concurrency control based on locking in distributed
databases.
24
13-05-2016
25
Concurrency control in distributed
transactions
 Concurrency control for distributed transactions: each
server applies local concurrency control to its own
objects, which ensure transactions serializability
locally.
 However, the members of a collection of servers of
distributed transactions are jointly responsible for
ensuring that they are performed in a serially
equivalent manner. Thus global serializability is
required.
25
13-05-2016
26
locks
 Lock manager at each server decide whether to grant a
lock or make the requesting transaction wait.
 However, it cannot release any locks until it knows that
the transaction has been committed or aborted at all
the servers involved in the transaction.
 A lock managers in different servers set their locks
independently of one another. It is possible that
different servers may impose different orderings on
transactions.
26
13-05-2016
27
Timestamp ordering concurrency control
 In a single server transaction, the coordinator issues a
unique timestamp to each transaction when it starts. Serial
equivalence is enforced by committing the versions of
objects in the order of the timestamps of transactions that
accessed them.
 In distributed transactions, we require that each
coordinator issue globally unique time stamps. The
coordinators must agree as to the ordering of their
timestamps. , the agreed ordering of pairs of timestamps is
based on a comparison in which the server-id is less
significant.
 The timestamp is passed to each server whose objects
perform an operation in the transaction.
27
13-05-2016
28
Timestamp ordering concurrency
control
 To achieve the same ordering at all the servers, The
servers of distributed transactions are jointly
responsible for ensuring that they are performed in a
serially equivalent manner. E.g. If T commits after U at
server X, T must commits after U at server Y.
 Conflicts are resolved as each operation is performed.
If the resolution of a conflict requires a transaction to
be aborted, the coordinator will be informed and it
will abort the transaction at all the participants.
28
13-05-2016
29
locking
29
13-05-2016
30
Distributed deadlock
 Deadlocks can arise within a single server when
locking is used for concurrency control. Servers must
either prevent or detect and resolve deadlocks.
 Using timeout to resolve deadlock is a clumsy
approach. Why? Another way is to detect deadlock by
detecting cycles in a wait for graph.
30
13-05-2016
31
Summarization
 Distributed transaction managers must ensure that all transactions have the
atomicity, durability, seriability and isolation properties. In most systems, this is
obtained by implementing on top of existing local transaction managers the 2-
phasecommitment protocol for reliability,2-phase-locking for concurrency
control, and timeouts for deadlock detection.
 The 2-phase-commitment protocol ensures that the subtransactions of the same
transaction will either all commit or all abort, in spite of the possible failures. 2-
phasecommitment is resilient to any failure in which no log information is lost.
The 2- phase-locking mechanism requires that all subtransactions acquire locks
in the growing phase and release locks in the shrinking phase. Timeout
mechanisms for deadlock detection simply abort those transactions which are in
wait, possibly for a deadlock.
 Several computation and communication structures are possible for distributed
transaction managers. The computation can use processes permanently assigned
to transactions, or servers dynamically bound to them. Processes can have a
centralized structure, in which one agent activates all other agents, or a
hierarchical structure, in which each agent can in turn activate other agents. The
communication can use sessions or datagrams. The communication structure of
the commitment protocol can be centralized, hierarchical, linear, or distributed.
31
13-05-2016
32
References
 Distributed Database Principles and Systems by
Stefano Ceri and Giuseppe Pelagatti.
32
13-05-2016
33
33
Any questions?

More Related Content

Management of Distributed Transactions

  • 2. 13-05-2016 2 Overview Introduction Framework Supporting atomicity Concurrency control Summarization References 2
  • 3. 13-05-2016 3 Distributed Database System 3 A distributed database system consists of loosely coupled sites that share no physical component Database systems that run on each site are independent of each other Transactions may access data at one or more sites
  • 4. 13-05-2016 4 Introduction The management of distributed transactions require dealing with several problems which are strictly interconnected, like- a. Reliability b. Concurrency control c. Efficient utilization of the resources of the whole system. 4
  • 5. 13-05-2016 5 Framework for Transaction Management 1. Properties of transactions 2. Goals of transaction management 3. Distributed transactions 5
  • 6. 13-05-2016 6 1. Properties of transactions Atomicity Durability Serializability Isolation 6
  • 7. 13-05-2016 7 2. Goals of transaction management CPU and main memory utilization Control messages Response time Availability 7
  • 8. 13-05-2016 8 Lets summarize! The goal of transaction management in a distributed database is to control the execution of transactions so that: 1. Transactions have atomicity, durability, serializability and isolation properties. 2. Their cost in terms of main memory, CPU and number of transmitted control messages and their response time are minimized. 3. The availability of the system is maximized. 8
  • 9. 13-05-2016 9 3. Distributed transactions Agent: An agent is a local process which performs some actions on behalf of an application. 9
  • 11. 13-05-2016 11 An updating transaction 11 An updating transaction Updating a master tape is fault tolerant: If a run fails for any reason, all the tape could be rewound and the job restarted with no harm done.
  • 15. 13-05-2016 15 Flat and nested transactions 15
  • 16. 13-05-2016 16 Flat and nested transactions Flat transaction send out requests to different servers and each request is completed before client goes to the next one. Nested transaction allows sub-transactions at the same level to execute concurrently. 16
  • 17. 13-05-2016 17 Supporting Atomicity of Distributed Transactions Recovery in centralized databases Communication failures in distributed databases. Recovery of distributed transactions. The 2-Phase-commitment protocol 17
  • 18. 13-05-2016 18 1-phase atomic commit protocol A transaction comes to an end when the client requests that a transaction be committed or aborted. Simple way is: coordinator to communicate the commit or abort request to all of the participants in the transaction and to keep on repeating the request until all of them have acknowledged that they had carried it out. Inadequate because when the client requests a commit, it does not allow a server to make a unilateral decision to abort a transaction. E.g. deadlock avoidance may force a transaction to abort at a server when locking is used. So any server may fail or abort and client is not aware. 18
  • 19. 13-05-2016 19 2-phase commit protocol Allow any participant to abort its part of a transaction. Due to atomicity, the whole transaction must also be aborted. In the first phase, each participant votes for the transaction to be committed or aborted. Once voted to commit, not allowed to abort it. So before votes to commit, it must ensure that it will eventually be able to carry out its part, even if it fails and is replaced. A participant is said to be in a prepared state if it will eventually be able to commit it. So each participant needs to save the altered objects in the permanent storage device together with its status prepared. 19
  • 20. 13-05-2016 20 2-phase commit protocol In the second phase, every participant in the transaction carries out the joint decision. If any one participant votes to abort, the decision must be to abort. If all the participants vote to commit, then the decision is to commit the transaction. The problem is to ensure that all of the participants vote and that they all reach the same decision. It is an example of consensus. It is simple if no error occurs. However, it should work when servers fail, message lost or servers are temporarily unable to communicate with one another. 20
  • 21. 13-05-2016 21 2-phase commit protocol If the client requests abort, or if the transaction is aborted by one of the participants, the coordinator informs the participants immediately. It is when the client asks the coordinator to commit the transaction that two-phase commit protocol comes into use. In the first phase, the coordinator asks all the participants if they are prepared to commit; and in the second, it tells them to commit or abort the transaction. 21
  • 22. 13-05-2016 22 Operations for 2-phase commit protocol canCommit?(trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. doCommit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. doAbort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. haveCommitted(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. getDecision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages. 22
  • 24. 13-05-2016 24 Concurrency Control for Distributed Transactions Concurrency control based on locking in centralized databases. Concurrency control based on locking in distributed databases. 24
  • 25. 13-05-2016 25 Concurrency control in distributed transactions Concurrency control for distributed transactions: each server applies local concurrency control to its own objects, which ensure transactions serializability locally. However, the members of a collection of servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner. Thus global serializability is required. 25
  • 26. 13-05-2016 26 locks Lock manager at each server decide whether to grant a lock or make the requesting transaction wait. However, it cannot release any locks until it knows that the transaction has been committed or aborted at all the servers involved in the transaction. A lock managers in different servers set their locks independently of one another. It is possible that different servers may impose different orderings on transactions. 26
  • 27. 13-05-2016 27 Timestamp ordering concurrency control In a single server transaction, the coordinator issues a unique timestamp to each transaction when it starts. Serial equivalence is enforced by committing the versions of objects in the order of the timestamps of transactions that accessed them. In distributed transactions, we require that each coordinator issue globally unique time stamps. The coordinators must agree as to the ordering of their timestamps. , the agreed ordering of pairs of timestamps is based on a comparison in which the server-id is less significant. The timestamp is passed to each server whose objects perform an operation in the transaction. 27
  • 28. 13-05-2016 28 Timestamp ordering concurrency control To achieve the same ordering at all the servers, The servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner. E.g. If T commits after U at server X, T must commits after U at server Y. Conflicts are resolved as each operation is performed. If the resolution of a conflict requires a transaction to be aborted, the coordinator will be informed and it will abort the transaction at all the participants. 28
  • 30. 13-05-2016 30 Distributed deadlock Deadlocks can arise within a single server when locking is used for concurrency control. Servers must either prevent or detect and resolve deadlocks. Using timeout to resolve deadlock is a clumsy approach. Why? Another way is to detect deadlock by detecting cycles in a wait for graph. 30
  • 31. 13-05-2016 31 Summarization Distributed transaction managers must ensure that all transactions have the atomicity, durability, seriability and isolation properties. In most systems, this is obtained by implementing on top of existing local transaction managers the 2- phasecommitment protocol for reliability,2-phase-locking for concurrency control, and timeouts for deadlock detection. The 2-phase-commitment protocol ensures that the subtransactions of the same transaction will either all commit or all abort, in spite of the possible failures. 2- phasecommitment is resilient to any failure in which no log information is lost. The 2- phase-locking mechanism requires that all subtransactions acquire locks in the growing phase and release locks in the shrinking phase. Timeout mechanisms for deadlock detection simply abort those transactions which are in wait, possibly for a deadlock. Several computation and communication structures are possible for distributed transaction managers. The computation can use processes permanently assigned to transactions, or servers dynamically bound to them. Processes can have a centralized structure, in which one agent activates all other agents, or a hierarchical structure, in which each agent can in turn activate other agents. The communication can use sessions or datagrams. The communication structure of the commitment protocol can be centralized, hierarchical, linear, or distributed. 31
  • 32. 13-05-2016 32 References Distributed Database Principles and Systems by Stefano Ceri and Giuseppe Pelagatti. 32