際際滷

際際滷Share a Scribd company logo
CS542: Topics in
Distributed Systems
Diganta Goswami
Distributed System
 A collection of independent computers
that appears to its users as a single
coherent system.
 Autonomous computers
 Many components  connected by a network 
sharing resources.
Distributed System
 A System of networked components that
communicate and coordinate their actions only by
passing messages
 concurrent execution of programs
 no global clock
 components fail independently of one another
Another definition
 You know you have a distributed system when
the crash of a computer youve never heard of
stops you from getting any work done.
 inter-dependencies
 shared state
 independent failure of components
A working definition for us
A distributed system is a collection of entities, each
of which is autonomous, programmable,
asynchronous and failure-prone, and which
communicate through an unreliable communication
medium using message passing.
 Entity=a process on a device (PC, PDA)
 Communication Medium=Wired or wireless network
 Our interest in distributed systems involves
 design and implementation, maintenance, algorithmics
Important Distributed Systems Issues
 No global clock: no single global notion of the correct
time (asynchrony)
 Unpredictable failures of components: lack of
response may be due to either failure of a network
component, network path being down, or a computer
crash (failure-prone, unreliable)
 Highly variable bandwidth: from 16Kbps (slow
modems or Google Balloon) to Gbps (Internet2) to
Tbps (in between DCs of same big company)
 Possibly large and variable latency: few ms to
several seconds
 Large numbers of hosts: 2 to several million
There are a range of interesting problems for
Distributed System designers


 Real distributed systems
 Cloud Computing, Peer to peer systems, Hadoop, distributed file
systems, sensor networks, graph processing, 
 Classical Problems
 Failure detection, Asynchrony, Snapshots, Multicast, Consensus,
Mutual Exclusion, Election, 
 Concurrency
 RPCs, Concurrency Control, Replication Control, 
 Security
 Byzantine Faults, 
 Others
Typical Distributed Systems Design Goals
 Common Goals:
 Heterogeneity  can the system handle a large variety of
types of PCs and devices?
 Robustness  is the system resilient to host crashes
and failures, and to the network dropping messages?
 Availability  are data+services always there for clients?
 Transparency  can the system hide its internal
workings from the users?
 Concurrency  can the server handle multiple clients
simultaneously?
 Efficiency  is the service fast enough? Does it utilize
100% of all resources?
 Scalability  can it handle 100 million nodes without
degrading service? (nodes=clients and/or servers)
 Security  can the system withstand hacker attacks?
 Openness  is the system extensible?
Challenges and Goals of Distributed Systems
 Heterogeneity
 Openness
 Security
 Scalability
 Failure handling
 Concurrency
 Transparency
Challenges
 Heterogeneity (variety and difference) ofunderlying
network infrastructure,
 Internet consists of many different sorts of network 
their differences are masked by the fact that all of the
computers attached to them use the Internet Protocols for
communication.
 e.g. a computer attached to an Ethernet has an implementation of the
Internet Protocols over the Ethernet, whereas a computer on a different sort
of network will need an implementation of the Internet Protocols for that
network.
Heterogeneity
 Computer hardware and software
 e.g., operating systems, compare UNIX socket and Winsock
calls
 Programming languages : in particular, data
representations
Some approaches: Middleware
 A S/W layer that provides a programming
abstraction as well as masking the heterogeneity
of the underlying networks, H/W, O/S and
programming languages.
 Middleware (e.g., CORBA): transparency of network, hard- and
software and programming language heterogeneity. JAVA
RMI
 In addition to solving the problems of heterogeneity,
middleware provides a uniform computational model for
use by the programmers of servers and distributed
applications.
Positioning Middleware
 General structure of a distributed system as
middleware.
1-22
Openness
 Characteristic that determine whether the system
can be extended and re-implemented in various
ways.
 Determined primarily by the degree to which new resource
sharing services can be added and be made available for use
by a variety of client programs.
 Cannot be achieved unless the specification and
documentation of the key s/w interfaces are made available to
s/w developers (i.e. key interfaces are published)
Openness
 Designers of the Internet protocols
introduced a series of documents called RFCs
 Specifications of the Internet communication
protocols
 Specifications for applications run over them
損 e.g., email, telnet, file transfer, etc. (by the mid 80s)
 RFCs are not the only means --- e.g. CORBA is
published through a series of documents, including a
complete specification of the interfaces of its services
(www.omg.org)
Openness
 Offering services according to standard rules that
describe the syntax and semantics of those
services
 e.g., Network protocol rules (RFCs)
 Services specified through interfaces
 Interface definition languages (IDLs)
 specifies names and available functions as well as
parameters, return values, exceptions etc.
Security
 Distributed systems must protect the shared
information and resources
 The openness of DS makes them vulnerable to
security threats
 Providing security is a significant challenge for
DS
Security.
Privacy / Confidentiality: protection against
disclosure to unauthorized individuals
Integrity: protection against alteration or corruption
Availability: protection against interference with the
means to access the resources
Scalability
 Scalable systemsystem that can handle additional
number of users/resources without suffering
noticeable loss of performance
 Three metrics of a scalable system
 No of user/resources
 Distance between the farthest nodes in the system (network radius)
 Number of organizations exerting control over the pieces of the
system
Challenges in designing scalable DS
 Controlling the cost of physical resources:
 As the demand for a resource grows, it should be
possible to extend the system, at reasonable cost,
to meet it.
損 e.g. it must be possible to add server computers to avoid
the performance bottleneck that would arise if a single file
server had to handle all file access request when the freq.
of file access request grows in an intranet with the
increase in users and computers.
www.amazon.com is more than one computer
Challenges in designing scalable DS
 Controlling the performance loss:
 Management of a set of data whose size is
proportional to the number of users or resources in
the system
損 e.g. the Domain Name System holds the table with the
correspondence between domain names of computers
and their Internet address
損 Hierarchic structures scale better than linar structures.
Scaling Techniques
1.5
An example of dividing the DNS name space into zones.
Challenges in designing scalable DS
 Preventing s/w resources running out:
 Numbers used as Internet address --- 32 bits was
used in the late 70s but may run out soon.
 Change from 32 bits to 128 bits?
 Difficult to predict the demand.
 Over-compensating for future growth may be worse than
adapting to a change when we are forced to - large Internet
address occupy extra space in messages and in computer
storage.
Failure Handling
 Failure in a DS is partial
 Some components fail while others continue to
function
 This makes handling of failures difficult.
Techniques for dealing with
failures
 Detecting failures
 may be impossible  remote site crash or delay
in message transmission?
 Some can be.
 Ex. - Checksums can be used to detect corrupted data
Techniques for dealing with
failures
 Masking failure
 Some can be hidden or made less severe
 Retransmission  when messages fail to arrive
Techniques for dealing with
failures
 Tolerating failures
 Would not be practical to detect and hide all of the failures.
Can be designed to tolerate some of those
 e.g. timeouts when waiting for a web resource  clients give
up after a predetermined number of attempts and take other
actions & inform the user.
Failure Handling
 Recovery from failures
 Rollback
 Undo/Redo in transactions
 Redundancy
 Makes the system more available through replication of
resources/data
 Redundant routes in the network
 Replication of name tables in multiple domain name servers
Concurrency
 In a distributed system it is possible that
multiple machines/processes/users may try to
access shared data/resource concurrently
 Can potentially lead to incorrect results and/or
 Deadlocks
 The operations must be synchronized/serialized so
that the end result is correct
Transparency
 Concealing the heterogeneous and
distributed nature of the system so that it
appears to the user like one system
 Making the user believe that there is only a
single, undivided system i.e., to hide the notion
of distribution completely
 What are the challenges of transparency?
Transparency Categories
 Access transparency - access local and remote
resources using identical operations
 e.g., users of UNIX NFS can use the same commands
and parameters for file system operations regardless of
whether the accessed files are on a local or remote disk.
Transparency categories
 Location Transparency: Access without
knowledge of location of a resource
 e.g., URLs, email addresses (hostname, IP addresses, etc.
not required --- the part of the URL that identifies a web
server domain name refers to a computer name in a
domain, rather than to an Internet address)
Transparency Categories
 Concurrency transparency: Allow several
processes to operate concurrently using shared
resources in a consistent fashion w/o interference
between them.
 That is, users and programmers are unaware that
components request services concurrently.
 Replication transparency
 Use replicated resource as if there was just one
instance.
損 Increase reliability and performance w/o knowledge of
the replicas by users or application programmers.
Failure transparency
 Enables the concealment of faults, allowing
users and application programs to complete
their task despite failures of h/w or s/w
components.
 Retransmit of email messages  eventually
delivered even when servers or
communication links fail  it may even take
several days.
Failure transparency
 Failure transparency depends on concurrency
and replication transparency.
 Replication can be employed to achieve failure
transparency
 Message transmission governed by TCP is a
mechanism for providing failure transparency
Mobility Transparency
 Mobility transparency: allow resources to move
around w/o affecting the operation of users or
programs
 e.g., 700 phone number  but URLs are not, because
someones personal web page cannot move to their new
place of work in a different domain  all of the links in other
pages will still point to the original page!
Transparency Categories
 Performance transparency: adaptation of the
system to varying load situations without the user
noticing it.
 Scaling transparency: allow system and
applications to expand without need to change
structure or application algorithms
Degree of transparency
 There are systems in which attempting to blindly hide
all distribution aspects from users is not always a
good idea
 Requesting your electronic newspaper in your mailbox before 7 am
local time  while you are at the other end of the world living in a
different time zone
 (Your morning paper will not be the morning paper you are used to)
Degree of transparency
 There is trade-off between a high degree of
transparency and the performance of a system
 Masking transient server failure by retransmitting the request
may slow down the system
 If it is necessary to guarantee that several replicas need to be
consistent all the time, a single update may take a long time 
something that cannot be hidden from the user.

More Related Content

Similar to Intorduction Distributed and Parallel Computing.ppt (20)

distributed system original.pdf
distributed system original.pdfdistributed system original.pdf
distributed system original.pdf
KirimanyiJovanntanda
Lecture 3 - Types of Distributed Systems.ppt
Lecture 3 - Types of Distributed Systems.pptLecture 3 - Types of Distributed Systems.ppt
Lecture 3 - Types of Distributed Systems.ppt
KostadinKostadin
chapter 1- introduction to distributed system.ppt
chapter 1- introduction to distributed system.pptchapter 1- introduction to distributed system.ppt
chapter 1- introduction to distributed system.ppt
AschalewAyele2
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptxChapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
meharikiros2
Unit 1
Unit 1Unit 1
Unit 1
Karthi Vel
Disadvantages Distributed System.pptx
Disadvantages   Distributed  System.pptxDisadvantages   Distributed  System.pptx
Disadvantages Distributed System.pptx
vlakshmirajendran1
Design Issues of Distributed System (1).pptx
Design Issues of Distributed System  (1).pptxDesign Issues of Distributed System  (1).pptx
Design Issues of Distributed System (1).pptx
vlakshmirajendran1
20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt
suganthi66742
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing
Megha yadav
Distributed Systems.pptx
Distributed Systems.pptxDistributed Systems.pptx
Distributed Systems.pptx
salutiontechnology
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system
Sarvesh Meena
distributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdfdistributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdf
lematadese670
Unit 1
Unit 1Unit 1
Unit 1
Karthi Vel
Unit 1
Unit 1Unit 1
Unit 1
Baskarkncet
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdfDISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
BachaLamessaa
istributed system
istributed systemistributed system
istributed system
abdillahkarine
Introduction
IntroductionIntroduction
Introduction
Mohamed Diallo
Unit 2(oss) (1)
Unit 2(oss) (1)Unit 2(oss) (1)
Unit 2(oss) (1)
Vivek Subramanium
chapter-1Introduction to DS,Issues and Architecture.pptx
chapter-1Introduction to DS,Issues and Architecture.pptxchapter-1Introduction to DS,Issues and Architecture.pptx
chapter-1Introduction to DS,Issues and Architecture.pptx
ARULMURUGANRAMU1
Cloud computing basic introduction and notes for exam
Cloud computing basic introduction and notes for examCloud computing basic introduction and notes for exam
Cloud computing basic introduction and notes for exam
UtkarshAnand512529
distributed system original.pdf
distributed system original.pdfdistributed system original.pdf
distributed system original.pdf
KirimanyiJovanntanda
Lecture 3 - Types of Distributed Systems.ppt
Lecture 3 - Types of Distributed Systems.pptLecture 3 - Types of Distributed Systems.ppt
Lecture 3 - Types of Distributed Systems.ppt
KostadinKostadin
chapter 1- introduction to distributed system.ppt
chapter 1- introduction to distributed system.pptchapter 1- introduction to distributed system.ppt
chapter 1- introduction to distributed system.ppt
AschalewAyele2
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptxChapter-1-IntroDistributeddffsfdfsdf-1.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
meharikiros2
Disadvantages Distributed System.pptx
Disadvantages   Distributed  System.pptxDisadvantages   Distributed  System.pptx
Disadvantages Distributed System.pptx
vlakshmirajendran1
Design Issues of Distributed System (1).pptx
Design Issues of Distributed System  (1).pptxDesign Issues of Distributed System  (1).pptx
Design Issues of Distributed System (1).pptx
vlakshmirajendran1
20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt
suganthi66742
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing
Megha yadav
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system
Sarvesh Meena
distributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdfdistributed system chapter one introduction to distribued system.pdf
distributed system chapter one introduction to distribued system.pdf
lematadese670
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdfDISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf
BachaLamessaa
chapter-1Introduction to DS,Issues and Architecture.pptx
chapter-1Introduction to DS,Issues and Architecture.pptxchapter-1Introduction to DS,Issues and Architecture.pptx
chapter-1Introduction to DS,Issues and Architecture.pptx
ARULMURUGANRAMU1
Cloud computing basic introduction and notes for exam
Cloud computing basic introduction and notes for examCloud computing basic introduction and notes for exam
Cloud computing basic introduction and notes for exam
UtkarshAnand512529

Recently uploaded (20)

A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
Principled Technologies
FutureProofing the Nordic Economy with GenAI
FutureProofing the Nordic Economy with GenAIFutureProofing the Nordic Economy with GenAI
FutureProofing the Nordic Economy with GenAI
Pete Nieminen
April Patch Tuesday
April Patch TuesdayApril Patch Tuesday
April Patch Tuesday
Ivanti
Human Centered Design By Gnanasambandham
Human Centered Design By GnanasambandhamHuman Centered Design By Gnanasambandham
Human Centered Design By Gnanasambandham
Gnanasambandham Anbazhagan CSP, CSM, CSPO
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
DanBrown980551
AI in SEO Marketing Presentation (BY MetaSense Marketing)
AI in SEO Marketing Presentation (BY MetaSense Marketing)AI in SEO Marketing Presentation (BY MetaSense Marketing)
AI in SEO Marketing Presentation (BY MetaSense Marketing)
MetaSenseMarketing
Build With AI X GDG Harare Beginners .pdf
Build With AI X GDG Harare Beginners .pdfBuild With AI X GDG Harare Beginners .pdf
Build With AI X GDG Harare Beginners .pdf
Google Developer Group - Harare
SaaS Product Development Best Practices
SaaS Product Development Best PracticesSaaS Product Development Best Practices
SaaS Product Development Best Practices
ApptDev
How to Achieve High-Accuracy Results When Using LLMs
How to Achieve High-Accuracy Results When Using LLMsHow to Achieve High-Accuracy Results When Using LLMs
How to Achieve High-Accuracy Results When Using LLMs
Aggregage
Codequiry: A Code Similarity Checker Every Developer Should Know
Codequiry: A Code Similarity Checker Every Developer Should KnowCodequiry: A Code Similarity Checker Every Developer Should Know
Codequiry: A Code Similarity Checker Every Developer Should Know
Code Quiry
Health Promotion explained ppt.pptx
Health Promotion  explained ppt.pptxHealth Promotion  explained ppt.pptx
Health Promotion explained ppt.pptx
MohamedIbrahim354734
Commit Conf 2025 Bitnami Charts with Kubescape
Commit Conf 2025 Bitnami Charts with KubescapeCommit Conf 2025 Bitnami Charts with Kubescape
Commit Conf 2025 Bitnami Charts with Kubescape
Alfredo Garc鱈a Lavilla
Top 10 Mobile Hacking Tools 2025 Edition
Top 10 Mobile Hacking Tools  2025 EditionTop 10 Mobile Hacking Tools  2025 Edition
Top 10 Mobile Hacking Tools 2025 Edition
anishachhikara2122
MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB Berlin Roadshow 際際滷s - 8 April 2025MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB plc
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
DianaGray10
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Tobias Schneck
UiPath Community Dubai: Discover Unified Apps
UiPath Community Dubai: Discover Unified AppsUiPath Community Dubai: Discover Unified Apps
UiPath Community Dubai: Discover Unified Apps
UiPathCommunity
Meet, Greet, and Explore Agentic AI with UiPath Scotland
Meet, Greet, and Explore Agentic AI with UiPath ScotlandMeet, Greet, and Explore Agentic AI with UiPath Scotland
Meet, Greet, and Explore Agentic AI with UiPath Scotland
UiPathCommunity
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdfBrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
Nick Samuel
Presentation Session 4 -Agent Builder.pdf
Presentation Session 4 -Agent Builder.pdfPresentation Session 4 -Agent Builder.pdf
Presentation Session 4 -Agent Builder.pdf
Mukesh Kala
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
Principled Technologies
FutureProofing the Nordic Economy with GenAI
FutureProofing the Nordic Economy with GenAIFutureProofing the Nordic Economy with GenAI
FutureProofing the Nordic Economy with GenAI
Pete Nieminen
April Patch Tuesday
April Patch TuesdayApril Patch Tuesday
April Patch Tuesday
Ivanti
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
CitrineOS: Bridging the Past and Future of EV Charging with OCPP 1.6 & 2.x Su...
DanBrown980551
AI in SEO Marketing Presentation (BY MetaSense Marketing)
AI in SEO Marketing Presentation (BY MetaSense Marketing)AI in SEO Marketing Presentation (BY MetaSense Marketing)
AI in SEO Marketing Presentation (BY MetaSense Marketing)
MetaSenseMarketing
SaaS Product Development Best Practices
SaaS Product Development Best PracticesSaaS Product Development Best Practices
SaaS Product Development Best Practices
ApptDev
How to Achieve High-Accuracy Results When Using LLMs
How to Achieve High-Accuracy Results When Using LLMsHow to Achieve High-Accuracy Results When Using LLMs
How to Achieve High-Accuracy Results When Using LLMs
Aggregage
Codequiry: A Code Similarity Checker Every Developer Should Know
Codequiry: A Code Similarity Checker Every Developer Should KnowCodequiry: A Code Similarity Checker Every Developer Should Know
Codequiry: A Code Similarity Checker Every Developer Should Know
Code Quiry
Health Promotion explained ppt.pptx
Health Promotion  explained ppt.pptxHealth Promotion  explained ppt.pptx
Health Promotion explained ppt.pptx
MohamedIbrahim354734
Commit Conf 2025 Bitnami Charts with Kubescape
Commit Conf 2025 Bitnami Charts with KubescapeCommit Conf 2025 Bitnami Charts with Kubescape
Commit Conf 2025 Bitnami Charts with Kubescape
Alfredo Garc鱈a Lavilla
Top 10 Mobile Hacking Tools 2025 Edition
Top 10 Mobile Hacking Tools  2025 EditionTop 10 Mobile Hacking Tools  2025 Edition
Top 10 Mobile Hacking Tools 2025 Edition
anishachhikara2122
MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB Berlin Roadshow 際際滷s - 8 April 2025MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB Berlin Roadshow 際際滷s - 8 April 2025
MariaDB plc
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
SAP Automation with UiPath: Leveraging AI for SAP Automation - Part 8 of 8
DianaGray10
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Evaluating Global Load Balancing Options for Kubernetes in Practice (Kubermat...
Tobias Schneck
UiPath Community Dubai: Discover Unified Apps
UiPath Community Dubai: Discover Unified AppsUiPath Community Dubai: Discover Unified Apps
UiPath Community Dubai: Discover Unified Apps
UiPathCommunity
Meet, Greet, and Explore Agentic AI with UiPath Scotland
Meet, Greet, and Explore Agentic AI with UiPath ScotlandMeet, Greet, and Explore Agentic AI with UiPath Scotland
Meet, Greet, and Explore Agentic AI with UiPath Scotland
UiPathCommunity
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdfBrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
BrightonSEO April 2025 - hreflang XML E-Commerce - Nick Samuel.pdf
Nick Samuel
Presentation Session 4 -Agent Builder.pdf
Presentation Session 4 -Agent Builder.pdfPresentation Session 4 -Agent Builder.pdf
Presentation Session 4 -Agent Builder.pdf
Mukesh Kala

Intorduction Distributed and Parallel Computing.ppt

  • 1. CS542: Topics in Distributed Systems Diganta Goswami
  • 2. Distributed System A collection of independent computers that appears to its users as a single coherent system. Autonomous computers Many components connected by a network sharing resources.
  • 3. Distributed System A System of networked components that communicate and coordinate their actions only by passing messages concurrent execution of programs no global clock components fail independently of one another
  • 4. Another definition You know you have a distributed system when the crash of a computer youve never heard of stops you from getting any work done. inter-dependencies shared state independent failure of components
  • 5. A working definition for us A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium using message passing. Entity=a process on a device (PC, PDA) Communication Medium=Wired or wireless network Our interest in distributed systems involves design and implementation, maintenance, algorithmics
  • 6. Important Distributed Systems Issues No global clock: no single global notion of the correct time (asynchrony) Unpredictable failures of components: lack of response may be due to either failure of a network component, network path being down, or a computer crash (failure-prone, unreliable) Highly variable bandwidth: from 16Kbps (slow modems or Google Balloon) to Gbps (Internet2) to Tbps (in between DCs of same big company) Possibly large and variable latency: few ms to several seconds Large numbers of hosts: 2 to several million
  • 7. There are a range of interesting problems for Distributed System designers Real distributed systems Cloud Computing, Peer to peer systems, Hadoop, distributed file systems, sensor networks, graph processing, Classical Problems Failure detection, Asynchrony, Snapshots, Multicast, Consensus, Mutual Exclusion, Election, Concurrency RPCs, Concurrency Control, Replication Control, Security Byzantine Faults, Others
  • 8. Typical Distributed Systems Design Goals Common Goals: Heterogeneity can the system handle a large variety of types of PCs and devices? Robustness is the system resilient to host crashes and failures, and to the network dropping messages? Availability are data+services always there for clients? Transparency can the system hide its internal workings from the users? Concurrency can the server handle multiple clients simultaneously? Efficiency is the service fast enough? Does it utilize 100% of all resources? Scalability can it handle 100 million nodes without degrading service? (nodes=clients and/or servers) Security can the system withstand hacker attacks? Openness is the system extensible?
  • 9. Challenges and Goals of Distributed Systems Heterogeneity Openness Security Scalability Failure handling Concurrency Transparency
  • 10. Challenges Heterogeneity (variety and difference) ofunderlying network infrastructure, Internet consists of many different sorts of network their differences are masked by the fact that all of the computers attached to them use the Internet Protocols for communication. e.g. a computer attached to an Ethernet has an implementation of the Internet Protocols over the Ethernet, whereas a computer on a different sort of network will need an implementation of the Internet Protocols for that network.
  • 11. Heterogeneity Computer hardware and software e.g., operating systems, compare UNIX socket and Winsock calls Programming languages : in particular, data representations
  • 12. Some approaches: Middleware A S/W layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, H/W, O/S and programming languages. Middleware (e.g., CORBA): transparency of network, hard- and software and programming language heterogeneity. JAVA RMI In addition to solving the problems of heterogeneity, middleware provides a uniform computational model for use by the programmers of servers and distributed applications.
  • 13. Positioning Middleware General structure of a distributed system as middleware. 1-22
  • 14. Openness Characteristic that determine whether the system can be extended and re-implemented in various ways. Determined primarily by the degree to which new resource sharing services can be added and be made available for use by a variety of client programs. Cannot be achieved unless the specification and documentation of the key s/w interfaces are made available to s/w developers (i.e. key interfaces are published)
  • 15. Openness Designers of the Internet protocols introduced a series of documents called RFCs Specifications of the Internet communication protocols Specifications for applications run over them 損 e.g., email, telnet, file transfer, etc. (by the mid 80s) RFCs are not the only means --- e.g. CORBA is published through a series of documents, including a complete specification of the interfaces of its services (www.omg.org)
  • 16. Openness Offering services according to standard rules that describe the syntax and semantics of those services e.g., Network protocol rules (RFCs) Services specified through interfaces Interface definition languages (IDLs) specifies names and available functions as well as parameters, return values, exceptions etc.
  • 17. Security Distributed systems must protect the shared information and resources The openness of DS makes them vulnerable to security threats Providing security is a significant challenge for DS
  • 18. Security. Privacy / Confidentiality: protection against disclosure to unauthorized individuals Integrity: protection against alteration or corruption Availability: protection against interference with the means to access the resources
  • 19. Scalability Scalable systemsystem that can handle additional number of users/resources without suffering noticeable loss of performance Three metrics of a scalable system No of user/resources Distance between the farthest nodes in the system (network radius) Number of organizations exerting control over the pieces of the system
  • 20. Challenges in designing scalable DS Controlling the cost of physical resources: As the demand for a resource grows, it should be possible to extend the system, at reasonable cost, to meet it. 損 e.g. it must be possible to add server computers to avoid the performance bottleneck that would arise if a single file server had to handle all file access request when the freq. of file access request grows in an intranet with the increase in users and computers. www.amazon.com is more than one computer
  • 21. Challenges in designing scalable DS Controlling the performance loss: Management of a set of data whose size is proportional to the number of users or resources in the system 損 e.g. the Domain Name System holds the table with the correspondence between domain names of computers and their Internet address 損 Hierarchic structures scale better than linar structures.
  • 22. Scaling Techniques 1.5 An example of dividing the DNS name space into zones.
  • 23. Challenges in designing scalable DS Preventing s/w resources running out: Numbers used as Internet address --- 32 bits was used in the late 70s but may run out soon. Change from 32 bits to 128 bits? Difficult to predict the demand. Over-compensating for future growth may be worse than adapting to a change when we are forced to - large Internet address occupy extra space in messages and in computer storage.
  • 24. Failure Handling Failure in a DS is partial Some components fail while others continue to function This makes handling of failures difficult.
  • 25. Techniques for dealing with failures Detecting failures may be impossible remote site crash or delay in message transmission? Some can be. Ex. - Checksums can be used to detect corrupted data
  • 26. Techniques for dealing with failures Masking failure Some can be hidden or made less severe Retransmission when messages fail to arrive
  • 27. Techniques for dealing with failures Tolerating failures Would not be practical to detect and hide all of the failures. Can be designed to tolerate some of those e.g. timeouts when waiting for a web resource clients give up after a predetermined number of attempts and take other actions & inform the user.
  • 28. Failure Handling Recovery from failures Rollback Undo/Redo in transactions Redundancy Makes the system more available through replication of resources/data Redundant routes in the network Replication of name tables in multiple domain name servers
  • 29. Concurrency In a distributed system it is possible that multiple machines/processes/users may try to access shared data/resource concurrently Can potentially lead to incorrect results and/or Deadlocks The operations must be synchronized/serialized so that the end result is correct
  • 30. Transparency Concealing the heterogeneous and distributed nature of the system so that it appears to the user like one system Making the user believe that there is only a single, undivided system i.e., to hide the notion of distribution completely What are the challenges of transparency?
  • 31. Transparency Categories Access transparency - access local and remote resources using identical operations e.g., users of UNIX NFS can use the same commands and parameters for file system operations regardless of whether the accessed files are on a local or remote disk.
  • 32. Transparency categories Location Transparency: Access without knowledge of location of a resource e.g., URLs, email addresses (hostname, IP addresses, etc. not required --- the part of the URL that identifies a web server domain name refers to a computer name in a domain, rather than to an Internet address)
  • 33. Transparency Categories Concurrency transparency: Allow several processes to operate concurrently using shared resources in a consistent fashion w/o interference between them. That is, users and programmers are unaware that components request services concurrently. Replication transparency Use replicated resource as if there was just one instance. 損 Increase reliability and performance w/o knowledge of the replicas by users or application programmers.
  • 34. Failure transparency Enables the concealment of faults, allowing users and application programs to complete their task despite failures of h/w or s/w components. Retransmit of email messages eventually delivered even when servers or communication links fail it may even take several days.
  • 35. Failure transparency Failure transparency depends on concurrency and replication transparency. Replication can be employed to achieve failure transparency Message transmission governed by TCP is a mechanism for providing failure transparency
  • 36. Mobility Transparency Mobility transparency: allow resources to move around w/o affecting the operation of users or programs e.g., 700 phone number but URLs are not, because someones personal web page cannot move to their new place of work in a different domain all of the links in other pages will still point to the original page!
  • 37. Transparency Categories Performance transparency: adaptation of the system to varying load situations without the user noticing it. Scaling transparency: allow system and applications to expand without need to change structure or application algorithms
  • 38. Degree of transparency There are systems in which attempting to blindly hide all distribution aspects from users is not always a good idea Requesting your electronic newspaper in your mailbox before 7 am local time while you are at the other end of the world living in a different time zone (Your morning paper will not be the morning paper you are used to)
  • 39. Degree of transparency There is trade-off between a high degree of transparency and the performance of a system Masking transient server failure by retransmitting the request may slow down the system If it is necessary to guarantee that several replicas need to be consistent all the time, a single update may take a long time something that cannot be hidden from the user.

Editor's Notes

  • #5: Designers and progammers interested in : algorithmics, maintenance. (informal definition, for us programmers of distributed systems) Ok: peer to peer systems Contradiction: a computer without ROM or disk drives that needs to boot over the network. Is a collection of these computers a distributed system?