ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
INTRODUCTION TO DISTRIBUTED
SYSTEMS
GBETNKOM NJIFON Jeff, M.Sc. Computer Science
ADAMOU HAMZA, PhD Computer Science
INFO 4218 — Faculté des Sciences - UYI
Course Presentation I
General objectives
• Understand design goals and challenges associated with building distributed
applications.
• Analyse different communication mechanisms and their suitability for various
scenarios.
• Explore techniques for coordinating processes and maintaining consistency in a
distributed environment.
• Identify and understand the purpose of naming services in distributed systems.
INFO 4218 - Distributed Computing UYI 2/30
Course Presentation II
Pre-requisites
• Operations systems and networking fundamentals
• System Programming (Shell, C, C++)
• Virtualisation (Virtualbox, Docker)
Requirements
• Commitment
• An operational laptop having virtualisation enabled
• Internet access
INFO 4218 - Distributed Computing UYI 3/30
Lecture objectives
1. Define : distributed system, networked system, decentralized system.
2. Recall the key design goals for distributed systems.
3. Identify common pitfalls associated with designing and deploying distributed
systems.
4. Describe the role of networked systems in enabling distributed functionalities.
5. Analyse the trade-offs between different design goals for distributed systems.
6. Explain the potential consequences of each pitfall associated with distributed
systems.
INFO 4218 - Distributed Computing UYI 4/30
Lecture Plan
Generalities
Design goals
Pitfalls
Assignment: interprocess communication
INFO 4218 - Distributed Computing UYI 5/30
Generalities
Generalities
Networked systems I
Basics
• Devices communication
• What could be a device ?
Properties
• Topologies, architectures
• LAN, MAN, WAN
• Wired vs Wireless
INFO 4218 - Distributed Computing UYI 7/30
Generalities
Networked systems II
INFO 4218 - Distributed Computing UYI 8/30
Generalities
Needs before distributed systems
Integrative view
• Several network computer systems providing different services
• Giving access to entities that were not thought before
• Example: in organisations
Expansive view
• Existing system in need of additional computers
• Expanding a system with computers to hold resources close to where those
resources are needed
INFO 4218 - Distributed Computing UYI 9/30
Generalities
Definitions
Centralized
• Single point of processes and resources
Decentralized
• Processes and resources are necessarily spread across multiple computers
• Necessary means required
Distributed
• Processes and resources are sufficiently spread across multiple computers
• Sufficient means enough
INFO 4218 - Distributed Computing UYI 10/30
Design goals
Design goals
Resource sharing
Principle
• Easy access to remote resources
• Peripherals, storage facilities
• Data, files, services, and networks
Motivations
• Economic
• Exchange of information
• Collaboration
INFO 4218 - Distributed Computing UYI 12/30
Design goals
Transparency I
Principle
• Hide the fact that its processes and resources are physically distributed across
multiple computers, possibly separated by large distances.
INFO 4218 - Distributed Computing UYI 13/30
Design goals
Transparency II
INFO 4218 - Distributed Computing UYI 14/30
Design goals
Transparency III
Types of transparency Description
Access
Hide differences in data representation
and how an object is accessed
Location Hide where an object is located
Relocation
Hide that an object may be moved to
another location while in use
Migration
Hide that an object may move to another
location
Replication Hide that an object is replicated
Concurrency
Hide that an object may be shared
by several independent users
Failure Hide the failure and recovery of an object
INFO 4218 - Distributed Computing UYI 15/30
Design goals
Openness I
Principle
• An open distributed system is essentially a system that offers components that
can easily be used by, or integrated into other systems
Components
• Services
• Interface Definition Language for syntax and semantics
• Completeness and neutrality of specifications
INFO 4218 - Distributed Computing UYI 16/30
Design goals
Openness II
Interoperability
• Two implementations of systems or components
• Merely relying on each other’s services
Portability
• A and B are two distributed systems
• An app for A runs on B
• No modifications should be applied for the app to run
INFO 4218 - Distributed Computing UYI 17/30
Design goals
Openness III
Extensibility
• Add parts that run on a different operating system
• Replace an entire filesystem
INFO 4218 - Distributed Computing UYI 18/30
Design goals
Dependability I
Principle
• The degree that a computer system can be relied upon to operate as expected
Fault tolerance
• Availability: a system is ready to be used immediately
• Reliability: a system can run continuously without failure
• Safety: no catastrophic event happens
• Maintainability: how easily a failed system can be repaired
INFO 4218 - Distributed Computing UYI 19/30
Design goals
Dependability II
Metrics
• Mean Time To Failure (MTTF): The average time until a component fails
• Mean Time To Repair (MTTR): The average time needed to repair a component
• Mean Time Between Failures (MTBF): Simply MTTF + MTTR.
Vocabulary
• Failures
• Errors
• Faults: transient, intermittent, permanent
INFO 4218 - Distributed Computing UYI 20/30
Design goals
Security I
Principle
• Confidentiality
• Integrity
• Authorization
• Authentication
• Trust
Key elements
INFO 4218 - Distributed Computing UYI 21/30
Design goals
Security II
• Encryption
• Decryption
INFO 4218 - Distributed Computing UYI 22/30
Design goals
Scalability I
Principle
Measurement
• Size: we can easily add more users and resources to the system without any
noticeable loss of performance
• Geographical: the users and resources may lie far apart, but the fact that
communication delays may be significant is hardly noticed
• Administrative: can still be easily managed even if it spans many independent
administrative organizations
INFO 4218 - Distributed Computing UYI 23/30
Design goals
Scalability II
Methods
• Scaling up : more capacity by replacement
• Scaling out : more devices
Techniques
• Hiding communication latencies
• Partitioning and distribution
• Replication
• Caching
INFO 4218 - Distributed Computing UYI 24/30
Pitfalls
Pitfalls
False assumptions
Network state
• Reliable
• Secure
• Homogeneous
Network properties
• Latency is zero
• Topology is static
• Bandwidth is infinite
• No transport cost and one administrator
INFO 4218 - Distributed Computing UYI 26/30
Assignment: interprocess
communication
Assignment: interprocess
communication
Using the command line...
1. Write a C program that takes as input an integer n and creates n processes.
2. Each process has a number (from 1 to n)
3. All the process should display their identity (PID) and their parent’s identity (PPID)
4. In addition to the previous requirement, even processes should initialise a large
size array and create p threads, each counting the number of occurrences of
each value of the array.
5. The generated array should be written in a file.
6. The work of each thread should be saved in a distinct file specifying the portion
of the file, the thread was working on.
INFO 4218 - Distributed Computing UYI 28/30
Assignment: interprocess
communication
Useful resources
• https://www.geeksforgeeks.org/fork-system-call/
• https://www.geeksforgeeks.org/basics-file-handling-c/
• https://www.geeksforgeeks.org/multithreading-in-c/
INFO 4218 - Distributed Computing UYI 29/30
Distributed_Systems - introduction-definition.pdf

More Related Content

Distributed_Systems - introduction-definition.pdf

  • 1. INTRODUCTION TO DISTRIBUTED SYSTEMS GBETNKOM NJIFON Jeff, M.Sc. Computer Science ADAMOU HAMZA, PhD Computer Science INFO 4218 — Faculté des Sciences - UYI
  • 2. Course Presentation I General objectives • Understand design goals and challenges associated with building distributed applications. • Analyse different communication mechanisms and their suitability for various scenarios. • Explore techniques for coordinating processes and maintaining consistency in a distributed environment. • Identify and understand the purpose of naming services in distributed systems. INFO 4218 - Distributed Computing UYI 2/30
  • 3. Course Presentation II Pre-requisites • Operations systems and networking fundamentals • System Programming (Shell, C, C++) • Virtualisation (Virtualbox, Docker) Requirements • Commitment • An operational laptop having virtualisation enabled • Internet access INFO 4218 - Distributed Computing UYI 3/30
  • 4. Lecture objectives 1. Define : distributed system, networked system, decentralized system. 2. Recall the key design goals for distributed systems. 3. Identify common pitfalls associated with designing and deploying distributed systems. 4. Describe the role of networked systems in enabling distributed functionalities. 5. Analyse the trade-offs between different design goals for distributed systems. 6. Explain the potential consequences of each pitfall associated with distributed systems. INFO 4218 - Distributed Computing UYI 4/30
  • 5. Lecture Plan Generalities Design goals Pitfalls Assignment: interprocess communication INFO 4218 - Distributed Computing UYI 5/30
  • 7. Generalities Networked systems I Basics • Devices communication • What could be a device ? Properties • Topologies, architectures • LAN, MAN, WAN • Wired vs Wireless INFO 4218 - Distributed Computing UYI 7/30
  • 8. Generalities Networked systems II INFO 4218 - Distributed Computing UYI 8/30
  • 9. Generalities Needs before distributed systems Integrative view • Several network computer systems providing different services • Giving access to entities that were not thought before • Example: in organisations Expansive view • Existing system in need of additional computers • Expanding a system with computers to hold resources close to where those resources are needed INFO 4218 - Distributed Computing UYI 9/30
  • 10. Generalities Definitions Centralized • Single point of processes and resources Decentralized • Processes and resources are necessarily spread across multiple computers • Necessary means required Distributed • Processes and resources are sufficiently spread across multiple computers • Sufficient means enough INFO 4218 - Distributed Computing UYI 10/30
  • 12. Design goals Resource sharing Principle • Easy access to remote resources • Peripherals, storage facilities • Data, files, services, and networks Motivations • Economic • Exchange of information • Collaboration INFO 4218 - Distributed Computing UYI 12/30
  • 13. Design goals Transparency I Principle • Hide the fact that its processes and resources are physically distributed across multiple computers, possibly separated by large distances. INFO 4218 - Distributed Computing UYI 13/30
  • 14. Design goals Transparency II INFO 4218 - Distributed Computing UYI 14/30
  • 15. Design goals Transparency III Types of transparency Description Access Hide differences in data representation and how an object is accessed Location Hide where an object is located Relocation Hide that an object may be moved to another location while in use Migration Hide that an object may move to another location Replication Hide that an object is replicated Concurrency Hide that an object may be shared by several independent users Failure Hide the failure and recovery of an object INFO 4218 - Distributed Computing UYI 15/30
  • 16. Design goals Openness I Principle • An open distributed system is essentially a system that offers components that can easily be used by, or integrated into other systems Components • Services • Interface Definition Language for syntax and semantics • Completeness and neutrality of specifications INFO 4218 - Distributed Computing UYI 16/30
  • 17. Design goals Openness II Interoperability • Two implementations of systems or components • Merely relying on each other’s services Portability • A and B are two distributed systems • An app for A runs on B • No modifications should be applied for the app to run INFO 4218 - Distributed Computing UYI 17/30
  • 18. Design goals Openness III Extensibility • Add parts that run on a different operating system • Replace an entire filesystem INFO 4218 - Distributed Computing UYI 18/30
  • 19. Design goals Dependability I Principle • The degree that a computer system can be relied upon to operate as expected Fault tolerance • Availability: a system is ready to be used immediately • Reliability: a system can run continuously without failure • Safety: no catastrophic event happens • Maintainability: how easily a failed system can be repaired INFO 4218 - Distributed Computing UYI 19/30
  • 20. Design goals Dependability II Metrics • Mean Time To Failure (MTTF): The average time until a component fails • Mean Time To Repair (MTTR): The average time needed to repair a component • Mean Time Between Failures (MTBF): Simply MTTF + MTTR. Vocabulary • Failures • Errors • Faults: transient, intermittent, permanent INFO 4218 - Distributed Computing UYI 20/30
  • 21. Design goals Security I Principle • Confidentiality • Integrity • Authorization • Authentication • Trust Key elements INFO 4218 - Distributed Computing UYI 21/30
  • 22. Design goals Security II • Encryption • Decryption INFO 4218 - Distributed Computing UYI 22/30
  • 23. Design goals Scalability I Principle Measurement • Size: we can easily add more users and resources to the system without any noticeable loss of performance • Geographical: the users and resources may lie far apart, but the fact that communication delays may be significant is hardly noticed • Administrative: can still be easily managed even if it spans many independent administrative organizations INFO 4218 - Distributed Computing UYI 23/30
  • 24. Design goals Scalability II Methods • Scaling up : more capacity by replacement • Scaling out : more devices Techniques • Hiding communication latencies • Partitioning and distribution • Replication • Caching INFO 4218 - Distributed Computing UYI 24/30
  • 26. Pitfalls False assumptions Network state • Reliable • Secure • Homogeneous Network properties • Latency is zero • Topology is static • Bandwidth is infinite • No transport cost and one administrator INFO 4218 - Distributed Computing UYI 26/30
  • 28. Assignment: interprocess communication Using the command line... 1. Write a C program that takes as input an integer n and creates n processes. 2. Each process has a number (from 1 to n) 3. All the process should display their identity (PID) and their parent’s identity (PPID) 4. In addition to the previous requirement, even processes should initialise a large size array and create p threads, each counting the number of occurrences of each value of the array. 5. The generated array should be written in a file. 6. The work of each thread should be saved in a distinct file specifying the portion of the file, the thread was working on. INFO 4218 - Distributed Computing UYI 28/30
  • 29. Assignment: interprocess communication Useful resources • https://www.geeksforgeeks.org/fork-system-call/ • https://www.geeksforgeeks.org/basics-file-handling-c/ • https://www.geeksforgeeks.org/multithreading-in-c/ INFO 4218 - Distributed Computing UYI 29/30