Lecture objectives
1. Define : distributed system, networked system, decentralized system.
2. Recall the key design goals for distributed systems.
3. Identify common pitfalls associated with designing and deploying distributed
systems.
4. Describe the role of networked systems in enabling distributed functionalities.
5. Analyse the trade-offs between different design goals for distributed systems.
6. Explain the potential consequences of each pitfall associated with distributed
systems.
2. Course Presentation I
General objectives
• Understand design goals and challenges associated with building distributed
applications.
• Analyse different communication mechanisms and their suitability for various
scenarios.
• Explore techniques for coordinating processes and maintaining consistency in a
distributed environment.
• Identify and understand the purpose of naming services in distributed systems.
INFO 4218 - Distributed Computing UYI 2/30
3. Course Presentation II
Pre-requisites
• Operations systems and networking fundamentals
• System Programming (Shell, C, C++)
• Virtualisation (Virtualbox, Docker)
Requirements
• Commitment
• An operational laptop having virtualisation enabled
• Internet access
INFO 4218 - Distributed Computing UYI 3/30
4. Lecture objectives
1. Define : distributed system, networked system, decentralized system.
2. Recall the key design goals for distributed systems.
3. Identify common pitfalls associated with designing and deploying distributed
systems.
4. Describe the role of networked systems in enabling distributed functionalities.
5. Analyse the trade-offs between different design goals for distributed systems.
6. Explain the potential consequences of each pitfall associated with distributed
systems.
INFO 4218 - Distributed Computing UYI 4/30
7. Generalities
Networked systems I
Basics
• Devices communication
• What could be a device ?
Properties
• Topologies, architectures
• LAN, MAN, WAN
• Wired vs Wireless
INFO 4218 - Distributed Computing UYI 7/30
9. Generalities
Needs before distributed systems
Integrative view
• Several network computer systems providing different services
• Giving access to entities that were not thought before
• Example: in organisations
Expansive view
• Existing system in need of additional computers
• Expanding a system with computers to hold resources close to where those
resources are needed
INFO 4218 - Distributed Computing UYI 9/30
10. Generalities
Definitions
Centralized
• Single point of processes and resources
Decentralized
• Processes and resources are necessarily spread across multiple computers
• Necessary means required
Distributed
• Processes and resources are sufficiently spread across multiple computers
• Sufficient means enough
INFO 4218 - Distributed Computing UYI 10/30
12. Design goals
Resource sharing
Principle
• Easy access to remote resources
• Peripherals, storage facilities
• Data, files, services, and networks
Motivations
• Economic
• Exchange of information
• Collaboration
INFO 4218 - Distributed Computing UYI 12/30
13. Design goals
Transparency I
Principle
• Hide the fact that its processes and resources are physically distributed across
multiple computers, possibly separated by large distances.
INFO 4218 - Distributed Computing UYI 13/30
15. Design goals
Transparency III
Types of transparency Description
Access
Hide differences in data representation
and how an object is accessed
Location Hide where an object is located
Relocation
Hide that an object may be moved to
another location while in use
Migration
Hide that an object may move to another
location
Replication Hide that an object is replicated
Concurrency
Hide that an object may be shared
by several independent users
Failure Hide the failure and recovery of an object
INFO 4218 - Distributed Computing UYI 15/30
16. Design goals
Openness I
Principle
• An open distributed system is essentially a system that offers components that
can easily be used by, or integrated into other systems
Components
• Services
• Interface Definition Language for syntax and semantics
• Completeness and neutrality of specifications
INFO 4218 - Distributed Computing UYI 16/30
17. Design goals
Openness II
Interoperability
• Two implementations of systems or components
• Merely relying on each other’s services
Portability
• A and B are two distributed systems
• An app for A runs on B
• No modifications should be applied for the app to run
INFO 4218 - Distributed Computing UYI 17/30
19. Design goals
Dependability I
Principle
• The degree that a computer system can be relied upon to operate as expected
Fault tolerance
• Availability: a system is ready to be used immediately
• Reliability: a system can run continuously without failure
• Safety: no catastrophic event happens
• Maintainability: how easily a failed system can be repaired
INFO 4218 - Distributed Computing UYI 19/30
20. Design goals
Dependability II
Metrics
• Mean Time To Failure (MTTF): The average time until a component fails
• Mean Time To Repair (MTTR): The average time needed to repair a component
• Mean Time Between Failures (MTBF): Simply MTTF + MTTR.
Vocabulary
• Failures
• Errors
• Faults: transient, intermittent, permanent
INFO 4218 - Distributed Computing UYI 20/30
21. Design goals
Security I
Principle
• Confidentiality
• Integrity
• Authorization
• Authentication
• Trust
Key elements
INFO 4218 - Distributed Computing UYI 21/30
23. Design goals
Scalability I
Principle
Measurement
• Size: we can easily add more users and resources to the system without any
noticeable loss of performance
• Geographical: the users and resources may lie far apart, but the fact that
communication delays may be significant is hardly noticed
• Administrative: can still be easily managed even if it spans many independent
administrative organizations
INFO 4218 - Distributed Computing UYI 23/30
24. Design goals
Scalability II
Methods
• Scaling up : more capacity by replacement
• Scaling out : more devices
Techniques
• Hiding communication latencies
• Partitioning and distribution
• Replication
• Caching
INFO 4218 - Distributed Computing UYI 24/30
26. Pitfalls
False assumptions
Network state
• Reliable
• Secure
• Homogeneous
Network properties
• Latency is zero
• Topology is static
• Bandwidth is infinite
• No transport cost and one administrator
INFO 4218 - Distributed Computing UYI 26/30
28. Assignment: interprocess
communication
Using the command line...
1. Write a C program that takes as input an integer n and creates n processes.
2. Each process has a number (from 1 to n)
3. All the process should display their identity (PID) and their parent’s identity (PPID)
4. In addition to the previous requirement, even processes should initialise a large
size array and create p threads, each counting the number of occurrences of
each value of the array.
5. The generated array should be written in a file.
6. The work of each thread should be saved in a distinct file specifying the portion
of the file, the thread was working on.
INFO 4218 - Distributed Computing UYI 28/30
29. Assignment: interprocess
communication
Useful resources
• https://www.geeksforgeeks.org/fork-system-call/
• https://www.geeksforgeeks.org/basics-file-handling-c/
• https://www.geeksforgeeks.org/multithreading-in-c/
INFO 4218 - Distributed Computing UYI 29/30