1. ASeminar Presentationon FAULT TOLERANCE IN DISTRIBUTED SYSTEM Coordinator : Submitted By: Mr. JitendraYadavPankajMehra Lecturer Final Yr. I.T.
2. A distributed system consists of autonomous computing modules that interact with each other using messages. Physical separation and the use of heterogeneous computers complicate interprocessor communication, management of resources, synchronization of cooperating activities, and maintenance of consistency among multiple copies of information Distributed System
3. Why Distributed System A distributed system is a program that runs on several processing units at the same time. This partitioning across several processors and hosts may be necessary because of the following reasons. Â Processing throughput CPU specialization Fault tolerance. Repartition of the application on various sites
4. A distributed system is modeled as a graph with directed edges. Vertices are called processes. Directed edges are called communication channels (or simply channels) as shown in figure. Examples of systems that have fault-tolerant distributed implementations are databases, operatingsystems, communication busses, file systems, and server groups . Distributed System
7. A fault is a violation of a system’s underlying assumptions. An error is an internal data state that reflects a fault. A failure is an externally visible deviation from specifications Definition of Faults
8. Data Corruption Hanging Processes Misleading Return Values Misbehaving Machines Hardware/Software/Network Outages Over commitment of Resources Insufficient Disk Space Experienced Faults in Distributed Systems
14. The main goal of replication of data in distributed systems is maintaining copies on multiple computers The main benefits of replication of data can be classified as follows: Performance enhancement Reliability enhancement Data closer to client Share workload Increased availability Increased fault tolerance The constraints are classified below: How to keep data consistency (need to ensure a satisfactorily consistent image for clients) Where to place replicas and how updates are propagated Scalability REPLICATION OF DATA