SlideShare ist ein Scribd-Unternehmen logo
1 von 31
I cannot cover
                                           Distributed Systems
                                              in 30 minutes!



But, I can tell why
you might want to
 learn Distributed
   Systems in 30
     minutes!
        http://www.flickr.com/photos/uwehermann/82753155/sizes/m/in/photostream/ and
            http://www.flickr.com/photos/peterpearson/5921765552, licensed under CC
What is a Distributed System?
              "A distributed system is
             one on which I cannot get
              any work done because
               some machine I have
                never heard of has
                      crashed.“
                  --Leslie Lamport
What is a Distributed System?




“A system in which hardware or   “A distributed system is a
software components located
                                 collection of independent
at networked computers
communicate and coordinate
                                 computers that appear to the
their actions only by message    users of the system as a single
passing.” - [Coulouris]          coherent system.” - [Tanenbaum]
Characteristics and Challenges
• No Global Clock                                        • Fault
                                                           Tolerance
• Communication
                                                         • Scale
  only by message                                        • Transparenc
  Passing
• No Global State
• Independent
  Failures




                                             Photo by John Trainoron Flickr
                         http://www.flickr.com/photos/trainor/2902023575/, Licensed u
Fallacies of Distributed Systems




•   The network is reliable.                  • There is one
•   Latency is zero.                            administrator.
•   Bandwidth is infinite.                    • Transport cost is zero.
•   The network is secure.                    • The network is
•   Topology doesn't change.                    homogeneous.
                      http://www.flickr.com/photos/12587661@N06/2300406685, @Michael Gwyther-Jones, L
Why Distributed Systems
•   Need to build bigger systems
•   Many usecases are inherently distributed
•   To avoid failures
•   Omnipresence
    –   if you buy food from a super market
    –   If you buy a book from a Bookshop Chain
    –   If you search in the Web
    –   If you use a GPS navigator
    –   If you turn on your My 10 list
    –   If you pay a bill
    –   If you use your mobile App
A System Usecase Classification
• Processing Data
  (Moving vs. Stored
  Data)
• Servers: Receive,
  Process, and Respond
• Running User provided
  Jobs
• Data Storages and
  Provenance

                          http://www.flickr.com/photos/kelsea-groves/5535666329/
Usecase: Processing Data: React to Sensors




 • Many sensors: Weather, Travel, Traffic, Surveillance, Stock
   exchange, Smart Grid, Production line
 • Monitor, understand, and react to events
 • Usually handled with CEP (e.g. Esper, Stream Base, Siddhi) or
   Stream Processing (S4, Twitter Stream)
                                  http://www.flickr.com/photos/imuttoo/4257813689/ by Ian
Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David
                                                         copyright CC
Usecase: Processing Data: Target Marketing




• Receive data about users continuously: e.g. web
  clicks, what they brought, what they liked and do not
  like, what their friends like and brought
• Build models, index information in the background
• Send him advertisements that best matches his
  preferences
   – have to do this quickly
   – in few (say 50) milliseconds
• Cloud be the next billion dollar problem
Usecase: Receive, Process, and Respond:
          Online Store (e.g. Amazon)
                                       • Many Sellers selling
                                         many items and
                                         Many Byers
                                       • List of all items,
                                         with their specs
                                       • Index items by
                                         many dimensions
                                         and support search

• Support checkout, track the delivery, returns, ratings, and
  complains
• Supported by partitioning sellers/ items across many nodes
Usecase: Running User Provided Jobs :
            SETI@Home
• Many people volunteer
  their computing power
• Scientists submit
  computing jobs to the
  system
• Broker and match
  resources with jobs, run
  them and return results.
  Handle failures. Avoid
  free riding.
• Considered biggest
  computer in earth (505
  TFLOPS, 150k active
  computers)
            http://www.elfwood.com/~axthony/Staring-Aliens.2552052.html, Licensed CC
Usecase: Data Storages and Provenance
             (Sky Server)
                                                       • Telescopes (Square Kilometer
                                                         Array) keep collecting data from
                                                         the sky (Tera bytes per day)
                                                       • Sky Server let scientists to come
                                                         and see the sky of a given
                                                         location, as seen at a given
                                                         time.
                                                       • Moving data takes long time.
                                                         1TB takes
                                                           – 100 Mbps network : 30 hrs
                                                           – 1 Gbps network : 3 hrs
                                                           – 10 Gbps network : 20 minutes
                                                       • Given a data item, need to track
                                                         how it is created, equipment
                                                         accuracy, transformations used
 http://www.fotopedia.com/items/flickr-518876976 and
                                                         etc.
http://www.geograph.org.uk/photo/103069, Licensed CC
Mobile Sensor Crowdsourcing
                                                                     • Mobile phones are now like a
                                                                       weather center: has
                                                                        –   a barometer
                                                                        –   temperature sensor
                                                                        –   proximity sensor
                                                                        –   GPS
                                                                        –   moisture sensor
                                                                     • Get volunteer phones to send
                                                                       sensor data (Crowd source).
                                                                        – report on weather
                                                                        – crop diseases (agriculture
                                                                          officials)
                                                                        – epidemics (from hospitals,
                                                                          doctors)
                                                                     • Use that to do weather
                                                                       predications, crop disease and
         http://www.fotopedia.com/items/flickr-2548697541 ,            epidemic spread
          http://www.geograph.org.uk/photo/1534209, and
http://www.yourbdnews.com/2011/10/17/samsung-files-to-halt-iphone-   • Moving Sensors (Polar Grid)
              4s-in-japan-australia/iphone-4s, Licensed CC
Great! lets see what
     Distributed System
technologies have made these
     use cases possible!!
Distributed Systems Timeline/History
Period          Topics


1965-late 70s   Parallel Programming, Self Stabilization, Fault Tolerance, ER Model/
                Transactions, Time Clock
1980s           Consensus and impossibility, SQL, Distributed Snapshots,
                Replications, Group Communication


Early 90s       Linearizability, Parallel DB, transactional Memory, RAID, MPI


Late 90s        Volunteer Computing, P2P file sharing, Complex event processing


Early 2000      Oceanostore, Web Services, Symantec Web, REST, DHT, Pub/Sub,
                Grid, Autonomic Computing, Google File System, Virtualization, SOA,
                Map reduce
2005-2010       Cloud, NoSQL, Mobile Apps, Data Provenance
Theoretical Computer Science
                                      • Concerns with
                                             – Coordination algorithms:
                                               Leader Election, multi-cast,
                                               distributed locks, barriers,
                                               snapshot algorithms
                                             – Impossibility results, upper
                                               and lower bounds
                                             – Distributed versions of some
                                               centralized algorithms (e.g.
                                               shortest path)
                                             – Lot of work done on 70s,
                                               and layed the ground work
                                               for Distributed Systems
          http://www.flickr.com/photos/lodz_na_nowo/5690492370/
                             http://xkcd.com/384/
 http://www.flickr.com/photos/quinnanya/4990131194/sizes/z/in/photostream/
                                 , Licensed CC
Communication Protocols
• Request/Response
  – RMI, CORBA, REST/HTTP,
    WS, Thrift
• Publish/Subscribe
• Distributed Queues
• DHT (Distributed Hash
  Tables)
• Gossip/ Epidemic
  Protocols
• Whiteboards             http://www.flickr.com/photos/novecentino/2596898279/, Licensed CC
Request/Response and Architectural
              Styles
• Message formats
  • RMI, CORBA, REST/HTTP, Web Service, Thrift
• Architectural Styles
  – Remote Procedure Calls (RPC)
  – Distributed Objects
  – Service Oriented Architecture (SOA)
  – Resource Oriented Architecture (ROA)
Known Distributed Architecture
               Patterns
• LB + Shared nothing Nodes
• LB + Stateless Nodes + Scalable
  Storage
• DHT (Distributed Hash Table)
• Distributed queues
• Publish/Subscribe Broker
  Network
• Gossip architectures + biology
  inspired algorithms
• Map reduce/ data flows
• Stream processing
• Tree of responsibility
LB + Shared Nothing and 3-Tier




• Most common scaling pattern
• Most architectures follows this model
Storages
• Single Database
• Replicated Databases
• Parallel Databases
  (Sharding)
• NewSQL (In-
  Memory, sharding .. Highly
  optimized)
• NoSQL (Column Family, Key
  Value pair, Document)
Building Scalable Systems
• Single Machine
• Shared Memory
  Model
• Clustering (State
  Replication
  through group
  communication)
• Shard Nothing
• Loose Consistency
  with Shared
  nothing             http://www.fotopedia.com/items/louromig-8P4w6xtSgbY, Licensed CC
Publish Subscribe and EDA




• Many publishers send events
• Subscribers register events, and a
  publish/subscribe network match and redirect
  events
• Have scalable implementations
• Basis for event driven architectures
Cloud Computing
• Ability to buy computations
  power, storage, or execution
  services as an Utility, on demand.
• Best way to explain it is by
  comparing it to Electricity
• Idea is a big pool of servers and
  share.
  • Economics of scale through Optimize
    large scale operations.
  • Resource Pooling.
  • No need for capacity planning, start
    small and grow as needed.
  • Outsource and enabling
    specialization.
                                             photo by LoopZilla on Flickr,
                       http://www.flickr.com/photos/loopzilla/2328231843/sizes/m/in/photostre
Where do go from here?
If You Plan to Learn about Distributed
                   Systems
• One of the fields to learn by
  doing
• You have to be a good
  programmer
   – a patient one (Debugging)
   – Lazy one (but intelligent)
• Start by writing some Web
  Services, request response stuff
• Stop reinventing the wheel, start
  using tools (middleware)
• Learn Zookeeper
• Take a class – read, write code,
  debug, ..
                                      http://www.flickr.com/photos/mariachily/5250487136,
                                                           Licensed CC
Distributed System Community
•   Based around ACM, IEEE, and USENIX
•   Well known journals
     – IBM System journal, ACM Operating Systems Review,
        ACM Transactions on Computer Systems, IEEE
        Distributed Systems Online, IEEE Transactions on
        Parallel and Distributed Systems
•   Conferences
     – Theory: ICDCS, SPDC
     – SOA/Cloud : ICWS
     – E-Science, Parallel Programming : HPDC, SC, E-
        Science, Ccgrid
     – Systems : USENIX, Middleware, ACM Symposium on
        Operating Systems Principles, FAST, LISA, OSDI
     – DB : Sigmoid record, VLDB
•   Awards
     – Turing Award
     – Edsger W. Dijkstra Prize in Distributed Computing
      http://www.flickr.com/photos/dullhunk/4187914071, http://www.foto
               pedia.com/items/flickr-1544709148, Licensed CC
Few Must Read Papers
•   System Structure for Software Fault Tolerance (1975)
•   Reaching Agreement in the Presence of Faults (1980)
•   Time, Clocks, and the Ordering of Events in a Distributed System (1978)
•   Reaching agreement in the presence of faults(1980) and The Byzantine
    generals problem” (1982),
•   End-to-End Arguments in System Design (1984)
•   A Note on Distributed Computing (1994)
•   Scale in Distributed Systems, (1994)
•   The Google File System (2003)
•   Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications,
    (2001)
•   The Google file system (2003)
•   Xen and the Art of virtualization (2003)
•   MapReduce: Simplified Data Processing on Large Clusters (2004)
Some Open Challenges
• Every thing Data: Analytics, AI,
  Data Mining (Distributed
  versions of many algorithms)
• Complex Event Processing
  (CEP)
• How to Scale?
• Middleware for the Cloud
• Scalable Storage
• Provenance
• Workflows
• Guard against DDoS and other       http://www.flickr.com/photos/brianscott/5474210001,
  Distributed Security Issues                             Licensed CC
Questions?




Copyright by romainguy, and licensed for reuse under CC License
    http://www.flickr.com/photos/romainguy/249370084

Weitere ähnliche Inhalte

Ähnlich wie Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.pptSahanaMk2
 
wirelss sensor network
wirelss sensor networkwirelss sensor network
wirelss sensor networkrasyidi usman
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesSrinath Perera
 
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...TimeScience
 
The ways in which ict is used
The ways in which ict is usedThe ways in which ict is used
The ways in which ict is usedgracepm28
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensorajsatienza
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptxSamPrem3
 
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004Jason Hong
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in BusinessNiko Vuokko
 
Real time visualization of structured things
Real time visualization of structured thingsReal time visualization of structured things
Real time visualization of structured thingsNurul Amin Choudhury
 
MC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxMC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxBinyamBekeleMoges
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsVille Antila
 
Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...ICDEcCnferenece
 
III CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxIII CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxAvinashAvuthu2
 
Citron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesCitron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesTetsuo Yamabe
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containersbenaam
 
Artificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + AeronauticsArtificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + Aeronauticswaleed zahid kayani
 

Ähnlich wie Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How? (20)

ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.ppt
 
ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.ppt
 
wirelss sensor network
wirelss sensor networkwirelss sensor network
wirelss sensor network
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected Devices
 
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
 
The ways in which ict is used
The ways in which ict is usedThe ways in which ict is used
The ways in which ict is used
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensor
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptx
 
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in Business
 
Lecture3 - VR Technology
Lecture3 - VR TechnologyLecture3 - VR Technology
Lecture3 - VR Technology
 
Real time visualization of structured things
Real time visualization of structured thingsReal time visualization of structured things
Real time visualization of structured things
 
MC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxMC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptx
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior Patterns
 
Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...
 
III CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxIII CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptx
 
Citron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesCitron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal Devices
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containers
 
Artificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + AeronauticsArtificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + Aeronautics
 

Mehr von Srinath Perera

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingSrinath Perera
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs Srinath Perera
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsSrinath Perera
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?Srinath Perera
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesSrinath Perera
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?Srinath Perera
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsSrinath Perera
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Srinath Perera
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of BlockchainSrinath Perera
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesSrinath Perera
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata EraSrinath Perera
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksSrinath Perera
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeSrinath Perera
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies TimelineSrinath Perera
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglySrinath Perera
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through AnalyticsSrinath Perera
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
 

Mehr von Srinath Perera (20)

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-Making
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance Professionals
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & Challenges
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future Integrations
 
Future of Serverless
Future of ServerlessFuture of Serverless
Future of Serverless
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going?
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of Blockchain
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New Technologies
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata Era
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and Risks
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology Landscape
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies Timeline
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming Applications
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the Ugly
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through Analytics
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration Technology
 

Kürzlich hochgeladen

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Kürzlich hochgeladen (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

  • 1.
  • 2. I cannot cover Distributed Systems in 30 minutes! But, I can tell why you might want to learn Distributed Systems in 30 minutes! http://www.flickr.com/photos/uwehermann/82753155/sizes/m/in/photostream/ and http://www.flickr.com/photos/peterpearson/5921765552, licensed under CC
  • 3. What is a Distributed System? "A distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed.“ --Leslie Lamport
  • 4. What is a Distributed System? “A system in which hardware or “A distributed system is a software components located collection of independent at networked computers communicate and coordinate computers that appear to the their actions only by message users of the system as a single passing.” - [Coulouris] coherent system.” - [Tanenbaum]
  • 5. Characteristics and Challenges • No Global Clock • Fault Tolerance • Communication • Scale only by message • Transparenc Passing • No Global State • Independent Failures Photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed u
  • 6. Fallacies of Distributed Systems • The network is reliable. • There is one • Latency is zero. administrator. • Bandwidth is infinite. • Transport cost is zero. • The network is secure. • The network is • Topology doesn't change. homogeneous. http://www.flickr.com/photos/12587661@N06/2300406685, @Michael Gwyther-Jones, L
  • 7. Why Distributed Systems • Need to build bigger systems • Many usecases are inherently distributed • To avoid failures • Omnipresence – if you buy food from a super market – If you buy a book from a Bookshop Chain – If you search in the Web – If you use a GPS navigator – If you turn on your My 10 list – If you pay a bill – If you use your mobile App
  • 8. A System Usecase Classification • Processing Data (Moving vs. Stored Data) • Servers: Receive, Process, and Respond • Running User provided Jobs • Data Storages and Provenance http://www.flickr.com/photos/kelsea-groves/5535666329/
  • 9. Usecase: Processing Data: React to Sensors • Many sensors: Weather, Travel, Traffic, Surveillance, Stock exchange, Smart Grid, Production line • Monitor, understand, and react to events • Usually handled with CEP (e.g. Esper, Stream Base, Siddhi) or Stream Processing (S4, Twitter Stream) http://www.flickr.com/photos/imuttoo/4257813689/ by Ian Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David copyright CC
  • 10. Usecase: Processing Data: Target Marketing • Receive data about users continuously: e.g. web clicks, what they brought, what they liked and do not like, what their friends like and brought • Build models, index information in the background • Send him advertisements that best matches his preferences – have to do this quickly – in few (say 50) milliseconds • Cloud be the next billion dollar problem
  • 11. Usecase: Receive, Process, and Respond: Online Store (e.g. Amazon) • Many Sellers selling many items and Many Byers • List of all items, with their specs • Index items by many dimensions and support search • Support checkout, track the delivery, returns, ratings, and complains • Supported by partitioning sellers/ items across many nodes
  • 12. Usecase: Running User Provided Jobs : SETI@Home • Many people volunteer their computing power • Scientists submit computing jobs to the system • Broker and match resources with jobs, run them and return results. Handle failures. Avoid free riding. • Considered biggest computer in earth (505 TFLOPS, 150k active computers) http://www.elfwood.com/~axthony/Staring-Aliens.2552052.html, Licensed CC
  • 13. Usecase: Data Storages and Provenance (Sky Server) • Telescopes (Square Kilometer Array) keep collecting data from the sky (Tera bytes per day) • Sky Server let scientists to come and see the sky of a given location, as seen at a given time. • Moving data takes long time. 1TB takes – 100 Mbps network : 30 hrs – 1 Gbps network : 3 hrs – 10 Gbps network : 20 minutes • Given a data item, need to track how it is created, equipment accuracy, transformations used http://www.fotopedia.com/items/flickr-518876976 and etc. http://www.geograph.org.uk/photo/103069, Licensed CC
  • 14. Mobile Sensor Crowdsourcing • Mobile phones are now like a weather center: has – a barometer – temperature sensor – proximity sensor – GPS – moisture sensor • Get volunteer phones to send sensor data (Crowd source). – report on weather – crop diseases (agriculture officials) – epidemics (from hospitals, doctors) • Use that to do weather predications, crop disease and http://www.fotopedia.com/items/flickr-2548697541 , epidemic spread http://www.geograph.org.uk/photo/1534209, and http://www.yourbdnews.com/2011/10/17/samsung-files-to-halt-iphone- • Moving Sensors (Polar Grid) 4s-in-japan-australia/iphone-4s, Licensed CC
  • 15. Great! lets see what Distributed System technologies have made these use cases possible!!
  • 16. Distributed Systems Timeline/History Period Topics 1965-late 70s Parallel Programming, Self Stabilization, Fault Tolerance, ER Model/ Transactions, Time Clock 1980s Consensus and impossibility, SQL, Distributed Snapshots, Replications, Group Communication Early 90s Linearizability, Parallel DB, transactional Memory, RAID, MPI Late 90s Volunteer Computing, P2P file sharing, Complex event processing Early 2000 Oceanostore, Web Services, Symantec Web, REST, DHT, Pub/Sub, Grid, Autonomic Computing, Google File System, Virtualization, SOA, Map reduce 2005-2010 Cloud, NoSQL, Mobile Apps, Data Provenance
  • 17. Theoretical Computer Science • Concerns with – Coordination algorithms: Leader Election, multi-cast, distributed locks, barriers, snapshot algorithms – Impossibility results, upper and lower bounds – Distributed versions of some centralized algorithms (e.g. shortest path) – Lot of work done on 70s, and layed the ground work for Distributed Systems http://www.flickr.com/photos/lodz_na_nowo/5690492370/ http://xkcd.com/384/ http://www.flickr.com/photos/quinnanya/4990131194/sizes/z/in/photostream/ , Licensed CC
  • 18. Communication Protocols • Request/Response – RMI, CORBA, REST/HTTP, WS, Thrift • Publish/Subscribe • Distributed Queues • DHT (Distributed Hash Tables) • Gossip/ Epidemic Protocols • Whiteboards http://www.flickr.com/photos/novecentino/2596898279/, Licensed CC
  • 19. Request/Response and Architectural Styles • Message formats • RMI, CORBA, REST/HTTP, Web Service, Thrift • Architectural Styles – Remote Procedure Calls (RPC) – Distributed Objects – Service Oriented Architecture (SOA) – Resource Oriented Architecture (ROA)
  • 20. Known Distributed Architecture Patterns • LB + Shared nothing Nodes • LB + Stateless Nodes + Scalable Storage • DHT (Distributed Hash Table) • Distributed queues • Publish/Subscribe Broker Network • Gossip architectures + biology inspired algorithms • Map reduce/ data flows • Stream processing • Tree of responsibility
  • 21. LB + Shared Nothing and 3-Tier • Most common scaling pattern • Most architectures follows this model
  • 22. Storages • Single Database • Replicated Databases • Parallel Databases (Sharding) • NewSQL (In- Memory, sharding .. Highly optimized) • NoSQL (Column Family, Key Value pair, Document)
  • 23. Building Scalable Systems • Single Machine • Shared Memory Model • Clustering (State Replication through group communication) • Shard Nothing • Loose Consistency with Shared nothing http://www.fotopedia.com/items/louromig-8P4w6xtSgbY, Licensed CC
  • 24. Publish Subscribe and EDA • Many publishers send events • Subscribers register events, and a publish/subscribe network match and redirect events • Have scalable implementations • Basis for event driven architectures
  • 25. Cloud Computing • Ability to buy computations power, storage, or execution services as an Utility, on demand. • Best way to explain it is by comparing it to Electricity • Idea is a big pool of servers and share. • Economics of scale through Optimize large scale operations. • Resource Pooling. • No need for capacity planning, start small and grow as needed. • Outsource and enabling specialization. photo by LoopZilla on Flickr, http://www.flickr.com/photos/loopzilla/2328231843/sizes/m/in/photostre
  • 26. Where do go from here?
  • 27. If You Plan to Learn about Distributed Systems • One of the fields to learn by doing • You have to be a good programmer – a patient one (Debugging) – Lazy one (but intelligent) • Start by writing some Web Services, request response stuff • Stop reinventing the wheel, start using tools (middleware) • Learn Zookeeper • Take a class – read, write code, debug, .. http://www.flickr.com/photos/mariachily/5250487136, Licensed CC
  • 28. Distributed System Community • Based around ACM, IEEE, and USENIX • Well known journals – IBM System journal, ACM Operating Systems Review, ACM Transactions on Computer Systems, IEEE Distributed Systems Online, IEEE Transactions on Parallel and Distributed Systems • Conferences – Theory: ICDCS, SPDC – SOA/Cloud : ICWS – E-Science, Parallel Programming : HPDC, SC, E- Science, Ccgrid – Systems : USENIX, Middleware, ACM Symposium on Operating Systems Principles, FAST, LISA, OSDI – DB : Sigmoid record, VLDB • Awards – Turing Award – Edsger W. Dijkstra Prize in Distributed Computing http://www.flickr.com/photos/dullhunk/4187914071, http://www.foto pedia.com/items/flickr-1544709148, Licensed CC
  • 29. Few Must Read Papers • System Structure for Software Fault Tolerance (1975) • Reaching Agreement in the Presence of Faults (1980) • Time, Clocks, and the Ordering of Events in a Distributed System (1978) • Reaching agreement in the presence of faults(1980) and The Byzantine generals problem” (1982), • End-to-End Arguments in System Design (1984) • A Note on Distributed Computing (1994) • Scale in Distributed Systems, (1994) • The Google File System (2003) • Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, (2001) • The Google file system (2003) • Xen and the Art of virtualization (2003) • MapReduce: Simplified Data Processing on Large Clusters (2004)
  • 30. Some Open Challenges • Every thing Data: Analytics, AI, Data Mining (Distributed versions of many algorithms) • Complex Event Processing (CEP) • How to Scale? • Middleware for the Cloud • Scalable Storage • Provenance • Workflows • Guard against DDoS and other http://www.flickr.com/photos/brianscott/5474210001, Distributed Security Issues Licensed CC
  • 31. Questions? Copyright by romainguy, and licensed for reuse under CC License http://www.flickr.com/photos/romainguy/249370084