SlideShare ist ein Scribd-Unternehmen logo
1 von 35
CS 425 / ECE 428
Distributed Systems
Fall 2019
Indranil Gupta (Indy)
Lecture 2-3: Introduction to Cloud
Computing
All slides © IG
1
2
The Hype!
• Forrester in 2010 – Cloud computing will go from
$40.7 billion in 2010 to $241 billion in 2020.
• Goldman Sachs says cloud computing will grow
at annual rate of 30% from 2013-2018
• Hadoop market to reach $20.8 B by by 2018:
Transparency Market Research
• Companies and even Federal/state governments
using cloud computing now: fbo.gov
3
Many Cloud Providers
• AWS: Amazon Web Services
– EC2: Elastic Compute Cloud
– S3: Simple Storage Service
– EBS: Elastic Block Storage
• Microsoft Azure
• Google Cloud/Compute Engine/AppEngine
• Rightscale, Salesforce, EMC, Gigaspaces, 10gen, Datastax,
Oracle, VMWare, Yahoo, Cloudera
• And many many more!
4
Two Categories of Clouds
• Can be either a (i) public cloud, or (ii) private cloud
• Private clouds are accessible only to company employees
• Public clouds provide service to any paying customer:
– Amazon S3 (Simple Storage Service): store arbitrary datasets, pay per GB-month
stored
• As of 2019: 0.4c-3 c per GB month
– Amazon EC2 (Elastic Compute Cloud): upload and run arbitrary OS images, pay
per CPU hour used
• As of 2019: 0.2 c per CPU hr to $7.2 per CPU hr (depending on strength)
– Google cloud: similar pricing as above
– Google AppEngine/Compute Engine: develop applications within their appengine
framework, upload data that will be imported into their format, and run
5
Customers Save Time and $$$
• Dave Power, Associate Information Consultant at Eli Lilly and
Company: “With AWS, Powers said, a new server can be up and
running in three minutes (it used to take Eli Lilly seven and a half
weeks to deploy a server internally) and a 64-node Linux cluster can
be online in five minutes (compared with three months internally). …
It's just shy of instantaneous.”
• Ingo Elfering, Vice President of Information Technology Strategy,
GlaxoSmithKline: “With Online Services, we are able to reduce our IT
operational costs by roughly 30% of what we’re spending”
• Jim Swartz, CIO, Sybase: “At Sybase, a private cloud of virtual servers
inside its datacenter has saved nearly $US2 million annually since
2006, Swartz says, because the company can share computing power
and storage resources across servers.”
• 100s of startups in Silicon Valley can harness large computing resources
without buying their own machines.
6
But what exactly IS a cloud?
7
What is a Cloud?
• It’s a cluster!
• It’s a supercomputer!
• It’s a datastore!
• It’s superman!
• None of the above
• All of the above
• Cloud = Lots of storage + compute cycles nearby
8
What is a Cloud?
• A single-site cloud (aka “Datacenter”) consists
of
– Compute nodes (grouped into racks) (2)
– Switches, connecting the racks
– A network topology, e.g., hierarchical
– Storage (backend) nodes connected to the network
(3)
– Front-end for submitting jobs and receiving client
requests (1)
– (1-3: Often called “three-tier architecture”)
– Software Services
• A geographically distributed cloud consists of
– Multiple such sites
– Each site perhaps with a different structure and
services
9
A Sample Cloud Topology
So then, what is a cluster?
10
“A Cloudy History of Time”
1940
1950
1960
1970
1980
1990
2000
Timesharing Companies
& Data Processing Industry
Grids
Peer to peer systems
Clusters
The first datacenters!
PCs
(not distributed!)
Clouds and datacenters
2012
11
“A Cloudy History of Time”
1940
1950
1960
1970
1980
1990
2000
2012 Clouds
Grids (1980s-2000s):
•GriPhyN (1970s-80s)
•Open Science Grid and Lambda Rail (2000s)
•Globus & other standards (1990s-2000s)
Timesharing Industry (1975):
•Market Share: Honeywell 34%, IBM 15%,
•Xerox 10%, CDC 10%, DEC 10%, UNIVAC 10%
•Honeywell 6000 & 635, IBM 370/168,
Xerox 940 & Sigma 9, DEC PDP-10, UNIVAC 1108
Data Processing Industry
- 1968: $70 M. 1978: $3.15 Billion
First large datacenters: ENIAC, ORDVAC, ILLIAC
Many used vacuum tubes and mechanical relays
Berkeley NOW Project
Supercomputers
Server Farms (e.g., Oceano)
P2P Systems (90s-00s)
•Many Millions of users
•Many GB per day
12
Trends: Technology
• Doubling Periods – storage: 12 mos, bandwidth: 9 mos,
and (what law is this?) cpu compute capacity: 18 mos
• Then and Now
– Bandwidth
• 1985: mostly 56Kbps links nationwide
• 2015: Tbps links widespread
– Disk capacity
• Today’s PCs have TBs, far more than a 1990 supercomputer
13
Trends: Users
• Then and Now
Biologists:
– 1990: were running small single-molecule
simulations
– Today: CERN’s Large Hadron Collider producing
many PB/year
14
Prophecies
• In 1965, MIT's Fernando Corbató and the other designers of
the Multics operating system envisioned a computer facility
operating “like a power company or water company”.
• Plug your thin client into the computing Utility and Play
your favorite Intensive Compute & Communicate
Application
– Have today’s clouds brought us closer to this reality? Think
about it.
15
Four Features New in Today’s
Clouds
I. Massive scale.
II. On-demand access: Pay-as-you-go, no upfront commitment.
– And anyone can access it
III. Data-intensive Nature: What was MBs has now become TBs, PBs and
XBs.
– Daily logs, forensics, Web data, etc.
– Humans have data numbness: Wikipedia (large) compressed is only about 10 GB!
IV. New Cloud Programming Paradigms: MapReduce/Hadoop,
NoSQL/Cassandra/MongoDB and many others.
– High in accessibility and ease of programmability
– Lots of open-source
Combination of one or more of these gives rise to novel and unsolved
distributed computing problems in cloud computing. 16
I. Massive Scale
• Facebook [GigaOm, 2012]
– 30K in 2009 -> 60K in 2010 -> 180K in 2012
• Microsoft [NYTimes, 2008]
– 150K machines
– Growth rate of 10K per month
– 80K total running Bing
– In 2013, Microsoft Cosmos had 110K machines (4 sites)
• Yahoo! [2009]:
– 100K
– Split into clusters of 4000
• AWS EC2 [Randy Bias, 2009]
– 40K machines
– 8 cores/machine
• eBay [2012]: 50K machines
• HP [2012]: 380K in 180 DCs
• Google [2011, Data Center Knowledge] : 900K
17
Quiz: Where is the World’s
Largest Datacenter?
18
Quiz: Where is the World’s
Largest Datacenter?
• (2018) China Telecom. 10.7 Million sq. ft.
• (2017) “The Citadel” Nevada. 7.2 Million sq. ft.
• (2015) In Chicago!
• 350 East Cermak, Chicago, 1.1 MILLION sq. ft.
• Shared by many different “carriers”
• Critical to Chicago Mercantile Exchange
• See:
– https://www.gigabitmagazine.com/top10/top-10-biggest-data-centres-world
– https://www.racksolutions.com/news/data-center-news/top-10-largest-data-centers-world/
19
What does a datacenter look like
from inside?
• A virtual walk through a datacenter
• Reference: http://gigaom.com/cleantech/a-rare-look-
inside-facebooks-oregon-data-center-photos-video/
20
Servers
Front Back
In Some highly secure (e.g., financial info)21
Power
Off-site
On-site
•WUE = Annual Water Usage / IT Equipment Energy (L/kWh) – low is good
•PUE = Total facility Power / IT Equipment Power – low is good
(e.g., Google~1.1)
22
Cooling
Air sucked in from top (also, Bugzappers) Water purified
Water sprayed into air 15 motors per server bank
23
Extra - Fun Videos to Watch
• Microsoft GFS Datacenter Tour (Youtube)
– http://www.youtube.com/watch?v=hOxA1l1pQIw
• Timelapse of a Datacenter Construction on the Inside
(Fortune 500 company)
– http://www.youtube.com/watch?v=ujO-xNvXj3g
24
II. On-demand access: *aaS
Classification
On-demand: renting a cab vs. (previously) renting a car, or buying one. E.g.:
– AWS Elastic Compute Cloud (EC2): a few cents to a few $ per CPU hour
– AWS Simple Storage Service (S3): a few cents per GB-month
• HaaS: Hardware as a Service
– You get access to barebones hardware machines, do whatever you want with them,
Ex: Your own cluster
– Not always a good idea because of security risks
• IaaS: Infrastructure as a Service
– You get access to flexible computing and storage infrastructure. Virtualization is one
way of achieving this (cgroups, Kubernetes, Dockers, VMs,…). Often said to
subsume HaaS.
– Ex: Amazon Web Services (AWS: EC2 and S3), OpenStack, Eucalyptus, Rightscale,
Microsoft Azure, Google Cloud. 25
II. On-demand access: *aaS
Classification
• PaaS: Platform as a Service
– You get access to flexible computing and storage infrastructure,
coupled with a software platform (often tightly coupled)
– Ex: Google’s AppEngine (Python, Java, Go)
• SaaS: Software as a Service
– You get access to software services, when you need them. Often
said to subsume SOA (Service Oriented Architectures).
– Ex: Google docs, MS Office 365 Online
26
III. Data-intensive Computing
• Computation-Intensive Computing
– Example areas: MPI-based, High-performance computing, Grids
– Typically run on supercomputers (e.g., NCSA Blue Waters)
• Data-Intensive
– Typically store data at datacenters
– Use compute nodes nearby
– Compute nodes run computation services
• In data-intensive computing, the focus shifts from computation to the data:
CPU utilization no longer the most important resource metric, instead I/O is
(disk and/or network)
27
IV. New Cloud Programming
Paradigms
• Easy to write and run highly parallel programs in new cloud programming
paradigms:
– Google: MapReduce and Sawzall
– Amazon: Elastic MapReduce service (pay-as-you-go)
– Google (MapReduce)
• Indexing: a chain of 24 MapReduce jobs
• ~200K jobs processing 50PB/month (in 2006)
– Yahoo! (Hadoop + Pig)
• WebMap: a chain of several MapReduce jobs
• 300 TB of data, 10K cores, many tens of hours (~2008)
– Facebook (Hadoop + Hive)
• ~300TB total, adding 2TB/day (in 2008)
• 3K jobs processing 55TB/day
– Similar numbers from other companies, e.g., Yieldex, eharmony.com, etc.
– NoSQL: MySQL is an industry standard, but Cassandra is 2400 times faster! 28
Two Categories of Clouds
• Can be either a (i) public cloud, or (ii) private cloud
• Private clouds are accessible only to company employees
• Public clouds provide service to any paying customer
• You’re starting a new service/company: should you use a public cloud
or purchase your own private cloud?
29
Single site Cloud: to Outsource
or Own?
• Medium-sized organization: wishes to run a service for M months
– Service requires 128 servers (1024 cores) and 524 TB
– Same as UIUC CCT (Cloud Computing Testbed) cloud site (bought in 2009, now
decommissioned)
• Outsource (e.g., via AWS): monthly cost
– S3 costs: $0.12 per GB month. EC2 costs: $0.10 per CPU hour (costs from 2009)
– Storage = $ 0.12 X 524 X 1000 ~ $62 K
– Total = Storage + CPUs = $62 K + $0.10 X 1024 X 24 X 30 ~ $136 K
• Own: monthly cost
– Storage ~ $349 K / M
– Total ~ $ 1555 K / M + 7.5 K (includes 1 sysadmin / 100 nodes)
• using 0.45:0.4:0.15 split for hardware:power:network and 3 year
lifetime of hardware 30
Single site Cloud: to Outsource
or Own?
• Breakeven analysis: more preferable to own if:
- $349 K / M < $62 K (storage)
- $ 1555 K / M + 7.5 K < $136 K (overall)
Breakeven points
- M > 5.55 months (storage)
- M > 12 months (overall)
- As a result
- Startups use clouds a lot
- Cloud providers benefit monetarily most from storage
31
Academic Clouds: Emulab
• A community resource open to researchers in academia and industry. Very widely
used by researchers everywhere today.
• https://www.emulab.net/
• A cluster, with currently ~500 servers
• Founded and owned by University of Utah (led by Late Prof. Jay Lepreau)
• As a user, you can:
– Grab a set of machines for your experiment
– You get root-level (sudo) access to these machines
– You can specify a network topology for your cluster
– You can emulate any topology
All images © Emulab 32
• A community resource open to researchers in academia and
industry
• http://www.planet-lab.org/
• Currently, ~ 1077 nodes at ~500 sites across the world
• Founded at Princeton University (led by Prof. Larry Peterson), but owned in a federated manner by the sites
• Node: Dedicated server that runs components of PlanetLab services.
• Site: A location, e.g., UIUC, that hosts a number of nodes.
• Sliver: Virtual division of each node. Currently, uses VMs, but it could also other technology. Needed for
timesharing across users.
• Slice: A spatial cut-up of the PL nodes. Per user. A slice is a way of giving each user (Unix-shell like)
access to a subset of PL machines, selected by the user. A slice consists of multiple slivers, one at each
component node.
• Thus, PlanetLab allows you to run real world-wide experiments.
• Many services have been deployed atop it, used by millions (not just researchers): Application-level DNS
services, Monitoring services, CoralCDN, etc.
• PlanetLab is basis for NSF GENI https://www.geni.net/
All images © PlanetLab
33
Public Research Clouds
• Accessible to researchers with a qualifying grant
• Chameleon Cloud: https://www.chameleoncloud.org/
• HaaS
• OpenStack (~AWS)
• CloudLab: https://www.cloudlab.us/
• Build your own cloud on their hardware
34
Summary
• Clouds build on many previous generations of distributed
systems
• Especially the timesharing and data processing industry of
the 1960-70s.
• Need to identify unique aspects of a problem to classify it as
a new cloud computing problem
– Scale, On-demand access, data-intensive, new programming
• Otherwise, the solutions to your problem may already exist!
• Next: Mapreduce!
35

Weitere ähnliche Inhalte

Was ist angesagt?

A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud PlatformColin Su
 
Getting started with Google Cloud Training Material - 2018
Getting started with Google Cloud Training Material - 2018Getting started with Google Cloud Training Material - 2018
Getting started with Google Cloud Training Material - 2018JK Baseer
 
Tom Grey - Google Cloud Platform
Tom Grey - Google Cloud PlatformTom Grey - Google Cloud Platform
Tom Grey - Google Cloud PlatformFondazione CUOA
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsNilanchal
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platformdhruv_chaudhari
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014Puppet
 
Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies Andrés Leonardo Martinez Ortiz
 
MongoDB Days UK: Run MongoDB on Google Cloud Platform
MongoDB Days UK: Run MongoDB on Google Cloud PlatformMongoDB Days UK: Run MongoDB on Google Cloud Platform
MongoDB Days UK: Run MongoDB on Google Cloud PlatformMongoDB
 
Build with all of Google Cloud
Build with all of Google CloudBuild with all of Google Cloud
Build with all of Google Cloudwesley chun
 
Google Cloud Technologies Overview
Google Cloud Technologies OverviewGoogle Cloud Technologies Overview
Google Cloud Technologies OverviewChris Schalk
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platformrajdeep
 
Google cloud Platform
Google cloud PlatformGoogle cloud Platform
Google cloud PlatformJanu Jahnavi
 
 Introduction google cloud platform
 Introduction google cloud platform Introduction google cloud platform
 Introduction google cloud platformmarwa Ayad Mohamed
 
Google Compute Engine Starter Guide
Google Compute Engine Starter GuideGoogle Compute Engine Starter Guide
Google Compute Engine Starter GuideSimon Su
 
Google Cloud Platform Update
Google Cloud Platform UpdateGoogle Cloud Platform Update
Google Cloud Platform UpdateIdo Green
 
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic Training
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic TrainingGCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic Training
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic TrainingSimon Su
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesVMware Tanzu
 
Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Korea
 

Was ist angesagt? (20)

A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud Platform
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Getting started with Google Cloud Training Material - 2018
Getting started with Google Cloud Training Material - 2018Getting started with Google Cloud Training Material - 2018
Getting started with Google Cloud Training Material - 2018
 
Tom Grey - Google Cloud Platform
Tom Grey - Google Cloud PlatformTom Grey - Google Cloud Platform
Tom Grey - Google Cloud Platform
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / Platforms
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Google Cloud Platform
Google Cloud Platform Google Cloud Platform
Google Cloud Platform
 
Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies
 
MongoDB Days UK: Run MongoDB on Google Cloud Platform
MongoDB Days UK: Run MongoDB on Google Cloud PlatformMongoDB Days UK: Run MongoDB on Google Cloud Platform
MongoDB Days UK: Run MongoDB on Google Cloud Platform
 
Build with all of Google Cloud
Build with all of Google CloudBuild with all of Google Cloud
Build with all of Google Cloud
 
Google Cloud Technologies Overview
Google Cloud Technologies OverviewGoogle Cloud Technologies Overview
Google Cloud Technologies Overview
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platform
 
Google cloud Platform
Google cloud PlatformGoogle cloud Platform
Google cloud Platform
 
 Introduction google cloud platform
 Introduction google cloud platform Introduction google cloud platform
 Introduction google cloud platform
 
Google Compute Engine Starter Guide
Google Compute Engine Starter GuideGoogle Compute Engine Starter Guide
Google Compute Engine Starter Guide
 
Google Cloud Platform Update
Google Cloud Platform UpdateGoogle Cloud Platform Update
Google Cloud Platform Update
 
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic Training
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic TrainingGCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic Training
GCP - GCE, Cloud SQL, Cloud Storage, BigQuery Basic Training
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
 
Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017Google Cloud Connect Korea - Sep 2017
Google Cloud Connect Korea - Sep 2017
 

Ähnlich wie L2 3.fa19

AWS res 2024 key points for better research.ppt
AWS res 2024 key points for better research.pptAWS res 2024 key points for better research.ppt
AWS res 2024 key points for better research.pptfodod37142
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.pptBala Anand
 
An Introduction to Data Intensive Computing
An Introduction to Data Intensive ComputingAn Introduction to Data Intensive Computing
An Introduction to Data Intensive ComputingCollin Bennett
 
Kb12012011 amitava cloud_computing
Kb12012011 amitava cloud_computingKb12012011 amitava cloud_computing
Kb12012011 amitava cloud_computingAmitava Kumar
 
Introduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big DataIntroduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big Datawaheed751
 
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisCourse 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisBetacowork
 
Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)David Torres
 
Trends in recent technology
Trends in recent technologyTrends in recent technology
Trends in recent technologysai krishna
 
Cloud Computing Berkeley.pdf
Cloud Computing Berkeley.pdfCloud Computing Berkeley.pdf
Cloud Computing Berkeley.pdfAtaulAzizIkram
 
The Cloud Revolution - Philippines Cloud Summit
The Cloud Revolution - Philippines Cloud SummitThe Cloud Revolution - Philippines Cloud Summit
The Cloud Revolution - Philippines Cloud SummitRandy Bias
 
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838e
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838eCC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838e
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838eRamzanShareefPrivate
 
Cloud Computing .ppt
Cloud Computing .pptCloud Computing .ppt
Cloud Computing .pptPrukaBay
 
Introduction to Cloud computing
Introduction to Cloud computingIntroduction to Cloud computing
Introduction to Cloud computingMathews Job
 

Ähnlich wie L2 3.fa19 (20)

AWS res 2024 key points for better research.ppt
AWS res 2024 key points for better research.pptAWS res 2024 key points for better research.ppt
AWS res 2024 key points for better research.ppt
 
cloud.ppt
cloud.pptcloud.ppt
cloud.ppt
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.ppt
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
An Introduction to Data Intensive Computing
An Introduction to Data Intensive ComputingAn Introduction to Data Intensive Computing
An Introduction to Data Intensive Computing
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Kb12012011 amitava cloud_computing
Kb12012011 amitava cloud_computingKb12012011 amitava cloud_computing
Kb12012011 amitava cloud_computing
 
Introduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big DataIntroduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big Data
 
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisCourse 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)
 
cloud.ppt
cloud.pptcloud.ppt
cloud.ppt
 
Trends in recent technology
Trends in recent technologyTrends in recent technology
Trends in recent technology
 
Cloud Computing Berkeley.pdf
Cloud Computing Berkeley.pdfCloud Computing Berkeley.pdf
Cloud Computing Berkeley.pdf
 
The Cloud Revolution - Philippines Cloud Summit
The Cloud Revolution - Philippines Cloud SummitThe Cloud Revolution - Philippines Cloud Summit
The Cloud Revolution - Philippines Cloud Summit
 
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838e
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838eCC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838e
CC Notes.pdf of jdjejwiwu22u28938ehdh3y2u2838e
 
ppt2.pdf
ppt2.pdfppt2.pdf
ppt2.pdf
 
Cloud Computing .ppt
Cloud Computing .pptCloud Computing .ppt
Cloud Computing .ppt
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Introduction to Cloud computing
Introduction to Cloud computingIntroduction to Cloud computing
Introduction to Cloud computing
 

Kürzlich hochgeladen

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 

Kürzlich hochgeladen (20)

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 

L2 3.fa19

  • 1. CS 425 / ECE 428 Distributed Systems Fall 2019 Indranil Gupta (Indy) Lecture 2-3: Introduction to Cloud Computing All slides © IG 1
  • 2. 2
  • 3. The Hype! • Forrester in 2010 – Cloud computing will go from $40.7 billion in 2010 to $241 billion in 2020. • Goldman Sachs says cloud computing will grow at annual rate of 30% from 2013-2018 • Hadoop market to reach $20.8 B by by 2018: Transparency Market Research • Companies and even Federal/state governments using cloud computing now: fbo.gov 3
  • 4. Many Cloud Providers • AWS: Amazon Web Services – EC2: Elastic Compute Cloud – S3: Simple Storage Service – EBS: Elastic Block Storage • Microsoft Azure • Google Cloud/Compute Engine/AppEngine • Rightscale, Salesforce, EMC, Gigaspaces, 10gen, Datastax, Oracle, VMWare, Yahoo, Cloudera • And many many more! 4
  • 5. Two Categories of Clouds • Can be either a (i) public cloud, or (ii) private cloud • Private clouds are accessible only to company employees • Public clouds provide service to any paying customer: – Amazon S3 (Simple Storage Service): store arbitrary datasets, pay per GB-month stored • As of 2019: 0.4c-3 c per GB month – Amazon EC2 (Elastic Compute Cloud): upload and run arbitrary OS images, pay per CPU hour used • As of 2019: 0.2 c per CPU hr to $7.2 per CPU hr (depending on strength) – Google cloud: similar pricing as above – Google AppEngine/Compute Engine: develop applications within their appengine framework, upload data that will be imported into their format, and run 5
  • 6. Customers Save Time and $$$ • Dave Power, Associate Information Consultant at Eli Lilly and Company: “With AWS, Powers said, a new server can be up and running in three minutes (it used to take Eli Lilly seven and a half weeks to deploy a server internally) and a 64-node Linux cluster can be online in five minutes (compared with three months internally). … It's just shy of instantaneous.” • Ingo Elfering, Vice President of Information Technology Strategy, GlaxoSmithKline: “With Online Services, we are able to reduce our IT operational costs by roughly 30% of what we’re spending” • Jim Swartz, CIO, Sybase: “At Sybase, a private cloud of virtual servers inside its datacenter has saved nearly $US2 million annually since 2006, Swartz says, because the company can share computing power and storage resources across servers.” • 100s of startups in Silicon Valley can harness large computing resources without buying their own machines. 6
  • 7. But what exactly IS a cloud? 7
  • 8. What is a Cloud? • It’s a cluster! • It’s a supercomputer! • It’s a datastore! • It’s superman! • None of the above • All of the above • Cloud = Lots of storage + compute cycles nearby 8
  • 9. What is a Cloud? • A single-site cloud (aka “Datacenter”) consists of – Compute nodes (grouped into racks) (2) – Switches, connecting the racks – A network topology, e.g., hierarchical – Storage (backend) nodes connected to the network (3) – Front-end for submitting jobs and receiving client requests (1) – (1-3: Often called “three-tier architecture”) – Software Services • A geographically distributed cloud consists of – Multiple such sites – Each site perhaps with a different structure and services 9
  • 10. A Sample Cloud Topology So then, what is a cluster? 10
  • 11. “A Cloudy History of Time” 1940 1950 1960 1970 1980 1990 2000 Timesharing Companies & Data Processing Industry Grids Peer to peer systems Clusters The first datacenters! PCs (not distributed!) Clouds and datacenters 2012 11
  • 12. “A Cloudy History of Time” 1940 1950 1960 1970 1980 1990 2000 2012 Clouds Grids (1980s-2000s): •GriPhyN (1970s-80s) •Open Science Grid and Lambda Rail (2000s) •Globus & other standards (1990s-2000s) Timesharing Industry (1975): •Market Share: Honeywell 34%, IBM 15%, •Xerox 10%, CDC 10%, DEC 10%, UNIVAC 10% •Honeywell 6000 & 635, IBM 370/168, Xerox 940 & Sigma 9, DEC PDP-10, UNIVAC 1108 Data Processing Industry - 1968: $70 M. 1978: $3.15 Billion First large datacenters: ENIAC, ORDVAC, ILLIAC Many used vacuum tubes and mechanical relays Berkeley NOW Project Supercomputers Server Farms (e.g., Oceano) P2P Systems (90s-00s) •Many Millions of users •Many GB per day 12
  • 13. Trends: Technology • Doubling Periods – storage: 12 mos, bandwidth: 9 mos, and (what law is this?) cpu compute capacity: 18 mos • Then and Now – Bandwidth • 1985: mostly 56Kbps links nationwide • 2015: Tbps links widespread – Disk capacity • Today’s PCs have TBs, far more than a 1990 supercomputer 13
  • 14. Trends: Users • Then and Now Biologists: – 1990: were running small single-molecule simulations – Today: CERN’s Large Hadron Collider producing many PB/year 14
  • 15. Prophecies • In 1965, MIT's Fernando Corbató and the other designers of the Multics operating system envisioned a computer facility operating “like a power company or water company”. • Plug your thin client into the computing Utility and Play your favorite Intensive Compute & Communicate Application – Have today’s clouds brought us closer to this reality? Think about it. 15
  • 16. Four Features New in Today’s Clouds I. Massive scale. II. On-demand access: Pay-as-you-go, no upfront commitment. – And anyone can access it III. Data-intensive Nature: What was MBs has now become TBs, PBs and XBs. – Daily logs, forensics, Web data, etc. – Humans have data numbness: Wikipedia (large) compressed is only about 10 GB! IV. New Cloud Programming Paradigms: MapReduce/Hadoop, NoSQL/Cassandra/MongoDB and many others. – High in accessibility and ease of programmability – Lots of open-source Combination of one or more of these gives rise to novel and unsolved distributed computing problems in cloud computing. 16
  • 17. I. Massive Scale • Facebook [GigaOm, 2012] – 30K in 2009 -> 60K in 2010 -> 180K in 2012 • Microsoft [NYTimes, 2008] – 150K machines – Growth rate of 10K per month – 80K total running Bing – In 2013, Microsoft Cosmos had 110K machines (4 sites) • Yahoo! [2009]: – 100K – Split into clusters of 4000 • AWS EC2 [Randy Bias, 2009] – 40K machines – 8 cores/machine • eBay [2012]: 50K machines • HP [2012]: 380K in 180 DCs • Google [2011, Data Center Knowledge] : 900K 17
  • 18. Quiz: Where is the World’s Largest Datacenter? 18
  • 19. Quiz: Where is the World’s Largest Datacenter? • (2018) China Telecom. 10.7 Million sq. ft. • (2017) “The Citadel” Nevada. 7.2 Million sq. ft. • (2015) In Chicago! • 350 East Cermak, Chicago, 1.1 MILLION sq. ft. • Shared by many different “carriers” • Critical to Chicago Mercantile Exchange • See: – https://www.gigabitmagazine.com/top10/top-10-biggest-data-centres-world – https://www.racksolutions.com/news/data-center-news/top-10-largest-data-centers-world/ 19
  • 20. What does a datacenter look like from inside? • A virtual walk through a datacenter • Reference: http://gigaom.com/cleantech/a-rare-look- inside-facebooks-oregon-data-center-photos-video/ 20
  • 21. Servers Front Back In Some highly secure (e.g., financial info)21
  • 22. Power Off-site On-site •WUE = Annual Water Usage / IT Equipment Energy (L/kWh) – low is good •PUE = Total facility Power / IT Equipment Power – low is good (e.g., Google~1.1) 22
  • 23. Cooling Air sucked in from top (also, Bugzappers) Water purified Water sprayed into air 15 motors per server bank 23
  • 24. Extra - Fun Videos to Watch • Microsoft GFS Datacenter Tour (Youtube) – http://www.youtube.com/watch?v=hOxA1l1pQIw • Timelapse of a Datacenter Construction on the Inside (Fortune 500 company) – http://www.youtube.com/watch?v=ujO-xNvXj3g 24
  • 25. II. On-demand access: *aaS Classification On-demand: renting a cab vs. (previously) renting a car, or buying one. E.g.: – AWS Elastic Compute Cloud (EC2): a few cents to a few $ per CPU hour – AWS Simple Storage Service (S3): a few cents per GB-month • HaaS: Hardware as a Service – You get access to barebones hardware machines, do whatever you want with them, Ex: Your own cluster – Not always a good idea because of security risks • IaaS: Infrastructure as a Service – You get access to flexible computing and storage infrastructure. Virtualization is one way of achieving this (cgroups, Kubernetes, Dockers, VMs,…). Often said to subsume HaaS. – Ex: Amazon Web Services (AWS: EC2 and S3), OpenStack, Eucalyptus, Rightscale, Microsoft Azure, Google Cloud. 25
  • 26. II. On-demand access: *aaS Classification • PaaS: Platform as a Service – You get access to flexible computing and storage infrastructure, coupled with a software platform (often tightly coupled) – Ex: Google’s AppEngine (Python, Java, Go) • SaaS: Software as a Service – You get access to software services, when you need them. Often said to subsume SOA (Service Oriented Architectures). – Ex: Google docs, MS Office 365 Online 26
  • 27. III. Data-intensive Computing • Computation-Intensive Computing – Example areas: MPI-based, High-performance computing, Grids – Typically run on supercomputers (e.g., NCSA Blue Waters) • Data-Intensive – Typically store data at datacenters – Use compute nodes nearby – Compute nodes run computation services • In data-intensive computing, the focus shifts from computation to the data: CPU utilization no longer the most important resource metric, instead I/O is (disk and/or network) 27
  • 28. IV. New Cloud Programming Paradigms • Easy to write and run highly parallel programs in new cloud programming paradigms: – Google: MapReduce and Sawzall – Amazon: Elastic MapReduce service (pay-as-you-go) – Google (MapReduce) • Indexing: a chain of 24 MapReduce jobs • ~200K jobs processing 50PB/month (in 2006) – Yahoo! (Hadoop + Pig) • WebMap: a chain of several MapReduce jobs • 300 TB of data, 10K cores, many tens of hours (~2008) – Facebook (Hadoop + Hive) • ~300TB total, adding 2TB/day (in 2008) • 3K jobs processing 55TB/day – Similar numbers from other companies, e.g., Yieldex, eharmony.com, etc. – NoSQL: MySQL is an industry standard, but Cassandra is 2400 times faster! 28
  • 29. Two Categories of Clouds • Can be either a (i) public cloud, or (ii) private cloud • Private clouds are accessible only to company employees • Public clouds provide service to any paying customer • You’re starting a new service/company: should you use a public cloud or purchase your own private cloud? 29
  • 30. Single site Cloud: to Outsource or Own? • Medium-sized organization: wishes to run a service for M months – Service requires 128 servers (1024 cores) and 524 TB – Same as UIUC CCT (Cloud Computing Testbed) cloud site (bought in 2009, now decommissioned) • Outsource (e.g., via AWS): monthly cost – S3 costs: $0.12 per GB month. EC2 costs: $0.10 per CPU hour (costs from 2009) – Storage = $ 0.12 X 524 X 1000 ~ $62 K – Total = Storage + CPUs = $62 K + $0.10 X 1024 X 24 X 30 ~ $136 K • Own: monthly cost – Storage ~ $349 K / M – Total ~ $ 1555 K / M + 7.5 K (includes 1 sysadmin / 100 nodes) • using 0.45:0.4:0.15 split for hardware:power:network and 3 year lifetime of hardware 30
  • 31. Single site Cloud: to Outsource or Own? • Breakeven analysis: more preferable to own if: - $349 K / M < $62 K (storage) - $ 1555 K / M + 7.5 K < $136 K (overall) Breakeven points - M > 5.55 months (storage) - M > 12 months (overall) - As a result - Startups use clouds a lot - Cloud providers benefit monetarily most from storage 31
  • 32. Academic Clouds: Emulab • A community resource open to researchers in academia and industry. Very widely used by researchers everywhere today. • https://www.emulab.net/ • A cluster, with currently ~500 servers • Founded and owned by University of Utah (led by Late Prof. Jay Lepreau) • As a user, you can: – Grab a set of machines for your experiment – You get root-level (sudo) access to these machines – You can specify a network topology for your cluster – You can emulate any topology All images © Emulab 32
  • 33. • A community resource open to researchers in academia and industry • http://www.planet-lab.org/ • Currently, ~ 1077 nodes at ~500 sites across the world • Founded at Princeton University (led by Prof. Larry Peterson), but owned in a federated manner by the sites • Node: Dedicated server that runs components of PlanetLab services. • Site: A location, e.g., UIUC, that hosts a number of nodes. • Sliver: Virtual division of each node. Currently, uses VMs, but it could also other technology. Needed for timesharing across users. • Slice: A spatial cut-up of the PL nodes. Per user. A slice is a way of giving each user (Unix-shell like) access to a subset of PL machines, selected by the user. A slice consists of multiple slivers, one at each component node. • Thus, PlanetLab allows you to run real world-wide experiments. • Many services have been deployed atop it, used by millions (not just researchers): Application-level DNS services, Monitoring services, CoralCDN, etc. • PlanetLab is basis for NSF GENI https://www.geni.net/ All images © PlanetLab 33
  • 34. Public Research Clouds • Accessible to researchers with a qualifying grant • Chameleon Cloud: https://www.chameleoncloud.org/ • HaaS • OpenStack (~AWS) • CloudLab: https://www.cloudlab.us/ • Build your own cloud on their hardware 34
  • 35. Summary • Clouds build on many previous generations of distributed systems • Especially the timesharing and data processing industry of the 1960-70s. • Need to identify unique aspects of a problem to classify it as a new cloud computing problem – Scale, On-demand access, data-intensive, new programming • Otherwise, the solutions to your problem may already exist! • Next: Mapreduce! 35