Infrastructure for cloud_computing

Infrastructure for Cloud
Computing
Dahai Li

2008/06/12

Agenda

• About Cloud Computing
• Tools for Cloud Computing in Google
• Google’s partnerships with universities

2

Advantages

• Data safety and reliability
• Data synchronization between different
devices
• Low requirement of end device
• Unlimited potential of the cloud

Cloud for end user

Google Cloud

Cloud for web developer

Google Cloud

APIs

Example: Earthquake map based on Map API

7

Agenda


8

google.stanford.edu (circa 1997)

Google Data Center (circa 2000)

Google File System (GFS)

12

Why GFS?

• Google has unusual requirements
• Unfair advantage
• Fun and challenging to build large-scale
systems

13

GFS Architecture

Replicas
GFS Master
Masters MSN Client
19% Master
GFS
Google Client
Client
48%
Client
Client

C0 C1 C1 C0 C5 Client
Client
Yahoo

C5 C2 C5
33%
C3 … C2
Client
Client

Chunkserver 1 Chunkserver 2 Chunkserver N

14

Master

• Maintain Metadata:
– File namespace
– Access control info
– Maps files to chunks
• Control system activities:
– Monitor state of chunkservers
– Chunk allocation and placement
– Initiate chunk recovery and rebalancing
– Garbage collect dead chunks
– Collect and display stats, admin functions
15

Client

• Protocol implemented by client library
• Read protocol

16

GFS Usage in Google Cloud

• 50+ clusters
• Filesystem clusters of up to 1000+
machines
• Pools of 1000+ clients
• 10+ GB/s read/write load
– in the presence of frequent hardware failures

17

What’s MapReduce

• A simple programming model that applies to
many large-scale computing problems
• Hide messy details in MapReduce runtime
library

19

Typical problem solved by MapReduce

• Read a lot of data
• Map: extract something you care about from
each record
• Shuffle and Sort
• Reduce: aggregate, summarize, filter, or
transform
• Write the results

20

More specifically…

• Programmer specifies two primary methods:
– map(k, v) → <k', v'>*
– reduce(k', <v'>*) → <k', v'>*
• All v' with same k' are reduced together, in
order.

21

Example: Word Frequencies in Web Pages

• Input is files with one document per record
• Specify a map function that takes a key/value pair
– key = document URL
– value = document contents
• Output of map function is (potentially many) key/value
pairs.
– In our case, output (word, “1”) once per word in the
document
<“网页1”, “是也不是”>

<“是”, “1”>
<“也”, “1”>
<“不”, “1”>
…
22

Continued: word frequencies in web pages

• MapReduce library gathers together all pairs with the
same key (shuffle/sort)
• The reduce function combines the values for a key
In our case, compute the sum
key = “是” key = “也” key = “不”
values = “1”, “1” values = “1” values = “1”

“2” “1” “1”

• Output of reduce (usually 0 or 1 value) paired with key
and saved
“是”, “2”
“也”, “1”
“不”, “1”

23

Example: Pseudo-code

Map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_values:
EmitIntermediate(w, "1");

Reduce(String key, Iterator intermediate_values):
// key: a word, same for input and output
// intermediate_values: a list of counts
int result = 0;
for each v in intermediate_values:
result += ParseInt(v);
Emit(AsString(result));

24

Conclusion to MapReduce

• MapReduce has proven to be a remarkably-useful
abstraction
• Greatly simplifies large-scale computations at Google
• Fun to use: focus on problem, let library deal with messy
details
• Many thousands of parallel programs written by
hundreds of different programmers in last few years
– Many had no prior parallel or distributed programming
experience

25

Overview

• Structure data storage, not database
• Wide applicability
• Scalability
• High performance
• High availability

27

Basic Data Model

• Distributed multi-dimensional sparse map
(row, column, timestamp) cell contents

“contents” COLUMNS

ROWS
…
www.cnn.com t1
…
t2
“<html>…” t3 TIMESTAMPS

• Good match for most of our applications

28

BigTable API

• Metadata operations
– Create/delete tables, column families, change metadata
• Writes (atomic)
– Set(): write cells in a row
– DeleteCells(): delete cells in a row
– DeleteRow(): delete all cells in a row
• Reads
– Scanner: read arbitrary cells in a bigtable

29

System Structure

Bigtable client
Bigtable cell
Bigtable client
Bigtable master library
performs metadata ops, Open()
load balancing

Bigtable tablet server Bigtable tablet server Bigtable tablet server

serves data serves data serves data

Cluster Scheduling Master GFS Lock service
handles failover, monitoring holds tablet data, logs holds metadata,
handles master-election

Current status of BigTable

• Design/initial implementation started beginning of 2004
• Currently ~100 BigTable cells
• Production use or active development for many projects:
– Google Print
– My Search History
– Orkut
– Crawling/indexing pipeline
– Google Maps/Google Earth
– Blogger
– …
• Largest bigtable cell manages ~200TB of data spread
over several thousand machines (larger cells planned)

31

Typical Cluster

Lock service GFS master Scheduling masters

Machine 1 Machine 2 Machine N
User User User
app1 app1 app3
User
User app2 app3 User app2
…
Scheduler GFS Scheduler GFS Scheduler GFS
slave chunkserver slave chunkserver slave chunkserver
Linux Linux Linux

32

Agenda


33

ACCI in Oct. 2007

• Stand for Academic Cloud Computing
Initiative
• IBM and Google partnership
• Facilitate universities education with
distributed system programming skills
• Started from University of Washington and
scaling to many others

34

Google’s ACCI activities in Greater China

• Google Greater China has helped create a
cloud computing course at Tsinghua in
summer 2007
• Now scaling to other mainland China and
Taiwan Universities

Example: THU MR Course, Fall 2007

• “Massive Data Processing” course based
on Google Cloud technology
• Google employees gave lectures during
the course offering;
• Got interesting results from the smart
students

• http://hpc.cs.tsinghua.edu.cn/dpcourse/

Count: THU MR Course, Fall 2007

Students presenting course Massive data processing to
project “simulating the operation simulate the operation of
of solar system based on the solar system
MapReduce technology” at
Google office

THANK YOU

More info on
http://code.google.com/intl/zh-CN/

Infrastructure for cloud_computing

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (18)

Ähnlich wie Infrastructure for cloud_computing

Ähnlich wie Infrastructure for cloud_computing (20)

Mehr von JULIO GONZALEZ SANZ

Mehr von JULIO GONZALEZ SANZ (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Infrastructure for cloud_computing