Computational Social Science, Lecture 03: Counting at Scale, Part I

•

1 gefällt mir•1,872 views

J

Counting @ Scale

Sharad Goel
Columbia University
Computational Social Science: Lecture 3

February 8, 2013

Descriptive statistics
(as opposed to inferential statistics)
is about counting
contingency tables
means, variances, quantiles
summaries of conditional distributions

Long tail
video consumption on YouTube

Digital divide
time spent across various online properties

Viral diffusion
propagation of tweets on Twitter

Counting @ scale
conceptually easy
computationally hard

I/O bound
difficult to read terabytes of data

Network bound
hard to transfer terabytes of data

Memory bound
cannot randomly access data points

CPU bound
even simple manipulations add up

Rank videos by popularity
local video store
1K movies, 100K viewings

Rank videos by popularity
local video store
1K movies, 100K viewings

Load dataset into memory

Rank videos by popularity
Netflix
100K movies, 1B viewings

Rank videos by popularity
Netflix
100K movies, 1B viewings

store counter for each movie in memory
and stream through the dataset

Rank videos by popularity
YouTube
10B videos, 10T viewings

Rank videos by popularity
YouTube
10B videos, 10T viewings

Trouble, with a capital ‘T’

Parallel computation
Distribute work across several machines

10 parallel workers
1T views per worker
maybe 5B unique videos on each

100 parallel workers
100B views per worker
maybe 1B videos on each

split

count

sort by video

merge

sort by popularity

Core problem
the same movie appears on multiple machines

Solution
do not split viewing data at random
ensure individual movies are never split apart

split

count

sort by video

merge

sort by popularity

Shuffle (1st attempt)
create a new file for every movie

append viewing data to the appropriate file

Shuffle (2nd attempt)
First time you see a movie,
append it randomly to one of 10K files

Next time you see the movie, append it to same file

Shuffle (3rd attempt)
Hash the movie ID to determine which
file to append it to

( Hash function
maps large input space to small output space
approximately uniformly
)

MapReduce:
Simplifed Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat
OSDI, 2004

Map
assign each input line to one or more groups

Shuffle
aggregate groups

Reduce
operate on grouped data

Map
assign each input line to one or more groups
v  [(k1, v1), …, (km, vm)]

Shuffle
aggregate groups

Reduce
operate on grouped data
(k, [v1, …, vn])  [w1, …, wp]

The Insight of MapReduce
One can efficiently group identical items

Many tasks are computationally easier on grouped data

Word Count

Input
text corpus

Output
number of occurrences of each word

Word Count

Map
line  words

Reduce
word group  size of group

MapReduce
the unreasonable effectiveness of aggregation

Empfohlen

Computational Social Science, Lecture 09: Data Wrangling

Computational Social Science, Lecture 09: Data Wrangling

Computational Social Science, Lecture 09: Data Wranglingjakehofman

Computational Social Science, Lecture 04: Counting at Scale, Part II

Computational Social Science, Lecture 04: Counting at Scale, Part II

Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman

Computational Social Science, Lecture 02: An Introduction to Counting

Computational Social Science, Lecture 02: An Introduction to Counting

Computational Social Science, Lecture 02: An Introduction to Countingjakehofman

Computational Social Science, Lecture 10: Online Experiments

Computational Social Science, Lecture 10: Online Experiments

Computational Social Science, Lecture 10: Online Experimentsjakehofman

Computational Social Science, Lecture 08: Counting Fast, Part II

Computational Social Science, Lecture 08: Counting Fast, Part II

Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman

Computational Social Science, Lecture 13: Classification

Computational Social Science, Lecture 13: Classification

Computational Social Science, Lecture 13: Classificationjakehofman

Computational Social Science, Lecture 11: Regression

Computational Social Science, Lecture 11: Regression

Computational Social Science, Lecture 11: Regressionjakehofman

Computational Social Science, Lecture 07: Counting Fast, Part I

Computational Social Science, Lecture 07: Counting Fast, Part I

Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman

Empfohlen

Computational Social Science, Lecture 09: Data Wrangling

Computational Social Science, Lecture 09: Data Wrangling

Computational Social Science, Lecture 09: Data Wranglingjakehofman

Computational Social Science, Lecture 04: Counting at Scale, Part II

Computational Social Science, Lecture 04: Counting at Scale, Part II

Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman

Computational Social Science, Lecture 02: An Introduction to Counting

Computational Social Science, Lecture 02: An Introduction to Counting

Computational Social Science, Lecture 02: An Introduction to Countingjakehofman

Computational Social Science, Lecture 10: Online Experiments

Computational Social Science, Lecture 10: Online Experiments

Computational Social Science, Lecture 10: Online Experimentsjakehofman

Computational Social Science, Lecture 08: Counting Fast, Part II

Computational Social Science, Lecture 08: Counting Fast, Part II

Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman

Computational Social Science, Lecture 13: Classification

Computational Social Science, Lecture 13: Classification

Computational Social Science, Lecture 13: Classificationjakehofman

Computational Social Science, Lecture 11: Regression

Computational Social Science, Lecture 11: Regression

Computational Social Science, Lecture 11: Regressionjakehofman

Computational Social Science, Lecture 07: Counting Fast, Part I

Computational Social Science, Lecture 07: Counting Fast, Part I

Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part IIjakehofman

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part Ijakehofman

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Countingjakehofman

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overviewjakehofman

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1jakehofman

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scalejakehofman

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03jakehofman

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scalejakehofman

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníARufinaespi

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th AnniversaryUbisoft Montreal

практ8slavinskiy1

CV_EnglishHatem Aws Alansary

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizajenatalie añez

Mapa Conceptual 11 6

Mapa Conceptual 11 6

Mapa Conceptual 11 6guest7379f6

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactionsassist

Ninos sabiosHectorin Arreola Alfaro

лабар8slavinskiy1

поетична свічка

поетична свічка

поетична свічкаШкола №7 Миргород

Video + Language 2019

Video + Language 2019

Video + Language 2019Goergen Institute for Data Science

Video + Language

Video + Language

Video + LanguageGoergen Institute for Data Science

Video+Language: From Classification to Description

Video+Language: From Classification to Description

Video+Language: From Classification to DescriptionGoergen Institute for Data Science

Weitere ähnliche Inhalte

Andere mochten auch

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part IIjakehofman

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part Ijakehofman

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Countingjakehofman

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overviewjakehofman

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1jakehofman

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scalejakehofman

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03jakehofman

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scalejakehofman

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníARufinaespi

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th AnniversaryUbisoft Montreal

практ8slavinskiy1

CV_EnglishHatem Aws Alansary

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizajenatalie añez

Mapa Conceptual 11 6

Mapa Conceptual 11 6

Mapa Conceptual 11 6guest7379f6

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactionsassist

Ninos sabiosHectorin Arreola Alfaro

лабар8slavinskiy1

поетична свічка

поетична свічка

поетична свічкаШкола №7 Миргород

Andere mochten auch (19)

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 06: Networks, Part II

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part I

Computational Social Science, Lecture 05: Networks, Part I

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 1: Overview

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 6: Regression, Part 1

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scale

Modeling Social Data, Lecture 3: Counting at Scale

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03

Data-driven Modeling: Lecture 03

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 4: Counting at Scale

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in R

Modeling Social Data, Lecture 3: Data manipulation in R

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníA

Conociendo Parte De La AmazoníA

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

From Ubisoft Montreal to Fantasia: Happy 15th Anniversary

практ8

CV_English

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizaje

Ambientes Personales De Aprendizaje

Mapa Conceptual 11 6

Mapa Conceptual 11 6

Mapa Conceptual 11 6

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactions

Assist Workshop 2016 - Phil Gray - Interactions

Ninos sabios

лабар8

поетична свічка

поетична свічка

поетична свічка

Ähnlich wie Computational Social Science, Lecture 03: Counting at Scale, Part I

Video + Language 2019

Video + Language 2019

Video + Language 2019Goergen Institute for Data Science

Video + Language

Video + Language

Video + LanguageGoergen Institute for Data Science

Video+Language: From Classification to Description

Video+Language: From Classification to Description

Video+Language: From Classification to DescriptionGoergen Institute for Data Science

Tg noh jeju_workshop

Tg noh jeju_workshop

Tg noh jeju_workshopTae-Gil Noh

Machine Learning and Deep Software Variability

Machine Learning and Deep Software Variability

Machine Learning and Deep Software VariabilityUniversity of Rennes, INSA Rennes, Inria/IRISA, CNRS

Handling Data in Mega Scale Web Systems

Handling Data in Mega Scale Web Systems

Handling Data in Mega Scale Web SystemsVineet Gupta

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowskaguest43b4df3

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World LazowskaWCET

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosCodiax

Big data & hadoop

Big data & hadoop

Big data & hadoopAbhi Goyan

Thinking in parallel ab tuladev

Thinking in parallel ab tuladev

Thinking in parallel ab tuladevPavel Tsukanov

Modeling data and best practices for the Azure Cosmos DB.

Modeling data and best practices for the Azure Cosmos DB.

Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif

Deep Generative Models

Deep Generative Models

Deep Generative Models Chia-Wen Cheng

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...Naoki (Neo) SATO

Interactive Latency in Big Data Visualization

Interactive Latency in Big Data Visualization

Interactive Latency in Big Data Visualizationbigdataviz_bay

Deep learningAman Kamboj

LIQUID-A Scalable Deduplication File System For Virtual Machine Images

LIQUID-A Scalable Deduplication File System For Virtual Machine Images

LIQUID-A Scalable Deduplication File System For Virtual Machine Imagesfabna benz

Big Data Analytics

Big Data Analytics

Big Data AnalyticsAmazon Web Services

2014 moore-dddc.titus.brown

Ensemblue - Paper Presentation

Ensemblue - Paper Presentation

Ensemblue - Paper Presentationankurkath

Ähnlich wie Computational Social Science, Lecture 03: Counting at Scale, Part I (20)

Video + Language 2019

Video + Language 2019

Video + Language 2019

Video + Language

Video + Language

Video + Language

Video+Language: From Classification to Description

Video+Language: From Classification to Description

Video+Language: From Classification to Description

Tg noh jeju_workshop

Tg noh jeju_workshop

Tg noh jeju_workshop

Machine Learning and Deep Software Variability

Machine Learning and Deep Software Variability

Machine Learning and Deep Software Variability

Handling Data in Mega Scale Web Systems

Handling Data in Mega Scale Web Systems

Handling Data in Mega Scale Web Systems

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

E Science As A Lens On The World Lazowska

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos

Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos

Big data & hadoop

Big data & hadoop

Big data & hadoop

Thinking in parallel ab tuladev

Thinking in parallel ab tuladev

Thinking in parallel ab tuladev

Modeling data and best practices for the Azure Cosmos DB.

Modeling data and best practices for the Azure Cosmos DB.

Modeling data and best practices for the Azure Cosmos DB.

Deep Generative Models

Deep Generative Models

Deep Generative Models

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...

[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...

Interactive Latency in Big Data Visualization

Interactive Latency in Big Data Visualization

Interactive Latency in Big Data Visualization

Deep learning

LIQUID-A Scalable Deduplication File System For Virtual Machine Images

LIQUID-A Scalable Deduplication File System For Virtual Machine Images

LIQUID-A Scalable Deduplication File System For Virtual Machine Images

Big Data Analytics

Big Data Analytics

Big Data Analytics

2014 moore-ddd

Ensemblue - Paper Presentation

Ensemblue - Paper Presentation

Ensemblue - Paper Presentation

Mehr von jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman

Modeling Social Data, Lecture 10: Networks

Modeling Social Data, Lecture 10: Networks

Modeling Social Data, Lecture 10: Networksjakehofman

Modeling Social Data, Lecture 8: Classification

Modeling Social Data, Lecture 8: Classification

Modeling Social Data, Lecture 8: Classificationjakehofman

Modeling Social Data, Lecture 7: Model complexity and generalization

Modeling Social Data, Lecture 7: Model complexity and generalization

Modeling Social Data, Lecture 7: Model complexity and generalizationjakehofman

Modeling Social Data, Lecture 8: Recommendation Systems

Modeling Social Data, Lecture 8: Recommendation Systems

Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman

Modeling Social Data, Lecture 6: Classification with Naive Bayes

Modeling Social Data, Lecture 6: Classification with Naive Bayes

Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Countingjakehofman

Modeling Social Data, Lecture 1: Case Studies

Modeling Social Data, Lecture 1: Case Studies

Modeling Social Data, Lecture 1: Case Studiesjakehofman

NYC Data Science Meetup: Computational Social Science

NYC Data Science Meetup: Computational Social Science

NYC Data Science Meetup: Computational Social Sciencejakehofman

Technical Tricks of Vowpal Wabbit

Technical Tricks of Vowpal Wabbit

Technical Tricks of Vowpal Wabbitjakehofman

Data-driven modeling: Lecture 10

Data-driven modeling: Lecture 10

Data-driven modeling: Lecture 10jakehofman

Data-driven modeling: Lecture 09

Data-driven modeling: Lecture 09

Data-driven modeling: Lecture 09jakehofman

Using Data to Understand the Brain

Using Data to Understand the Brain

Using Data to Understand the Brainjakehofman

Mehr von jakehofman (14)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1

Modeling Social Data, Lecture 11: Causality and Experiments, Part 1

Modeling Social Data, Lecture 10: Networks

Modeling Social Data, Lecture 10: Networks

Modeling Social Data, Lecture 10: Networks

Modeling Social Data, Lecture 8: Classification

Modeling Social Data, Lecture 8: Classification

Modeling Social Data, Lecture 8: Classification

Modeling Social Data, Lecture 7: Model complexity and generalization

Modeling Social Data, Lecture 7: Model complexity and generalization

Modeling Social Data, Lecture 7: Model complexity and generalization

Modeling Social Data, Lecture 8: Recommendation Systems

Modeling Social Data, Lecture 8: Recommendation Systems

Modeling Social Data, Lecture 8: Recommendation Systems

Modeling Social Data, Lecture 6: Classification with Naive Bayes

Modeling Social Data, Lecture 6: Classification with Naive Bayes

Modeling Social Data, Lecture 6: Classification with Naive Bayes

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 2: Introduction to Counting

Modeling Social Data, Lecture 1: Case Studies

Modeling Social Data, Lecture 1: Case Studies

Modeling Social Data, Lecture 1: Case Studies

NYC Data Science Meetup: Computational Social Science

NYC Data Science Meetup: Computational Social Science

NYC Data Science Meetup: Computational Social Science

Technical Tricks of Vowpal Wabbit

Technical Tricks of Vowpal Wabbit

Technical Tricks of Vowpal Wabbit

Data-driven modeling: Lecture 10

Data-driven modeling: Lecture 10

Data-driven modeling: Lecture 10

Data-driven modeling: Lecture 09

Data-driven modeling: Lecture 09

Data-driven modeling: Lecture 09

Using Data to Understand the Brain

Using Data to Understand the Brain

Using Data to Understand the Brain

Kürzlich hochgeladen

Barangay Council for the Protection of Children (BCPC) Orientation.pptx

Barangay Council for the Protection of Children (BCPC) Orientation.pptx

Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood

Gas measurement O2,Co2,& ph) 04/2024.pptx

Gas measurement O2,Co2,& ph) 04/2024.pptx

Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan

Raw materials used in Herbal Cosmetics.pptx

Raw materials used in Herbal Cosmetics.pptx

Raw materials used in Herbal Cosmetics.pptxAshokrao Mane college of Pharmacy Peth-Vadgaon

Karra SKD Conference Presentation Revised.pptx

Karra SKD Conference Presentation Revised.pptx

Karra SKD Conference Presentation Revised.pptxAshokKarra1

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood

Full Stack Web Development Course for Beginners

Full Stack Web Development Course for Beginners

Full Stack Web Development Course for BeginnersSabitha Banu

Field Attribute Index Feature in Odoo 17

Field Attribute Index Feature in Odoo 17

Field Attribute Index Feature in Odoo 17Celine George

Roles & Responsibilities in Pharmacovigilance

Roles & Responsibilities in Pharmacovigilance

Roles & Responsibilities in PharmacovigilanceSamikshaHamane

What is Model Inheritance in Odoo 17 ERP

What is Model Inheritance in Odoo 17 ERP

What is Model Inheritance in Odoo 17 ERPCeline George

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

TataKelola dan KamSiber Kecerdasan Buatan v022.pdfSarwono Sutikno, Dr.Eng.,CISA,CISSP,CISM,CSX-F

Science 7 Quarter 4 Module 2: Natural Resources.pptx

Science 7 Quarter 4 Module 2: Natural Resources.pptx

Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2

Q4 English4 Week3 PPT Melcnmg-based.pptx

Q4 English4 Week3 PPT Melcnmg-based.pptx

Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2

4.18.24 Movement Legacies, Reflection, and Review.pptx

4.18.24 Movement Legacies, Reflection, and Review.pptx

4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239

How to do quick user assign in kanban in Odoo 17 ERP

How to do quick user assign in kanban in Odoo 17 ERP

How to do quick user assign in kanban in Odoo 17 ERPCeline George

Kürzlich hochgeladen (20)

Barangay Council for the Protection of Children (BCPC) Orientation.pptx

Barangay Council for the Protection of Children (BCPC) Orientation.pptx

Barangay Council for the Protection of Children (BCPC) Orientation.pptx

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

Gas measurement O2,Co2,& ph) 04/2024.pptx

Gas measurement O2,Co2,& ph) 04/2024.pptx

Gas measurement O2,Co2,& ph) 04/2024.pptx

Raw materials used in Herbal Cosmetics.pptx

Raw materials used in Herbal Cosmetics.pptx

Raw materials used in Herbal Cosmetics.pptx

Karra SKD Conference Presentation Revised.pptx

Karra SKD Conference Presentation Revised.pptx

Karra SKD Conference Presentation Revised.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

Full Stack Web Development Course for Beginners

Full Stack Web Development Course for Beginners

Full Stack Web Development Course for Beginners

Field Attribute Index Feature in Odoo 17

Field Attribute Index Feature in Odoo 17

Field Attribute Index Feature in Odoo 17

Roles & Responsibilities in Pharmacovigilance

Roles & Responsibilities in Pharmacovigilance

Roles & Responsibilities in Pharmacovigilance

What is Model Inheritance in Odoo 17 ERP

What is Model Inheritance in Odoo 17 ERP

What is Model Inheritance in Odoo 17 ERP

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

Science 7 Quarter 4 Module 2: Natural Resources.pptx

Science 7 Quarter 4 Module 2: Natural Resources.pptx

Science 7 Quarter 4 Module 2: Natural Resources.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

Q4 English4 Week3 PPT Melcnmg-based.pptx

Q4 English4 Week3 PPT Melcnmg-based.pptx

Q4 English4 Week3 PPT Melcnmg-based.pptx

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

4.18.24 Movement Legacies, Reflection, and Review.pptx

4.18.24 Movement Legacies, Reflection, and Review.pptx

4.18.24 Movement Legacies, Reflection, and Review.pptx

How to do quick user assign in kanban in Odoo 17 ERP

How to do quick user assign in kanban in Odoo 17 ERP

How to do quick user assign in kanban in Odoo 17 ERP

Computational Social Science, Lecture 03: Counting at Scale, Part I

1. Counting @ Scale Sharad Goel Columbia University Computational Social Science: Lecture 3 February 8, 2013

2. Descriptive statistics (as opposed to inferential statistics) is about counting contingency tables means, variances, quantiles summaries of conditional distributions

3. Long tail video consumption on YouTube Digital divide time spent across various online properties Viral diffusion propagation of tweets on Twitter

4. Counting @ scale conceptually easy computationally hard

5. I/O bound difficult to read terabytes of data Network bound hard to transfer terabytes of data Memory bound cannot randomly access data points CPU bound even simple manipulations add up

6. Rank videos by popularity local video store 1K movies, 100K viewings

7. Rank videos by popularity local video store 1K movies, 100K viewings Load dataset into memory

8. Rank videos by popularity Netflix 100K movies, 1B viewings

9. Rank videos by popularity Netflix 100K movies, 1B viewings store counter for each movie in memory and stream through the dataset

10. Rank videos by popularity YouTube 10B videos, 10T viewings

11. Rank videos by popularity YouTube 10B videos, 10T viewings Trouble, with a capital ‘T’

12. Parallel computation Distribute work across several machines

13. 10 parallel workers 1T views per worker maybe 5B unique videos on each 100 parallel workers 100B views per worker maybe 1B videos on each

14. split  count  sort by video  merge  sort by popularity

15. Core problem the same movie appears on multiple machines Solution do not split viewing data at random ensure individual movies are never split apart

16. split  count  sort by video  merge  sort by popularity

17. Shuffle (1st attempt) create a new file for every movie append viewing data to the appropriate file

18. Shuffle (2nd attempt) First time you see a movie, append it randomly to one of 10K files Next time you see the movie, append it to same file

19. Shuffle (3rd attempt) Hash the movie ID to determine which file to append it to ( Hash function maps large input space to small output space approximately uniformly )

20. MapReduce: Simplifed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat OSDI, 2004

21. Map assign each input line to one or more groups Shuffle aggregate groups Reduce operate on grouped data

22. Map assign each input line to one or more groups v  [(k1, v1), …, (km, vm)] Shuffle aggregate groups Reduce operate on grouped data (k, [v1, …, vn])  [w1, …, wp]

23. The Insight of MapReduce One can efficiently group identical items Many tasks are computationally easier on grouped data

24. Word Count Input text corpus Output number of occurrences of each word

25. Word Count Map line  words Reduce word group  size of group

26. MapReduce the unreasonable effectiveness of aggregation