Suche senden
Hochladen
BigData @ comScore
•
1 gefällt mir
•
1,001 views
E
eaiti
Folgen
Technologie
Business
Melden
Teilen
Melden
Teilen
1 von 28
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Using Hadoop
Using Hadoop
eaiti
How to Succeed in Hadoop: comScore’s Deceptively Simple Secrets to Deploying ...
How to Succeed in Hadoop: comScore’s Deceptively Simple Secrets to Deploying ...
MapR Technologies
Case Study Com Score
Case Study Com Score
FM Signal
Syncsort & comScore Big Data Warehouse Meetup Sept 2013
Syncsort & comScore Big Data Warehouse Meetup Sept 2013
Steven Totman
Steve Totman Syncsort Big Data Warehousing hug 23 sept Final
Steve Totman Syncsort Big Data Warehousing hug 23 sept Final
Steven Totman
Expect More from Hadoop
Expect More from Hadoop
MapR Technologies
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Cloudera, Inc.
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
MapR Technologies
Empfohlen
Using Hadoop
Using Hadoop
eaiti
How to Succeed in Hadoop: comScore’s Deceptively Simple Secrets to Deploying ...
How to Succeed in Hadoop: comScore’s Deceptively Simple Secrets to Deploying ...
MapR Technologies
Case Study Com Score
Case Study Com Score
FM Signal
Syncsort & comScore Big Data Warehouse Meetup Sept 2013
Syncsort & comScore Big Data Warehouse Meetup Sept 2013
Steven Totman
Steve Totman Syncsort Big Data Warehousing hug 23 sept Final
Steve Totman Syncsort Big Data Warehousing hug 23 sept Final
Steven Totman
Expect More from Hadoop
Expect More from Hadoop
MapR Technologies
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Cloudera, Inc.
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
MapR Technologies
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Matt Stubbs
Meruvian - Introduction to MapR
Meruvian - Introduction to MapR
The World Bank
Modern real-time streaming architectures
Modern real-time streaming architectures
Arun Kejariwal
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
MapR Technologies
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQuery
Thuyen Ho
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Technologies
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
Anomaly Detection At The Edge
Anomaly Detection At The Edge
Arun Kejariwal
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
Converging your data landscape
Converging your data landscape
MapR Technologies
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
MapR Technologies
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
Bonaventura Del Monte
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
Amazon Web Services
Cosso cox
Cosso cox
Izzatul Jannah Jannah
Bitcoin 101 - Certified Bitcoin Professional Training Session
Bitcoin 101 - Certified Bitcoin Professional Training Session
Lisa Cheng
Weitere ähnliche Inhalte
Was ist angesagt?
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Matt Stubbs
Meruvian - Introduction to MapR
Meruvian - Introduction to MapR
The World Bank
Modern real-time streaming architectures
Modern real-time streaming architectures
Arun Kejariwal
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
MapR Technologies
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQuery
Thuyen Ho
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Technologies
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
Anomaly Detection At The Edge
Anomaly Detection At The Edge
Arun Kejariwal
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
Converging your data landscape
Converging your data landscape
MapR Technologies
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
MapR Technologies
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
Bonaventura Del Monte
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
Amazon Web Services
Was ist angesagt?
(20)
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Meruvian - Introduction to MapR
Meruvian - Introduction to MapR
Modern real-time streaming architectures
Modern real-time streaming architectures
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQuery
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
Anomaly Detection At The Edge
Anomaly Detection At The Edge
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
Converging your data landscape
Converging your data landscape
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
Rhino: Efficient Management of Very Large Distributed State for Stream Proces...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
Andere mochten auch
Cosso cox
Cosso cox
Izzatul Jannah Jannah
Bitcoin 101 - Certified Bitcoin Professional Training Session
Bitcoin 101 - Certified Bitcoin Professional Training Session
Lisa Cheng
Broadband tech 2005
Broadband tech 2005
eaiti
KULTPRIT LookBook %231
KULTPRIT LookBook %231
Flavia Furtos
Daniel Niersbach Resume 2014
Daniel Niersbach Resume 2014
Daniel Niersbach
Journal
Journal
szeming_teoh
Psych comic strip
Psych comic strip
szeming_teoh
Official short presentation (eng)
Official short presentation (eng)
Ivelin Stoyanov
Meritlist nbf
Meritlist nbf
ফুোপ প্হারপুোিব
Spring2016Report
Spring2016Report
Erika Hang
Have a taste of Cocktail Advertising - Digital & Social Media
Have a taste of Cocktail Advertising - Digital & Social Media
Flavia Furtos
How To Structure Large Applications With AngularJS
How To Structure Large Applications With AngularJS
Stefan Unterhofer
Hitesh renuwel
Hitesh renuwel
Solanki Hitesh
English essay
English essay
szeming_teoh
Video presentation
Video presentation
szeming_teoh
Hitesh cross cultural comm in business
Hitesh cross cultural comm in business
Solanki Hitesh
Ctolinux 2001
Ctolinux 2001
eaiti
Ping solutions overview_111904
Ping solutions overview_111904
eaiti
Awardees b
Awardees b
ফুোপ প্হারপুোিব
Cto forum nirav_kapadia_2006_03_31_2006
Cto forum nirav_kapadia_2006_03_31_2006
eaiti
Andere mochten auch
(20)
Cosso cox
Cosso cox
Bitcoin 101 - Certified Bitcoin Professional Training Session
Bitcoin 101 - Certified Bitcoin Professional Training Session
Broadband tech 2005
Broadband tech 2005
KULTPRIT LookBook %231
KULTPRIT LookBook %231
Daniel Niersbach Resume 2014
Daniel Niersbach Resume 2014
Journal
Journal
Psych comic strip
Psych comic strip
Official short presentation (eng)
Official short presentation (eng)
Meritlist nbf
Meritlist nbf
Spring2016Report
Spring2016Report
Have a taste of Cocktail Advertising - Digital & Social Media
Have a taste of Cocktail Advertising - Digital & Social Media
How To Structure Large Applications With AngularJS
How To Structure Large Applications With AngularJS
Hitesh renuwel
Hitesh renuwel
English essay
English essay
Video presentation
Video presentation
Hitesh cross cultural comm in business
Hitesh cross cultural comm in business
Ctolinux 2001
Ctolinux 2001
Ping solutions overview_111904
Ping solutions overview_111904
Awardees b
Awardees b
Cto forum nirav_kapadia_2006_03_31_2006
Cto forum nirav_kapadia_2006_03_31_2006
Ähnlich wie BigData @ comScore
How to Suceed in Hadoop
How to Suceed in Hadoop
Precisely
Demantra Case Study Doug
Demantra Case Study Doug
sichie
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
AWS Germany
comScore
comScore
Teradata Aster
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Teradata Aster
The Evolution of Data Architecture
The Evolution of Data Architecture
Wei-Chiu Chuang
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
Data Driven Innovation
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
pzjnjr6rsg
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
mattdenesuk
New Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @Wharton
Paul Hofmann
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Webonise Lab
Applying linear regression and predictive analytics
Applying linear regression and predictive analytics
MariaDB plc
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
Bigdata
Bigdata
Thanandorn Panichnok
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
COIICV
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
MongoDB
NI Automated Test Outlook 2016
NI Automated Test Outlook 2016
Hank Lydick
STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.
Hank Lydick
Pentaho Reporting Solution for a Leading Energy Company in US
Pentaho Reporting Solution for a Leading Energy Company in US
Sigma Infosolutions, LLC
Ähnlich wie BigData @ comScore
(20)
How to Suceed in Hadoop
How to Suceed in Hadoop
Demantra Case Study Doug
Demantra Case Study Doug
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
comScore
comScore
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
The Evolution of Data Architecture
The Evolution of Data Architecture
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
New Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @Wharton
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Applying linear regression and predictive analytics
Applying linear regression and predictive analytics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Bigdata
Bigdata
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
Miguel Angel Perdiguero - Head of BIG data & analytics Atos Iberia - semanain...
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
NI Automated Test Outlook 2016
NI Automated Test Outlook 2016
STS. Smarter devices. Smarter test systems.
STS. Smarter devices. Smarter test systems.
Pentaho Reporting Solution for a Leading Energy Company in US
Pentaho Reporting Solution for a Leading Energy Company in US
Mehr von eaiti
Handheld device med_care_2001
Handheld device med_care_2001
eaiti
Dc roundtablesmall webservices_2002
Dc roundtablesmall webservices_2002
eaiti
Middleware 2002
Middleware 2002
eaiti
J2ee 2000
J2ee 2000
eaiti
Xp presentation 2003
Xp presentation 2003
eaiti
Push to pull
Push to pull
eaiti
Intrusion detection 2001
Intrusion detection 2001
eaiti
Cloud mz cto_roundtable
Cloud mz cto_roundtable
eaiti
Mobile 2000
Mobile 2000
eaiti
Stateof cto career_2002
Stateof cto career_2002
eaiti
Dions globalsoa web2presentation1_2006
Dions globalsoa web2presentation1_2006
eaiti
Thads globalsoa web2presentation2_2006
Thads globalsoa web2presentation2_2006
eaiti
Social apps 3_1_2008
Social apps 3_1_2008
eaiti
It outsourcing 2005
It outsourcing 2005
eaiti
Washdc cto-0905-2003
Washdc cto-0905-2003
eaiti
Quantum technology
Quantum technology
eaiti
Hemispheres of Data
Hemispheres of Data
eaiti
Enterprise Mobility Management
Enterprise Mobility Management
eaiti
Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analytics
eaiti
Mehr von eaiti
(19)
Handheld device med_care_2001
Handheld device med_care_2001
Dc roundtablesmall webservices_2002
Dc roundtablesmall webservices_2002
Middleware 2002
Middleware 2002
J2ee 2000
J2ee 2000
Xp presentation 2003
Xp presentation 2003
Push to pull
Push to pull
Intrusion detection 2001
Intrusion detection 2001
Cloud mz cto_roundtable
Cloud mz cto_roundtable
Mobile 2000
Mobile 2000
Stateof cto career_2002
Stateof cto career_2002
Dions globalsoa web2presentation1_2006
Dions globalsoa web2presentation1_2006
Thads globalsoa web2presentation2_2006
Thads globalsoa web2presentation2_2006
Social apps 3_1_2008
Social apps 3_1_2008
It outsourcing 2005
It outsourcing 2005
Washdc cto-0905-2003
Washdc cto-0905-2003
Quantum technology
Quantum technology
Hemispheres of Data
Hemispheres of Data
Enterprise Mobility Management
Enterprise Mobility Management
Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analytics
Kürzlich hochgeladen
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Neo4j
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Kürzlich hochgeladen
(20)
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
BigData @ comScore
1.
BigData @ comScore Michael
Brown, CTO, comScore, Inc. March 25th , 2011
2.
comScore is a
Global Leader in Measuring the Digital World NASDAQ SCOR Clients 1600+ worldwide Employees 1,000+ Headquarters Reston, VA Global Coverage 170+ countries under measurement; 43 markets reported Local Presence 30+ locations in 21 countries 2© comScore, Inc. Proprietary. Local Presence 30+ locations in 21 countries V0910
3.
Broad Client Base
and Deep Expertise Across Key Industries Media Agencies Telecom/Mobile Financial Retail Travel CPG Pharma Technology 3© comScore, Inc. Proprietary. V0910
4.
The Trusted Source
for Digital Intelligence Across Vertical Markets 47 out of the top 50 4 out of the top 4 WIRELESS CARRIERS 9 out of the top 10 INVESTMENT BANKS 9 out of the top 10 9 out of the top 10 INTERNET SERVICE PROVIDERS 9 out of the top 10 AUTO INSURERS 4© comScore, Inc. Proprietary. 47 out of the top 50 ONLINE PROPERTIES 45 out of the top 50 ADVERTISING AGENCIES 9 out of the top 10 MAJOR MEDIA COMPANIES 9 out of the top 10 PHARMACEUTICAL COMPANIES 9 out of the top 10 CONSUMER FINANCE COMPANIES 9 out of the top 10 CPG COMPANIES V0910
5.
comScore History of
Leadership and Innovation To measure the search market To measure video streaming To provide behavioral ad effectiveness To meter mobile user behavior 1st To Unify census + panel measurement 5© comScore, Inc. Proprietary. To build and project from 2 million+ longitudinal panel To monitor and report e-commerce data 1 To deliver a worldwide Internet audience measurement Global Shaper Company 2010 V0910
6.
Average Records Captured
per Day (2005-2009) 800,000,000 1,000,000,000 1,200,000,000 1,400,000,000 1,600,000,000 1,800,000,000 6© comScore, Inc. Proprietary. - 200,000,000 400,000,000 600,000,000 800,000,000
7.
Launching the 3rd
Generation In 2009, in the midst of the recession, comScore decided to build and release its 3rd Generation Product – Unified Digital Measurement (UDM or Hybrid) Technology Goals – Ramp up data collection – Deploy new methodologies for data processing and analysis – Be able to scale linearly to the environment to support growth 7© comScore, Inc. Proprietary. – Be able to scale linearly to the environment to support growth – Have yesterdays data available today And one more thing … do it in 4 months or less.
8.
Unified Digital Measurement™
(UDM) Establishes Platform For Panel + Census Data Integration Global PERSON Measurement Global MACHINE Measurement 8© comScore, Inc. Proprietary. PAGE TAGSPANEL Unified Digital Measurement (UDM) Patent-Pending Methodology Adopted by 88% of Top U.S. Media Properties V0910
9.
How Does the
Hybrid Process Work? Collect Traffic from PCs and devices Clean Traffic – remove non- human, bots, apply edit rules 9© comScore, Inc. Proprietary. Apply comScore URL Dictionary Total Traffic Filtered Traffic
10.
URL Dictionary (CFD):
Advertising Industry “Currency” Intelligent grouping of Properties with 7+ levels of detail – Property (e.g., Yahoo! Properties, Microsoft Sites) – Media Title (e.g., Yahoo!, MSN) 10© comScore, Inc. Proprietary. – Channel (e.g., Yahoo! Search, MSN Homepages) – Subchannel (e.g., Yahoo! Image Search, MSNBC) – Group/Subgroup (e.g., Yahoo! Calendar, Today)
11.
URL Dictionary (CFD)
Coverage Statistics 11MM Unique Domains Average/Month in 2010 • Over 80% pages viewed from top 131K domains in 2010 vs. 392K in 2009 11© comScore, Inc. Proprietary. • 2,360K patterns in January 2011represents 85% of all pages • 1,254K syndicated entities in January 2010 • 41K patterns added/month in 2010.
12.
Worldwide UDM™ Penetration Europe Austria
80% Asia Pacific Australia 91% North America Canada 94% Latin America Argentina 94% Middle East & Africa Israel 93% Percentage of Machines Included in UDM Measurement 12© comScore, Inc. Proprietary. July 2010 Penetration Data Austria 80% Belgium 85% Switzerland 84% Germany 84% Denmark 82% Spain 90% Finland 85% France 91% Ireland 91% Italy 80% Netherlands 88% Norway 84% Portugal 86% Sweden 85% United Kingdom 90% Australia 91% Hong Kong 88% India 84% Japan 73% Malaysia 87% New Zealand 88% Singapore 91% Canada 94% United States 91% Argentina 94% Brazil 92% Chile 94% Colombia 95% Mexico 93% Puerto Rico 92% Israel 93% South Africa 73% V0910
13.
Worldwide Tags per
Day 15,000,000,000 20,000,000,000 25,000,000,000 #ofrecords 13© comScore, Inc. Proprietary. 0 5,000,000,000 10,000,000,000 Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009 Jan 2010 Feb 2010 Mar 2010 Apr 2010 May 2010 Jun 2010 Jul 2010 Aug 2010 Sep 2010 Oct 2010 Nov 2010 Dec 2010 Jan 2011 Feb 2011 #ofrecords Beacon Records Panel Records
14.
Monthly Totals 300,000,000,000 400,000,000,000 500,000,000,000 600,000,000,000 #ofrecords 14© comScore,
Inc. Proprietary. 0 100,000,000,000 200,000,000,000 300,000,000,000 Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb 2009 2010 2011 #ofrecords Beacon Records Panel Records
15.
High Level Data
Flow Panel ETL 15© comScore, Inc. Proprietary. Census ETL Delivery
16.
Enterprise Data Warehouse
: Sybase IQ 15.2 Multiplex EDW is currently comprised of 20 servers running Windows 2003 R2 x64 – Currently 220 Intel CPUs – Dedicated EDW technical team of 3 DBAs and 1 Administrator – Ability to grow compute capacity and storage capacity independently EDW data repository housed on both EMC VMAX and Clarion – 4 EDW instances (2 in Virginia and 2 in Illinois) – One EDW instance is 147TB usable (app. 200TB of raw data) 16© comScore, Inc. Proprietary. – One EDW instance is 147TB usable (app. 200TB of raw data) – Production EDW Drive Layout 416 x 1TB SATA, RAID6, 14+2 42 x 600GB 15K, RAID1 8 X 400GB Flash, RAID5, 7+1 Current Capacity and Performance Metrics – 1,835,412,793,799 Rows loaded – 140TB in 14,168 tables – Capable of Loading 56 Billion rows per hour
17.
Subsystem System designed using
multiple sub systems Easily take out and replace different components as demands changed Moved from a single server to a cluster of servers in a few months in some cases with first stage tag processing Periodically redesign different subsystems to support increased processing demands 17© comScore, Inc. Proprietary. Many systems on their third generation of technology
18.
Homegrown Distributed Processing Reduced
core aggregation from Reduce final product creation 2002 – comScore distributed processing framework Open Source Hadoop ScalabilityWall 18© comScore, Inc. Proprietary. aggregation from 48 hours to 7 hours product creation from 24 hours to 2 hours Hadoop framework ScalabilityWall
19.
GreenPlum GreenPlum MPP – 80
Node Cluster: 1 Master; 6 ETL; 72 Workers – Using Dell R510 with 12 600GB 15K RAID, 64GB RAM, 24 cores (HT) – Support analytic end users with access to record level data, through a SQL interface – Ability to load over 400 billion rows in 8 hours – Hourly data loading in place 19© comScore, Inc. Proprietary. – Hourly data loading in place – Allow the analysts to mine the data for the business uses – Use for quick analysis of raw event data and for the ideation and creation of new products
20.
Hadoop Hadoop – Dev -
6x Dell 2950 w/6 1TB – Prod - 10x Dell R710 w/ 6 600GB – Prod in 2 weeks – 10x Dell R710 w/6 600GB & 20x Dell R510 w/12 2TB – Moving large processing jobs that currently are constrained by our current framework to Hadoop. We have some large analytical runs that currently go for over 40 hours on 32 servers and we are re-engineering to reduce 20© comScore, Inc. Proprietary. for over 40 hours on 32 servers and we are re-engineering to reduce processing time. – We have found that the Fair Scheduler works well for our job loads – We use a “homegrown” workflow system (BORG) that manages tasks inside and outside hadoop.
21.
Sharding Sharding divides work
across multiple systems using different mechanisms Shard data as far up stream as possible Ability to break data into multiple chunks early in processing, enables ability to compute capacity down stream to accommodate large volume increases in data ingest 21© comScore, Inc. Proprietary.
22.
Sorting We use DMExpress
from SyncSort across hundreds of servers this allows for efficient data processing We sort input data based on a column in advance To calculate uniques, check if the prior value changed from the current value and then increment a counter We now have aggregation systems that can process over 50 GB of data with 357 million rows in less than an hour on a Dell R710 2U serve 22© comScore, Inc. Proprietary. with 357 million rows in less than an hour on a Dell R710 2U serve
23.
Compression w/Sorting Compress Log
Files when processing large volumes of log data Several advantages to Sorting Data First: – Reduces the size of the data – Improves application performance Examples: – 1 Hour of our data (313 GB raw, 815 million rows) 23© comScore, Inc. Proprietary. 1 Hour of our data (313 GB raw, 815 million rows) – Standard compression of time ordered data is 93GB (30% of original) – Standard compression on a 2 key sorted set is 56GB (18% of original) – For one day it saves 800GB – For one month it saves 25 TB – For 90 days it saves 75TB
24.
Big data makes
you think differently Question: How many distinct cookies over 3 months? Data: 3 monthly tables with distinct cookies, indexed Size: 10B records per table Platform: Sybase IQ Attempt: UNION select count(cookies) over 3 monthly tables 24© comScore, Inc. Proprietary. – Union operator distincts Result: FAIL. Out of temp space. Out of luck. – Failed after 30 minutes. Why? UNION performs a SELECT and then a DISTINCT (sorting 30B rows)
25.
Rethink the problem! INNER
joins are cheaper No sort, they use existing indexes Remember set theory? Of course you do! Let months be {A, B, C} A B ∪ ∪ 25© comScore, Inc. Proprietary. INNER join on only 2 tables of data at a time 2 month intersections took 2 hours each and less taxing on memory Used intersection of intermediate (indexed!) results… 5 mins C A ∪ B ∪ C = A + B + C – A ∩ B – A ∩ C – C ∩ B + A ∩ B ∩ C A ∩ B ∩ C = (A ∩ B) ∩ (A ∩ C) ∩ (C ∩ B) Total query time: 6.5 hours
26.
TCO with Large
Cluster Systems Examine replication factor and disk configuration for systems with replication built into the framework to support redundancy and concurrency Example: Hadoop cluster that supports 108TB of base compressed data Hypothetical Configurations: 26© comScore, Inc. Proprietary. – Replication Factor of 3 R710 (6x drives, JBOD); requires 162 servers R510 (12x drives JBOD); requires 68 servers – Replication Factor of 2 R710 (6x drives, RAID 5); requires 129 servers R510 (12x drives, RAID 5); requires 54 servers
27.
Useful Factoids Colorful, bite-sized
graphical representations of the best discoveries we unearth. 27© comScore, Inc. Proprietary. Visit www.comscoredatamine.com or follow @datagems for the latest gems.
28.
Thank You! Michael Brown CTO comScore,
Inc. mbrown@comscore.com 28© comScore, Inc. Proprietary.
Jetzt herunterladen