SlideShare ist ein Scribd-Unternehmen logo
1 von 24
New Data, New Insights
 it isn’t what we thought it was

©MapR Technologies - Confidential   1
Do you
                                     remember
                                    the future?

©MapR Technologies - Confidential        2
©MapR Technologies - Confidential   3
Some things
                                    turned out
                                    as expected

©MapR Technologies - Confidential        4
Guys wearing
                                      Fedoras



©MapR Technologies - Confidential       5
What about
                                    “Big Data”?


©MapR Technologies - Confidential        6
Harvard University
                               will have 200 x 106

                                 volumes by 2040
                                            Fremont Rider, 1944




©MapR Technologies - Confidential      7
To cope … only short papers
              should be published. … not
              more than 2500 characters
              counting “space,” punctuation
              marks, etc.     Gray and Ruston in IEEE Transactions on
                                            Electronic Computers, 1964




©MapR Technologies - Confidential       8
Remember
                                     the guy in
                                    the Fedora?


©MapR Technologies - Confidential        9
He’s tweeting
                                     about this
                                      right now


©MapR Technologies - Confidential        10
So what is the big
                    data monorail and
                   what is the Fedora?


©MapR Technologies - Confidential   11
Data curation
                      Rigid Schemas
                   Engineered Structure



©MapR Technologies - Confidential   12
Data curation
                      Rigid Schemas
                   Engineered Structure



©MapR Technologies - Confidential   13
Data as-you-find-it
                         Flexible schemas
                           Late binding



©MapR Technologies - Confidential   14
Data as-you-find-it
                         Flexible schemas
                           Late binding



©MapR Technologies - Confidential   15
©MapR Technologies - Confidential   16
©MapR Technologies - Confidential   17
©MapR Technologies - Confidential   18
©MapR Technologies - Confidential   19
Why is it different?
                  How does it work?

                                    Scan, don’t seek
                                    Stream, don’t store

©MapR Technologies - Confidential          20
Targeted Advertising
               - not a monorail
               - but not what we
                 thought
               - the value is in the
                 provenance
©MapR Technologies - Confidential   21
Social Media Analytics
  - what are they saying
  - and exactly when?
  - combined with?


©MapR Technologies - Confidential   22
Deep Learning
  - what does it mean
  - the math keeps on
    getting easier
  - blame it on scale

©MapR Technologies - Confidential   23
 Send me email !
       tdunning@maprtech.com

 Tweet as the spirit moves
       @ted_dunning


©MapR Technologies - Confidential   24

Weitere ähnliche Inhalte

Andere mochten auch

Cmu-2011-09.pptx
Cmu-2011-09.pptxCmu-2011-09.pptx
Cmu-2011-09.pptx
Ted Dunning
 

Andere mochten auch (7)

R user-group-2011-09
R user-group-2011-09R user-group-2011-09
R user-group-2011-09
 
Cmu-2011-09.pptx
Cmu-2011-09.pptxCmu-2011-09.pptx
Cmu-2011-09.pptx
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 
Cmu Lecture on Hadoop Performance
Cmu Lecture on Hadoop PerformanceCmu Lecture on Hadoop Performance
Cmu Lecture on Hadoop Performance
 
Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012Drill lightning-london-big-data-10-01-2012
Drill lightning-london-big-data-10-01-2012
 
Graham Mossman - SQL and high performance computing on Hadoop
Graham Mossman - SQL and high performance computing on HadoopGraham Mossman - SQL and high performance computing on Hadoop
Graham Mossman - SQL and high performance computing on Hadoop
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�
 

Ähnlich wie Bda-dunning-2012-12-06

Ähnlich wie Bda-dunning-2012-12-06 (20)

Big Data Analytics London
Big Data Analytics LondonBig Data Analytics London
Big Data Analytics London
 
predictive-analytics-san-diego-2013-02-21
predictive-analytics-san-diego-2013-02-21predictive-analytics-san-diego-2013-02-21
predictive-analytics-san-diego-2013-02-21
 
What is the past future tense of data?
What is the past future tense of data?What is the past future tense of data?
What is the past future tense of data?
 
Hcj 2013-01-21
Hcj 2013-01-21Hcj 2013-01-21
Hcj 2013-01-21
 
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRSuper-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
 
Strata New York 2012
Strata New York 2012Strata New York 2012
Strata New York 2012
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Big data, why now?
Big data, why now?Big data, why now?
Big data, why now?
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
 
Unlocking the Power of Point Clouds_Tony Kent - Intergraph Geospatial World T...
Unlocking the Power of Point Clouds_Tony Kent - Intergraph Geospatial World T...Unlocking the Power of Point Clouds_Tony Kent - Intergraph Geospatial World T...
Unlocking the Power of Point Clouds_Tony Kent - Intergraph Geospatial World T...
 
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
 
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...
 
Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on Spark
 
Chicago Hadoop in Finance - Ted Dunning
Chicago Hadoop in Finance - Ted DunningChicago Hadoop in Finance - Ted Dunning
Chicago Hadoop in Finance - Ted Dunning
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning Primer
 

Mehr von Ted Dunning

Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
Ted Dunning
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
Ted Dunning
 

Mehr von Ted Dunning (20)

Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxDunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with Kubernetes
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
 
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forAnomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning Logistics
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
T digest-update
T digest-updateT digest-update
T digest-update
 
Finding Changes in Real Data
Finding Changes in Real DataFinding Changes in Real Data
Finding Changes in Real Data
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoop
 
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
 
Sharing Sensitive Data Securely
Sharing Sensitive Data SecurelySharing Sensitive Data Securely
Sharing Sensitive Data Securely
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
 
Doing-the-impossible
Doing-the-impossibleDoing-the-impossible
Doing-the-impossible
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
 
Cognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approachesCognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approaches
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Bda-dunning-2012-12-06

  • 1. New Data, New Insights it isn’t what we thought it was ©MapR Technologies - Confidential 1
  • 2. Do you remember the future? ©MapR Technologies - Confidential 2
  • 3. ©MapR Technologies - Confidential 3
  • 4. Some things turned out as expected ©MapR Technologies - Confidential 4
  • 5. Guys wearing Fedoras ©MapR Technologies - Confidential 5
  • 6. What about “Big Data”? ©MapR Technologies - Confidential 6
  • 7. Harvard University will have 200 x 106 volumes by 2040 Fremont Rider, 1944 ©MapR Technologies - Confidential 7
  • 8. To cope … only short papers should be published. … not more than 2500 characters counting “space,” punctuation marks, etc. Gray and Ruston in IEEE Transactions on Electronic Computers, 1964 ©MapR Technologies - Confidential 8
  • 9. Remember the guy in the Fedora? ©MapR Technologies - Confidential 9
  • 10. He’s tweeting about this right now ©MapR Technologies - Confidential 10
  • 11. So what is the big data monorail and what is the Fedora? ©MapR Technologies - Confidential 11
  • 12. Data curation Rigid Schemas Engineered Structure ©MapR Technologies - Confidential 12
  • 13. Data curation Rigid Schemas Engineered Structure ©MapR Technologies - Confidential 13
  • 14. Data as-you-find-it Flexible schemas Late binding ©MapR Technologies - Confidential 14
  • 15. Data as-you-find-it Flexible schemas Late binding ©MapR Technologies - Confidential 15
  • 16. ©MapR Technologies - Confidential 16
  • 17. ©MapR Technologies - Confidential 17
  • 18. ©MapR Technologies - Confidential 18
  • 19. ©MapR Technologies - Confidential 19
  • 20. Why is it different? How does it work? Scan, don’t seek Stream, don’t store ©MapR Technologies - Confidential 20
  • 21. Targeted Advertising - not a monorail - but not what we thought - the value is in the provenance ©MapR Technologies - Confidential 21
  • 22. Social Media Analytics - what are they saying - and exactly when? - combined with? ©MapR Technologies - Confidential 22
  • 23. Deep Learning - what does it mean - the math keeps on getting easier - blame it on scale ©MapR Technologies - Confidential 23
  • 24.  Send me email ! tdunning@maprtech.com  Tweet as the spirit moves @ted_dunning ©MapR Technologies - Confidential 24