SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Hal’s Headache
               Data Tuesday

       02/25/2013
       Florian Douetteau
Meet Hal Alowne




                                        Dim Sum



                                 ‟
                                        CEO & Founder
                                        Dim’s Private Showroom



                                    Hey Hal ! We need
                                   a big data platform
                                     like the big guys.


                                                                 ”
   Hal Alowne
   BI Manager                    Let’s just do as they do!
   Dim’s Private Showroom


European E-commerce Web site                                         Big   Guys
• 100M$ Revenue                                    Big Data          •     10B$+ Revenue
• 1 Million customer                               Copy Cat          •     100M+ customers
• 1 Data Analyst (Hal Himself)                     Project           •     100+ Data Scientist

Dataiku - Data Tuesday                                                     2/28/2013             2
CHOOSE TECHNOLOGY
NoSQL-Slavia                               Scalability Central                Machine Learning
Elastic Search                                                                Mystery Land
                                           Hadoop                        Scikit-Learn
SOLR                                       Ceph

           MongoDB                         Cassandra
                                                      Sphere             Mahout
                                                                                    WEKA
Riak                                                                     MLBase
                            Membase
                                           Spark
SQL Colunnar Republic
InfiniDB                                                         RapidMiner
                                                                                            R
           LucidDB
                                                   Pig                     Panda
Impala                                             Hive
                             D3                    Cascading             Statistician Old
                             Crossfilter           Talend                House
       Vizualization County
                                                  Data Clean Wasteland
   Dataiku - Data Tuesday                                                   2/28/2013           3
LEARN MACHINE
                           LEARNING STUFF




 Try to understand              Find People that understand machine learning
  myself                          and all this stuff



  Dataiku - Data Tuesday                                    2/28/2013            4
DO IT

                     Open Data           Storm
Megabytes
                     CRM                 Hadoop                  R
Gigabytes
                                                               Elastic
                                                               Search
                     Web Logs


Terabytes
                                         SQL     D3



                                    Connect things together
                                    Pour Data in
                                    Clean Data
                                    Fix the leaks
                                    Start again
 Dataiku - Data Tuesday                                                  2/28/2013   5
MERIT = TIME + ROI

  TIME : 6 MONTHS                                          ROI : APPS

2013                                      2014
                                                          Targeted
       Find the right       Choose the
                                           Make it work   Newsletter
           people            technology
                                           (6 months?)
        (6 months?)         (6 months?)


                                                          Recommender
 2013
                                                          System

        Build the lab
         (6 months)


        • Train People
        • Reuse working patterns                          Dynamic Pricing



          Build a lab in 6 months                          Deploy apps
           (rather than 18 months)                           that actually deliver value
   Dataiku - Data Tuesday                                               2/28/2013          6
Dataiku

One Goal                           One platform with an open source core



‟
Help you build your data lab in
     less than six months
                                  Export
                                  Predictions



                                  Manage datasets
                                  and transformations
                                                        Impact


                                                        Flow
                                                                  Feedback


                                                                   Doctor
                                                                             Continuous
                                                                             Loopback



                                                                             Diagnose
                                                                             Data




                             ”
                                  all-in-one data
                                  scientists             D1        Shaker    Prepare
                                                                             Data
                                  distribution




One fake customer                 A few real ones




             Data Is Money


    Dataiku - Data Tuesday                                       2/28/2013                7
2/28/2013   8
Dataiku - Data Tuesday

Weitere ähnliche Inhalte

Was ist angesagt?

Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyonddatasalt
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computingViet-Trung TRAN
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
Yahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study ExcerptYahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study ExcerptDenny Lee
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Jonathan Seidman
 
Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Jonathan Seidman
 
Is Hadoop a necessity for Data Science
Is Hadoop a necessity for Data ScienceIs Hadoop a necessity for Data Science
Is Hadoop a necessity for Data ScienceEdureka!
 
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemTreasure Data, Inc.
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Hadoop for Data Warehousing professionals
Hadoop for Data Warehousing professionalsHadoop for Data Warehousing professionals
Hadoop for Data Warehousing professionalsEdureka!
 
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
Webinar  : Talend : The Non-Programmer's Swiss Knife for Big DataWebinar  : Talend : The Non-Programmer's Swiss Knife for Big Data
Webinar : Talend : The Non-Programmer's Swiss Knife for Big DataEdureka!
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 
Yahoo Microstrategy 2008
Yahoo Microstrategy 2008Yahoo Microstrategy 2008
Yahoo Microstrategy 2008Amr Awadallah
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsVMware Tanzu
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoopManoj Jangalva
 

Was ist angesagt? (20)

Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyond
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
Yahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study ExcerptYahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study Excerpt
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011
 
Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011
 
Is Hadoop a necessity for Data Science
Is Hadoop a necessity for Data ScienceIs Hadoop a necessity for Data Science
Is Hadoop a necessity for Data Science
 
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Hadoop for Data Warehousing professionals
Hadoop for Data Warehousing professionalsHadoop for Data Warehousing professionals
Hadoop for Data Warehousing professionals
 
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
Webinar  : Talend : The Non-Programmer's Swiss Knife for Big DataWebinar  : Talend : The Non-Programmer's Swiss Knife for Big Data
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Yahoo Microstrategy 2008
Yahoo Microstrategy 2008Yahoo Microstrategy 2008
Yahoo Microstrategy 2008
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Seminar Report Vaibhav
Seminar Report VaibhavSeminar Report Vaibhav
Seminar Report Vaibhav
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoop
 

Andere mochten auch

Presentation pa pms
Presentation pa   pmsPresentation pa   pms
Presentation pa pmsroufida7
 
Why BizzApps
Why BizzAppsWhy BizzApps
Why BizzAppsbizzapps
 
My biography legg
My biography leggMy biography legg
My biography leggleggasi
 
5 rocket fuel data tuesday- tourisme&transport
5   rocket fuel data tuesday- tourisme&transport5   rocket fuel data tuesday- tourisme&transport
5 rocket fuel data tuesday- tourisme&transportData Tuesday
 
Why Bizz apps
Why Bizz appsWhy Bizz apps
Why Bizz appsbizzapps
 
1 Systhematic - Data Tuesday 26 fev 2013
1  Systhematic - Data Tuesday 26 fev 20131  Systhematic - Data Tuesday 26 fev 2013
1 Systhematic - Data Tuesday 26 fev 2013Data Tuesday
 
4 Oslandia - datatuesday 26 fev 2013
4   Oslandia - datatuesday 26 fev 20134   Oslandia - datatuesday 26 fev 2013
4 Oslandia - datatuesday 26 fev 2013Data Tuesday
 
Nicolas Rose - XANGE - VRM - Data Tuesday
Nicolas Rose - XANGE -  VRM - Data Tuesday Nicolas Rose - XANGE -  VRM - Data Tuesday
Nicolas Rose - XANGE - VRM - Data Tuesday Data Tuesday
 
Art & Mechanical Reproduction
Art & Mechanical ReproductionArt & Mechanical Reproduction
Art & Mechanical ReproductionCinemaLUC
 
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...★ MIKE SHUR
 
Data Tuesday 20 nov 2012 Mesagraph-social tv analitycs
Data Tuesday 20 nov 2012  Mesagraph-social tv analitycsData Tuesday 20 nov 2012  Mesagraph-social tv analitycs
Data Tuesday 20 nov 2012 Mesagraph-social tv analitycsData Tuesday
 

Andere mochten auch (16)

Presentation pa pms
Presentation pa   pmsPresentation pa   pms
Presentation pa pms
 
Why BizzApps
Why BizzAppsWhy BizzApps
Why BizzApps
 
My biography legg
My biography leggMy biography legg
My biography legg
 
5 rocket fuel data tuesday- tourisme&transport
5   rocket fuel data tuesday- tourisme&transport5   rocket fuel data tuesday- tourisme&transport
5 rocket fuel data tuesday- tourisme&transport
 
Why Bizz apps
Why Bizz appsWhy Bizz apps
Why Bizz apps
 
1.oo
1.oo1.oo
1.oo
 
1 Systhematic - Data Tuesday 26 fev 2013
1  Systhematic - Data Tuesday 26 fev 20131  Systhematic - Data Tuesday 26 fev 2013
1 Systhematic - Data Tuesday 26 fev 2013
 
123
123123
123
 
1.oo
1.oo1.oo
1.oo
 
4 Oslandia - datatuesday 26 fev 2013
4   Oslandia - datatuesday 26 fev 20134   Oslandia - datatuesday 26 fev 2013
4 Oslandia - datatuesday 26 fev 2013
 
Shift technology
Shift technologyShift technology
Shift technology
 
Nicolas Rose - XANGE - VRM - Data Tuesday
Nicolas Rose - XANGE -  VRM - Data Tuesday Nicolas Rose - XANGE -  VRM - Data Tuesday
Nicolas Rose - XANGE - VRM - Data Tuesday
 
çAylak video
çAylak videoçAylak video
çAylak video
 
Art & Mechanical Reproduction
Art & Mechanical ReproductionArt & Mechanical Reproduction
Art & Mechanical Reproduction
 
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...
Alexia Howard | Transforming Packaged Food | 2016 #FarmToLabel | Keynote Pres...
 
Data Tuesday 20 nov 2012 Mesagraph-social tv analitycs
Data Tuesday 20 nov 2012  Mesagraph-social tv analitycsData Tuesday 20 nov 2012  Mesagraph-social tv analitycs
Data Tuesday 20 nov 2012 Mesagraph-social tv analitycs
 

Ähnlich wie 8 douetteau - dataiku - data tuesday open source 26 fev 2013

Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsInside Analysis
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overviewjdijcks
 
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012Gigaom
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascadingDataiku
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
Klout changing landscape of social media
Klout changing landscape of social mediaKlout changing landscape of social media
Klout changing landscape of social mediaDataWorks Summit
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages JaunesDataiku
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...Jürgen Ambrosi
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudInside Analysis
 
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012Anand Deshpande
 
Real time data processing frameworks
Real time data processing frameworksReal time data processing frameworks
Real time data processing frameworksIJDKP
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAmazon Web Services
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSTreasure Data, Inc.
 

Ähnlich wie 8 douetteau - dataiku - data tuesday open source 26 fev 2013 (20)

Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012
BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Klout changing landscape of social media
Klout changing landscape of social mediaKlout changing landscape of social media
Klout changing landscape of social media
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
 
Final deck
Final deckFinal deck
Final deck
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
 
Real time data processing frameworks
Real time data processing frameworksReal time data processing frameworks
Real time data processing frameworks
 
Joe C
Joe CJoe C
Joe C
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
 
The architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWSThe architecture of data analytics PaaS on AWS
The architecture of data analytics PaaS on AWS
 

Mehr von Data Tuesday

Data driven company
Data driven companyData driven company
Data driven companyData Tuesday
 
Bruno Van Haetsdaele - Data Tuesday - VRM
Bruno Van Haetsdaele - Data Tuesday - VRMBruno Van Haetsdaele - Data Tuesday - VRM
Bruno Van Haetsdaele - Data Tuesday - VRMData Tuesday
 
Bruno van haetsdaele_2013-09-vrm
Bruno van haetsdaele_2013-09-vrmBruno van haetsdaele_2013-09-vrm
Bruno van haetsdaele_2013-09-vrmData Tuesday
 
Daniel Kaplan - FING - Data Tuesday - VRM
Daniel Kaplan - FING -  Data Tuesday - VRMDaniel Kaplan - FING -  Data Tuesday - VRM
Daniel Kaplan - FING - Data Tuesday - VRMData Tuesday
 
Cozy Cloud - Data Tuesday - VRM
Cozy Cloud - Data Tuesday - VRMCozy Cloud - Data Tuesday - VRM
Cozy Cloud - Data Tuesday - VRMData Tuesday
 
Didier louvet - ADM - Data Tuesday - VRM - 08-10-2013
Didier louvet - ADM  -  Data Tuesday - VRM - 08-10-2013Didier louvet - ADM  -  Data Tuesday - VRM - 08-10-2013
Didier louvet - ADM - Data Tuesday - VRM - 08-10-2013Data Tuesday
 
Privowny - Data Tuesday VRM
Privowny - Data Tuesday VRMPrivowny - Data Tuesday VRM
Privowny - Data Tuesday VRMData Tuesday
 
Renaud Francou- FING - Datatuesday VRM
Renaud Francou- FING - Datatuesday VRMRenaud Francou- FING - Datatuesday VRM
Renaud Francou- FING - Datatuesday VRMData Tuesday
 
DATATUESDAY VRM - Onecub
DATATUESDAY VRM -  Onecub DATATUESDAY VRM -  Onecub
DATATUESDAY VRM - Onecub Data Tuesday
 
Cristal Festival, Timeline (EN)
Cristal Festival, Timeline (EN)Cristal Festival, Timeline (EN)
Cristal Festival, Timeline (EN)Data Tuesday
 
Cristal Festival, Timeline (FR)
Cristal Festival, Timeline (FR)Cristal Festival, Timeline (FR)
Cristal Festival, Timeline (FR)Data Tuesday
 
4 présentation mgdis open data - datatuesday v2
4   présentation mgdis open data - datatuesday v24   présentation mgdis open data - datatuesday v2
4 présentation mgdis open data - datatuesday v2Data Tuesday
 
Cristal Academy / Séminaire 4 juillet 2013
Cristal Academy / Séminaire 4 juillet 2013Cristal Academy / Séminaire 4 juillet 2013
Cristal Academy / Séminaire 4 juillet 2013Data Tuesday
 
10 jean-louis zimmermann - open streetmap france - lizmobility
10   jean-louis zimmermann - open streetmap france - lizmobility10   jean-louis zimmermann - open streetmap france - lizmobility
10 jean-louis zimmermann - open streetmap france - lizmobilityData Tuesday
 
8 olivier rovellotti - natural solution
8   olivier rovellotti - natural solution8   olivier rovellotti - natural solution
8 olivier rovellotti - natural solutionData Tuesday
 
7 marc alcaraz - vamosalaplaya-presentation-slides
7   marc alcaraz - vamosalaplaya-presentation-slides7   marc alcaraz - vamosalaplaya-presentation-slides
7 marc alcaraz - vamosalaplaya-presentation-slidesData Tuesday
 

Mehr von Data Tuesday (20)

Data Publica
Data PublicaData Publica
Data Publica
 
Bittle
BittleBittle
Bittle
 
Fruition sciences
Fruition sciencesFruition sciences
Fruition sciences
 
Xiko
XikoXiko
Xiko
 
Data driven company
Data driven companyData driven company
Data driven company
 
Bruno Van Haetsdaele - Data Tuesday - VRM
Bruno Van Haetsdaele - Data Tuesday - VRMBruno Van Haetsdaele - Data Tuesday - VRM
Bruno Van Haetsdaele - Data Tuesday - VRM
 
Bruno van haetsdaele_2013-09-vrm
Bruno van haetsdaele_2013-09-vrmBruno van haetsdaele_2013-09-vrm
Bruno van haetsdaele_2013-09-vrm
 
Daniel Kaplan - FING - Data Tuesday - VRM
Daniel Kaplan - FING -  Data Tuesday - VRMDaniel Kaplan - FING -  Data Tuesday - VRM
Daniel Kaplan - FING - Data Tuesday - VRM
 
Cozy Cloud - Data Tuesday - VRM
Cozy Cloud - Data Tuesday - VRMCozy Cloud - Data Tuesday - VRM
Cozy Cloud - Data Tuesday - VRM
 
Didier louvet - ADM - Data Tuesday - VRM - 08-10-2013
Didier louvet - ADM  -  Data Tuesday - VRM - 08-10-2013Didier louvet - ADM  -  Data Tuesday - VRM - 08-10-2013
Didier louvet - ADM - Data Tuesday - VRM - 08-10-2013
 
Privowny - Data Tuesday VRM
Privowny - Data Tuesday VRMPrivowny - Data Tuesday VRM
Privowny - Data Tuesday VRM
 
Renaud Francou- FING - Datatuesday VRM
Renaud Francou- FING - Datatuesday VRMRenaud Francou- FING - Datatuesday VRM
Renaud Francou- FING - Datatuesday VRM
 
DATATUESDAY VRM - Onecub
DATATUESDAY VRM -  Onecub DATATUESDAY VRM -  Onecub
DATATUESDAY VRM - Onecub
 
Cristal Festival, Timeline (EN)
Cristal Festival, Timeline (EN)Cristal Festival, Timeline (EN)
Cristal Festival, Timeline (EN)
 
Cristal Festival, Timeline (FR)
Cristal Festival, Timeline (FR)Cristal Festival, Timeline (FR)
Cristal Festival, Timeline (FR)
 
4 présentation mgdis open data - datatuesday v2
4   présentation mgdis open data - datatuesday v24   présentation mgdis open data - datatuesday v2
4 présentation mgdis open data - datatuesday v2
 
Cristal Academy / Séminaire 4 juillet 2013
Cristal Academy / Séminaire 4 juillet 2013Cristal Academy / Séminaire 4 juillet 2013
Cristal Academy / Séminaire 4 juillet 2013
 
10 jean-louis zimmermann - open streetmap france - lizmobility
10   jean-louis zimmermann - open streetmap france - lizmobility10   jean-louis zimmermann - open streetmap france - lizmobility
10 jean-louis zimmermann - open streetmap france - lizmobility
 
8 olivier rovellotti - natural solution
8   olivier rovellotti - natural solution8   olivier rovellotti - natural solution
8 olivier rovellotti - natural solution
 
7 marc alcaraz - vamosalaplaya-presentation-slides
7   marc alcaraz - vamosalaplaya-presentation-slides7   marc alcaraz - vamosalaplaya-presentation-slides
7 marc alcaraz - vamosalaplaya-presentation-slides
 

8 douetteau - dataiku - data tuesday open source 26 fev 2013

  • 1. Hal’s Headache Data Tuesday  02/25/2013  Florian Douetteau
  • 2. Meet Hal Alowne Dim Sum ‟ CEO & Founder Dim’s Private Showroom Hey Hal ! We need a big data platform like the big guys. ” Hal Alowne BI Manager Let’s just do as they do! Dim’s Private Showroom European E-commerce Web site Big Guys • 100M$ Revenue Big Data • 10B$+ Revenue • 1 Million customer Copy Cat • 100M+ customers • 1 Data Analyst (Hal Himself) Project • 100+ Data Scientist Dataiku - Data Tuesday 2/28/2013 2
  • 3. CHOOSE TECHNOLOGY NoSQL-Slavia Scalability Central Machine Learning Elastic Search Mystery Land Hadoop Scikit-Learn SOLR Ceph MongoDB Cassandra Sphere Mahout WEKA Riak MLBase Membase Spark SQL Colunnar Republic InfiniDB RapidMiner R LucidDB Pig Panda Impala Hive D3 Cascading Statistician Old Crossfilter Talend House Vizualization County Data Clean Wasteland Dataiku - Data Tuesday 2/28/2013 3
  • 4. LEARN MACHINE LEARNING STUFF  Try to understand  Find People that understand machine learning myself and all this stuff Dataiku - Data Tuesday 2/28/2013 4
  • 5. DO IT Open Data Storm Megabytes CRM Hadoop R Gigabytes Elastic Search Web Logs Terabytes SQL D3  Connect things together  Pour Data in  Clean Data  Fix the leaks  Start again Dataiku - Data Tuesday 2/28/2013 5
  • 6. MERIT = TIME + ROI TIME : 6 MONTHS ROI : APPS 2013 2014 Targeted Find the right Choose the Make it work Newsletter people technology (6 months?) (6 months?) (6 months?) Recommender 2013 System Build the lab (6 months) • Train People • Reuse working patterns Dynamic Pricing  Build a lab in 6 months  Deploy apps (rather than 18 months) that actually deliver value Dataiku - Data Tuesday 2/28/2013 6
  • 7. Dataiku One Goal One platform with an open source core ‟ Help you build your data lab in less than six months Export Predictions Manage datasets and transformations Impact Flow Feedback Doctor Continuous Loopback Diagnose Data ” all-in-one data scientists D1 Shaker Prepare Data distribution One fake customer A few real ones Data Is Money Dataiku - Data Tuesday 2/28/2013 7
  • 8. 2/28/2013 8 Dataiku - Data Tuesday