SlideShare ist ein Scribd-Unternehmen logo
1 von 21
BACK TO THE FUTURE: DATAFLOW FINALLY COMES
       OF AGE!
                           SPEAKER:   Damian Black
                                      CEO
                                      SQLstream

Tuesday, November 27, 12
Real-time Big Data with
                                                                                         Relational Streaming Dataflow Technology



         Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.

Tuesday, November 27, 12
Brief History of Dataflow




        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           3
Tuesday, November 27, 12
Brief History of Dataflow

      What	
  is	
  Dataflow?	
  	
  
      üParallel	
  processing	
  model	
  invented	
  in	
  the	
  70s
      üGraphed-­‐based	
  execu6on,	
  without	
  destruc6ve	
  updates
      üData	
  flow	
  along	
  arcs	
  to	
  nodes,	
  are	
  combined,	
  and	
  flow	
  along	
  output	
  
        arcs




        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           3
Tuesday, November 27, 12
Brief History of Dataflow

      What	
  is	
  Dataflow?	
  	
  
      üParallel	
  processing	
  model	
  invented	
  in	
  the	
  70s
      üGraphed-­‐based	
  execu6on,	
  without	
  destruc6ve	
  updates
      üData	
  flow	
  along	
  arcs	
  to	
  nodes,	
  are	
  combined,	
  and	
  flow	
  along	
  output	
  
        arcs
      What	
  happened	
  to	
  Dataflow?	
  	
  
      üA	
  number	
  of	
  experimental	
  parallel	
  computers	
  designed	
  and	
  built
      üTransputer	
  and	
  Occam	
  were	
  literally	
  decades	
  ahead	
  of	
  their	
  6me
      üDue	
  for	
  a	
  resurgence	
  due	
  to	
  inexpensive	
  mul9-­‐core	
  servers	
  &	
  SQL

        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           3
Tuesday, November 27, 12
Brief History of Dataflow

      What	
  is	
  Dataflow?	
  	
  
      üParallel	
  processing	
  model	
  invented	
  in	
  the	
  70s
      üGraphed-­‐based	
  execu6on,	
  without	
  destruc6ve	
  updates
      üData	
  flow	
  along	
  arcs	
  to	
  nodes,	
  are	
  combined,	
  and	
  flow	
  along	
  output	
  
        arcs
      What	
  happened	
  to	
  Dataflow?	
  	
  
      üA	
  number	
  of	
  experimental	
  parallel	
  computers	
  designed	
  and	
  built
      üTransputer	
  and	
  Occam	
  were	
  literally	
  decades	
  ahead	
  of	
  their	
  6me
      üDue	
  for	
  a	
  resurgence	
  due	
  to	
  inexpensive	
  mul9-­‐core	
  servers	
  &	
  SQL
      What	
  is	
  Rela9onal	
  Streaming?	
  	
  
      üA	
  dataflow	
  paradigm	
  for	
  processing	
  Streaming	
  Big	
  Data	
  tuples
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           3
Tuesday, November 27, 12
Dataflow Graph: Pipelined and Superscalar Processing




                                        Rela9onal	
  Streaming:	
  DAGs	
  of	
  fine-­‐grained	
  dataflow.
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           4
Tuesday, November 27, 12
Dataflow Graph: Pipelined and Superscalar Processing




                                        Rela9onal	
  Streaming:	
  DAGs	
  of	
  fine-­‐grained	
  dataflow.
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           4
Tuesday, November 27, 12
Comparison of Techniques for Dataflow Scaling


                                                                                       Hadoop	
  and	
  HDFS              Rela6onal
                                                                                                                          Streaming

                                        Data                                           § Fat	
  File                     § Fat	
  Stream
                                    Distribu4on



                                     Dataflow                                           § Generate	
  new	
  tuples	
     § Generate	
  new	
  tuples	
  from	
  
                                    Enablement                                            from	
  old                        old
                                                                                       § leaving	
  old	
  tuples	
      § leaving	
  old	
  tuples	
  
                                                                                          unaltered                          unaltered


        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           5
Tuesday, November 27, 12
Dataflow: Hadoop versus Relational Streaming




                     	
  Hadoop	
  style:	
  data	
  chunking	
  coarse-­‐grained	
  dataflow.




                   Rela9onal	
  Streaming:	
  DAGs	
  of	
  fine-­‐grained	
  dataflow.
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           6
Tuesday, November 27, 12
Parallel Dataflow Execution


                               Collect                                                                             » Hadoop Map Reduce Process


                                                                               Clean


                                                                                                                   Aggregate

                                                                                                                               Analyze


                                                                                                                                         Deliver


        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           7
Tuesday, November 27, 12
Parallel Dataflow Execution


                               Collect                                                                             » Hadoop Map Reduce Process
                                                                                                                     Relational Streaming Approach:
                                                                                                                     » Continuous Parallel Dataflow Execution
                                                                               Clean
                                                                                                                     » Real-time Answers Immediately

                                                                                                                     » Intelligently populate data store:
                                                                                                                   Aggregate
                                                                                                                       Hadoop or
                                                                                                                       Data Warehouse
                                                                                                                                 Analyze


                                                                                                                                                Deliver


        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           7
Tuesday, November 27, 12
Parallel Dataflow Execution


                               Collect                                                                             » Relational Streaming Approach:
                                                                                                                     » Continuous Parallel Dataflow Execution
                                     Clean                                                                           » Real-time Answers Immediately

                                                                                                                     » Intelligently populate data store:
                                            Aggregate
                                                                                                                       Hadoop or
                                                                                                                       Data Warehouse
                                                   Analyze


                                                          Deliver
                                                       Low Latency
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           7
Tuesday, November 27, 12
Relational Streaming synergies with Hadoop
      » Relational Stream Processors co-located with Hadoop Servers
               » Stream/re-stream into and from locally data stores in parallel

      » Combination performs Real-time and Historical processing:
               » Querying the future – Continuous ETL and Analytics (parallel pipelines)

               » Querying the past – Hadoop batch jobs on stored tuples (parallel batches)



                                         Select Select
                                           Select         Project
                                                                Project
                                                            Project       Join Join
                                                                            Join         Agg Agg
                                                                                          Agg        Order
                                                                                                         Order
                                                                                                     Order         Group
                                                                                                                       Group
                                                                                                                   Group
                                                    SelectSelect    Project
                                                                         Project    Join Join    Agg Agg     Order
                                                                                                                 Order     Group
                                                                                                                              Group
                                                        Hadoop & Relational Streaming Server
                                                                 Select          Project        Join       Agg         Order        Group
                                                             Hadoop & RelationalProject
                                                                          Select            StreamingJoin
                                                         Hadoop & Relational Streaming Server             Server Agg           Order                                  Group
                                                                                     Hadoop & Relational Streaming Server
                                                                                     Hadoop & Relational StreamingReduce Server
                                                                                                                   Server
                                            Split
                                              Split
                                                                               Map
                                                                                MapMap
                                                                                         Hadoop & Relational Streaming
                                                                                         Combine       Sort
                                                                                                Hadoop & Relational Streaming Server
                                                                                          Combine       Sort      Reduce
                                                  Split                                      Combine        Sort     Reduce
                                                               Split                             Map                  Combine       Sort           Reduce
                                                                          Split                          Map             Combine         Sort         Reduce
                                                                                      Split                        Map        Combine         Sort          Reduce
                                                                                                        Split             Map         Combine          Sort          Reduce



        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           8
Tuesday, November 27, 12
Application Example – Google: “Youtube Mozilla Glow”




     » Mozilla Firefox 4 – Real-time Download Monitor

     » Continuous processing of download requests

     » Real-time integration with Hadoop and HBase




        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
           9
Tuesday, November 27, 12
Cloud Monitoring – Detecting Service Error Spikes
   SELECT STREAM ROWTIME, url, “numErrorsLastMinute”
   FROM (
      SELECT STREAM ROWTIME, url, “numErrorsLastMinute”,
      AVG(“numErrorsLastMinute”) OVER
          (PARTITION BY url RANGE INTERVAL ’1′ MINUTE PRECEDING) AS “avgErrorsPerMinute”,
      STDDEV(“numErrorsLastMinute”) OVER
          (PARTITION BY url RANGE INTERVAL ’1′ MINUTE PRECEDING) AS “stdDevErrorsPerMinute”
      FROM “ServiceRequestsPerMinute”) AS S
   WHERE S.”numErrorsLastMinute” > S.”avgErrorsPerMinute” + 2 * S.”stdDevErrorsPerMinute”;




   » Millions of records per second

   » Real-time Bollinger Bands                                                                                     stream	
  
                                                                                                                        stream	
  
                                                                                                                     stream	
  
                                                                                                                                             stream	
  
                                                                                                                                                  stream	
  
                                                                                                                                               stream	
  
                                                                                                                                                                       stream	
  
                                                                                                                   Server                    Server                    Server
                                                                                                                     Server stream	
  
                                                                                                                        Server                        stream	
  
                                                                                                                                                  Server
                                                                                                                                               Server
   » Amazon EC2                                                                                                             Server
                                                                                                                                stream	
  
                                                                                                                                Server
                                                                                                                                                      Server
                                                                                                                                                          stream	
  
                                                                                                                                                          Server



        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
         10
Tuesday, November 27, 12
A New Streaming Data Management Quadrant
                                                                                                High-level Declarative
                                                                                                Language & Operation
                                                                                                                                          Real-time
                                                                                                                                          Big Data
                                                                                                                   Rela6onal    Hadoop
                                                       Data	
  Warehouses
                                                                                                                   Streaming    Big	
  Data




                   Historical analysis                                                                                             Continuous analysis
                                                                                                                               Messaging	
  
                   Periodic batches                                                                                                Real-time processing
                                                                                                                               Middleware




                                                    Batched
                                                    Big Data
                                                                                                  Low-level Procedural
                                                                                                  Language & Operation
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
         11
Tuesday, November 27, 12
Benefits of Real-time “Big Dataflow” with Relational Streaming


              1.	
  Real-­‐time	
  Integration                                                                      Con4nuous,	
  real-­‐4me	
  data	
  integra4on



              2.	
  Real-­‐time	
  Analysis                                                                         Process,	
  analyze,	
  and	
  react	
  –	
  all	
  in	
  real-­‐4me



              3.	
  RT	
  Parallel	
  Processing                                                                    Made	
  easy,	
  auto-­‐op4mized,	
  massive	
  scale




        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
                                                                                                            Confiden6al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
         12
Tuesday, November 27, 12
Benefits of Real-time “Big Dataflow” with Relational Streaming


              1.	
  Real-­‐time	
  Integration                                                                      Con4nuous,	
  real-­‐4me	
  data	
  integra4on



              2.	
  Real-­‐time	
  Analysis                                                                         Process,	
  analyze,	
  and	
  react	
  –	
  all	
  in	
  real-­‐4me



              3.	
  RT	
  Parallel	
  Processing                                                                    Made	
  easy,	
  auto-­‐op4mized,	
  massive	
  scale


                                      Dataflow	
  finally	
  comes	
  of	
  age.
                             Rela9onal	
  Streaming.	
  	
  The	
  Next	
  Wave	
  of	
  Big	
  Data.
        Copyright	
  ©	
  2012	
  –	
  Proprietary	
  and	
  Confiden6al	
  Informa6on	
  of	
  SQLstream	
  Inc.
                                                                                                            Confiden6al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
         12
Tuesday, November 27, 12
Query the Future ®
     The Future of Query.




Tuesday, November 27, 12
Tuesday, November 27, 12

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonCaserta
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo pptPhil Young
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Uwe Printz
 
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations Sumeet Singh
 
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at Twitter
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at TwitterHadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at Twitter
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at TwitterBill Graham
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsSkillspeed
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop DeveloperEdureka!
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceeakasit_dpu
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & HadoopEdureka!
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyRohit Kulkarni
 
Where does hadoop come handy
Where does hadoop come handyWhere does hadoop come handy
Where does hadoop come handyPraveen Sripati
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoopVarun Narang
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Sumeet Singh
 

Was ist angesagt? (20)

Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
 
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
 
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at Twitter
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at TwitterHadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at Twitter
Hadoop Summit 2012 - Hadoop and Vertica: The Data Analytics Platform at Twitter
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
 
Where does hadoop come handy
Where does hadoop come handyWhere does hadoop come handy
Where does hadoop come handy
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
 
Hadoop Family and Ecosystem
Hadoop Family and EcosystemHadoop Family and Ecosystem
Hadoop Family and Ecosystem
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
 
May 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data OutMay 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data Out
 
SQL in Hadoop
SQL in HadoopSQL in Hadoop
SQL in Hadoop
 

Ähnlich wie BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012

8 douetteau - dataiku - data tuesday open source 26 fev 2013
8   douetteau - dataiku - data tuesday open source 26 fev 2013 8   douetteau - dataiku - data tuesday open source 26 fev 2013
8 douetteau - dataiku - data tuesday open source 26 fev 2013 Data Tuesday
 
Big dataappliance hadoopworld_final
Big dataappliance hadoopworld_finalBig dataappliance hadoopworld_final
Big dataappliance hadoopworld_finaljdijcks
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Cloudera, Inc.
 
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012Gigaom
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataSteve Watt
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Datacwensel
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architectureMilos Milovanovic
 
Planning and Optimizing Data Lake Architecture - Milos Milovanovic
 Planning and Optimizing Data Lake Architecture - Milos Milovanovic Planning and Optimizing Data Lake Architecture - Milos Milovanovic
Planning and Optimizing Data Lake Architecture - Milos MilovanovicInstitute of Contemporary Sciences
 
Hadoop Big data Solution Provider
Hadoop Big data Solution ProviderHadoop Big data Solution Provider
Hadoop Big data Solution ProviderAgileiss
 
Hadoop For Enterprises
Hadoop For EnterprisesHadoop For Enterprises
Hadoop For Enterprisesnvvrajesh
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
Bringing olap fully online  analyze changing datasets in mem sql and spark wi...Bringing olap fully online  analyze changing datasets in mem sql and spark wi...
Bringing olap fully online analyze changing datasets in mem sql and spark wi...SingleStore
 
13 09-28 hadoop-in_taiwan_2013_opening
13 09-28 hadoop-in_taiwan_2013_opening13 09-28 hadoop-in_taiwan_2013_opening
13 09-28 hadoop-in_taiwan_2013_openingJazz Yao-Tsung Wang
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache HadoopHortonworks
 

Ähnlich wie BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012 (20)

8 douetteau - dataiku - data tuesday open source 26 fev 2013
8   douetteau - dataiku - data tuesday open source 26 fev 2013 8   douetteau - dataiku - data tuesday open source 26 fev 2013
8 douetteau - dataiku - data tuesday open source 26 fev 2013
 
Big dataappliance hadoopworld_final
Big dataappliance hadoopworld_finalBig dataappliance hadoopworld_final
Big dataappliance hadoopworld_final
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
 
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big Data
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 
Planning and Optimizing Data Lake Architecture - Milos Milovanovic
 Planning and Optimizing Data Lake Architecture - Milos Milovanovic Planning and Optimizing Data Lake Architecture - Milos Milovanovic
Planning and Optimizing Data Lake Architecture - Milos Milovanovic
 
Hadoop Big data Solution Provider
Hadoop Big data Solution ProviderHadoop Big data Solution Provider
Hadoop Big data Solution Provider
 
Hadoop For Enterprises
Hadoop For EnterprisesHadoop For Enterprises
Hadoop For Enterprises
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
Bringing olap fully online  analyze changing datasets in mem sql and spark wi...Bringing olap fully online  analyze changing datasets in mem sql and spark wi...
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
 
13 09-28 hadoop-in_taiwan_2013_opening
13 09-28 hadoop-in_taiwan_2013_opening13 09-28 hadoop-in_taiwan_2013_opening
13 09-28 hadoop-in_taiwan_2013_opening
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 

Mehr von Gigaom

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanGigaom
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceGigaom
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsGigaom
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionGigaom
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopGigaom
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryGigaom
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Gigaom
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Gigaom
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovGigaom
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Gigaom
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Gigaom
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherGigaom
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadGigaom
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Gigaom
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathGigaom
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellGigaom
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteGigaom
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013Gigaom
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013Gigaom
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013Gigaom
 

Mehr von Gigaom (20)

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe Weinman
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - Rackspace
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey results
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad Competition
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshop
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - Battery
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013
 

Kürzlich hochgeladen

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 

Kürzlich hochgeladen (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 

BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE from Structure 2012

  • 1. BACK TO THE FUTURE: DATAFLOW FINALLY COMES OF AGE! SPEAKER: Damian Black CEO SQLstream Tuesday, November 27, 12
  • 2. Real-time Big Data with Relational Streaming Dataflow Technology Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. Tuesday, November 27, 12
  • 3. Brief History of Dataflow Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 3 Tuesday, November 27, 12
  • 4. Brief History of Dataflow What  is  Dataflow?     üParallel  processing  model  invented  in  the  70s üGraphed-­‐based  execu6on,  without  destruc6ve  updates üData  flow  along  arcs  to  nodes,  are  combined,  and  flow  along  output   arcs Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 3 Tuesday, November 27, 12
  • 5. Brief History of Dataflow What  is  Dataflow?     üParallel  processing  model  invented  in  the  70s üGraphed-­‐based  execu6on,  without  destruc6ve  updates üData  flow  along  arcs  to  nodes,  are  combined,  and  flow  along  output   arcs What  happened  to  Dataflow?     üA  number  of  experimental  parallel  computers  designed  and  built üTransputer  and  Occam  were  literally  decades  ahead  of  their  6me üDue  for  a  resurgence  due  to  inexpensive  mul9-­‐core  servers  &  SQL Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 3 Tuesday, November 27, 12
  • 6. Brief History of Dataflow What  is  Dataflow?     üParallel  processing  model  invented  in  the  70s üGraphed-­‐based  execu6on,  without  destruc6ve  updates üData  flow  along  arcs  to  nodes,  are  combined,  and  flow  along  output   arcs What  happened  to  Dataflow?     üA  number  of  experimental  parallel  computers  designed  and  built üTransputer  and  Occam  were  literally  decades  ahead  of  their  6me üDue  for  a  resurgence  due  to  inexpensive  mul9-­‐core  servers  &  SQL What  is  Rela9onal  Streaming?     üA  dataflow  paradigm  for  processing  Streaming  Big  Data  tuples Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 3 Tuesday, November 27, 12
  • 7. Dataflow Graph: Pipelined and Superscalar Processing Rela9onal  Streaming:  DAGs  of  fine-­‐grained  dataflow. Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 4 Tuesday, November 27, 12
  • 8. Dataflow Graph: Pipelined and Superscalar Processing Rela9onal  Streaming:  DAGs  of  fine-­‐grained  dataflow. Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 4 Tuesday, November 27, 12
  • 9. Comparison of Techniques for Dataflow Scaling Hadoop  and  HDFS Rela6onal Streaming Data § Fat  File § Fat  Stream Distribu4on Dataflow § Generate  new  tuples   § Generate  new  tuples  from   Enablement from  old old § leaving  old  tuples   § leaving  old  tuples   unaltered unaltered Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 5 Tuesday, November 27, 12
  • 10. Dataflow: Hadoop versus Relational Streaming  Hadoop  style:  data  chunking  coarse-­‐grained  dataflow. Rela9onal  Streaming:  DAGs  of  fine-­‐grained  dataflow. Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 6 Tuesday, November 27, 12
  • 11. Parallel Dataflow Execution Collect » Hadoop Map Reduce Process Clean Aggregate Analyze Deliver Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 7 Tuesday, November 27, 12
  • 12. Parallel Dataflow Execution Collect » Hadoop Map Reduce Process Relational Streaming Approach: » Continuous Parallel Dataflow Execution Clean » Real-time Answers Immediately » Intelligently populate data store: Aggregate Hadoop or Data Warehouse Analyze Deliver Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 7 Tuesday, November 27, 12
  • 13. Parallel Dataflow Execution Collect » Relational Streaming Approach: » Continuous Parallel Dataflow Execution Clean » Real-time Answers Immediately » Intelligently populate data store: Aggregate Hadoop or Data Warehouse Analyze Deliver Low Latency Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 7 Tuesday, November 27, 12
  • 14. Relational Streaming synergies with Hadoop » Relational Stream Processors co-located with Hadoop Servers » Stream/re-stream into and from locally data stores in parallel » Combination performs Real-time and Historical processing: » Querying the future – Continuous ETL and Analytics (parallel pipelines) » Querying the past – Hadoop batch jobs on stored tuples (parallel batches) Select Select Select Project Project Project Join Join Join Agg Agg Agg Order Order Order Group Group Group SelectSelect Project Project Join Join Agg Agg Order Order Group Group Hadoop & Relational Streaming Server Select Project Join Agg Order Group Hadoop & RelationalProject Select StreamingJoin Hadoop & Relational Streaming Server Server Agg Order Group Hadoop & Relational Streaming Server Hadoop & Relational StreamingReduce Server Server Split Split Map MapMap Hadoop & Relational Streaming Combine Sort Hadoop & Relational Streaming Server Combine Sort Reduce Split Combine Sort Reduce Split Map Combine Sort Reduce Split Map Combine Sort Reduce Split Map Combine Sort Reduce Split Map Combine Sort Reduce Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 8 Tuesday, November 27, 12
  • 15. Application Example – Google: “Youtube Mozilla Glow” » Mozilla Firefox 4 – Real-time Download Monitor » Continuous processing of download requests » Real-time integration with Hadoop and HBase Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 9 Tuesday, November 27, 12
  • 16. Cloud Monitoring – Detecting Service Error Spikes SELECT STREAM ROWTIME, url, “numErrorsLastMinute” FROM ( SELECT STREAM ROWTIME, url, “numErrorsLastMinute”, AVG(“numErrorsLastMinute”) OVER (PARTITION BY url RANGE INTERVAL ’1′ MINUTE PRECEDING) AS “avgErrorsPerMinute”, STDDEV(“numErrorsLastMinute”) OVER (PARTITION BY url RANGE INTERVAL ’1′ MINUTE PRECEDING) AS “stdDevErrorsPerMinute” FROM “ServiceRequestsPerMinute”) AS S WHERE S.”numErrorsLastMinute” > S.”avgErrorsPerMinute” + 2 * S.”stdDevErrorsPerMinute”; » Millions of records per second » Real-time Bollinger Bands stream   stream   stream   stream   stream   stream   stream   Server Server Server Server stream   Server stream   Server Server » Amazon EC2 Server stream   Server Server stream   Server Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 10 Tuesday, November 27, 12
  • 17. A New Streaming Data Management Quadrant High-level Declarative Language & Operation Real-time Big Data Rela6onal Hadoop Data  Warehouses Streaming Big  Data Historical analysis Continuous analysis Messaging   Periodic batches Real-time processing Middleware Batched Big Data Low-level Procedural Language & Operation Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. 11 Tuesday, November 27, 12
  • 18. Benefits of Real-time “Big Dataflow” with Relational Streaming 1.  Real-­‐time  Integration Con4nuous,  real-­‐4me  data  integra4on 2.  Real-­‐time  Analysis Process,  analyze,  and  react  –  all  in  real-­‐4me 3.  RT  Parallel  Processing Made  easy,  auto-­‐op4mized,  massive  scale Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. Confiden6al  and  Trade  Secret  SQLstream  Inc.  ©  2012 12 Tuesday, November 27, 12
  • 19. Benefits of Real-time “Big Dataflow” with Relational Streaming 1.  Real-­‐time  Integration Con4nuous,  real-­‐4me  data  integra4on 2.  Real-­‐time  Analysis Process,  analyze,  and  react  –  all  in  real-­‐4me 3.  RT  Parallel  Processing Made  easy,  auto-­‐op4mized,  massive  scale Dataflow  finally  comes  of  age. Rela9onal  Streaming.    The  Next  Wave  of  Big  Data. Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc. Confiden6al  and  Trade  Secret  SQLstream  Inc.  ©  2012 12 Tuesday, November 27, 12
  • 20. Query the Future ® The Future of Query. Tuesday, November 27, 12