SlideShare ist ein Scribd-Unternehmen logo
1 von 33
MonetDB/DataCell

                   Exploiting the Power of Relational
                     Databases for Efficient Stream
                               Processing

                                        CWI
                             Project Meeting@Innsbruck
                               Feb 28 - Mar 04, 2011




Wednesday, March 02, 2011
DBMS versus DSMS
                                                                            1

                                                         2
                                        One-time query
                                                                                Incoming data

                                                                 DB
                                                answer
                                            4
   1    Store incoming tuples
   2    Submit one-time query                                3

   3    Query processing on the already stored data
   4    Create answer                                                 Disk storage




Wednesday, March 02, 2011
DBMS versus DSMS
                                                                                        1

                                                             2
                                          One-time query
                                                                                              Incoming data

                                                                         DB
                                                   answer
                                               4
   1    Store incoming tuples
   2    Submit one-time query                                     3

   3    Query processing on the already stored data
   4    Create answer                                                             Disk storage


                                      4                      3
                                                                                                  2



                                                                                                     Input stream
                                                      Continuous queries
                                    notification                              1
                                                                                            Memory
   1    Submit continuous queries
   2    Incoming streams
                                                                                    A data stream is a never
   3    Input stream is processed on the fly                                        ending sequence of tuples
   4    The produced results are continuously delivered to the clients

Wednesday, March 02, 2011
One-time Queries versus Continuous Queries
                                         arrival time of q

                              One-time                       Continuous
                               query                           query




                                                                          t of data
                                             tn          t n+1


              One-time query
               q Evaluated once over the already stored tuples



               Continuous query

                q Waits for future incoming tuples
                q Evaluated continuously as new tuples arrive



Wednesday, March 02, 2011
One-time Queries versus Continuous Queries
                                         arrival time of q

                              One-time                       Continuous
                               query                           query




                                                                          t of data
                                             tn          t n+1


              One-time query
               q Evaluated once over the already stored tuples



               Continuous query

                q Waits for future incoming tuples
                q Evaluated continuously as new tuples arrive



Wednesday, March 02, 2011
One-time Queries versus Continuous Queries
                                         arrival time of q

                              One-time                       Continuous
                               query                           query




                                                                          t of data
                                             tn          t n+1


              One-time query
               q Evaluated once over the already stored tuples



               Continuous query

                q Waits for future incoming tuples
                q Evaluated continuously as new tuples arrive



Wednesday, March 02, 2011
One-time Queries versus Continuous Queries
                                         arrival time of q

                              One-time                       Continuous
                               query                           query




                                                                          t of data
                                             tn          t n+1


              One-time query
               q Evaluated once over the already stored tuples



               Continuous query

                q Waits for future incoming tuples
                q Evaluated continuously as new tuples arrive



Wednesday, March 02, 2011
One-time Queries versus Continuous Queries
                                         arrival time of q

                              One-time                       Continuous
                               query                           query




                                                                          t of data
                                             tn          t n+1


              One-time query
               q Evaluated once over the already stored tuples



               Continuous query

                q Waits for future incoming tuples
                                                                          www
                q Evaluated continuously as new tuples arrive



Wednesday, March 02, 2011
Observation
   ‱ Nowadays stream systems are built from scratch

   ‱ Redesign operators and optimizations

  ‱ Relational Databases are considered inefficient and too complex

   ‱ Modern stream applications require both management of
      stored and streaming data




Wednesday, March 02, 2011
Goals
   ‱ We design the DataCell on top of an existing DataBase Kernel

   ‱ Exploit database techniques, query optimization and operators

   ‱ Provide full language functionalities (SQL’03)

   ‱ Research questions
      ‱ is it viable?
      ‱ multi-query processing/scheduling
      ‱ real-time processing



Wednesday, March 02, 2011
The Basic Idea of DataCell
      ‱ Stream tuples are first stored in (appended to) baskets.

      ‱ We evaluate the continuous queries over the baskets.
             Instead of throwing each incoming tuple against the waiting queries (Data Streams)
                              tuple

                                      Query
                                       Set



             first collect the data and then throw the queries against the tuples (DataBase)

                            tuple      Query
                                        Set



      ‱ Once a tuple is seen, it is dropped from its basket.


Wednesday, March 02, 2011
The MonetDB/DataCell stack
                                    SQL Query

                              SQL



                              Query parser



                            Query Optimizer




                             MAL


                             MAL Interpreter


                                    Query Executor




Wednesday, March 02, 2011
The MonetDB/DataCell stack
                                        SQL Query

                                  SQL



                                   Query parser + CQ



                                Query Optimizer + DC opt


                            Continuous Query Scheduler

                                  MAL


                                 MAL Interpreter


                                        Query Executor




Wednesday, March 02, 2011
DataCell Components
                            Receptor   <=>   Listens to a stream


                            Emitter    <=>   Delivers events to the clients


                            Factory    <=>   Continuous query


                            Basket     <=>   Holds events


        Input Stream                                          Output Stream
                                R            Q            E


Wednesday, March 02, 2011
DataCell Architecture
                                                  SQL Compiler


                                 Data Columns             MAL Optimizer
                                                                                 DataCell
                            R1    id a
                                     a                                                            E1
                                           id c     Continuous Query Scheduler
                                    id b                                          id a’


                                                                                          id k’




                            R2    id k
                                                                                                  E2
                                                                                          id b’




                            R3
                                                                                                  E3
                                                                                   id k’’
                                    id m

 Legend                                    id n                                       id n’


        Basket

        Receptor
                                                       Disk Storage
        Emitter
        Factory
Wednesday, March 02, 2011
DataCell Architecture
                                                  SQL Compiler


                                 Data Columns             MAL Optimizer
                                                                                 DataCell
                            R1    id a
                                     a                                                            E1
                                           id c     Continuous Query Scheduler
                                    id b                                          id a’


                                                                                          id k’




                            R2    id k
                                                                                                  E2
                                                                                          id b’




                            R3
                                                                                                  E3
                                                                                   id k’’
                                    id m

 Legend                                    id n                                       id n’


        Basket

        Receptor
                                                       Disk Storage
        Emitter
        Factory
Wednesday, March 02, 2011
DataCell Architecture
                                                  SQL Compiler


                                 Data Columns             MAL Optimizer
                                                                                 DataCell
                            R1    id a
                                     a                                                            E1
                                           id c     Continuous Query Scheduler
                                    id b                                          id a’


                                                                                          id k’




                            R2    id k
                                                                                                  E2
                                                                                          id b’




                            R3
                                                                                                  E3
                                                                                   id k’’
                                    id m

 Legend                                    id n                                       id n’


        Basket

        Receptor
                                                       Disk Storage
        Emitter
        Factory
Wednesday, March 02, 2011
DataCell Architecture
                                                  SQL Compiler


                                 Data Columns             MAL Optimizer
                                                                                 DataCell
                            R1    id a
                                     a                                                            E1
                                           id c     Continuous Query Scheduler
                                    id b                                          id a’


                                                                                          id k’




                            R2    id k
                                                                                                  E2
                                                                                          id b’




                            R3
                                                                                                  E3
                                                                                   id k’’
                                    id m

 Legend                                    id n                                       id n’


        Basket

        Receptor
                                                       Disk Storage
        Emitter
        Factory
Wednesday, March 02, 2011
DataCell Architecture
                                                  SQL Compiler        SPARQL Compiler


                                 Data Columns             MAL Optimizer
                                                                                 DataCell
                            R1    id a
                                     a                                                            E1
                                           id c     Continuous Query Scheduler
                                    id b                                          id a’


                                                                                          id k’




                            R2    id k
                                                                                                  E2
                                                                                          id b’




                            R3
                                                                                                  E3
                                                                                   id k’’
                                    id m

 Legend                                    id n                                         id n’


        Basket

        Receptor
                                                       Disk Storage
        Emitter
        Factory
Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories

           Tumbling window
           Q1: Select * From [Select * from X top 3] as S where S.a>10;

           Sliding window
           Q2:      SELECT * FROM (
                    [Select * From X top 1]
                     Union
                     Select * From X top 2 offset 1) as S
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                            100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                    [Select * From X top 1]
                     Union
                     Select * From X top 2 offset 1) as S
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                            100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                    [Select * From X top 1]
                     Union
                     Select * From X top 2 offset 1) as S
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                                         12
                                                                            100          100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                    [Select * From X top 1]
                     Union
                     Select * From X top 2 offset 1) as S
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                                         12
                                                                            100          100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                                                                            12
                    [Select * From X top 1]                                 3
                     Union                                                          Q2
                                                                            100
                     Select * From X top 2 offset 1) as S
                                                                            14
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                                         12
                                                                            100          100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                                                                            12
                    [Select * From X top 1]                                 3            12
                     Union                                                          Q2
                                                                            100          100
                     Select * From X top 2 offset 1) as S
                                                                            14
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Basket Expressions
      q Syntax:
             It is an SQL sub-query surrounded by square brackets

      q Semantics:
            All qualifying tuples in a basket expression are removed by the factories
                                                                            12
           Tumbling window                                                  3
                                                                                    Q1
                                                                                         12
                                                                            100          100
           Q1: Select * From [Select * from X top 3] as S where S.a>10;
                                                                            14


           Sliding window
           Q2:      SELECT * FROM (
                                                                            12
                    [Select * From X top 1]                                 3            12
                     Union                                                          Q2
                                                                            100          100
                     Select * From X top 2 offset 1) as S
                                                                            14
                     WHERE S.a>10;

      q Flexible/expressive continuous queries, by selectively picking the data to
         process from a basket

      q Allow to process predicate windows on a stream.
         q out of order processing


Wednesday, March 02, 2011
Query processing strategies
            Separate Baskets

     ‱ Each continuous query is encapsulated within a single factory
     ‱ Each factory f has it own input baskets, that are accessed only by f
     ‱ If more than one factory are interested for the same data, we create
          multiple copies of this data

     ‱ Factories are completely independent
     ‱ Exploit column-store to minimize the overhead of replication
                                          bcopy1
                                                   Q1

                            b             bcopy2
                                  Qcopy            Q2


                                          bcopy3
                                                   Q3

Wednesday, March 02, 2011
Query processing strategies
          Shared Baskets

      ‱ Exploit query similarities to avoid replication
      ‱ Baskets are shared among factories
      ‱ Two new (cheap) factories Locker, Unlocker

                                        Q1

                    b

                                        Q2




                                        Q3




Wednesday, March 02, 2011
Query processing strategies
          Shared Baskets

      ‱ Exploit query similarities to avoid replication
      ‱ Baskets are shared among factories
      ‱ Two new (cheap) factories Locker, Unlocker

                                   FL1   Q1

                    b

                            Lock   FL2   Q2




                                   FL3   Q3




Wednesday, March 02, 2011
Query processing strategies
          Shared Baskets

      ‱ Exploit query similarities to avoid replication
      ‱ Baskets are shared among factories
      ‱ Two new (cheap) factories Locker, Unlocker

                                   FL1   Q1     FU1
                    b

                            Lock   FL2   Q2     FU2



                                   FL3   Q3     FU3


Wednesday, March 02, 2011
Query processing strategies
          Shared Baskets

      ‱ Exploit query similarities to avoid replication
      ‱ Baskets are shared among factories
      ‱ Two new (cheap) factories Locker, Unlocker

                                   FL1   Q1     FU1
                    b

                            Lock   FL2   Q2     FU2       Unlock




                                   FL3   Q3     FU3


Wednesday, March 02, 2011
Query processing strategies
          Shared Baskets

      ‱ Exploit query similarities to avoid replication
      ‱ Baskets are shared among factories
      ‱ Two new (cheap) factories Locker, Unlocker

                                   FL1   Q1     FU1
                    b

                            Lock   FL2   Q2     FU2       Unlock




                                   FL3   Q3     FU3


Wednesday, March 02, 2011
Summary




                            +   =   DataCell




Wednesday, March 02, 2011

Weitere Àhnliche Inhalte

Mehr von PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsPlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 

Mehr von PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 

KĂŒrzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

KĂŒrzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Exploiting Relational Databases for Efficient Stream Processing

  • 1. MonetDB/DataCell Exploiting the Power of Relational Databases for Efficient Stream Processing CWI Project Meeting@Innsbruck Feb 28 - Mar 04, 2011 Wednesday, March 02, 2011
  • 2. DBMS versus DSMS 1 2 One-time query Incoming data DB answer 4 1 Store incoming tuples 2 Submit one-time query 3 3 Query processing on the already stored data 4 Create answer Disk storage Wednesday, March 02, 2011
  • 3. DBMS versus DSMS 1 2 One-time query Incoming data DB answer 4 1 Store incoming tuples 2 Submit one-time query 3 3 Query processing on the already stored data 4 Create answer Disk storage 4 3 2 Input stream Continuous queries notification 1 Memory 1 Submit continuous queries 2 Incoming streams A data stream is a never 3 Input stream is processed on the fly ending sequence of tuples 4 The produced results are continuously delivered to the clients Wednesday, March 02, 2011
  • 4. One-time Queries versus Continuous Queries arrival time of q One-time Continuous query query t of data tn t n+1 One-time query q Evaluated once over the already stored tuples Continuous query q Waits for future incoming tuples q Evaluated continuously as new tuples arrive Wednesday, March 02, 2011
  • 5. One-time Queries versus Continuous Queries arrival time of q One-time Continuous query query t of data tn t n+1 One-time query q Evaluated once over the already stored tuples Continuous query q Waits for future incoming tuples q Evaluated continuously as new tuples arrive Wednesday, March 02, 2011
  • 6. One-time Queries versus Continuous Queries arrival time of q One-time Continuous query query t of data tn t n+1 One-time query q Evaluated once over the already stored tuples Continuous query q Waits for future incoming tuples q Evaluated continuously as new tuples arrive Wednesday, March 02, 2011
  • 7. One-time Queries versus Continuous Queries arrival time of q One-time Continuous query query t of data tn t n+1 One-time query q Evaluated once over the already stored tuples Continuous query q Waits for future incoming tuples q Evaluated continuously as new tuples arrive Wednesday, March 02, 2011
  • 8. One-time Queries versus Continuous Queries arrival time of q One-time Continuous query query t of data tn t n+1 One-time query q Evaluated once over the already stored tuples Continuous query q Waits for future incoming tuples www q Evaluated continuously as new tuples arrive Wednesday, March 02, 2011
  • 9. Observation ‱ Nowadays stream systems are built from scratch ‱ Redesign operators and optimizations ‱ Relational Databases are considered inefficient and too complex ‱ Modern stream applications require both management of stored and streaming data Wednesday, March 02, 2011
  • 10. Goals ‱ We design the DataCell on top of an existing DataBase Kernel ‱ Exploit database techniques, query optimization and operators ‱ Provide full language functionalities (SQL’03) ‱ Research questions ‱ is it viable? ‱ multi-query processing/scheduling ‱ real-time processing Wednesday, March 02, 2011
  • 11. The Basic Idea of DataCell ‱ Stream tuples are first stored in (appended to) baskets. ‱ We evaluate the continuous queries over the baskets. Instead of throwing each incoming tuple against the waiting queries (Data Streams) tuple Query Set first collect the data and then throw the queries against the tuples (DataBase) tuple Query Set ‱ Once a tuple is seen, it is dropped from its basket. Wednesday, March 02, 2011
  • 12. The MonetDB/DataCell stack SQL Query SQL Query parser Query Optimizer MAL MAL Interpreter Query Executor Wednesday, March 02, 2011
  • 13. The MonetDB/DataCell stack SQL Query SQL Query parser + CQ Query Optimizer + DC opt Continuous Query Scheduler MAL MAL Interpreter Query Executor Wednesday, March 02, 2011
  • 14. DataCell Components Receptor <=> Listens to a stream Emitter <=> Delivers events to the clients Factory <=> Continuous query Basket <=> Holds events Input Stream Output Stream R Q E Wednesday, March 02, 2011
  • 15. DataCell Architecture SQL Compiler Data Columns MAL Optimizer DataCell R1 id a a E1 id c Continuous Query Scheduler id b id a’ id k’ R2 id k E2 id b’ R3 E3 id k’’ id m Legend id n id n’ Basket Receptor Disk Storage Emitter Factory Wednesday, March 02, 2011
  • 16. DataCell Architecture SQL Compiler Data Columns MAL Optimizer DataCell R1 id a a E1 id c Continuous Query Scheduler id b id a’ id k’ R2 id k E2 id b’ R3 E3 id k’’ id m Legend id n id n’ Basket Receptor Disk Storage Emitter Factory Wednesday, March 02, 2011
  • 17. DataCell Architecture SQL Compiler Data Columns MAL Optimizer DataCell R1 id a a E1 id c Continuous Query Scheduler id b id a’ id k’ R2 id k E2 id b’ R3 E3 id k’’ id m Legend id n id n’ Basket Receptor Disk Storage Emitter Factory Wednesday, March 02, 2011
  • 18. DataCell Architecture SQL Compiler Data Columns MAL Optimizer DataCell R1 id a a E1 id c Continuous Query Scheduler id b id a’ id k’ R2 id k E2 id b’ R3 E3 id k’’ id m Legend id n id n’ Basket Receptor Disk Storage Emitter Factory Wednesday, March 02, 2011
  • 19. DataCell Architecture SQL Compiler SPARQL Compiler Data Columns MAL Optimizer DataCell R1 id a a E1 id c Continuous Query Scheduler id b id a’ id k’ R2 id k E2 id b’ R3 E3 id k’’ id m Legend id n id n’ Basket Receptor Disk Storage Emitter Factory Wednesday, March 02, 2011
  • 20. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories Tumbling window Q1: Select * From [Select * from X top 3] as S where S.a>10; Sliding window Q2: SELECT * FROM ( [Select * From X top 1] Union Select * From X top 2 offset 1) as S WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 21. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( [Select * From X top 1] Union Select * From X top 2 offset 1) as S WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 22. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( [Select * From X top 1] Union Select * From X top 2 offset 1) as S WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 23. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 12 100 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( [Select * From X top 1] Union Select * From X top 2 offset 1) as S WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 24. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 12 100 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( 12 [Select * From X top 1] 3 Union Q2 100 Select * From X top 2 offset 1) as S 14 WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 25. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 12 100 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( 12 [Select * From X top 1] 3 12 Union Q2 100 100 Select * From X top 2 offset 1) as S 14 WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 26. Basket Expressions q Syntax: It is an SQL sub-query surrounded by square brackets q Semantics: All qualifying tuples in a basket expression are removed by the factories 12 Tumbling window 3 Q1 12 100 100 Q1: Select * From [Select * from X top 3] as S where S.a>10; 14 Sliding window Q2: SELECT * FROM ( 12 [Select * From X top 1] 3 12 Union Q2 100 100 Select * From X top 2 offset 1) as S 14 WHERE S.a>10; q Flexible/expressive continuous queries, by selectively picking the data to process from a basket q Allow to process predicate windows on a stream. q out of order processing Wednesday, March 02, 2011
  • 27. Query processing strategies Separate Baskets ‱ Each continuous query is encapsulated within a single factory ‱ Each factory f has it own input baskets, that are accessed only by f ‱ If more than one factory are interested for the same data, we create multiple copies of this data ‱ Factories are completely independent ‱ Exploit column-store to minimize the overhead of replication bcopy1 Q1 b bcopy2 Qcopy Q2 bcopy3 Q3 Wednesday, March 02, 2011
  • 28. Query processing strategies Shared Baskets ‱ Exploit query similarities to avoid replication ‱ Baskets are shared among factories ‱ Two new (cheap) factories Locker, Unlocker Q1 b Q2 Q3 Wednesday, March 02, 2011
  • 29. Query processing strategies Shared Baskets ‱ Exploit query similarities to avoid replication ‱ Baskets are shared among factories ‱ Two new (cheap) factories Locker, Unlocker FL1 Q1 b Lock FL2 Q2 FL3 Q3 Wednesday, March 02, 2011
  • 30. Query processing strategies Shared Baskets ‱ Exploit query similarities to avoid replication ‱ Baskets are shared among factories ‱ Two new (cheap) factories Locker, Unlocker FL1 Q1 FU1 b Lock FL2 Q2 FU2 FL3 Q3 FU3 Wednesday, March 02, 2011
  • 31. Query processing strategies Shared Baskets ‱ Exploit query similarities to avoid replication ‱ Baskets are shared among factories ‱ Two new (cheap) factories Locker, Unlocker FL1 Q1 FU1 b Lock FL2 Q2 FU2 Unlock FL3 Q3 FU3 Wednesday, March 02, 2011
  • 32. Query processing strategies Shared Baskets ‱ Exploit query similarities to avoid replication ‱ Baskets are shared among factories ‱ Two new (cheap) factories Locker, Unlocker FL1 Q1 FU1 b Lock FL2 Q2 FU2 Unlock FL3 Q3 FU3 Wednesday, March 02, 2011
  • 33. Summary + = DataCell Wednesday, March 02, 2011