SlideShare a Scribd company logo
1 of 28
Astronomical Data Processing Using
   SciQL, an SQL Based Query
     Language for Array Data



 Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes
                          CWI Amsterdam

             ADASS XXI, Nov. 06-10, 2011, Paris, France

                                       !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(
                                       2.#(4&#$5()*+,#-&$".1(6&$&



                                       !"#$%&'()&"#*+,-(     ./0/123
                                       4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
Why Not RDBMS?

             SQL is difficult

                No appropriate array denotations

                No functional complete operation set

             DBMSs are slow

                Too much overhead

                Size limitations (due to BLOB representations)

                Existing foreign files

                Scale

                ...




2011-11-09                               ADASS XXI                  3
SciQL
             An array query language based on SQL:2003

             To lower the entrance fee to RDBMSs



             Distinguish features (Kersten et al. AD ’11; Zhang et al.
             IDEAS2011):

                Arrays and tables as first class citizens of DBMSs

                Seamless integration of relational and array paradigms

                Named dimensions with constraints

                Flexible structure-based grouping




             LOFAR Transient Key Project use case

2011-11-09                                ADASS XXI                          4
Array Definitions
                        y               null

                    3       0.0   0.0          0.0     0.0
                    2       0.0   0.0          0.0     0.0
             null                                                null
                    1       0.0   0.0          0.0     0.0
                    0       0.0   0.0          0.0     0.0
                                                             x
                            0     1            2       3
                                        null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




2011-11-09                        ADASS XXI                             5
Array Definitions
                        y                   null

                    3       0.0       0.0          0.0     0.0
                    2       0.0       0.0          0.0     0.0
             null                                                    null
                    1       0.0       0.0          0.0     0.0
                    0       0.0       0.0          0.0     0.0
                                                                 x
                            0          1           2       3
                                            null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);


                                      dimensions,
                                  any scalar data type



2011-11-09                             ADASS XXI                            5
Array Definitions
                        y                    null

                    3       0.0        0.0          0.0         0.0
                    2       0.0        0.0          0.0         0.0
             null                                                         null
                    1       0.0        0.0          0.0         0.0
                    0       0.0        0.0          0.0         0.0
                                                                      x
                            0            1           2          3
                                             null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);

                            dimensional range:
                            [(start|∗) : (step|∗) : (stop|∗)]




2011-11-09                              ADASS XXI                                6
Array Definitions
                        y                null

                    3       0.0   0.0           0.0     0.0
                    2       0.0   0.0           0.0     0.0
             null                                                 null
                    1       0.0   0.0           0.0     0.0
                    0       0.0   0.0           0.0     0.0
                                                              x
                            0        1           2      3
                                         null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




                   cell values,
              any column data type


2011-11-09                           ADASS XXI                           7
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y              null

                    3       0.0   0.0         0.0   0.0

                    2       0.0   0.0         0.0   0.0
             null                                         null
                    1       0.0   0.5         0.5   0.5

                    0       0.0   0.0         0.0   0.0
                             0     1           2     3
                                                           x
                                       null




2011-11-09                        ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y               null

                    3        0.0   0.0         0.0   0.0

                    2        0.0   0.0         0.0   0.0
             null                                          null
                    1       0.125 0.25 0.25 0.25

                    0       0.125 0.25 0.25 0.25
                              0     1           2     3
                                                            x
                                        null




2011-11-09                         ADASS XXI                                 9
LOFAR Catalogue
                                                                                                         ra DOUBLE,
  zone (Gray et al. 2006)                                frequency                                       decl DOUBLE,
                                                                                                         ra_err DOUBLE,
  90                                                                                                     decl_err DOUBLE,
                                                         ...                                             flux DOUBLE,
   ...                                                                                                   ...
                                                         ν4
   2
                                                         ν3
    1
                                                         ν2                                          V
    0                                                                                            U
                                                         ν1                                  Q
   -1                                                                                    I
   -2                                                          t1   t2   t3   t4   ...           time
   ...
  -90

          0   1   2   3   ...   357   358   359   meridian



         CREATE ARRAY LOFARsrc (
           zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360],
           ts   TIMESTAMP DIMENSION,     freq INT DIMENSION[30:10:241],
           id   INT DIMENSION[0:1:*],    stks CHAR(1) DIMENSION
                   CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'),
           ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE,
           flux DOUBLE, ...);

2011-11-09                                               ADASS XXI                                                          10
LOFAR Use Case
                                                                                                          ra DOUBLE,
  zone (Gray et al. 2006)                                 frequency                                       decl DOUBLE,
                                                                                                          ra_err DOUBLE,
  90                                                                                                      decl_err DOUBLE,
                                                          ...                                             flux DOUBLE,
   ...                                                                                                    ...
                                                          ν4
   2
                                                          ν3
    1
                                                          ν2                                          V
    0                                                                                             U
                                                          ν1                                  Q
   -1                                                                                     I
   -2                                                           t1   t2   t3   t4   ...           time
   ...
  -90

         0    1   2   3    ...   357   358   359   meridian



             Similarity of the flux of a LOFAR source at frequencies 30
             MHz and 200 MHz

                          cross-correlation of two time series


2011-11-09                                                ADASS XXI                                                          11
Cross-Correlation

        idx       0         1        2        3
  F
        val       4         3        6        2

                                              idx        0       1       2
                                     G
                                              val        1       5       7




                      idx       -3       -2         -1       0       1       2
             Cr
                      val




2011-11-09                                               ADASS XXI                          12
Cross-Correlation
                                                                                      Cr.idx = -3

                  idx     0        1        2          3
         F                                                                            F [3 : 4]
                  val     4        3        6          2

                                            idx        0       1       2              G [0 : 1]
                                   G
                                            val        1       5       7




                    idx       -3       -2         -1       0       1       2
             Cr
                    val       2




2011-11-09                                             ADASS XXI                                  13
Cross-Correlation
                                                                                          Cr.idx = -2

                            idx        0        1          2       3
                  F                                                                       F [2 : 4]
                            val        4        3          6       2

                                                idx        0       1       2              G [0 : 2]
                                       G
                                                val        1       5       7




                      idx         -3       -2         -1       0       1       2
             Cr
                      val         2        16




2011-11-09                                                 ADASS XXI                                  14
Cross-Correlation
                                                                                      Cr.idx = -1

                                 idx        0          1       2       3
                        F                                                             F [1 : 4]
                                 val        4          3       6       2

                                            idx        0       1       2              G [0 : 3]
                                 G
                                            val        1       5       7




                  idx       -3         -2         -1       0       1       2
             Cr
                  val       2        16         47




2011-11-09                                             ADASS XXI                                  15
Cross-Correlation
                                                                                 Cr.idx = 0

                                      idx        0        1       2       3
                             F                                                   F [0 : 3]
                                      val        4        3       6       2

                                      idx        0        1       2              G [0 : 3]
                             G
                                      val        1        5       7




                  idx   -3       -2         -1       0        1       2
             Cr
                  val   2        16     47           61




2011-11-09                                       ADASS XXI                                   16
Cross-Correlation
                                                                                    Cr.idx = 1

                                                 idx       0        1       2   3
                                      F                                             F [0 : 2]
                                                 val       4        3       6   2

                                      idx        0         1        2               G [1 : 3]
                             G
                                      val        1         5        7




                  idx   -3       -2         -1         0       1        2
             Cr
                  val   2        16       47         61        41




2011-11-09                                       ADASS XXI                                      17
Cross-Correlation
                                                                                         Cr.idx = 2

                                                          idx       0        1   2   3
                                                 F                                       F [0 : 1]
                                                          val       4        3   6   2

                                      idx        0        1         2                    G [2 : 3]
                             G
                                      val        1        5         7




                  idx   -3       -2         -1       0          1       2
             Cr
                  val   2        16     47           61       41        28




2011-11-09                                       ADASS XXI                                           18
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian
         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian                           retrieve the time series

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
   zone                                                    frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian     dynamic grouping for every iteration

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          20
Conclusion

             SciQL: a novel query language for scientific data

                A symbiosis of relational and array paradigm

             Simplifies expression of complex scientific algorithms

             Leave optimisation to DBMS kernel

             Opens opportunities to enhance scientific data mining



             Under active implementation


                                                   !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(

                   www.scilens.org          www.monetdb.org
                                                   2.#(4&#$5()*+,#-&$".1(6&$&



                                                   !"#$%&'()&"#*+,-(     ./0/123
                                                   4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9




2011-11-09                             ADASS XXI                                                 21

More Related Content

More from PlanetData Network of Excellence

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataPlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 

Recently uploaded

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 

Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data

  • 1. Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes CWI Amsterdam ADASS XXI, Nov. 06-10, 2011, Paris, France !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
  • 2.
  • 3. Why Not RDBMS? SQL is difficult No appropriate array denotations No functional complete operation set DBMSs are slow Too much overhead Size limitations (due to BLOB representations) Existing foreign files Scale ... 2011-11-09 ADASS XXI 3
  • 4. SciQL An array query language based on SQL:2003 To lower the entrance fee to RDBMSs Distinguish features (Kersten et al. AD ’11; Zhang et al. IDEAS2011): Arrays and tables as first class citizens of DBMSs Seamless integration of relational and array paradigms Named dimensions with constraints Flexible structure-based grouping LOFAR Transient Key Project use case 2011-11-09 ADASS XXI 4
  • 5. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); 2011-11-09 ADASS XXI 5
  • 6. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensions, any scalar data type 2011-11-09 ADASS XXI 5
  • 7. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensional range: [(start|∗) : (step|∗) : (stop|∗)] 2011-11-09 ADASS XXI 6
  • 8. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); cell values, any column data type 2011-11-09 ADASS XXI 7
  • 9. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 10. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 11. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 12. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 13. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 14. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 15. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.125 0.25 0.25 0.25 0 0.125 0.25 0.25 0.25 0 1 2 3 x null 2011-11-09 ADASS XXI 9
  • 16. LOFAR Catalogue ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian CREATE ARRAY LOFARsrc ( zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360], ts TIMESTAMP DIMENSION, freq INT DIMENSION[30:10:241], id INT DIMENSION[0:1:*], stks CHAR(1) DIMENSION CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'), ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE, flux DOUBLE, ...); 2011-11-09 ADASS XXI 10
  • 17. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian Similarity of the flux of a LOFAR source at frequencies 30 MHz and 200 MHz cross-correlation of two time series 2011-11-09 ADASS XXI 11
  • 18. Cross-Correlation idx 0 1 2 3 F val 4 3 6 2 idx 0 1 2 G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2011-11-09 ADASS XXI 12
  • 19. Cross-Correlation Cr.idx = -3 idx 0 1 2 3 F F [3 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 1] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 2011-11-09 ADASS XXI 13
  • 20. Cross-Correlation Cr.idx = -2 idx 0 1 2 3 F F [2 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 2] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 2011-11-09 ADASS XXI 14
  • 21. Cross-Correlation Cr.idx = -1 idx 0 1 2 3 F F [1 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 2011-11-09 ADASS XXI 15
  • 22. Cross-Correlation Cr.idx = 0 idx 0 1 2 3 F F [0 : 3] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 2011-11-09 ADASS XXI 16
  • 23. Cross-Correlation Cr.idx = 1 idx 0 1 2 3 F F [0 : 2] val 4 3 6 2 idx 0 1 2 G [1 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 2011-11-09 ADASS XXI 17
  • 24. Cross-Correlation Cr.idx = 2 idx 0 1 2 3 F F [0 : 1] val 4 3 6 2 idx 0 1 2 G [2 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 28 2011-11-09 ADASS XXI 18
  • 25. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 26. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian retrieve the time series DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 27. LOFAR Use Case ra DOUBLE, zone frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian dynamic grouping for every iteration DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 20
  • 28. Conclusion SciQL: a novel query language for scientific data A symbiosis of relational and array paradigm Simplifies expression of complex scientific algorithms Leave optimisation to DBMS kernel Opens opportunities to enhance scientific data mining Under active implementation !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( www.scilens.org www.monetdb.org 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9 2011-11-09 ADASS XXI 21