SlideShare ist ein Scribd-Unternehmen logo
1 von 75
Cubes
                   light-weight OLAP




Stefan Urbanek ■ @Stiivi ■ stefan.urbanek@gmail.com ■ July 2012
source

  github.com/Stiivi/cubes

         documentation

packages.python.org/cubes/
Overview

■   purpose
■   analytical modelling and OLAP
■   slicing and dicing
■   OLAP server
■   SQL backend
analytical data modelling
        lightweight
http://tendre.sme.sk
aggregation browsing
     slicing and dicing
modelling   reporting
            aggregation browsing
Architecture
✂
 model     browser




             http

backends   server
Logical Model
 multidimensional, analytical
business/analyst’s
  point of view
transactions                 analysis
         OLTP                        OLAP




application (operational) data   analytical data
Model
           {
               “name” = “My Model”
               “description” = ....

               “cubes” = [...]
               “dimensions” = [...]
           }




cubes                         dimensions
measures                        levels, attributes, hierarchy
Facts

                  measurable


      fact

                    fact data cell




most detailed information
location




type




              time



           dimensions
Dimension

■ provide context for facts
■ used to filter queries or reports
■ control scope of aggregation of
  facts
Hierarchy


     2010 May 1st



        levels
Dimension

■   levels and attributes          “dimensions” = [
                                     {
■   hierarchy*                          “name”:”date”,
                                        “levels”: ...

■   key attributes                   },
                                        “hierarchy”: ...

                                     ...
■   label attributes               ]




                       *partial support for multiple hierarchies
label attribute   key attribute
                  for links to slices
Cube
               “cubes” = [
                 {
                    “name”:”contracts”,
                    “dimensions”: [ “date”,
                                    “category” ]
                    “measures”: [
■ dimensions          {
                        “name”: “amount”,
                        “label”: “Contract Amount”,
■ measures            }
                        “aggregations”: [“sum”]

                    ]
                 },
                 ...
               ]


                *partial support for multiple hierarchies
"attributes": [
                           {
                             "name":"group",
                             "label": "Group code"

localizable                },
                           {
                             "name":"group_label",
model and attributes         "label": "Group",
                             "locales": ["en", "sk"]
                           }
                       ]
Aggregation
  Browser

    ∑
∑ measures
get more details
Aggregation
                               Browser




SQL Snowflake   SQL Denormalized                               Some HTTP Data
                                            MongoDB Browser
  Browser          Browser                                     Service Browser




                                                                     ?




                        “batteries” that are included
Browser Workspace




logical model
                +   data
Cell
context of interest




cell
cell
Path

              [45,2]




[2012, 6]
                       list of level keys
1   load_model("model.json")

           Application



                  ∑

                                 3   model.cube("sales")
                                 4   workspace.browser(cube)


             cubes

       Aggregation Browser
            backend



2   create_workspace("sql",
                     model,
                     url="sqlite:///data.sqlite")
summary




drill-down
browser.aggregate(o cell)




                            summary
browser.aggregate(o cell,
                  . drilldown=[9 "sector"])




                         drill-down
for row in result.drilldown:




              row["amount_sum"]
row[q label_attribute]            row[k key]
received_amount_sum


measure      aggregation




           record_count
browser.facts(o cell)


browser.values(o cell, 9 dimension)


browser.cell_details(o cell)
✂
    Slicing and Dicing
✂
✂
✂
               April 2012
constructi
 on work                       construction work in
                                    april 2012
             type




    supplier



                            date
cut types
✂

point         set           range
           [[2010,10],   from=[2010,10]
[2010]
            [2010,12]]   to=[2010,12]
Implicit Hierarchy
       drilldown
whole cube


                                          o cell = Cell(cube)
                                          browser.aggregate(o cell)
                Total




                                          browser.aggregate(o cell,
                                                       drilldown=[9 “date”])


2006 2007 2008 2009 2010


                                          ✂ cut = PointCut(9 “date”, [2010])
                                          o cell = o cell.slice(✂ cut)

                                          browser.aggregate(o cell,
                                                       drilldown=[9 “date”])
Jan   Feb Mar Apr March April May   ...
Drill-down Level
. drilldown = [9 "date"]


                implicit: next from o cell




. drilldown = {9 "date": "month"}


                              explicit
Cross Table
 experimental interface
2009     2010

     Assets           Due from Banks     3044     1803
     Assets              Investments    41012    36012
     Assets        Loans Outstanding   103657   118104
     Assets            Nonnegotiable     1202     1123
     Assets             Other Assets     2247     3071
     Assets        Other Receivables      984      811
     Assets              Receivables      176      171
     Assets               Securities       33      289
     Equity            Capital Stock    11491    11492
     Equity         Deferred Amounts      359      313
     Equity                    Other    -1683    -3043
     Equity        Retained Earnings    29870    28793
Liabilities               Borrowings   110040   128577
Liabilities   Derivative Liabilities   115642   110418
Liabilities                    Other       57        8
Liabilities        Other Liabilities     7321     5454
Liabilities             Sold or Lent     2323      998
rows = ["item.category",
        "item.subcategory"]

columns = ["year"]

measures = ["amount_sum"]

table = result.cross_table(
              rows,
              columns,
              measures
        )
Slicer
The HTTP OLAP Server



      ✂
Application




HTTP                         JSON
             Slicer



                   ∑




       Aggregation Browser
GET /model

GET /aggregate

GET /values

GET /report
w
 logical model       configuration   data




$ slicer serve slicer.ini
[server]
backend: sql
log_level: info

[model]
path: model.json
locales: en,sk

[workspace]
url: postgres://localhost/database
schema: datamart
fact_prefix: ft_
dimension_prefix: dm_



                                 w
∑      amount




GET /aggregate
GET aggregate




{
    "cell": [],
    "drilldown": [],
    "summary": {
        "record_count": 62,
        "amount_sum": 1116860
    }
}
∑         amount
✂




GET /aggregate?cut=date:2010
GET aggregate?cut=year:2010




{
    "cell": [
        {
            "path": ["2010"],
            "type": "point",
            "dimension": "year",
            "level_depth": 1
        }
    ],
    "drilldown": [],
    "summary": {
        "record_count": 31,
        "amount_sum": 566020
    }
}
GET aggregate?drilldown=year



{
     "cell": [],
     "total_cell_count": 2,
     "drilldown": [
         {
             "record_count": 31,
             "amount_sum": 550840,
             "year": 2009
         },
         {
             "record_count": 31,
             "amount_sum": 566020,
             "year": 2010
         }
     ],
     "summary": {
         "record_count": 62,
         "amount_sum": 1116860
     }
}
GET report


                     Content-Type: application/json
list of cuts         {
                         "cell": [
                             {
                                 "dimension": "date",
                                 "type": "range",
                                 "from": [2009],
                                 "to": [2011,6]
                             }
                         ],
                         "queries": {
        list of              "by_segment": {
     named queries               "query": "aggregate",
                                 "drilldown": ["segment"]
                             },
                             "by_year": {
                                 "query": "aggregate",
                                 "drilldown": {"date":"year"}
                             }
                         }
                     }
SQL Backend
 What data it works with?
★   or
         ❄
★

dimensions   fact table
❄


             fact table
dimensions
Aggregation Browser


                     Browsing Context


               Snowflake            Denormalized
                             or
                Mapper               Mapper



denormalized view




snowflake
           ❄
logical




              physical
          ❄
SQL Features
■ does not require DB write access
■ denormalisation
 ■   denormalised browsing, indexing


■ simple date datatype dimension
 ■   extraction of date parts during mapping


■ multiple schema support
Slicer
command-line tool
■ model validation
  slicer model validate model.json



■ model translation
  slicer model translate model.json translation.json



■ workspace testing
  slicer test config.ini



■ denormalization
  slicer denormalize --materialize --index config.ini
Future
■ formatters for visualisation libraries
■ JavaScript library*             help needed

■ backends
■ derived measures


                        *http://github.com/Stiivi/cubes-js
Open Data

■ shared repository of models
■ shared repository of dimensions
■ public cubes
   open Slicer HTTP APIs




                           http://github.com/Stiivi/cubes/wiki
stay light
 Nutrition Facts
 Serving Size 1 cube

 Amount Per Serving
                       % Daily Value
 Total Fat 0g                    0%

   Saturated Fat 0g
   Trans Fat 0g
Thank You
              source:
    github.com/Stiivi/cubes
           documentation:
  packages.python.org/cubes/
             examples:
github.com/Stiivi/cubes-examples
Backup
Transactions                 Reporting
                              multidimensional
object–relational modelling
                                 modelling

      ORM mapping              logical model
                                 (and mapping)


   database connection            browser

     database engine             workspace
Limitations

■ one cut per dimension in a cell
 ■   logical conjunction of cuts (cut1 AND cut2 AND cut3 ...)


■ dimension-only selection
■ one - default hierarchy
 ■   some internals are ready for multiple

Weitere ähnliche Inhalte

Was ist angesagt?

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Databricks
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Roopa Tangirala
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)KafkaZone
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Timothy Spann
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...Andrew Lamb
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Vectors are the new JSON in PostgreSQL
Vectors are the new JSON in PostgreSQLVectors are the new JSON in PostgreSQL
Vectors are the new JSON in PostgreSQLJonathan Katz
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLJim Mlodgenski
 

Was ist angesagt? (20)

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Vectors are the new JSON in PostgreSQL
Vectors are the new JSON in PostgreSQLVectors are the new JSON in PostgreSQL
Vectors are the new JSON in PostgreSQL
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQLTop 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
 

Ähnlich wie Cubes light-weight OLAP analytical modelling

Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Stefan Urbanek
 
Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Paulo Gandra de Sousa
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015eddiebaggott
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADtab0ris_1
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageNeo4j
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!Daniel Cousineau
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for AnalyticsMongoDB
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserHoward Lewis Ship
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analyticsMongoDB
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaLucidworks
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databasesBinh Le
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
 
D3 meetup (Backbone and D3)
D3 meetup (Backbone and D3)D3 meetup (Backbone and D3)
D3 meetup (Backbone and D3)Alpine Data
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 

Ähnlich wie Cubes light-weight OLAP analytical modelling (20)

Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)
 
PoEAA by Example
PoEAA by ExamplePoEAA by Example
PoEAA by Example
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The Browser
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
 
Couchbas for dummies
Couchbas for dummiesCouchbas for dummies
Couchbas for dummies
 
Cubes 1.0 Overview
Cubes 1.0 OverviewCubes 1.0 Overview
Cubes 1.0 Overview
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introduction
 
D3 meetup (Backbone and D3)
D3 meetup (Backbone and D3)D3 meetup (Backbone and D3)
D3 meetup (Backbone and D3)
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 

Mehr von Stefan Urbanek

Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Stefan Urbanek
 
New york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionNew york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionStefan Urbanek
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explainedStefan Urbanek
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deploymentStefan Urbanek
 
Knowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsKnowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsStefan Urbanek
 
Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionDallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionStefan Urbanek
 
Dallas Data Brewery - introduction
Dallas Data Brewery - introductionDallas Data Brewery - introduction
Dallas Data Brewery - introductionStefan Urbanek
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsStefan Urbanek
 
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleKnowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleStefan Urbanek
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsStefan Urbanek
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceStefan Urbanek
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06Stefan Urbanek
 
Cubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP FrameworkCubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP FrameworkStefan Urbanek
 
Open Data Decentralisation
Open Data DecentralisationOpen Data Decentralisation
Open Data DecentralisationStefan Urbanek
 
Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Stefan Urbanek
 
Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management IntroductionStefan Urbanek
 

Mehr von Stefan Urbanek (18)

StepTalk Introduction
StepTalk IntroductionStepTalk Introduction
StepTalk Introduction
 
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
 
Sepro - introduction
Sepro - introductionSepro - introduction
Sepro - introduction
 
New york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionNew york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introduction
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explained
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deployment
 
Knowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsKnowledge Management Lecture 4: Models
Knowledge Management Lecture 4: Models
 
Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionDallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality Perception
 
Dallas Data Brewery - introduction
Dallas Data Brewery - introductionDallas Data Brewery - introduction
Dallas Data Brewery - introduction
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data Objects
 
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleKnowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: Cycle
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizations
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presence
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06
 
Cubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP FrameworkCubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP Framework
 
Open Data Decentralisation
Open Data DecentralisationOpen Data Decentralisation
Open Data Decentralisation
 
Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)
 
Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management Introduction
 

Kürzlich hochgeladen

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Kürzlich hochgeladen (20)

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Cubes light-weight OLAP analytical modelling

  • 1. Cubes light-weight OLAP Stefan Urbanek ■ @Stiivi ■ stefan.urbanek@gmail.com ■ July 2012
  • 2. source github.com/Stiivi/cubes documentation packages.python.org/cubes/
  • 3. Overview ■ purpose ■ analytical modelling and OLAP ■ slicing and dicing ■ OLAP server ■ SQL backend
  • 6. aggregation browsing slicing and dicing
  • 7. modelling reporting aggregation browsing
  • 9. ✂ model browser http backends server
  • 12. transactions analysis OLTP OLAP application (operational) data analytical data
  • 13. Model { “name” = “My Model” “description” = .... “cubes” = [...] “dimensions” = [...] } cubes dimensions measures levels, attributes, hierarchy
  • 14. Facts measurable fact fact data cell most detailed information
  • 15. location type time dimensions
  • 16. Dimension ■ provide context for facts ■ used to filter queries or reports ■ control scope of aggregation of facts
  • 17. Hierarchy 2010 May 1st levels
  • 18. Dimension ■ levels and attributes “dimensions” = [ { ■ hierarchy* “name”:”date”, “levels”: ... ■ key attributes }, “hierarchy”: ... ... ■ label attributes ] *partial support for multiple hierarchies
  • 19. label attribute key attribute for links to slices
  • 20. Cube “cubes” = [ { “name”:”contracts”, “dimensions”: [ “date”, “category” ] “measures”: [ ■ dimensions { “name”: “amount”, “label”: “Contract Amount”, ■ measures } “aggregations”: [“sum”] ] }, ... ] *partial support for multiple hierarchies
  • 21. "attributes": [ { "name":"group", "label": "Group code" localizable }, { "name":"group_label", model and attributes "label": "Group", "locales": ["en", "sk"] } ]
  • 25. Aggregation Browser SQL Snowflake SQL Denormalized Some HTTP Data MongoDB Browser Browser Browser Service Browser ? “batteries” that are included
  • 27. Cell
  • 29. cell
  • 30. Path [45,2] [2012, 6] list of level keys
  • 31. 1 load_model("model.json") Application ∑ 3 model.cube("sales") 4 workspace.browser(cube) cubes Aggregation Browser backend 2 create_workspace("sql", model, url="sqlite:///data.sqlite")
  • 34. browser.aggregate(o cell, . drilldown=[9 "sector"]) drill-down
  • 35. for row in result.drilldown: row["amount_sum"] row[q label_attribute] row[k key]
  • 36. received_amount_sum measure aggregation record_count
  • 37. browser.facts(o cell) browser.values(o cell, 9 dimension) browser.cell_details(o cell)
  • 38. Slicing and Dicing ✂
  • 39. ✂ ✂ April 2012 constructi on work construction work in april 2012 type supplier date
  • 40. cut types ✂ point set range [[2010,10], from=[2010,10] [2010] [2010,12]] to=[2010,12]
  • 41. Implicit Hierarchy drilldown
  • 42. whole cube o cell = Cell(cube) browser.aggregate(o cell) Total browser.aggregate(o cell, drilldown=[9 “date”]) 2006 2007 2008 2009 2010 ✂ cut = PointCut(9 “date”, [2010]) o cell = o cell.slice(✂ cut) browser.aggregate(o cell, drilldown=[9 “date”]) Jan Feb Mar Apr March April May ...
  • 43. Drill-down Level . drilldown = [9 "date"] implicit: next from o cell . drilldown = {9 "date": "month"} explicit
  • 45. 2009 2010 Assets Due from Banks 3044 1803 Assets Investments 41012 36012 Assets Loans Outstanding 103657 118104 Assets Nonnegotiable 1202 1123 Assets Other Assets 2247 3071 Assets Other Receivables 984 811 Assets Receivables 176 171 Assets Securities 33 289 Equity Capital Stock 11491 11492 Equity Deferred Amounts 359 313 Equity Other -1683 -3043 Equity Retained Earnings 29870 28793 Liabilities Borrowings 110040 128577 Liabilities Derivative Liabilities 115642 110418 Liabilities Other 57 8 Liabilities Other Liabilities 7321 5454 Liabilities Sold or Lent 2323 998
  • 46. rows = ["item.category", "item.subcategory"] columns = ["year"] measures = ["amount_sum"] table = result.cross_table( rows, columns, measures )
  • 47. Slicer The HTTP OLAP Server ✂
  • 48. Application HTTP JSON Slicer ∑ Aggregation Browser
  • 49. GET /model GET /aggregate GET /values GET /report
  • 50. w logical model configuration data $ slicer serve slicer.ini
  • 51. [server] backend: sql log_level: info [model] path: model.json locales: en,sk [workspace] url: postgres://localhost/database schema: datamart fact_prefix: ft_ dimension_prefix: dm_ w
  • 52. amount GET /aggregate
  • 53. GET aggregate { "cell": [], "drilldown": [], "summary": { "record_count": 62, "amount_sum": 1116860 } }
  • 54. amount ✂ GET /aggregate?cut=date:2010
  • 55. GET aggregate?cut=year:2010 { "cell": [ { "path": ["2010"], "type": "point", "dimension": "year", "level_depth": 1 } ], "drilldown": [], "summary": { "record_count": 31, "amount_sum": 566020 } }
  • 56. GET aggregate?drilldown=year { "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, "year": 2009 }, { "record_count": 31, "amount_sum": 566020, "year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 } }
  • 57. GET report Content-Type: application/json list of cuts { "cell": [ { "dimension": "date", "type": "range", "from": [2009], "to": [2011,6] } ], "queries": { list of "by_segment": { named queries "query": "aggregate", "drilldown": ["segment"] }, "by_year": { "query": "aggregate", "drilldown": {"date":"year"} } } }
  • 58. SQL Backend What data it works with?
  • 59. or ❄
  • 60. ★ dimensions fact table
  • 61. fact table dimensions
  • 62.
  • 63. Aggregation Browser Browsing Context Snowflake Denormalized or Mapper Mapper denormalized view snowflake ❄
  • 64. logical physical ❄
  • 65. SQL Features ■ does not require DB write access ■ denormalisation ■ denormalised browsing, indexing ■ simple date datatype dimension ■ extraction of date parts during mapping ■ multiple schema support
  • 67. ■ model validation slicer model validate model.json ■ model translation slicer model translate model.json translation.json ■ workspace testing slicer test config.ini ■ denormalization slicer denormalize --materialize --index config.ini
  • 69. ■ formatters for visualisation libraries ■ JavaScript library* help needed ■ backends ■ derived measures *http://github.com/Stiivi/cubes-js
  • 70. Open Data ■ shared repository of models ■ shared repository of dimensions ■ public cubes open Slicer HTTP APIs http://github.com/Stiivi/cubes/wiki
  • 71. stay light Nutrition Facts Serving Size 1 cube Amount Per Serving % Daily Value Total Fat 0g 0% Saturated Fat 0g Trans Fat 0g
  • 72. Thank You source: github.com/Stiivi/cubes documentation: packages.python.org/cubes/ examples: github.com/Stiivi/cubes-examples
  • 74. Transactions Reporting multidimensional object–relational modelling modelling ORM mapping logical model (and mapping) database connection browser database engine workspace
  • 75. Limitations ■ one cut per dimension in a cell ■ logical conjunction of cuts (cut1 AND cut2 AND cut3 ...) ■ dimension-only selection ■ one - default hierarchy ■ some internals are ready for multiple

Hinweis der Redaktion

  1. OLAP and Logical Model, Architecture, Slicing and Dicing, HTTP Server, SQL Backend\n\n
  2. \n
  3. \n
  4. Q: Who is familiar with OLAP?\n
  5. quick setup and reporting\ndoes not cover everything (intentionally)\n
  6. example application - public procurements of slovakia\n
  7. quick setup and reporting\ndoes not cover everything (intentionally)\n
  8. will talk about modelling first, then reporting, then going to mix\n
  9. how it looks like and what it does?\n
  10. FIXME: add slicer tool here\n
  11. not going into details, but just to align terminology and define context\n
  12. not so rare we see creating reports directly from what is available, instead of starting with business needs and tryig to find a way how to derive it from what is available\n
  13. different approach to data use, different needs\nwhile in apps you are focusing on transactions - trans data/oltp, in reporting you are focusing on analysis -> analytical data\nlogically separate (does not have to be physically separate)\n
  14. \n
  15. \n
  16. \n
  17. CONTEXT: where did the sale happened? who signed the contract?\nFILTER: how much was spent for construction work?\nAGGREGATION SCOPE: what was the revenue by country?\n\nused for ordering or sorting\ndefine master-detail relationships\n
  18. \n
  19. \n
  20. provides metadata to easily create apps\n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. what the browser does?\n
  28. aggregating measures\n
  29. \n
  30. aggregation browser has to have concrete backend implementation\n
  31. + bunch of other stuff\n
  32. context\n
  33. before I will talk about aggregation browser, I have to introduce a cell\n
  34. \n
  35. \n
  36. our filter/selection defines the cell\nthis is kind of multidimensional “breadcrumbs”\n
  37. path - taken from file system terminology for easier understanding\nthose are keys\nnote that displayed is level label, not a key\n
  38. ... let’s put it into a picture\n
  39. \n
  40. “aggregation result” was created according to usual report look\n
  41. FIXME: add picture\n
  42. you can specify multiple dimensions and explicit level to be drilled down (for example “month” level of a date dimension)\n
  43. it provides list of records, which are represented as dictionaries \nyou have to find out which one is level attribute or the key\n\n
  44. no need to find the context of dimension of interest\nif not sufficient, one can still fall-back to the manual method\n
  45. \n
  46. facts – get details\nvalues - can be used to create selection boxes, also level can be specified\ncell_details is used for creating the multidimensional breadcrumbs mentioned before - it contains data to humanly describe current context of interest\nordering and pagination is supported\n
  47. what was that “cell” thing?\n
  48. \n
  49. also show hierarchy\n
  50. \n
  51. \n
  52. same drilldown, different cell\n
  53. implicit: raises error if current level is the last one\nexample: you are exploring year 2010 (cell) and would like to see split by year (higher level)\n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. just to name a few...\n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. powered by sqlalchemy\n
  73. powered by great abstraction framework\nconstruction of SQL statements\n
  74. \n
  75. \n
  76. \n
  77. denormalized\n
  78. thanks to new browser and browsing context it is possible to transparently switch between original snowflake and generated denormalized view (which can be materialized and indexed based on dimension level keys)\n
  79. in which table and which column is the attribute?\n
  80. \n
  81. \n
  82. \n
  83. \n
  84. if someone would like to contribute with his skills, he is more than welcome and I will help\n
  85. so if you have OS app, like Django that more users use, you can publish reporting model for others.\nput your cube in the Wiki\n
  86. \n
  87. MIT license\n
  88. \n
  89. \n
  90. \n