Doctoral Symposium on Software Language Engineering 2010.
This is a presentation of a paper that describes how software language engineering is
applied to the process of data warehouse creation. The creation of a data
warehouse is a complex process and therefore costly. The indroduced approach decomposes
the data warehouse creation process into different aspects. These
aspects are described with different languages which are integrated by a
metamodel. Based on this metamodel, large parts of the data warehouse
creation process can be generated. With this approach data warehouses
are created more comfortable in less time.
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Using SLE for creation of data warehouses
1. Using SLE for creation of Data Warehouses
22.11.2015
1
Yvette Teiken
OFFIS Institute for Information Technology,
Escherweg 2, 26121 Oldenburg, Germany
yvette.teiken@offis.de
2. Problem Description and Motivation I
► Goal of a Data Warehouses:
► Perform complex analysis of all organizational data
► Used for decision support
► Time-variant
► Non-volatile
► Integrated data from different sources and different
formats in one integrated dataset
► Utilization of OLAP paradigm to allow easy analysis
and accessibility
► Addressed Problems in my thesis:
► Efficient creation of domain specific DWH
► Example of use:
► Health Reporting: preparation and presentation of
health relevant issues relating to population
2
22.11.2015
3. Problem Description and Motivation II
► Problems during DWH creation:
► No standardized process exists
► Documentation by many large documents
► Missing, distributed, inconsistent information
► A lot of schematic work performed during realization
► Many different user roles involved
► Initial build-up is a complex task
► Expected benefits:
► Faster realization of DWH
► Better documentation of whole creation process
► Not so well trained person can realize a DWH
3
22.11.2015
Analysis
organizational data
Define information
demand
Data source
transformation
Multidimensional
model
Data quality
4. Related Work
► Languages for covering aspects of DWH creation:
► Application Design for Analytical Processing Technologies (ADAPT)
► R2O mapping for relational databases
► InDaQu for Data quality
► MDA and DWA
► Rizzi et. al.: Modelling different aspects of DWHs
► Only deal with a certain aspect, not whole process
► My approach
► Use languages that cover the whole process of DWH creation
► Integrated through a common metamodel
► Deal with multidimensional structures
► Transformations generating large parts of the DWH
► Process model that orders different aspects and connect and refined
4
22.11.2015
5. Proposed Solution I
► Idea: Describe DWH with SLE techniques, generate
semi-automatic DWH
► Decompose DWH in different aspects, describe each
aspect with a language:
► Aspects:
► Data Sources Schemas: Subject, the
representation, and technical accessibility of sources
► Data Source Transformation: Use existing
languages like R2O
► Analysis Schema: Multidimensional data models,
based on ADAPT
► Measures: Mathematical functions on
multidimensional data
► Hierarchy: Central aspect, complex tree structures
► Data Quality: Integrate consistency constraints
(InDaQu)
5
22.11.2015
6. Example
► Hospital markt analysis:
► Find out percentages of birth
► Measure:
►
► Data Source Schema:
► Own Cases: Hospital information system: „§21 Data“
► All Cases: Buy from external source
6
22.11.2015
AllCases
OwnCases
eOfBirthMarketShar
Name Typ Arity
Id of
Insurance
Numeric 10
Year of Birth Numeric 4
Month of
Birth
Numeric 2
Gender String 1
PLZ Numeric 5
Start date Numeric 12
Reason of
admisson
String 1
End date String 12
Age in years String 3
DRG String 4
8. Example
Own Cases
start date
Reason of admisson
year of Birth
DRG
Gender
Id of Insurence
month of Birth
End date
age in years
PLZ
8
22.11.2015
Target
schema
day
ICD
Year
DRG
Gender
=new Datetime(Q[10,11],Q[4,5],Q[0-3])
(G==m M || G==w F)
► Data Source Transformation:
► Consistency Rules:
► ICD=O10-O16 & G=M invalid
► DRG=O01F & G=M invalid
9. Current Status
► Already done
► Analysis Schema DSL
► Hierarchy DSL
► Data Quality DSL
► Transformations for Data Integration and Cubes
► Integrated Metamodel for these aspects
► Left to be done
► Data Source Schema
► Measures
► Data Source Transformation
► Integrate these aspects
9
22.11.2015
10. Research Method and Conclusion
► Research Method
► Validation via implementation
► Described languages, metamodels, and transformations on basis of the
MUSTANG platform
► Ability to generate a configuration for a DWH
► Conclusion
► Experts can design and analyze all aspects of the DWH independently in DSLs
► Enables semi-automatic DWH creation
► Makes development faster
10
22.11.2015