2. Welcome
• Definition of agility
• Types of agility
• Discuss current approaches
• Hyper-agility
• Observations from the field
– Also topics of operational data warehousing, operational bi, agile project
management techniques, agility oriented tools, and operational integration
3. Data Warehouse Agility
• Agility
– The overall measure of adaptability in terms of speed
& scope.
– Overall performance in adapting to change.
NOTE: Not warehouse machine throughput, near real time (NRT)
processing, and operational DW performance…
Ability of the data warehouse to adapt to change
Versus
Performance of an existing (steady state) warehouse
4. Data Warehouse Agility
• Agility
– Agile in IT
• Agile Project Management
• Agile Software Development
– Agile Manifesto
We are uncovering better ways of developing software by doing it and helping others do it.
Through this work we have come to value:
Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
That is, while there is value in the items on the right, we value the items on the left more.
• Agile Modeling Driven Design (AMDD)
• Test-Driven Design (TDD)
5. Data Warehouse Agility
• Agility in the Data Warehouse
– Agility in terms of Data Warehousing is related to the ability to
build incrementally.
– The approach today is more concerned with the development
of a business intelligence, data warehousing program – the
capability to increment (adapt and grow).
– Since the business is always changing (new reporting needs,
new business processes, new business units, new data
sources, etc.) the EDW program is an ongoing initiative that
needs to focus on adapting to these changes.
– Note: distinguish between operational integration and data
warehousing.
6. Types of Data Warehouse Agility
Change DW
New Source
New Mart
Data Warehouse
New Attribute
New Subject Area
16. Data Warehouse Agility
• Why create a Data Model for the DW?
• Model Data versus Meaning?
– Separate the capture of data from the meaning?
– The structure of a table versus the semantics
– Business meaning versus data loading
– As XML is to EDI
18. Concept of Name/Value Pair
Cust_ID Lname Fname Add City State Zip Bdate
121202 Lundquist Carl 22 Bird St NYC NY 98291 10/9/1977
123335 Dahlgren Eva 7 Academy Madison NJ 07940 2/12/1982
139090 Lundberg Scott 444 7th St Tuborg MN 70098 4/22/1988
119944 Hultquist Darla 17 South Randolf PA 91121 9/22/1967
120334 Forsberg Sven 117 East A NYC NY 98292 8/19/1976
Each Value or ”data item” (record value for each attribute), is provided in a
List format paired with the corresponding Name or ”field name” (column
header) from the normalized table structure.
Moving to Name / Value Pair…
19. Concept of Name/Value Pair
Name Value
Cust_ID Lname Fname Add City State Zip Bdate
121202 Lundquist Carl 22 Bird St NYC NY 98291 10/9/1977
Cust_ID Lname Fname Add City State Zip Bdate
123335 Dahlgren Eva 7 Academy Madison NJ 07940 2/12/1982
Cust_ID Lname Fname Add City State Zip Bdate
139090 Lundberg Scott 444 7th St Tuborg MN 70098 4/22/1988
Cust_ID Lname Fname Add City State Zip Bdate
119944 Hultquist Darla 17 South Randolf PA 91121 9/22/1967
Cust_ID Lname Fname Add City State Zip Bdate
120334 Forsberg Sven 117 East A NYC NY 98292 8/19/1976
20. Moving to Name/Value Pair
Cust_ID Lname Fname Add City State Zip Bdate
121202 Lundquist Carl 22 Bird St NYC NY 98291 10/9/1977
123335 Dahlgren Eva 7 Academy Madison NJ 07940 2/12/1982
139090 Lundberg Scott 444 7th St Tuborg MN 70098 4/22/1988
119944 Hultquist Darla 17 South Randolf PA 91121 9/22/1967
120334 Forsberg Sven 117 East A NYC NY 98292 8/19/1976
V
N
A
A
L
M
U
E
E
Transpose
…with column headings…
21. Name Value
Cust_ID
Lname
121202
Lundquist
Name/Value Pair
Fname Carl
Add 22 Bird St
City NYC
State NY
Zip 98291
Bdate 10/9/1977
Cust_ID 123335
Lname Dahlgren
Fname Eva
Add 7 Academy
City Madison
State NJ
Zip 7940
Bdate 2/12/1982
Cust_ID 139090
Lname Lundberg
Fname Scott
22. Name Value
Cust_ID 121202
Lname Lundquist
Fname Carl
Add 22 Bird St
The concept of the ”record” is effectively
City NYC
lost in this transformation.
State NY
Zip 98291 Now a RECORD is a set of Name/Value Pair
Bdate 10/9/1977 instances…
Cust_ID 123335
Lname Dahlgren
CON Lose resolution on the record.
Fname Eva
Add 7 Academy
City Madison
State NJ
Zip 7940
Bdate 2/12/1982
Cust_ID 139090
Lname Lundberg
Fname Scott
23. Name Value
Cust_ID 121202
Lname Lundquist
Fname Carl
Add 22 Bird St
City NYC
State NY
Zip 98291
Bdate 10/9/1977
Cust_ID 123335
Lname Dahlgren
Fname Eva Also, the attributes are not defined in
Add 7 Academy advance – we don’t know what to expect and
City Madison we can’t check for attribute meaning,
State NJ definitions, domain values or data types.
Zip 7940
Bdate 2/12/1982
CON Attributes are not pre-defined.
Cust_ID 139090
Lname Lundberg
Fname Scott
24. Name Value
Cust_ID 121202
Lname Lundquist
Fname Carl
Add 22 Bird St
New attributes that are introduced into the
City NYC
source feed are added instantly to the DW.
State NY
There is no modeling delay, no code
Zip 98291
change, and no ETL impact…
Bdate 10/9/1977
CustClass Big
Cust_ID 123335 PRO Absorb new attributes instantly.
Lname Dahlgren
Fname Eva
Add 7 Academy
City Madison
State NJ
Zip 7940
Bdate 2/12/1982
CustClass Small
Cust_ID 139090
25. Hyper Agility
• The solution to deal with these issues requires a further level of
abstraction which in effect moves the persisted (historized,
permanent, integrated) data store even further away from the
business context that it is intended to represent.
• The DW model – the data model itself – is then not readable (not
understandable). In fact ETL professionals will also find themselves
further removed from this model. To the extent that a model is
intuitive, self-descriptive, and aligned with business meaning, this
approach takes a step in the other direction.
• Moving towards addressing these business driven agility
requirements casues the model itself to move much further away
(an order of magnitude away) from the business. So far as to
become effectively a technical solution utilizing only abstract
representations.
26. Hyper Agility
• The context – the meaning of the data – will in these cases need to
be managed in a different way.
• This can include a form of persisted and historized metadata
concerning the mappings and business rules. In effect a form of
EAI within the DW.
• Or it might include a more traditional secondary DW layer.
27. DW AGILITY SUMMARY
• Consider specific Agility Requirements
• Classify Agility Types and consider Alternatives
• Distinguish between operational integration and DW
• Look to modeling techniques optimized for Data Warehouse
• Look at entire picture – people, process, models and data
• Consider specific methodologies, templates and tools
• Determine if hyper agility is a requirement