2. Designing Aggregates
Once you have chosen dimensional aggregates, they must
be designed and documented. This is the point of greatest
risk for aggregate implementation.
2
4. The Base Schema
Declaration of grain is an essential part of schema design.
Proper definition of grain not only enables the future
identification of aggregates, it is crucial to the success of
the base schema itself.
4
6. Rollup dimensions
Rollup dimensions should be sourced from the base
dimensions, and their attributes must follow the same
rules for slow change processing.
6
7. Hierarchies
Documenting dimensional hierarchies
may be important for business
intelligence software and database
features such as materialized views and
materialized query tables.
The hierarchies identify potential
aggregation points and can aid in
estimating degree of summarization.
7
10. A Separate Star for Each Aggregation
Dimensional aggregates should be stored in separate
tables for each aggregation.
10
11. A Separate Star for Each Aggregation
Do not store different levels of aggregation in the same
schema. The schema will be capable of providing wrong
results.
11
12. Aggregate facts
Aggregate facts should be stored in separate tables for
each level of aggregation. These may be separate
aggregate fact tables or separate prejoined aggregate
tables
12
13. Naming Conventions
Facts and dimensional attributes should receive the same
name in anaggregate schema as they do in the base
schema.
The name of an aggregate dimension table should
describe the contents of its rows.
The names of aggregate fact tables are always
problematic. The best you can do is establish a convention
and stick to it.
13
14. Aggregate Dimension Design
Attributes of the aggregate dimension must be identical
to those in the base dimension in name and data type.
Slow change processing rules must be identical. The
natural key of an aggregate dimension will be different
from the base dimension.
Source aggregate dimensions from the base dimension,
rather than the original source system. This eliminates
redundant processing, and ensures uniform presentation
of data values.
14
15. Aggregate Dimension Design
Aggregate dimension tables are often shared by multiple
aggregates, and sometimes used by base fact tables. These
shared dimension tables do not need to be built
redundantly; the various fact tables can use the same
dimension table. If the shared table is to be instantiated
more than once, build it a single time and then replicate
it.
The documentation for a shared dimension must enumerate all
dependent fact tables, whether part of the base schema or
aggregates. In some cases, frequent updates to a dimension may
require updates to fact tables outside their normal load
windows.
15
16. Aggregate Fact Table Design
Aggregate Facts: Names and Data Types
The aggregate fact should have the same business definition and
column name as the base fact
Unlike dimensional attributes, the aggregate fact may have a different
data type than its counterpart in the base schema
No New Facts, Including Counts
Counts cannot be accurately performed against aggregate schemas,
even if all attributes are the same. All counts must be performed
against the base schema.
As a general rule of thumb, the only count to be added to an
aggregate should show the number of base rows summarized. If this
fact is added to the aggregate, it should also appear in the base fact
table with a constant value of 1. Counts of any other attribute should
be directed to the base schema only.
16
17. Aggregate Fact Table Design
Audit Dimension:
The audit record associated with a row in the aggregate fact
table does not summarize the audit data associated with the
base fact table. It describes the process by which the aggregate
row was inserted or updated.
Sourcing Aggregate Fact Tables
Facts will be sourced from the base fact table and aggregated
by the load process as appropriate.
17