SlideShare ist ein Scribd-Unternehmen logo
1 von 90
AGILE DATA WAREHOUSE DESIGN WITH BIG DATA

John DiPietro & Jim Stagnitto
1
AGENDA
•

INTRODUCTION / A2C OVERVIEW

•

MODELING FOR END USERS

•

ROLE OF DIMENSIONAL MODELS IN BIG DATA

•

EXAMPLE: E-COMMERCE
•

STRUCTURED DATA: SALES

•

SEMI-STRUCTURED DATA: CLICKSTREAM

•

AGILE DIMENSIONAL MODELING OVERVIEW

•

CASE STUDY REVIEW

•

Q&A

2
INTRODUCTION
A2C

•
•

BOUTIQUE EDM (ENTERPRISE DATA MANAGEMENT)
CONSULTANCY FIRM:

•

DATA WAREHOUSING

•

MASTER DATA MANAGEMENT

•

CLOSED LOOK ANALYTICS AND VISUALIZATION

•

DATA & APPLICATION ARCHITECTURE
JOHN DIPIETRO

•
•

PRINCIPAL, CHIEF TECHNOLOGY OFFICER
JIM STAGNITTO

•
•

DATA WAREHOUSE & MDM ARCHITECT

3
ON THURSDAY 11/14 A2C’S JIM STAGNITTO AND
JOHN DIPIETRO PRESENTED A WORKSHOP…
FEATURING AGILE DATA WAREHOUSE DESIGN - A STEP-BY-STEP
METHOD FOR DATA WAREHOUSING / BUSINESS INTELLIGENCE (DW/BI)
PROFESSIONALS TO BETTER COLLECT AND TRANSLATE BUSINESS
INTELLIGENCE REQUIREMENTS INTO SUCCESSFUL DIMENSIONAL DATA
WAREHOUSE DESIGNS.

BEAM✲

THE METHOD UTILIZES
(BUSINESS EVENT ANALYSIS AND
MODELING) - AN AGILE APPROACH TO DIMENSIONAL DATA MODELING
THAT CAN BE USED THROUGHOUT ANALYSIS AND DESIGN TO IMPROVE
PRODUCTIVITY AND COMMUNICATION BETWEEN DW DESIGNERS AND BI
STAKEHOLDERS.
SPONSORED BY MICROSOFT NERD (NEW ENGLAND RESEARCH AND
DEVELOPMENT CENTER) AND ATTENDED BY 93 DATA SCIENTISTS…
COMPETITIVE ADVANTAGE

CEO, Craig Spitzer

Pres., Scott King

CTO, John DiPietro

CRO, Brian Cassidy Managing Sales Dir., Joe Cattie

The founders of a2c were part of the fastest growing privately held IT
consulting and staff augmentation firm in the U.S. from 1994-2002. Our
Executive Management Team has over 100 years of collective
experience and has been responsible for delivering over a half billion
dollars of IT Consulting and staff augmentation revenue from 1994
through the present day.

a2c Top Twenty Most
Promising Data Analytics
November 2013

Alliance Consulting, Inc.
1999, 2000, 2001

CEO, Alliance Consulting
Group, Craig Spitzer 2001
AGILE DW DESIGN OVERVIEW

6
MODELING FOR END USERS:
HOW TO DESIGN TO ANSWER BUSINESS
QUESTIONS?

•

•

THINK ABOUT HOW QUESTIONS ARE ARTICULATED
AND HOW THE ANSWERS SHOULD BE DELIVERED

•

IDENTIFY A COMMON QUESTION FRAMEWORK

•

DESIGN AN ARCHITECTURE THAT
EMBRACES AND LEVERAGES THIS
COMMON QUESTION FRAMEWORK

•

UTILIZE THE BEST DESIGNS AND
TECHNOLOGIES TO:
(A) DERIVE THE ANSWERS
(B) PRESENT THEM IN COMPELLING WAYS THAT LEAD
TO THE NEXT INTERESTING QUESTION!

7
HOW DO WE ASK QUESTIONS?
What

When

Who

“HOW DO THIS QUARTER‟S SALES BY SALES REP
OF ELECTRONIC PRODUCTS THAT WE PROMOTED
TO RETAIL CUSTOMERS IN THE EAST COMPARE
WITH LAST YEAR‟S?”
When
Who

Why
Where

What

8
HOW DO WE ASK QUESTIONS?
EVENTS / TRANSACTIONS

•
•

E.G. SALE

•

A IMMUTABLE "FACT" THAT OCCURS IN A TIME AND
(TYPICALLY A) PLACE

INTERROGATIVES:

•
•

WHO, WHAT, WHEN, WHERE, WHY

•

DESCRIPTIVE CONTEXT THAT FULLY DESCRIBES THE EVENT

•

A SET OF “DIMENSIONS" THAT DESCRIBE EVENTS

9
DIMENSIONAL VALUE PROPOSITION
•

IT MAKES SENSE TO PRESENT ANSWERS TO PEOPLE USING THE SAME
TAXONOMY OF EVENTS AND INTERROGATIVES (AKA: FACTS AND
DIMENSIONS - DIMENSIONAL STRUCTURE) THAT THEY USE WHEN
FORMING QUESTIONS;

•

EVENTS ARE INSTANCES OF PROCESSES ;

•

IT‟S BEST TO PRESENT INFORMATION TO PEOPLE WHO WILL ASK THE
SYSTEM QUESTIONS IN DIMENSIONAL FORM;

•

THIS IS TRUE REGARDLESS OF THE TYPE OF INFORMATION BEING
INTERROGATED, ITS SOURCE, OR IT STUFF (LIKE DATABASE
TECHNOLOGIES UTILIZED);

•

IT‟S BEST TO MODEL THIS PRESENTATION LAYER BASED ON THE
EVENTS (AKA: BUSINESS PROCESSES) THAT UNDERLIE THE
QUESTIONS.

10
How
How
Many

Why
11
SCENARIOS:
A BRIEF DISCUSSION OF HOW AND
WHERE DIMENSIONAL MODELING
AND/OR DATABASES FIT WITHIN
COMMON AND EMERGING “BIG
DATA” DATA WAREHOUSING
ARCHITECTURES

12
KIMBALL DIMENSIONAL DW
Dimensional BI Semantic Layer
Dimensional Data Warehouse
Data Movement / Integration
Source Data
(Structured)

13
KIMBALL WITH BIG DATA
Dimensional BI Semantic Layer
Dimensional Data Warehouse

Big Data Capture
(e.g. HDFS)

Big Data
Discovery
(e.g. MR)

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

14
CORPORATE INFORMATION FACTORY (CIF)
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Corporate Information Factory 3NF DW

Data Movement / Integration
Source Data
(Structured)

15
CIF WITH BIG DATA
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Big Data Capture
(e.g. HDFS)

Big Data
Discovery

Corporate Information
Factory 3NF DW

(e.g. MR)

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

16
DATA VAULT
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Data Vault

Data Movement / Integration
Source Data
(Structured)

17
DATA VAULT WITH BIG DATA
Dimensional BI Semantic Layer
Dimensional Tier
(Virtual or Physical)

Big Data Capture
(e.g. HDFS)

Big Data
Discovery

Data Vault

(e.g. MR)

Data Movement / Integration Tier

Data Movement / Integration Tier

Source Data Tier

Source Data Tier

(Un/Semi-Structured)

(Structured)

18
COMMON FRAMEWORK
Dimensional BI Semantic Layer
Dimensional Tier
[Physical (Kimball) or Virtual (CIF or Data Vault)

(Virtual or Physical)
Persistent
Un/SemiStructured
Staging Area

Unstructured ->
Structured Data
Discovery
Processing

Persistent Structured Data
Repository
(not needed for Kimball)

Un/Semi-Structured Data Movement

Structured Data Movement

Un/Semi-Structured Source Data

Structured Source Data
(Structured)
19

Insight
Generation /
Data Mining
COMMON FRAMEWORK
Dining Room
Readily Accessible to End Users
(and BI Developers)
Safe, Hospital Environment
Data Assets “Ready for Primetime”
Dimensionally Structured

Dimensional BI Semantic Layer
Dimensional Tier
[Physical (Kimball) or Virtual (CIF or Data Vault)

(Virtual or Physical)
Persistent
Un/SemiStructured Staging
Area

Unstructured ->
Structured Data
Discovery
Processing

Persistent Structured Data
Repository

Kitchen

(not needed for Kimball)

Un/Semi-Structured Data Movement

Structured Data Movement

Un/Semi-Structured Source Data

Structured Source Data
(Structured)

Clickstream Data

Off Limits to End Users
Data Professionals Only Please
Dangerous / Inhospitable Environment
Data Assets “Not Ready for Primetime”
Structured Variably For Data Processing

eCommerce Sale

eCommerce Example

20
E-COMMERCE EXAMPLE: CLICKSTREAM
Semi-Structured
Recording of every page request made by a
user
Includes some structural elements – such as
when the request was made and who the user
is
Requires significant prep work in order to fit into
a traditional row-based relational database
Apples and Oranges: Pre-Sessionized Page
Visits, Detailed Product Views, Catalogue
Requests, Shopping Cart Adds / Deletes /
Abandons, etc.
Needs to be converted into separate-butrelatable dimensional facts - with many shared
(conformed) dimensions

21

Raw Clickstream Data
25 52 164 240 274 328 368 448 538 561 630 687 730 775 825
834
39 120 124 205 401 581 704 814 825 834
35 249 674 712 733 759 854 950
39 422 449 704 825 857 895 937 954 964
15 229 262 283 294 352 381 708 738 766 853 883 966 978
26 104 143 320 569 620 798
7 185 214 350 529 658 682 782 809 849 883 947 970 979
227 390
71 192 208 272 279 280 300 333 496 529 530 597 618 674 675
720 855 914 932
183 193 217 256 276 277 374 474 483 496 512 529 626 653 706
878 939
161 175 177 424 490 571 597 623 766 795 853 910 960
125 130 327 698 699 839
392 461 569 801 862
27 78 104 177 733 775 781 845 900 921 938
101 147 229 350 411 461 572 579 657 675 778 803 842 903
71 208 217 266 279 290 458 478 523 614 766 853 888 944 969
43 70 176 204 227 334 369 480 513 703 708 835 874 895
25 52 278 730
151 432 504 830 890
71 73 118 274 310 327 388 419 449 469 484 706 722 795 810
844 846 918
130 274 432 528 967
188 307 326 381 403 523 526 722 774 788 789 834 950 975
89 116 198 201 333 395 653 720 846
70 171 227 289 462 538 541 623 674 701 805 946 964
143 192 317 471 487 631 638 640 678 735 780 865 888 935
17 242 471 758 763 837 956
52 145 161 283 375 385 676 721 731 790 792 885
182 229 276 529
43 522 565 617 859
TYPICAL CLICKSTREAM “PAGE VIEW” DIMENSIONAL
MODEL
What

When

What

Who

Why

22
E-COMMERCE EXAMPLE: WEB SALES
•
•

•

FULLY STRUCTURED
THE SALE TRANSACTION TYPICALLY CARRIES ALL FUNDAMENTAL
DIMENSIONS:
• TIME
• CUSTOMER
• REFERRING URL / SEARCH PHRASE
• PRODUCT
• PURCHASE AND/OR SHIPMENT (GEO OR URL) LOCATIONS
• PROMOTION / CAMPAIGN
• ETC.
AND “HOW MANY” MEASURES
• UNIT AND PRICE QUANTITIES / AMOUNTS
• DISCOUNT AMOUNTS
• ETC.

23
E-COMMERCE DIMENSIONALITY
Facts (below) &
Dimensions (right)

Time
(When)

Page Visit

View Start
View End
Session Start
Session End

Customer
(Who)

Web Page
(Where)

Visitor

Current
Pre
vious
Next

Detailed Product View

View Start
View End
Session Start
Session End

Prospect

Current
Pre
vious
Next

Shopping Cart Activity

Activity Start
Activity End

Sale (Checkout)

Shipment / Delivery

Product
(What)

Referring
URL
(Where)

Promotion /
Campaign
(Why)

Activity
Type
(How)

✔︎

✔︎

✔︎

Prospect

✔︎

✔︎

✔︎

✔︎

Sale Start
Sale End

Customer

✔︎

✔︎

✔︎

✔︎

Shipment
Delivery

Customer
Delivery
Recipient

✔︎

24
AGILE DW DESIGN OVERVIEW

25
THE FIRST DIMENSIONAL MODELER:
R.K.
Ralph Kimball?
Rudyard Kipling

26
“I keep six honest serving-men
(They taught me all I knew);
Their names are What and Why and When
And How and Where and Who…”
–Rudyard Kipling

27
THE

7WS
Framework
How
How
Many

Why
HOW DID WE GET HERE?
DW ARCHITECTURES: A BRIEF HISTORY
Corporate Information
Factory

Undisciplined
Dimensional

Dimensional Bus
Architecture

Data-Driven Analysis

Report-Driven Analysis

Process-Driven Analysis
7WS DIMENSIONAL MODEL
When

Who

Time

Customer

Day

How – Facts:

Employee

Month

Much

Third Party

Fiscal Period

Many

Organization

Often

£$€
What

Where

Product

Location

Why

Causal

Geographic
Store
Ship To
Hospital

??

Service

Transactions

Promotion
Reason
Weather
Competition
How
Why

BEAM

How
Many

Business Event Analysis & Modeling
TO DOWNLOAD WITH AUDIO WORKSHOP FILE:
PLEASE COMPLETE THE FOLLOWING REQUEST FORM
FOR FREE LINK TO AGILE DATA WAREHOUSE DESIGN
PRESENTATION.
REVIEWS:
“EXCELLENT PRESENTATION. IT IS GOOD TO HEAR MEANINGFUL
…INFORMATION ABOUT NEW DEVELOPMENTS IN HOW AGILE
METHODOLOGIES CAN BE APPLIED TO DW/BI WORK. BIG KUDOS TO
THE PRESENTERS AND ORGANIZERS. THANKS, I FOUND IT VERY
USEFUL AND ENJOYABLE.”- RAMON VENEGAS
“EXTREMELY USEFUL TO UNDERSTAND HOW TO APPLY AGILE
APPROACH TO DWH; HOW CREATE A FRAMEWORK WHERE MODEL
CHANGES ARE WELCOME, AND BRING USERS TO THE PROCESS OF
DWH MODELING.” – ALFREDO GOMEZ

34
HOW
do you design a data warehouse?
TECH DESIGN ARTIFACTS?
OK, NOW VALIDATE WITH BUSINESS…
WHY
Agile Data Warehousing?
WATERFALL BI/DW DEVELOPMENT
Limited Stakeholder Interaction
Analysis
Design
Development
This Year

Stakeholder
Input

BDUF
Requirements

Data
Model

Next Year

Test
Release

ETL

BI

DATA

VALUE?
AGILE DW/BI DEVELOPMENT
Stakeholder interaction

?

JEDUF

BI
Prototyping

ETL

Review
Release

This Year

Next Year

Iteration 1

VALUE?

Iteration 2

ETL
BI
Iteration 3Rev

ADM

VALUE

Iteration …

VALUE!

DATA

Iteration n

VALUE!

VALUE!
STATE OF THE DW FIELD
•
•

SOLID:

DIMENSIONAL DATA WAREHOUSE
DESIGN IS MATURE

•

PROVEN DESIGN PATTERNS EXIST FOR
COMMON REQUIREMENTS

•
•

HIT OR MISS:

COLLECTING UNAMBIGUOUS AND
THOROUGH REQUIREMENTS

•

SLOTTING REQUIREMENTS INTO
PROVEN DESIGN PATTERNS

•

END-USER OWNERSHIP AND
VALIDATION

•

TOO OFTEN: SNATCHING DEFEAT FROM
THE JAWS OF VICTORY

41
MODELSTORMING:
QUICK

Interactive

Inclusive

Data
Modeler

BI Stakeholders

Fun
BEAM✲ METHODOLOGY
Structured, non-technical, collaborative
working conversation directly with BI Users

BEAM✲

• BI User’s Business
Process,
Organizational,
Hierarchical, and Data
Knowledge
• Focused Data Profiling

Data
Modeler

BI Stakeholders

• Logical and Physical
(Kimball-esque)
Dimensional Data
Models
• Example data
• Detailed and Testable
ETL Specification
• Instantiated DW
Prototype
REQUIREMENTS = DESIGN

4
COLLABORATION AT
EVERY STEP
AGILE DATA MODELING
REQUIREMENTS:
•

TECHNIQUES FOR ENCOURAGING INTERACTION

•

MUST USE SIMPLE, INCLUSIVE NOTATION AND TOOLS

•

MUST BE QUICK: HOURS RATHER THAN DAYS – MODELSTORMING

•

BALANCE „JUST IN TIME‟ (JIT) AND „JUST ENOUGH DESIGN UP FRONT‟
(JEDUF) TO REDUCE DESIGN REWORK

•

DW DESIGNERS MUST EMBRACE DATA MODEL CHANGE, ALLOW MODELS TO
EVOLVE, AVOID GENERIC DATA MODELS; NEED DESIGN PATTERNS THEY CAN
TRUST TO REPRESENT TOMORROW‟S BI REQUIREMENTS TOMORROW

•

ETL AND BI DEVELOPERS MUST EMBRACE DATABASE CHANGE; NEED TOOL
SUPPORT

46
WHAT
kind of model?
CALENDAR

PRODUCT

Date Key

Product Key

Date
Day
Day in Week
Day in Month
Day in Qtr
Day in Year
Month
Qtr
Year
Weekday Flag
Holiday Flag

Product Code
Product Description
Product Type
Brand
Subcategory
Category

SALES FACT
Date Key
Product Key
Store Key
Promotion Key

Quantity Sold
Revenue
Cost
Basket Count
STORE

PROMOTION

Store Key

Promotion Key

Store Code
Store Name
URL
Store Manager
Region
Country

Promotion Code
Promotion Name
Promotion Type
Discount Type
Ad Type
MODELING BY ABSTRACTION
MODELING BY EXAMPLE:
AGILE DW DESIGN
PROCESS

5
COLLABORATIVE / CONVERSATIONAL DESIGN

Who does what?
“Customers buy products”

BEAM✲
Modeler

Subjects Verb Objects

BI
Users
DESIGN USING NATURAL LANGUAGE
•

VERBS – EVENTS – RELATIONSHIPS – FACT TABLES

•

NOUNS – DETAILS – ENTITIES – DIMENSIONS

•

MAIN CLAUSE – SUBJECT-VERB-OBJECT

•

PREPOSITIONS – CONNECT ADDITIONAL DETAILS TO
THE MAIN CLAUSE

•

INTERROGATIVES – THE 7WS – DIMENSION TYPES

•

BUSINESS VOCABULARY - NO “IT-SPEAK”

55
“Spreadsheet”-like Models
Event Table Name (filled in later)

Subject Column Name
Verb
Object Column Name

Interrogative

Details
Example Data (4-6
rows)
Straightforward Methodology
1
1
1
1
1
1

Subject-Verb-Object

1
1
1
3
1
1

Who

What

When

Declare Event Type
Where

How
(many)

Why

Sufficient Detail Fact
Granularity

How

1
1
1
4
1
1

1
1
1
5
1
1
1
1
2
1
1
1
1
1
1
6
1
1
1
1
1
7
1
1
1
1
8
1
1
1
1
1
1
9
1
1

Initial Data Examples

Quantities - Facts
CAPTURE EXAMPLE DATA:
verb

on/at/every

SUBJECT

OBJECT

EVENT
DATE

[who]

[what]

[when]

[where]

[how many]

[why]

[how]

Typical

Typical/Popular

Typical

Typical

Typical/Average

Typical/Normal

Typical/Normal

Different

Different

Different

Different

Different

Different

Different

Repeat

Repeat

Repeat

Repeat

Repeat

Repeat

Repeat

Missing

Missing

Missing

Missing

Missing

Missing

Missing

Group

Multiple/Bundle

Old, Low

Old, Low Value

Oldest needed

Near

Min, Negative, 0

New, High

New, High

Most Recent, Future

Far

Max, Precision

Multi-Level

ENGAGE
CLARIFY DEFINITIONS / CONFORM
DIMENSIONS

Multiple Values

Exceptional

Exceptional

ILLUSTRATE EXCEPTIONS
“DRIVE OUT UNIQUENESS”
“SHOW AND TELL”
THOUGHTFUL EXAMPLE DATA:

Detailed ETL
Specification
IDENTIFY EVENT TYPE EARLY
ADJUST CONVERSATION BASED ON EVENT TYPE
DISCRETE EVENT -> TRANSACTION

•
•

INSTANTANEOUS/SHORT DURATION, IRREGULARLY OCCURRING
EVENTS OR TRANSACTIONS

RECURRING EVENT -> PERIODIC SNAPSHOT –
MEASUREMENT

•

•

REGULARLY OCCURRING EVENTS, ONGOING PROCESSES, TYPICALLY
USE TO MEASURE CUMULATIVE OF DISCRETE EVENTS

EVOLVING EVENT -> ACCUMULATING SNAPSHOT –
TIMELINE

•

•

NON-INSTANTANEOUS/LONGER DURATION, IRREGULARLY OCCURRING
EVENTS OR TRANSACTIONS

•

REPRESENTS CURRENT STATUS - REFLECTS ADJUSTMENTS

61
CAPTURE WHEN DETAILS
When do Customers order
Products?
“On the Order Date”
BEAM✲
Modeler

BI Users
ANY OTHER WHENS ?
ANY OTHER WHOS ?
AND SO ON...
MODEL HOW MANY MEASURES:
•

ADDITIVE – CAN BE SUMMED UP OVER ANY
COMBINATION OF DIMENSIONS. NO SPECIAL RULES

•

NON-ADDITIVE – CAN NOT BE SUMMED OVER ANY
DIMENSION E.G. UNIT PRICE OR TEMPERATURE
•
•

•

MUST BE AGGREGATED IN OTHER WAYS E.G. AVERAGE, MIN, MAX
DEGENERATE DIMENSIONS – TRANSACTION #, TIMESTAMPS, FLAGS

SEMI-ADDITIVE – CAN NOT BE SUMMED ACROSS AT
LEAST ONE DIMENSION E.G. BALANCES CAN NOT BE
SUMMED OVER TIME

66
MODELING DIMENSIONS:
ANNOTATE W TARGETED DATA PROFILING:
PROCEED THROUGH THE BUSINESS PROCESS
VALUE CHAIN:
COLLABORATIVE DIMENSION CONFORMANCE:
IDENTIFY HIERARCHY TYPES:
GRAPHICALLY DEPICT HIERARCHIES:
VISUALIZE THE HIERARCHIES
PAINT THE ORGANIZATION
PROTOTYPE! NOT “DATA MODEL REVIEW”
RECAP:
COLLABORATIVE AND AGILE

•
•

DATA MODELING

•

DATA SOURCING

•

DATA CONFORMANCE

REQUIREMENTS = DESIGN

•
•

SLOTS DIRECTLY INTO PROVEN AND MATURE DIMENSIONAL DATA
WAREHOUSING DESIGN PATTERNS

VALIDATION THROUGH PROTOTYPING

•
•

SEMI-AUTOMATED BUILD OF DIMENSIONAL DATA WAREHOUSE

•

PERFECT COMPLIMENT TO AGILE BI TOOLS AND METHODS (E.G.
PENTAHO)

76
IF YOU HAVE BEEN AFFECTED BY
ANY OF THE ISSUES RAISED
IN THIS PRESENTATION…
AGILE DATA WAREHOUSE DESIGN
LAWRENCE CORR, JIM STAGNITTO,
DECISION PRESS, NOVEMBER 2011
QUESTIONS/COMMENTS?
CONTACT: JIM STAGNITTO
OR JOHN DIPIETRO

215-789-4816
A2C CORPORATE OVERVIEW &
INDUSTRY EXPERIENCE

8
0
COMPANY OVERVIEW
•

TECHNOLOGY SOLUTION CONSULTANCY
HEADQUARTERED IN PHILADELPHIA WITH REGIONAL
OFFICES IN NEW YORK AND BOSTON

•

SERVICING HEALTHCARE, LIFE SCIENCE, TEL-COM AND
FINANCIAL SERVICES INDUSTRIES WITH RECENT
OBTAINMENT OF OUR GSA SCHEDULE TO PURSUE
FEDERAL GOVERNMENT OPPORTUNITIES

•

CONSULTANT BASE OF OVER 2500 PROVEN IT
PROFESSIONALS THROUGHOUT THE NORTH EAST REGION
WITH A RECRUITING NETWORK WHICH PROVIDES
NATIONAL COVERAGE

8
1
COMPANY OVERVIEW
•

FLEXIBLE APPROACH TO HELPING OUR CLIENTS WITH
THEIR INITIATIVES
•

PROJECT-BASED SOLUTIONS

•

STAFF AUGMENTATION

•

MANAGED SERVICE OFFERINGS – “ON-SHORE QA ,
DEVELOPMENT & APPLICATION SUPPORT”

•

EXECUTIVE & PROFESSIONAL SEARCH

8
2
a2c’s Recruiting Engine and Methodology
is one of the Best in the Industry…
CAPABLE OF PRODUCING QUALITY RESULTS ON-DEMAND
FOR OUR CLIENTS. RESOURCE MANAGERS CONTINUALLY
“SILO” DISCIPLINES WITH AVAILABLE CANDIDATES WHO
HAVE PROVEN THEIR ABILITIES WITH
A2C OVER THE PAST DECADE. THE
A2C SOLUTIONS ORGANIZATION IS
INSTRUMENTAL IN THE SCREENING
AND SELECTION PROCESS TO ENSURE
THAT CANDIDATES SUBMITTED TO CLIENTS
ARE AN IDEAL MATCH.
THE A2C TEAM
A2C’S CULTURE
PROVIDES AN ABILITY TO
ATTRACT AND RETAIN
THE BEST TALENT IN THE
INDUSTRY AND FOSTERS
CREATIVITY, INTEGRITY,
GROWTH AND
TEAMWORK.
ALTERNATIVE SOLUTIONS…
A2C PROVIDES
CLIENTS WITH AN
ALTERNATIVE
SOLUTION TO A “BIG
4” CONSULTANCY AT
SUBSTANTIAL
SAVINGS FOR
PROJECTS THAT ARE
BETWEEN $500K AND
$5M DUE TO
FLEXIBILITY, AGILITY
AND FOCUS.
A2C SOLUTION ENGAGEMENT STRUCTURES
•

TECHNOLOGY STRATEGY & ROADMAP FORMULATION

•

NEEDS & READINESS ASSESSMENT

•

PACKAGE & PLATFORM SELECTIONS

•

PROOF OF CONCEPT IMPLEMENTATION

•

REQUIREMENTS DISCOVERY & SPECIFICATIONS

•

PROGRAM/PROJECT MANAGEMENT

•

FULL LIFE CYCLE & APPLICATION DEVELOPMENT

•

INFRASTRUCTURE & FACILITIES INITIATIVES

•

MANAGED SERVICES & MAINTENANCE SUPPORT

8
6
A2C SOLUTIONS CAPABILITIES
•

ENTERPRISE DATA MANAGEMENT PRACTICE HELPS CLIENTS MANAGE THEIR
COMPLETE INFORMATION LIFECYCLE FROM THEIR ON-LINE TRANSACTIONAL
SYSTEMS TO THEIR DATA WAREHOUSING, ENTERPRISE REPORTING, DATA
MIGRATION, BACK-UP AND RECOVERY STRATEGIES

•

BUSINESS ARCHITECTURE & OPTIMIZATION PRACTICE UTILIZES “SIX SIGMA LEAN”
METHODOLOGIES TO ANALYZE, RE-ENGINEER AND AUTOMATE OUR CLIENT‟S
BUSINESS PROCESSES TO LEVERAGE HUMAN WORKFLOW AND BUSINESS RULES
ENGINE TECHNOLOGIES TO CREATE EFFICIENCIES AND PROVIDE BUSINESS UNIT
OWNERS WITH THE NECESSARY METRICS TO CONTINUALLY IMPROVE
PERFORMANCE

•

PROGRAM MANAGEMENT OFFICE OVERSEES ALL ASPECTS OF SOLUTIONS
PLANNING AND DELIVERY ACROSS CLIENT ENGAGEMENT TEAMS AND PROVIDES
THE METHODOLOGY AND FRAMEWORKS WHICH ARE BASED ON PMI® INDUSTRY
STANDARDS

8
7
A2C SOLUTIONS CAPABILITIES
•

APPLICATION DEVELOPMENT & MANAGED SERVICES PRACTICE HELPS
CLIENTS ARCHITECT, IMPLEMENT AND DEPLOY THE LATEST MICROSOFT
AND ENTERPRISE JAVA BASED APPLICATIONS WHICH ARE BUILT ON
PROVEN FRAMEWORKS AND ARCHITECTURES FOR THE ENTERPRISE

•

A2C'S SDLC DELIVERY MODEL IS COMPRISED OF OVER 20 YEARS
COLLECTIVE BEST PRACTICES AND INDUSTRY PROVEN
METHODOLOGIES THAT ALLOW OUR DELIVERY TEAMS TO RAPIDLY
DESIGN, DEVELOP AND IMPLEMENT SOLUTIONS. OUR SDLC MODEL HAS
BEEN DESIGNED TO COMPLEMENT OUR PROJECT MANAGEMENT
METHODOLOGY, UTILIZING ITERATIVE DEVELOPMENT CYCLES THAT
ENABLE PROJECT TEAMS TO PROVIDE CONSISTENTLY HIGH QUALITY,
ON-TIME DELIVERABLES, REGARDLESS OF TECHNOLOGY PLATFORM

8
8
LET A2C HELP WITH ALL YOUR
BUSINESS SOLUTIONS
CONNECT TO A2C
For Further information on the Agile Data Warehouse Design please contact:
John DiPietro, CTO

or Jim Stagnitto, Practice Director of Information Services

a2c.com

a2c Philadelphia
1801 Market Street
Suite 2430
Philadelphia, PA 19103
215-789-4816
contact: Joe Cattie
JCattie@a2c.com

a2c Boston
100 Grandview Road
Suite 215
Braintree, MA 02184
781-848-0005
contact: Scott King
SKing@a2c.com

a2c New York
401 Greenwich Street
3rd Floor
New York, NY 10013
212-913-0933
contact: John DiPietro
JDiPietro@a2c.com

Weitere ähnliche Inhalte

Was ist angesagt?

Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingDenodo
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...DATAVERSITY
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingDaniel Upton
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesDATAVERSITY
 
IDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsIDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsDATAVERSITY
 
Data Wearhouse (Dw) concepts
Data Wearhouse (Dw)  conceptsData Wearhouse (Dw)  concepts
Data Wearhouse (Dw) conceptsBeing Topper
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lakeCapgemini
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonDATAVERSITY
 
Snowflake: The Good, the Bad and the Ugly
Snowflake: The Good, the Bad and the UglySnowflake: The Good, the Bad and the Ugly
Snowflake: The Good, the Bad and the UglySamanthaBerlant
 
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...DATAVERSITY
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureInfosys
 
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov Oslo
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov OsloCore banking Closure bank day OSWA meetup 2018-Alexander Petrov Oslo
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov OsloAlexander Petrov
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics WebinarEckerson Group
 
Slides: Moving from a Relational Model to NoSQL
Slides: Moving from a Relational Model to NoSQLSlides: Moving from a Relational Model to NoSQL
Slides: Moving from a Relational Model to NoSQLDATAVERSITY
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringDATAVERSITY
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprisekayalvizhi kandasamy
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 

Was ist angesagt? (20)

Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
 
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
Slides: Proven Strategies for Hybrid Cloud Computing with Mainframes — From A...
 
Agile BI via Data Vault and Modelstorming
Agile BI via Data Vault and ModelstormingAgile BI via Data Vault and Modelstorming
Agile BI via Data Vault and Modelstorming
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data Lakes
 
IDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsIDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data Environments
 
Data Wearhouse (Dw) concepts
Data Wearhouse (Dw)  conceptsData Wearhouse (Dw)  concepts
Data Wearhouse (Dw) concepts
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
 
Snowflake: The Good, the Bad and the Ugly
Snowflake: The Good, the Bad and the UglySnowflake: The Good, the Bad and the Ugly
Snowflake: The Good, the Bad and the Ugly
 
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
 
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov Oslo
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov OsloCore banking Closure bank day OSWA meetup 2018-Alexander Petrov Oslo
Core banking Closure bank day OSWA meetup 2018-Alexander Petrov Oslo
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
 
Slides: Moving from a Relational Model to NoSQL
Slides: Moving from a Relational Model to NoSQLSlides: Moving from a Relational Model to NoSQL
Slides: Moving from a Relational Model to NoSQL
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern Engineering
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Data vault modeling et retour d'expérience
Data vault modeling et retour d'expérienceData vault modeling et retour d'expérience
Data vault modeling et retour d'expérience
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 

Andere mochten auch

SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)
SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)
SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)Denodo
 
Microsoft for BI and DW: Using the Right Tool for the Job
Microsoft for BI and DW: Using the Right Tool for the JobMicrosoft for BI and DW: Using the Right Tool for the Job
Microsoft for BI and DW: Using the Right Tool for the JobSenturus
 
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Denodo
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentationvickyc
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouseKomal Choudhary
 
Bhawani prasad data integration-ppt
Bhawani prasad data integration-pptBhawani prasad data integration-ppt
Bhawani prasad data integration-pptBhawani N Prasad
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation RoadmapDavid Walker
 
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMicrosoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMark Ginnebaugh
 
Hadoop and Data Virtualization - A Case Study by VHA
Hadoop and Data Virtualization - A Case Study by VHAHadoop and Data Virtualization - A Case Study by VHA
Hadoop and Data Virtualization - A Case Study by VHAHortonworks
 

Andere mochten auch (14)

DW 101
DW 101DW 101
DW 101
 
SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)
SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)
SOA with Data Virtualization (session 4 from Packed Lunch Webinar Series)
 
My Projects
My ProjectsMy Projects
My Projects
 
Microsoft for BI and DW: Using the Right Tool for the Job
Microsoft for BI and DW: Using the Right Tool for the JobMicrosoft for BI and DW: Using the Right Tool for the Job
Microsoft for BI and DW: Using the Right Tool for the Job
 
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
Bhawani prasad data integration-ppt
Bhawani prasad data integration-pptBhawani prasad data integration-ppt
Bhawani prasad data integration-ppt
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
 
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMicrosoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
 
Hadoop and Data Virtualization - A Case Study by VHA
Hadoop and Data Virtualization - A Case Study by VHAHadoop and Data Virtualization - A Case Study by VHA
Hadoop and Data Virtualization - A Case Study by VHA
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 

Ähnlich wie a2c Boston Big Data Meet-up: Agile Data Warehouse Design

How Starbucks Forecasts Demand at Scale with Facebook Prophet and Databricks
How Starbucks Forecasts Demand at Scale with Facebook Prophet and DatabricksHow Starbucks Forecasts Demand at Scale with Facebook Prophet and Databricks
How Starbucks Forecasts Demand at Scale with Facebook Prophet and DatabricksNavin Albert
 
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...Trivadis
 
Retail Design
Retail DesignRetail Design
Retail Designjagishar
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyNeo4j
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Neo4j
 
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyNeo4j
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesCisco Canada
 
Improve Store Expansion (Territory Management Featuring)
Improve Store Expansion (Territory Management Featuring)Improve Store Expansion (Territory Management Featuring)
Improve Store Expansion (Territory Management Featuring)Esri España
 
Knowledge Graphs for Supply Chain Operations.pdf
Knowledge Graphs for Supply Chain Operations.pdfKnowledge Graphs for Supply Chain Operations.pdf
Knowledge Graphs for Supply Chain Operations.pdfVaticle
 
Building Products with Data at Core
Building Products with Data at Core Building Products with Data at Core
Building Products with Data at Core Sandeep Adwankar
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data ManagementNeo4j
 
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain ProblemsData Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain ProblemsData Con LA
 
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User Adoption
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User AdoptionSPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User Adoption
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User AdoptionStacy Deere
 
SPFest DC Build It and They Will Come Share-Point 2013 User Adoption
SPFest DC   Build It and They Will Come Share-Point 2013 User AdoptionSPFest DC   Build It and They Will Come Share-Point 2013 User Adoption
SPFest DC Build It and They Will Come Share-Point 2013 User AdoptionStacy Deere
 
A taste of Snowplow Analytics data
A taste of Snowplow Analytics dataA taste of Snowplow Analytics data
A taste of Snowplow Analytics dataRobert Kingston
 
Supply Chain 2030: Presentation by Lora Cecere at CLX Conference
Supply Chain 2030: Presentation by Lora Cecere at CLX ConferenceSupply Chain 2030: Presentation by Lora Cecere at CLX Conference
Supply Chain 2030: Presentation by Lora Cecere at CLX ConferenceLora Cecere
 
Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Neo4j
 

Ähnlich wie a2c Boston Big Data Meet-up: Agile Data Warehouse Design (20)

How Starbucks Forecasts Demand at Scale with Facebook Prophet and Databricks
How Starbucks Forecasts Demand at Scale with Facebook Prophet and DatabricksHow Starbucks Forecasts Demand at Scale with Facebook Prophet and Databricks
How Starbucks Forecasts Demand at Scale with Facebook Prophet and Databricks
 
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...
Trivadis TechEvent 2016 Customer Event Hub - the modern Customer 360° view by...
 
Retail Design
Retail DesignRetail Design
Retail Design
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
Neo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael MooreNeo4j GraphTour New York_EY Presentation_Michael Moore
Neo4j GraphTour New York_EY Presentation_Michael Moore
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business Outcomes
 
Improve Store Expansion (Territory Management Featuring)
Improve Store Expansion (Territory Management Featuring)Improve Store Expansion (Territory Management Featuring)
Improve Store Expansion (Territory Management Featuring)
 
Knowledge Graphs for Supply Chain Operations.pdf
Knowledge Graphs for Supply Chain Operations.pdfKnowledge Graphs for Supply Chain Operations.pdf
Knowledge Graphs for Supply Chain Operations.pdf
 
Building Products with Data at Core
Building Products with Data at Core Building Products with Data at Core
Building Products with Data at Core
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data Management
 
SEAGATE
SEAGATESEAGATE
SEAGATE
 
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain ProblemsData Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
Data Con LA 2022 - Practical Solutions to Complex Supply Chain Problems
 
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User Adoption
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User AdoptionSPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User Adoption
SPS Chevy Chase - Build It and They Will Come: Sharepoint 2013 User Adoption
 
SPFest DC Build It and They Will Come Share-Point 2013 User Adoption
SPFest DC   Build It and They Will Come Share-Point 2013 User AdoptionSPFest DC   Build It and They Will Come Share-Point 2013 User Adoption
SPFest DC Build It and They Will Come Share-Point 2013 User Adoption
 
A taste of Snowplow Analytics data
A taste of Snowplow Analytics dataA taste of Snowplow Analytics data
A taste of Snowplow Analytics data
 
Supply Chain 2030: Presentation by Lora Cecere at CLX Conference
Supply Chain 2030: Presentation by Lora Cecere at CLX ConferenceSupply Chain 2030: Presentation by Lora Cecere at CLX Conference
Supply Chain 2030: Presentation by Lora Cecere at CLX Conference
 
Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017
 

Kürzlich hochgeladen

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Kürzlich hochgeladen (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

a2c Boston Big Data Meet-up: Agile Data Warehouse Design

  • 1. AGILE DATA WAREHOUSE DESIGN WITH BIG DATA John DiPietro & Jim Stagnitto 1
  • 2. AGENDA • INTRODUCTION / A2C OVERVIEW • MODELING FOR END USERS • ROLE OF DIMENSIONAL MODELS IN BIG DATA • EXAMPLE: E-COMMERCE • STRUCTURED DATA: SALES • SEMI-STRUCTURED DATA: CLICKSTREAM • AGILE DIMENSIONAL MODELING OVERVIEW • CASE STUDY REVIEW • Q&A 2
  • 3. INTRODUCTION A2C • • BOUTIQUE EDM (ENTERPRISE DATA MANAGEMENT) CONSULTANCY FIRM: • DATA WAREHOUSING • MASTER DATA MANAGEMENT • CLOSED LOOK ANALYTICS AND VISUALIZATION • DATA & APPLICATION ARCHITECTURE JOHN DIPIETRO • • PRINCIPAL, CHIEF TECHNOLOGY OFFICER JIM STAGNITTO • • DATA WAREHOUSE & MDM ARCHITECT 3
  • 4. ON THURSDAY 11/14 A2C’S JIM STAGNITTO AND JOHN DIPIETRO PRESENTED A WORKSHOP… FEATURING AGILE DATA WAREHOUSE DESIGN - A STEP-BY-STEP METHOD FOR DATA WAREHOUSING / BUSINESS INTELLIGENCE (DW/BI) PROFESSIONALS TO BETTER COLLECT AND TRANSLATE BUSINESS INTELLIGENCE REQUIREMENTS INTO SUCCESSFUL DIMENSIONAL DATA WAREHOUSE DESIGNS. BEAM✲ THE METHOD UTILIZES (BUSINESS EVENT ANALYSIS AND MODELING) - AN AGILE APPROACH TO DIMENSIONAL DATA MODELING THAT CAN BE USED THROUGHOUT ANALYSIS AND DESIGN TO IMPROVE PRODUCTIVITY AND COMMUNICATION BETWEEN DW DESIGNERS AND BI STAKEHOLDERS. SPONSORED BY MICROSOFT NERD (NEW ENGLAND RESEARCH AND DEVELOPMENT CENTER) AND ATTENDED BY 93 DATA SCIENTISTS…
  • 5. COMPETITIVE ADVANTAGE CEO, Craig Spitzer Pres., Scott King CTO, John DiPietro CRO, Brian Cassidy Managing Sales Dir., Joe Cattie The founders of a2c were part of the fastest growing privately held IT consulting and staff augmentation firm in the U.S. from 1994-2002. Our Executive Management Team has over 100 years of collective experience and has been responsible for delivering over a half billion dollars of IT Consulting and staff augmentation revenue from 1994 through the present day. a2c Top Twenty Most Promising Data Analytics November 2013 Alliance Consulting, Inc. 1999, 2000, 2001 CEO, Alliance Consulting Group, Craig Spitzer 2001
  • 6. AGILE DW DESIGN OVERVIEW 6
  • 7. MODELING FOR END USERS: HOW TO DESIGN TO ANSWER BUSINESS QUESTIONS? • • THINK ABOUT HOW QUESTIONS ARE ARTICULATED AND HOW THE ANSWERS SHOULD BE DELIVERED • IDENTIFY A COMMON QUESTION FRAMEWORK • DESIGN AN ARCHITECTURE THAT EMBRACES AND LEVERAGES THIS COMMON QUESTION FRAMEWORK • UTILIZE THE BEST DESIGNS AND TECHNOLOGIES TO: (A) DERIVE THE ANSWERS (B) PRESENT THEM IN COMPELLING WAYS THAT LEAD TO THE NEXT INTERESTING QUESTION! 7
  • 8. HOW DO WE ASK QUESTIONS? What When Who “HOW DO THIS QUARTER‟S SALES BY SALES REP OF ELECTRONIC PRODUCTS THAT WE PROMOTED TO RETAIL CUSTOMERS IN THE EAST COMPARE WITH LAST YEAR‟S?” When Who Why Where What 8
  • 9. HOW DO WE ASK QUESTIONS? EVENTS / TRANSACTIONS • • E.G. SALE • A IMMUTABLE "FACT" THAT OCCURS IN A TIME AND (TYPICALLY A) PLACE INTERROGATIVES: • • WHO, WHAT, WHEN, WHERE, WHY • DESCRIPTIVE CONTEXT THAT FULLY DESCRIBES THE EVENT • A SET OF “DIMENSIONS" THAT DESCRIBE EVENTS 9
  • 10. DIMENSIONAL VALUE PROPOSITION • IT MAKES SENSE TO PRESENT ANSWERS TO PEOPLE USING THE SAME TAXONOMY OF EVENTS AND INTERROGATIVES (AKA: FACTS AND DIMENSIONS - DIMENSIONAL STRUCTURE) THAT THEY USE WHEN FORMING QUESTIONS; • EVENTS ARE INSTANCES OF PROCESSES ; • IT‟S BEST TO PRESENT INFORMATION TO PEOPLE WHO WILL ASK THE SYSTEM QUESTIONS IN DIMENSIONAL FORM; • THIS IS TRUE REGARDLESS OF THE TYPE OF INFORMATION BEING INTERROGATED, ITS SOURCE, OR IT STUFF (LIKE DATABASE TECHNOLOGIES UTILIZED); • IT‟S BEST TO MODEL THIS PRESENTATION LAYER BASED ON THE EVENTS (AKA: BUSINESS PROCESSES) THAT UNDERLIE THE QUESTIONS. 10
  • 12. SCENARIOS: A BRIEF DISCUSSION OF HOW AND WHERE DIMENSIONAL MODELING AND/OR DATABASES FIT WITHIN COMMON AND EMERGING “BIG DATA” DATA WAREHOUSING ARCHITECTURES 12
  • 13. KIMBALL DIMENSIONAL DW Dimensional BI Semantic Layer Dimensional Data Warehouse Data Movement / Integration Source Data (Structured) 13
  • 14. KIMBALL WITH BIG DATA Dimensional BI Semantic Layer Dimensional Data Warehouse Big Data Capture (e.g. HDFS) Big Data Discovery (e.g. MR) Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) 14
  • 15. CORPORATE INFORMATION FACTORY (CIF) Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Corporate Information Factory 3NF DW Data Movement / Integration Source Data (Structured) 15
  • 16. CIF WITH BIG DATA Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Big Data Capture (e.g. HDFS) Big Data Discovery Corporate Information Factory 3NF DW (e.g. MR) Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) 16
  • 17. DATA VAULT Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Data Vault Data Movement / Integration Source Data (Structured) 17
  • 18. DATA VAULT WITH BIG DATA Dimensional BI Semantic Layer Dimensional Tier (Virtual or Physical) Big Data Capture (e.g. HDFS) Big Data Discovery Data Vault (e.g. MR) Data Movement / Integration Tier Data Movement / Integration Tier Source Data Tier Source Data Tier (Un/Semi-Structured) (Structured) 18
  • 19. COMMON FRAMEWORK Dimensional BI Semantic Layer Dimensional Tier [Physical (Kimball) or Virtual (CIF or Data Vault) (Virtual or Physical) Persistent Un/SemiStructured Staging Area Unstructured -> Structured Data Discovery Processing Persistent Structured Data Repository (not needed for Kimball) Un/Semi-Structured Data Movement Structured Data Movement Un/Semi-Structured Source Data Structured Source Data (Structured) 19 Insight Generation / Data Mining
  • 20. COMMON FRAMEWORK Dining Room Readily Accessible to End Users (and BI Developers) Safe, Hospital Environment Data Assets “Ready for Primetime” Dimensionally Structured Dimensional BI Semantic Layer Dimensional Tier [Physical (Kimball) or Virtual (CIF or Data Vault) (Virtual or Physical) Persistent Un/SemiStructured Staging Area Unstructured -> Structured Data Discovery Processing Persistent Structured Data Repository Kitchen (not needed for Kimball) Un/Semi-Structured Data Movement Structured Data Movement Un/Semi-Structured Source Data Structured Source Data (Structured) Clickstream Data Off Limits to End Users Data Professionals Only Please Dangerous / Inhospitable Environment Data Assets “Not Ready for Primetime” Structured Variably For Data Processing eCommerce Sale eCommerce Example 20
  • 21. E-COMMERCE EXAMPLE: CLICKSTREAM Semi-Structured Recording of every page request made by a user Includes some structural elements – such as when the request was made and who the user is Requires significant prep work in order to fit into a traditional row-based relational database Apples and Oranges: Pre-Sessionized Page Visits, Detailed Product Views, Catalogue Requests, Shopping Cart Adds / Deletes / Abandons, etc. Needs to be converted into separate-butrelatable dimensional facts - with many shared (conformed) dimensions 21 Raw Clickstream Data 25 52 164 240 274 328 368 448 538 561 630 687 730 775 825 834 39 120 124 205 401 581 704 814 825 834 35 249 674 712 733 759 854 950 39 422 449 704 825 857 895 937 954 964 15 229 262 283 294 352 381 708 738 766 853 883 966 978 26 104 143 320 569 620 798 7 185 214 350 529 658 682 782 809 849 883 947 970 979 227 390 71 192 208 272 279 280 300 333 496 529 530 597 618 674 675 720 855 914 932 183 193 217 256 276 277 374 474 483 496 512 529 626 653 706 878 939 161 175 177 424 490 571 597 623 766 795 853 910 960 125 130 327 698 699 839 392 461 569 801 862 27 78 104 177 733 775 781 845 900 921 938 101 147 229 350 411 461 572 579 657 675 778 803 842 903 71 208 217 266 279 290 458 478 523 614 766 853 888 944 969 43 70 176 204 227 334 369 480 513 703 708 835 874 895 25 52 278 730 151 432 504 830 890 71 73 118 274 310 327 388 419 449 469 484 706 722 795 810 844 846 918 130 274 432 528 967 188 307 326 381 403 523 526 722 774 788 789 834 950 975 89 116 198 201 333 395 653 720 846 70 171 227 289 462 538 541 623 674 701 805 946 964 143 192 317 471 487 631 638 640 678 735 780 865 888 935 17 242 471 758 763 837 956 52 145 161 283 375 385 676 721 731 790 792 885 182 229 276 529 43 522 565 617 859
  • 22. TYPICAL CLICKSTREAM “PAGE VIEW” DIMENSIONAL MODEL What When What Who Why 22
  • 23. E-COMMERCE EXAMPLE: WEB SALES • • • FULLY STRUCTURED THE SALE TRANSACTION TYPICALLY CARRIES ALL FUNDAMENTAL DIMENSIONS: • TIME • CUSTOMER • REFERRING URL / SEARCH PHRASE • PRODUCT • PURCHASE AND/OR SHIPMENT (GEO OR URL) LOCATIONS • PROMOTION / CAMPAIGN • ETC. AND “HOW MANY” MEASURES • UNIT AND PRICE QUANTITIES / AMOUNTS • DISCOUNT AMOUNTS • ETC. 23
  • 24. E-COMMERCE DIMENSIONALITY Facts (below) & Dimensions (right) Time (When) Page Visit View Start View End Session Start Session End Customer (Who) Web Page (Where) Visitor Current
Pre vious Next Detailed Product View View Start View End Session Start Session End Prospect Current
Pre vious Next Shopping Cart Activity Activity Start Activity End Sale (Checkout) Shipment / Delivery Product (What) Referring URL (Where) Promotion / Campaign (Why) Activity Type (How) ✔︎ ✔︎ ✔︎ Prospect ✔︎ ✔︎ ✔︎ ✔︎ Sale Start Sale End Customer ✔︎ ✔︎ ✔︎ ✔︎ Shipment Delivery Customer Delivery Recipient ✔︎ 24
  • 25. AGILE DW DESIGN OVERVIEW 25
  • 26. THE FIRST DIMENSIONAL MODELER: R.K. Ralph Kimball? Rudyard Kipling 26
  • 27. “I keep six honest serving-men (They taught me all I knew); Their names are What and Why and When And How and Where and Who…” –Rudyard Kipling 27
  • 30. HOW DID WE GET HERE?
  • 31. DW ARCHITECTURES: A BRIEF HISTORY Corporate Information Factory Undisciplined Dimensional Dimensional Bus Architecture Data-Driven Analysis Report-Driven Analysis Process-Driven Analysis
  • 32. 7WS DIMENSIONAL MODEL When Who Time Customer Day How – Facts: Employee Month Much Third Party Fiscal Period Many Organization Often £$€ What Where Product Location Why Causal Geographic Store Ship To Hospital ?? Service Transactions Promotion Reason Weather Competition
  • 34. TO DOWNLOAD WITH AUDIO WORKSHOP FILE: PLEASE COMPLETE THE FOLLOWING REQUEST FORM FOR FREE LINK TO AGILE DATA WAREHOUSE DESIGN PRESENTATION. REVIEWS: “EXCELLENT PRESENTATION. IT IS GOOD TO HEAR MEANINGFUL …INFORMATION ABOUT NEW DEVELOPMENTS IN HOW AGILE METHODOLOGIES CAN BE APPLIED TO DW/BI WORK. BIG KUDOS TO THE PRESENTERS AND ORGANIZERS. THANKS, I FOUND IT VERY USEFUL AND ENJOYABLE.”- RAMON VENEGAS “EXTREMELY USEFUL TO UNDERSTAND HOW TO APPLY AGILE APPROACH TO DWH; HOW CREATE A FRAMEWORK WHERE MODEL CHANGES ARE WELCOME, AND BRING USERS TO THE PROCESS OF DWH MODELING.” – ALFREDO GOMEZ 34
  • 35. HOW do you design a data warehouse?
  • 37. OK, NOW VALIDATE WITH BUSINESS…
  • 39. WATERFALL BI/DW DEVELOPMENT Limited Stakeholder Interaction Analysis Design Development This Year Stakeholder Input BDUF Requirements Data Model Next Year Test Release ETL BI DATA VALUE?
  • 40. AGILE DW/BI DEVELOPMENT Stakeholder interaction ? JEDUF BI Prototyping ETL Review Release This Year Next Year Iteration 1 VALUE? Iteration 2 ETL BI Iteration 3Rev ADM VALUE Iteration … VALUE! DATA Iteration n VALUE! VALUE!
  • 41. STATE OF THE DW FIELD • • SOLID: DIMENSIONAL DATA WAREHOUSE DESIGN IS MATURE • PROVEN DESIGN PATTERNS EXIST FOR COMMON REQUIREMENTS • • HIT OR MISS: COLLECTING UNAMBIGUOUS AND THOROUGH REQUIREMENTS • SLOTTING REQUIREMENTS INTO PROVEN DESIGN PATTERNS • END-USER OWNERSHIP AND VALIDATION • TOO OFTEN: SNATCHING DEFEAT FROM THE JAWS OF VICTORY 41
  • 43. BEAM✲ METHODOLOGY Structured, non-technical, collaborative working conversation directly with BI Users BEAM✲ • BI User’s Business Process, Organizational, Hierarchical, and Data Knowledge • Focused Data Profiling Data Modeler BI Stakeholders • Logical and Physical (Kimball-esque) Dimensional Data Models • Example data • Detailed and Testable ETL Specification • Instantiated DW Prototype
  • 46. AGILE DATA MODELING REQUIREMENTS: • TECHNIQUES FOR ENCOURAGING INTERACTION • MUST USE SIMPLE, INCLUSIVE NOTATION AND TOOLS • MUST BE QUICK: HOURS RATHER THAN DAYS – MODELSTORMING • BALANCE „JUST IN TIME‟ (JIT) AND „JUST ENOUGH DESIGN UP FRONT‟ (JEDUF) TO REDUCE DESIGN REWORK • DW DESIGNERS MUST EMBRACE DATA MODEL CHANGE, ALLOW MODELS TO EVOLVE, AVOID GENERIC DATA MODELS; NEED DESIGN PATTERNS THEY CAN TRUST TO REPRESENT TOMORROW‟S BI REQUIREMENTS TOMORROW • ETL AND BI DEVELOPERS MUST EMBRACE DATABASE CHANGE; NEED TOOL SUPPORT 46
  • 48.
  • 49. CALENDAR PRODUCT Date Key Product Key Date Day Day in Week Day in Month Day in Qtr Day in Year Month Qtr Year Weekday Flag Holiday Flag Product Code Product Description Product Type Brand Subcategory Category SALES FACT Date Key Product Key Store Key Promotion Key Quantity Sold Revenue Cost Basket Count STORE PROMOTION Store Key Promotion Key Store Code Store Name URL Store Manager Region Country Promotion Code Promotion Name Promotion Type Discount Type Ad Type
  • 50.
  • 54. COLLABORATIVE / CONVERSATIONAL DESIGN Who does what? “Customers buy products” BEAM✲ Modeler Subjects Verb Objects BI Users
  • 55. DESIGN USING NATURAL LANGUAGE • VERBS – EVENTS – RELATIONSHIPS – FACT TABLES • NOUNS – DETAILS – ENTITIES – DIMENSIONS • MAIN CLAUSE – SUBJECT-VERB-OBJECT • PREPOSITIONS – CONNECT ADDITIONAL DETAILS TO THE MAIN CLAUSE • INTERROGATIVES – THE 7WS – DIMENSION TYPES • BUSINESS VOCABULARY - NO “IT-SPEAK” 55
  • 56. “Spreadsheet”-like Models Event Table Name (filled in later) Subject Column Name Verb Object Column Name Interrogative Details Example Data (4-6 rows)
  • 57. Straightforward Methodology 1 1 1 1 1 1 Subject-Verb-Object 1 1 1 3 1 1 Who What When Declare Event Type Where How (many) Why Sufficient Detail Fact Granularity How 1 1 1 4 1 1 1 1 1 5 1 1 1 1 2 1 1 1 1 1 1 6 1 1 1 1 1 7 1 1 1 1 8 1 1 1 1 1 1 9 1 1 Initial Data Examples Quantities - Facts
  • 58. CAPTURE EXAMPLE DATA: verb on/at/every SUBJECT OBJECT EVENT DATE [who] [what] [when] [where] [how many] [why] [how] Typical Typical/Popular Typical Typical Typical/Average Typical/Normal Typical/Normal Different Different Different Different Different Different Different Repeat Repeat Repeat Repeat Repeat Repeat Repeat Missing Missing Missing Missing Missing Missing Missing Group Multiple/Bundle Old, Low Old, Low Value Oldest needed Near Min, Negative, 0 New, High New, High Most Recent, Future Far Max, Precision Multi-Level ENGAGE CLARIFY DEFINITIONS / CONFORM DIMENSIONS Multiple Values Exceptional Exceptional ILLUSTRATE EXCEPTIONS “DRIVE OUT UNIQUENESS” “SHOW AND TELL”
  • 61. ADJUST CONVERSATION BASED ON EVENT TYPE DISCRETE EVENT -> TRANSACTION • • INSTANTANEOUS/SHORT DURATION, IRREGULARLY OCCURRING EVENTS OR TRANSACTIONS RECURRING EVENT -> PERIODIC SNAPSHOT – MEASUREMENT • • REGULARLY OCCURRING EVENTS, ONGOING PROCESSES, TYPICALLY USE TO MEASURE CUMULATIVE OF DISCRETE EVENTS EVOLVING EVENT -> ACCUMULATING SNAPSHOT – TIMELINE • • NON-INSTANTANEOUS/LONGER DURATION, IRREGULARLY OCCURRING EVENTS OR TRANSACTIONS • REPRESENTS CURRENT STATUS - REFLECTS ADJUSTMENTS 61
  • 62. CAPTURE WHEN DETAILS When do Customers order Products? “On the Order Date” BEAM✲ Modeler BI Users
  • 66. MODEL HOW MANY MEASURES: • ADDITIVE – CAN BE SUMMED UP OVER ANY COMBINATION OF DIMENSIONS. NO SPECIAL RULES • NON-ADDITIVE – CAN NOT BE SUMMED OVER ANY DIMENSION E.G. UNIT PRICE OR TEMPERATURE • • • MUST BE AGGREGATED IN OTHER WAYS E.G. AVERAGE, MIN, MAX DEGENERATE DIMENSIONS – TRANSACTION #, TIMESTAMPS, FLAGS SEMI-ADDITIVE – CAN NOT BE SUMMED ACROSS AT LEAST ONE DIMENSION E.G. BALANCES CAN NOT BE SUMMED OVER TIME 66
  • 68. ANNOTATE W TARGETED DATA PROFILING:
  • 69. PROCEED THROUGH THE BUSINESS PROCESS VALUE CHAIN:
  • 75. PROTOTYPE! NOT “DATA MODEL REVIEW”
  • 76. RECAP: COLLABORATIVE AND AGILE • • DATA MODELING • DATA SOURCING • DATA CONFORMANCE REQUIREMENTS = DESIGN • • SLOTS DIRECTLY INTO PROVEN AND MATURE DIMENSIONAL DATA WAREHOUSING DESIGN PATTERNS VALIDATION THROUGH PROTOTYPING • • SEMI-AUTOMATED BUILD OF DIMENSIONAL DATA WAREHOUSE • PERFECT COMPLIMENT TO AGILE BI TOOLS AND METHODS (E.G. PENTAHO) 76
  • 77. IF YOU HAVE BEEN AFFECTED BY ANY OF THE ISSUES RAISED IN THIS PRESENTATION…
  • 78. AGILE DATA WAREHOUSE DESIGN LAWRENCE CORR, JIM STAGNITTO, DECISION PRESS, NOVEMBER 2011
  • 79. QUESTIONS/COMMENTS? CONTACT: JIM STAGNITTO OR JOHN DIPIETRO 215-789-4816
  • 80. A2C CORPORATE OVERVIEW & INDUSTRY EXPERIENCE 8 0
  • 81. COMPANY OVERVIEW • TECHNOLOGY SOLUTION CONSULTANCY HEADQUARTERED IN PHILADELPHIA WITH REGIONAL OFFICES IN NEW YORK AND BOSTON • SERVICING HEALTHCARE, LIFE SCIENCE, TEL-COM AND FINANCIAL SERVICES INDUSTRIES WITH RECENT OBTAINMENT OF OUR GSA SCHEDULE TO PURSUE FEDERAL GOVERNMENT OPPORTUNITIES • CONSULTANT BASE OF OVER 2500 PROVEN IT PROFESSIONALS THROUGHOUT THE NORTH EAST REGION WITH A RECRUITING NETWORK WHICH PROVIDES NATIONAL COVERAGE 8 1
  • 82. COMPANY OVERVIEW • FLEXIBLE APPROACH TO HELPING OUR CLIENTS WITH THEIR INITIATIVES • PROJECT-BASED SOLUTIONS • STAFF AUGMENTATION • MANAGED SERVICE OFFERINGS – “ON-SHORE QA , DEVELOPMENT & APPLICATION SUPPORT” • EXECUTIVE & PROFESSIONAL SEARCH 8 2
  • 83. a2c’s Recruiting Engine and Methodology is one of the Best in the Industry… CAPABLE OF PRODUCING QUALITY RESULTS ON-DEMAND FOR OUR CLIENTS. RESOURCE MANAGERS CONTINUALLY “SILO” DISCIPLINES WITH AVAILABLE CANDIDATES WHO HAVE PROVEN THEIR ABILITIES WITH A2C OVER THE PAST DECADE. THE A2C SOLUTIONS ORGANIZATION IS INSTRUMENTAL IN THE SCREENING AND SELECTION PROCESS TO ENSURE THAT CANDIDATES SUBMITTED TO CLIENTS ARE AN IDEAL MATCH.
  • 84. THE A2C TEAM A2C’S CULTURE PROVIDES AN ABILITY TO ATTRACT AND RETAIN THE BEST TALENT IN THE INDUSTRY AND FOSTERS CREATIVITY, INTEGRITY, GROWTH AND TEAMWORK.
  • 85. ALTERNATIVE SOLUTIONS… A2C PROVIDES CLIENTS WITH AN ALTERNATIVE SOLUTION TO A “BIG 4” CONSULTANCY AT SUBSTANTIAL SAVINGS FOR PROJECTS THAT ARE BETWEEN $500K AND $5M DUE TO FLEXIBILITY, AGILITY AND FOCUS.
  • 86. A2C SOLUTION ENGAGEMENT STRUCTURES • TECHNOLOGY STRATEGY & ROADMAP FORMULATION • NEEDS & READINESS ASSESSMENT • PACKAGE & PLATFORM SELECTIONS • PROOF OF CONCEPT IMPLEMENTATION • REQUIREMENTS DISCOVERY & SPECIFICATIONS • PROGRAM/PROJECT MANAGEMENT • FULL LIFE CYCLE & APPLICATION DEVELOPMENT • INFRASTRUCTURE & FACILITIES INITIATIVES • MANAGED SERVICES & MAINTENANCE SUPPORT 8 6
  • 87. A2C SOLUTIONS CAPABILITIES • ENTERPRISE DATA MANAGEMENT PRACTICE HELPS CLIENTS MANAGE THEIR COMPLETE INFORMATION LIFECYCLE FROM THEIR ON-LINE TRANSACTIONAL SYSTEMS TO THEIR DATA WAREHOUSING, ENTERPRISE REPORTING, DATA MIGRATION, BACK-UP AND RECOVERY STRATEGIES • BUSINESS ARCHITECTURE & OPTIMIZATION PRACTICE UTILIZES “SIX SIGMA LEAN” METHODOLOGIES TO ANALYZE, RE-ENGINEER AND AUTOMATE OUR CLIENT‟S BUSINESS PROCESSES TO LEVERAGE HUMAN WORKFLOW AND BUSINESS RULES ENGINE TECHNOLOGIES TO CREATE EFFICIENCIES AND PROVIDE BUSINESS UNIT OWNERS WITH THE NECESSARY METRICS TO CONTINUALLY IMPROVE PERFORMANCE • PROGRAM MANAGEMENT OFFICE OVERSEES ALL ASPECTS OF SOLUTIONS PLANNING AND DELIVERY ACROSS CLIENT ENGAGEMENT TEAMS AND PROVIDES THE METHODOLOGY AND FRAMEWORKS WHICH ARE BASED ON PMI® INDUSTRY STANDARDS 8 7
  • 88. A2C SOLUTIONS CAPABILITIES • APPLICATION DEVELOPMENT & MANAGED SERVICES PRACTICE HELPS CLIENTS ARCHITECT, IMPLEMENT AND DEPLOY THE LATEST MICROSOFT AND ENTERPRISE JAVA BASED APPLICATIONS WHICH ARE BUILT ON PROVEN FRAMEWORKS AND ARCHITECTURES FOR THE ENTERPRISE • A2C'S SDLC DELIVERY MODEL IS COMPRISED OF OVER 20 YEARS COLLECTIVE BEST PRACTICES AND INDUSTRY PROVEN METHODOLOGIES THAT ALLOW OUR DELIVERY TEAMS TO RAPIDLY DESIGN, DEVELOP AND IMPLEMENT SOLUTIONS. OUR SDLC MODEL HAS BEEN DESIGNED TO COMPLEMENT OUR PROJECT MANAGEMENT METHODOLOGY, UTILIZING ITERATIVE DEVELOPMENT CYCLES THAT ENABLE PROJECT TEAMS TO PROVIDE CONSISTENTLY HIGH QUALITY, ON-TIME DELIVERABLES, REGARDLESS OF TECHNOLOGY PLATFORM 8 8
  • 89. LET A2C HELP WITH ALL YOUR BUSINESS SOLUTIONS
  • 90. CONNECT TO A2C For Further information on the Agile Data Warehouse Design please contact: John DiPietro, CTO or Jim Stagnitto, Practice Director of Information Services a2c.com a2c Philadelphia 1801 Market Street Suite 2430 Philadelphia, PA 19103 215-789-4816 contact: Joe Cattie JCattie@a2c.com a2c Boston 100 Grandview Road Suite 215 Braintree, MA 02184 781-848-0005 contact: Scott King SKing@a2c.com a2c New York 401 Greenwich Street 3rd Floor New York, NY 10013 212-913-0933 contact: John DiPietro JDiPietro@a2c.com