Data warehousing, after decades of widespread adoption, still holds a strong place in today’s organization. Cloud-based technologies have revolutionized the traditional world of data warehousing, offering transformational ways to support analytics and reporting. Join this webinar to understand what has changed in the world of data warehousing with the introduction of cloud-based technologies, and what has remained the same.
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
DAS Slides: Cloud-Based Data Warehousing – What’s New and What Stays the Same
1. Copyright Global Data Strategy, Ltd. 2020
Cloud-based Data Warehousing:
What’s New and What Stays the Same
Donna Burbank
Global Data Strategy, Ltd.
March 26th, 2020
Follow on Twitter @donnaburbank
Twitter Event hashtag: #DAStrategies
2. Global Data Strategy, Ltd. 2020
Donna Burbank
2
Donna is a recognised industry expert in
information management with over 20 years
of experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture.
Her background is multi-faceted across
consulting, product development, product
management, brand strategy, marketing,
and business leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specializes in the alignment of
business drivers with data-centric
technology. In past roles, she has served in
key brand strategy and product
management roles at CA Technologies and
Embarcadero Technologies for several of the
leading data management products in the
market.
As an active contributor to the data
management community, she is a long time
DAMA International member, Past President
and Advisor to the DAMA Rocky Mountain
chapter, and was awarded the Excellence in
Data Management Award from DAMA
International.
Donna is also an analyst at the Boulder BI
Train Trust (BBBT) where she provides advice
and gains insight on the latest BI and
Analytics software in the market. She was on
several review committees for the Object
Management Group’s for key information
management and process modeling
notations.
She has worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks regularly
at industry conferences. She has co-
authored two books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Follow on Twitter @donnaburbank
Twitter Event hashtag: #DAStrategies
3. Global Data Strategy, Ltd. 2020
DATAVERSITY Data Architecture Strategies
• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same
• April 23 Master Data Management – Aligning Data, Process, and Governance
• May 28 Data Governance and Data Architecture – Alignment and Synergies
• June 25 Enterprise Architecture vs. Data Architecture
• July 22 Best Practices in Metadata Management
• August 27 Data Quality Best Practices
• September 24 Data Virtualization – Separating Myth from Reality
• October 22 Data Architect vs. Data Engineer vs. Data Modeler
• December 1 Graph Databases: Practical Use Cases
3
This Year’s Lineup
4. Global Data Strategy, Ltd. 2020
What We’ll Cover Today
• Data warehousing, after decades of widespread adoption, still hold a strong place in today’s
organization.
• Cloud-based technologies have revolutionized the traditional world of data warehousing,
offering transformational ways to support analytics and reporting.
• Join this webinar to understand what has changed in the world of data warehousing with the
introduction of Cloud-based technologies, and what has remained the same.
4
5. Global Data Strategy, Ltd. 2020
Business Intelligence & Analytics
Business Intelligence & Analytics are key to gaining
business insight.
• 80% of respondents indicated that reporting and
analytics were key drivers for data management.
• 87% are implementing business intelligence
• 87% have a data warehouse in place
• 22% are using a data lake in conjunction with a
data warehouse
5
Business Intelligence & Analytics provide Business Insight
* based on research from a 2019 DATAVERSITY survey on “Trends in Data Management” by Donna Burbank and Michelle McKnight
6. Global Data Strategy, Ltd. 2020
a number of respondents mentioned Data Governance in their comments as a way to align the various stakeholders around
common goals
Business Goals & Drivers
• Analytics and Reporting continue to lead the
business drivers for data management.
• Top drivers include:
• Gaining insights through reporting and analytics: 79.70%
• Saving cost and increasing efficiency: 68.42%
• Reducing risk: 66.92%
• Improving customer satisfaction: 58.65%
• Driving revenue and growth: 57.14%
• Supporting digital transformations: 53.38%
6
Gaining Business Insight through Analytics and Reporting continues to be a main business driver for today’s organizations.
* based on research from a 2019 DATAVERSITY survey on “Trends in Data Management” by Donna Burbank and Michelle McKnight
7. Global Data Strategy, Ltd. 2020
Current Platform Cloud Adoption
• Relational Database still dominate the data
management landscape
• Majority is on-premises
• Some Cloud Adoption
7
Relational database still dominate the market, both on premises and Cloud-based
8. Global Data Strategy, Ltd. 2020
a number of respondents mentioned Data Governance in their comments as a way to align the various stakeholders around
common goals
Future Platform Adoption – Greater Move Towards Cloud
• Future Plans still include a high percentage
of relational databases, with a higher
percentage of Cloud-based systems.
• A wider distribution of platform usage
indicates the variety of options and fit-for-
purpose solution – one size doesn’t fit all.
8
Future plans still feature relational databases, with a higher focus on Cloud Adoption, and a wider mix of technologies.
9. Global Data Strategy, Ltd. 2020
Moving to the Cloud: Pros and Cons
9
While organizations are moving to the Cloud for better scalability, concerns regarding security & privacy remain.
10. Global Data Strategy, Ltd. 2020
Platform Availability and Uptime
Ability to scale across geographic regions:
• Compliance
• Availability
• Performance
If Amazon, Google, or
Microsoft can’t handle it,
do we think we can do
better? We build cars,
not software.
Client quote (CEO) on moving to Cloud
Upside Availability & Scalability
Downside Risk
A Matter of Perspective
Some organizations employ a multi-Cloud platform to reduce risk.
11. Global Data Strategy, Ltd. 2020
Benefits & Drivers for a Cloud-based Data Warehouse
• Ease of Entry: Reduce requirements for platform maintenance & setups
• Increased Focus on Analytics: Analytical use cases require scale and flexibility
• Greater Volume and Variety of Data: Larger scale of data, as well as greater variety in
unstructured vs. structured data.
• Cost Savings and Ability to Scale: Many organizations benefit from costs savings due to:
• Low cost of entry and ability to scale
• Ability to flex usage due to seasonal variability (e.g. holiday shopping)
• OPEX vs. CAPEX
• Note: Cloud does not always equate to lower costs – consider usage patterns and practices.
• Democratization of Data: Easy to “spin up” a new instance in the Cloud without being a
platform expert.
12. Global Data Strategy, Ltd. 2020
Data Analytics
• The Gartner analyst firm categorizes
several stages of analytic use cases.
• Business Intelligence (BI) with a
traditional DW would be categorized as
Descriptive Analytics
• In order to move to higher levels of
optimization, a many organizations are
looking to scale to more Data Lake-style
implementations
13. Global Data Strategy, Ltd. 2020
Data Warehouse vs. Data Lake
13
Traditional Data Lake Traditional Data Warehouse
• Casual, Exploratory Environment
• Looser Development Guidelines
• e.g. “Sandbox Analytics”
OR
• Structured Environment
• Formal, Stricter Development
• e.g. Financial Reporting
14. Global Data Strategy, Ltd. 2020
Data Warehouse vs. Data Lake
14
Traditional Data Lake Traditional Data Warehouse Modern Data Warehouse
• Casual, Exploratory Environment
• Looser Development Guidelines
• e.g. “Sandbox Analytics”
OR XOR
• Structured Environment
• Formal, Stricter Development
• e.g. Financial Reporting
• Best of Both Worlds
• Integrated Development Environment
• e.g. Integrated Data Science Platform
15. Global Data Strategy, Ltd. 2020
a number of respondents mentioned Data Governance in their comments as a way to align the various stakeholders around
common goals
Pure-play Data Lakes facing Disillusionment
15
The concept of pure-play Data
Lakes, particularly those based
on Hadoop, are becoming out
of favor, according to Gartner’s
Hype Cycle.
Source: Gartner
16. Global Data Strategy, Ltd. 2020
Integrating the Data Lake & Traditional Data Sources
• The Data Lake has a different architecture & purpose than traditional data sources such as data warehouses.
• But the two environments can co-exist to share relevant information.
16
Data Analysis & Discovery – Data Lake Enterprise Systems of Record
Data Governance & Collaboration
Master &
Reference Data
Data Warehouse
Data MartsOperational Data
Security & Privacy
Sandbox
Lightly Modeled
Data
Data
Exploration
Reporting & Analytics
Advanced
Analytics
Self-Service BI
Standard BI
Reports
17. Global Data Strategy, Ltd. 2020
Comparing the Traditional and Modern Cloud Data Warehouse
17
Diagram referenced from Qlik
ETL
Traditional
Data Warehouse
(an example)
Modern Cloud
Data Warehouse
(an example)
** Remember – there is no
“One Size Fits All” Approach!
18. Global Data Strategy, Ltd. 2020
Democratization of Data Warehousing
18
DBA
Web Platforms
Data Exploration
& Discovery
Citizen Data Scientist
No
Yes
19. Global Data Strategy, Ltd. 2020
Fundamentals Still Apply
• Database Design: Core design principles still apply in the
Cloud landscape.
• Again, there is “no one size fits all” – match the use case
• Balance performance, scalability, usability
• Metadata: Understanding the context, traceability, and
meaning of data is critical
• Data Quality: The platform doesn’t change the need for clean,
consumable, fit-for-purpose data.
• Data Governance: As usage increases, so does the need for
data governance and accountability.
19
20. Global Data Strategy, Ltd. 2020
Faster Data Requires Fundamentals
According to a recent TDWI Reporting the
largest impediments to faster data include:
• Data Quality Issues: 67%
• Data Silos: 51%
• Governance & Regulation: 46%
• Data Transformation: 34%
20
21. Global Data Strategy, Ltd. 2020
Implement “Just Enough” Data Governance
• Know what to manage closely and what to leave alone
• The more the data is shared across & beyond the organization, the more formal governance needs to be
21
Core Enterprise
Data
Functional & Operational
Data
Exploratory Data
Reference &
Master Data
Core Enterprise Data
• Common data elements used by multiple
stakeholders, departments, etc. (e.g. DW)
• Highly governed
• Highly published & shared
Functional & Operational Data
• Lightly modeled & prepared data for
limited sharing & reuse
• Collaboration-based governance
• May be future candidates for core data
Exploratory Data
• Raw or lightly prepped data for
exploratory analysis
• Mainly ad hoc, one-off analysis
• Light touch governance
Examples
• Operational Reporting
• Non-productionized analytical model data
• Ad hoc reporting & discovery
Examples
• Raw data sets for exploratory analytics
• External & Open data sources
Examples
• Common Financial Metrics: for Financial & Regulatory Reporting
• Common Attributes: Core attributes reused across multiple areas
(e.g. Customer name, Address, etc.)
Master & Reference Data
• Common data elements used by multiple stakeholders
across functional areas, applications, etc.
• Highly governed
• Highly published & shared
Examples
• Reference Data: Department Codes, Country Codes, etc.
• Master Data: Customer, Product, Student, Supplier, etc.
Exploratory analysis
uses core data sets
when applicable
Derived variables of
value can be fed into
Core Enterprise, or
even Master Data.
PublishPromote
23. Global Data Strategy, Ltd. 2020
The Star Schema
Dimension
Dimension
Dimension
Dimension
Dimension
Fact
(Measure)
Facts/Measures: Contain the actual values to be reported on.
What are we measuring? e.g. Activities (sales transaction,
patient visit, etc.)
• Few attributes (just numbers with links to the dimensions)
• Many values (e.g. all sales transactions)
Dimensions: Contain the details that describe the central fact.
i.e. The things we want to report by. e.g. Date, Region, Quarter
• Many attributes (Individual name, DOB, gender, etc.)
• Few values
Note: Your Master Data domains often feed these dimensions.
Sales
By Month
By Customer
By Region By Sales Rep
By Product
The Star Schema is still a user-friendly and performant way to “slice and dice” data for reporting.
24. Global Data Strategy, Ltd. 2020
Design Patterns
There are a number of design patterns available to fit a variety of use cases
(again – there is no “one size fits all” )
Inmon vs. Kimball
The battle still rages...
Data Vault
Hubs, Links and Satellites
Flatten Everything
Popular with Data Science
Columnar
Columns vs. Rows
And More…
Choices abound…
25. Global Data Strategy, Ltd. 2020
Summary
25
• Reporting and Analytics continue to be a key business driver for most organizations.
• Cloud-based technologies provide a myriad of new options for scalability, performance,
ease of entry, and cost flexibility
• The concepts of Data Lakes and Data Warehousing are merging with new technology
offerings.
• The ease of entry for Cloud Data Warehousing platforms allows more citizen data analysts
& data scientists to join the game
• Despite apparent ease of entry, core fundamentals still apply: Data Governance, Data
Quality, and Database Design are still as important as ever.
26. Global Data Strategy, Ltd. 2020
White Paper: Trends in Data Management
• Download from www.globaldatastrategy.com
• Under ‘Whitepapers’
• Also available on Dataversity.net
26
Free Download
27. Global Data Strategy, Ltd. 2020
DATAVERSITY Data Architecture Strategies
• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same
• April 23 Master Data Management – Aligning Data, Process, and Governance
• May 28 Data Governance and Data Architecture – Alignment and Synergies
• June 25 Enterprise Architecture vs. Data Architecture
• July 22 Best Practices in Metadata Management
• August 27 Data Quality Best Practices
• September 24 Data Virtualization – Separating Myth from Reality
• October 22 Data Architect vs. Data Engineer vs. Data Modeler
• December 1 Graph Databases: Practical Use Cases
27
Join us next month
28. Global Data Strategy, Ltd. 2020
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that
specializes in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
28
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
29. Global Data Strategy, Ltd. 2020
Questions?
29
• Thoughts? Ideas?
www.globaldatastrategy.com