SlideShare ist ein Scribd-Unternehmen logo
1 von 40
© 2019 Snowflake Inc. All Rights Reserved
DELIVERING DATA
DEMOCRATIZATION
IN THE CLOUD
WITH SNOWFLAKE
KENT GRAZIANO, CHIEF TECHNICAL EVANGELIST I @KentGraziano
© 2019 Snowflake Inc. All Rights Reserved
• Chief Technical Evangelist, Snowflake Computing
• Oracle ACE Director, Alumni (DW/BI)
• OakTable Network
• Blogger – The Data Warrior
• Certified Data Vault Master and DV 2.0 Practitioner
• Former Member: Boulder BI Brain Trust (#BBBT)
• Member: DAMA Houston & DAMA International
• Data Architecture and Data Warehouse Specialist
• 30+ years in IT
• 25+ years of Oracle-related work
• 20+ years of data warehousing experience
• Author & Co-Author of a bunch of books (Amazon)
• Past-President of ODTUG and Rocky Mountain Oracle User Group
My Bio
© 2019 Snowflake Inc. All Rights Reserved
3 years in stealth + 4 years GA
3
1700+ employees
Over 3400
customers today
Over $1.4B in venture
funding from leading
investors
First customers
2014, general
availability 2015
Founded 2012 by
industry veterans
with over 120
database patents
Queries processed in
Snowflake per day:
406 million
Largest single
table:
46.7 trillion rows
Largest number of
tables single DB:
3.1 million!
Single customer
most data:
> 19.3PB
Single customer
most users:
> 11,000
Fun facts:
OUR MISSION
ENABLE ORGANIZATIONS TO
BE DATA LEADERS
© 2019 Snowflake Inc. All Rights Reserved
AGENDA
Data Challenges
What is a Cloud Data Platform?
Real World Use Cases
© 2020 Snowflake Inc. All Rights Reserved
Data Challenges Today
© 2018 Snowflake Computing Inc. All Rights Reserved.
163 Zettabytes by 2025
Web ERP3rd party apps Enterprise apps IoTMobile
It’s not the data itself
it’s how you take full advantage of the insight it provides
Web ERP3rd party apps Enterprise apps IoTMobile
© 2020 Snowflake Inc. All Rights Reserved
Most firms don’t consistently turn data into action
Source: Forrester
All Possible data All Possible Action
of firms
aspire to be
data-driven.
73%
of firms are
good at
turning data
into action.
29%
© 2020 Snowflake Inc. All Rights Reserved
JOURNEY TO A CLOUD DATA PLATFORM
On Premises
EDW
Data Lake,
Hadoop
1st Gen Cloud
EDW
Cloud Data
Platform
All Data
All Users
Fast Answers
SQL Database
Value
of
Data
Time
© 2020 Snowflake Inc. All Rights Reserved 12
DATA
SOURCES
OLTP DATABASES
ENTERPRISE
APPLICATIONS
THIRD-PARTY
WEB/LOG DATA
IoT
DATA
CONSUMERS
DATA MONETIZATION
OPERATIONAL
REPORTING
AD HOC ANALYSIS
REAL-TIME ANALYTICS
SNOWFLAKE CLOUD DATA PLATFORM
© 2020 Snowflake Inc. All Rights Reserved
Rethink
transformation
with robust
and integrated
data pipelines
Simplify and
accelerate your
data lake with
one platform for
all your data
Develop apps
with fast and
scalable analytics
that delight
customers
Deliver
analytics at
scale with
a modern
data warehouse
Empower your
ecosystem
to securely
collaborate
across all data
Simplify and
accelerate
machine learning
and artificial
intelligence
ONE PLATFORM, ONE COPY OF DATA,
MANY WORKLOADS
© 2020 Snowflake Inc. All Rights Reserved
Real World Examples
© 2020 Snowflake Inc. All Rights Reserved
Delivering compelling results
Simpler data pipeline
Replace noSQL database with Snowflake
for storing & transforming JSON event data
Faster analytics
Replace on-premises data warehouse with
Snowflake for analytics workload
Significantly lower cost
Improved performance while adding new
workloads--at a fraction of the cost
noSQL data base:
8 hours to prepare data
Snowflake:
1.5 minutes
Data warehouse
appliance: 20+ hours
Snowflake:
45 minutes
Data warehouse
appliance:
$5M + to expand
Snowflake:
added 2 new workloads for $50K
© 2020 Snowflake Inc. All Rights Reserved
Simplifying the Data pipeline
EDW
Game Event Data
Internal Data
Third-party Data
Analysts & BI
Tools
Staging
noSQL
Database
Existing
EDW
Game Event Data
Internal Data
Third-party Data
Analysts & BI
Tools
SnowflakeKinesis
Cleanse Normalize Transform
11-24 hours 15 minutes
Scenario
Complex pipeline slowing down analytics
Pain Points
• Fragile data pipeline
• Delays in getting updated data
• High cost and complexity
• Limited data granularity
Solution
Send data from Kinesis to S3 to Snowflake with
schemaless ingestion and easy querying
Snowflake Value
• >50x faster data updates
• 80% lower costs
• Nearly eliminated pipeline failures
• Able to retain full data granularity
© 2020 Snowflake Inc. All Rights Reserved
JSON Support with SQL – No Programming
17
> SELECT … FROM
…
Semi-structured data
(JSON, Avro, XML, Parquet, ORC)
Structured data
(e.g., CSV, TSV, …)
Storage optimization
Transparent discovery and storage optimization
of repeated elements
Query optimization
Full database optimization for queries
on semi-structured data
+
select v:lastName::string as last_name
from json_demo;
© 2020 Snowflake Inc. All Rights Reserved
Load data and run queries,
we do all the rest
Zero infrastructure and admin costs
Secure and highly available
Fully managed with no knobs
or tuning required
No indexes, distribution keys,
partitioning, or vacuuming
18
Automatic Query Optimization
© 2020 Snowflake Inc. All Rights Reserved
EVER EXPANDING ECOSYSTEM
© 2020 Snowflake Inc. All Rights Reserved
DATA ANALYTICS AT EXTREME SCALE
20
Scenario Pain Points
Snowflake Value
120x faster –
from 20 hours
to 45 minutes
Ad-Hoc analytics
available to all users
in minutes
Deployed in a week
during the busiest
time of year
• Financial institution with
a huge focus on security
• Overburdened staff
• Business needs to run monthly reports
that span 10 years of historical data
• No way to analyze
semi-structured data
• Quoted $500,000 to replace their
existing hardware appliance
• 20+ hours to run reports
• Could not continue to scale
• Users unable to query while
performing ETL
Profiles
Ratings
Loses
Microstrategy
1-2 days
1-2 mins
Legacy DW
© 2020 Snowflake Inc. All Rights Reserved
SNOWFLAKE - SECURE BY DESIGN
21
Snowflake
Operational Controls
• NIST 800-53
• SOC2 Type 2
• HIPAA
• PCI
• FedRAMP
Access
• All communication
secured & encrypted
• TLS 1.2 encryption
in both trusted and
untrusted networks
• IP Whitelisting
Authentication
• Password Policy
enforcement
• Multifactor Authentication
• SAML 2.0 support for
Federated Authentication
Application
• Flexible user
management
• Role-based
access control for
granular control
• RBAC for data
and actions
Data
• Encrypted at rest
• Hierarchical key model
rooted in Cloud HSM
• Automatic key rotation
• Time Travel 1-90 days
• Tri-Secret Secure
• Query statement
encryption
Infrastructure
• AWS, Azure Physical Security
• AWS, Azure Redundancy
• Regional Data Centers
▪ US
▪ EU
▪ AP
© 2020 Snowflake Inc. All Rights Reserved
DATA SECURITY & GOVERNANCE
Ensure data privacy and restrict access to source data
22
SECURE VIEWS
Cell-level security in
multi-tenant situations
SECURE UDFs
Prevent consumers
from seeing
underlying data
SECURE JOINS
Securely mask data
during join operations
SECURE
MATERIALIZED VIEWS
Secure,
precomputed views
© 2020 Snowflake Inc. All Rights Reserved
Oil & Gas Use Case
© 2020 Snowflake Inc. All Rights Reserved
Previous bottlenecks
Slow to modify due to so many layers
Legacy database limited size and scalability
Limited datasets to increase stability
Limited enterprise access
Expensive to build and maintain
SITUATION BEFORE: DATA WAREHOUSE
24
Data
Warehouse
Landing
Database
BI
Analytics
Tools
Source
Systems
© 2020 Snowflake Inc. All Rights Reserved
Previous bottlenecks
Not scalable
Too many points of failure
Developers and Change Control still required
Expensive to build and maintain
Too many users
SITUATION BEFORE: DATA SERVICES
25
ODataETL Repo
Database
BI
Analytics
Tools
Transform
and Store
Data Quality
Tool
Host
API
Source
Systems
© 2020 Snowflake Inc. All Rights Reserved
Previous bottlenecks
Opaque solutions
Difficult to support
Duplication of effort from business users
Proprietary solution and code
SITUATION BEFORE: SELF-SERVICE HADOOP
26
SQL
Datamart
HDFS
BI
Analytics
Tools
Transform
and Store
ETL Tool
HostSource
Systems
© 2020 Snowflake Inc. All Rights Reserved
SNOWFLAKE IMPLEMENTATION
27
Centralized raw and enterprise data
Simplified data transformation
Existing tools integrate seamlessly
Scales for the workload
Can handle all of the enterprise’s data
Source
Systems
Attunity
Replicate
Attunity (M)
Azure Blob
OData (L) SAS (M) Power BI (L)Databricks (XL)
Excel SAS Power BIDatabricks
© 2020 Snowflake Inc. All Rights Reserved
“SNOWFLAKE IS HERE
TO STAY, RIGHT?”
IMMEDIATE WINS
DATA MANAGEMENT
PROFESSIONAL
EHS Greenhouse Gas Report
Regulatory report originally ran in 48 hours
in legacy db, reduced to minutes
Data Prep for loading
15 hour process to prepare data for
loading into subsurface application
reduced to 30 minutes (mostly the data
load time)
Massive Scale
2,000 simultaneous and random queries
against a 40 billion record table all return in
under 10 seconds
© 2020 Snowflake Inc. All Rights Reserved
Separation of Storage & Compute
New multi-cluster, shared data architecture
29
Multi-Tenant Snowflake Deployment
Semi-structured
data
Structured data
Data Science (L)
Database Services
Management Optimization Security Availability Transactions Metadata
Easy-to-use Service with No Management
Unlimited Compute Clusters to Serve Every Use
Flexible Cloud Storage For All Kinds of Data
SQL Queries (S)
Data Transformation (M)
BI/Reporting
Multi-Cluster (M)
© 2020 Snowflake Inc. All Rights Reserved
How Snowflake transformed and simplified our data
30
Have all of our data in one place
Separated workloads to never
have downtime
Easy adoption because everything
looks like a traditional database
Give users a consistent location for
data, regardless of tool or use case
Tremendous scalability for users
of a large enterprise
Comparatively low cost
© 2020 Snowflake Inc. All Rights Reserved.
“2000 Simultaneous Queries
against a 40 Billion record table
all return in under 10 seconds.”
Larry Querbach,
Enterprise Data Architect
© 2020 Snowflake Inc. All Rights Reserved
SNOWFLAKE IN ACTION
TODAY
© 2020 Snowflake Inc. All Rights Reserved
SITUATION
• 13 data warehouses, some
end-of-life
• Huge data silos caused
ambiguity and reporting
disparities
• Poor consumer insight
VALUE
• Diverse data ingested for
analytics, data science
• Improved agility, scalability
with compute on demand
• Much faster – 6 hour queries
running in 3 seconds
• In depth customer knowledge
and improved services
© 2020 Snowflake Inc. All Rights Reserved
SITUATION / PAIN
• Painfully slow analytics cycles
• Limited ability to answer
complex questions
• Inability to provide business
continuity and ensure security
SOLUTION / VALUE
• Hundreds of newly
empowered analysts
• Improved scalability and
faster query times
• Guaranteed security and data
availability, 24/7/365
© 2020 Snowflake Inc. All Rights Reserved.
“In terms of processing data,
we are 10X faster and
60% cheaper.”
Rene Greiner,
VP of Data Integration
© 2020 Snowflake Inc. All Rights Reserved
PROVEN BY OVER 3400 CUSTOMERS
© 2020 Snowflake Inc. All Rights Reserved
What does a Cloud Data
Platform Provide?
Cost effective storage and analysis of
GBs, TBs, or even PB’s
Lightning fast query performance
Continuous data loading without impacting
query performance
Unlimited user concurrency
Full SQL relational support of both
structured and semi-structured data
Support for the tools and languages you
already useJava
>_
Scripting
Big Data does not have to
equal Big Effort
Web ERP3rd party
apps
Enterprise
apps
IoTMobile
© 2020 Snowflake Inc. All Rights Reserved
Kent Graziano
Snowflake Computing
Kent.graziano@snowflake.com
On Twitter @KentGraziano
More info at
http://snowflake.com
Visit my blog at
http://kentgraziano.com
Contact Info
THANK YOU
© 2019 Snowflake Inc. All Rights Reserved

Weitere ähnliche Inhalte

Was ist angesagt?

Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
Snowflake Computing
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
DATAVERSITY
 

Was ist angesagt? (20)

Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Data mesh
Data meshData mesh
Data mesh
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
 

Ähnlich wie Delivering Data Democratization in the Cloud with Snowflake

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 

Ähnlich wie Delivering Data Democratization in the Cloud with Snowflake (20)

Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Slides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdfSlides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdf
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Platform Deep Dive
Platform Deep DivePlatform Deep Dive
Platform Deep Dive
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
 

Mehr von Kent Graziano

Mehr von Kent Graziano (20)

Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Rise of the Data Cloud
Rise of the Data CloudRise of the Data Cloud
Rise of the Data Cloud
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
 
Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
 
Top Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerTop Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data Modeler
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 

Kürzlich hochgeladen

怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 

Kürzlich hochgeladen (20)

RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 

Delivering Data Democratization in the Cloud with Snowflake

  • 1. © 2019 Snowflake Inc. All Rights Reserved DELIVERING DATA DEMOCRATIZATION IN THE CLOUD WITH SNOWFLAKE KENT GRAZIANO, CHIEF TECHNICAL EVANGELIST I @KentGraziano
  • 2. © 2019 Snowflake Inc. All Rights Reserved • Chief Technical Evangelist, Snowflake Computing • Oracle ACE Director, Alumni (DW/BI) • OakTable Network • Blogger – The Data Warrior • Certified Data Vault Master and DV 2.0 Practitioner • Former Member: Boulder BI Brain Trust (#BBBT) • Member: DAMA Houston & DAMA International • Data Architecture and Data Warehouse Specialist • 30+ years in IT • 25+ years of Oracle-related work • 20+ years of data warehousing experience • Author & Co-Author of a bunch of books (Amazon) • Past-President of ODTUG and Rocky Mountain Oracle User Group My Bio
  • 3. © 2019 Snowflake Inc. All Rights Reserved 3 years in stealth + 4 years GA 3 1700+ employees Over 3400 customers today Over $1.4B in venture funding from leading investors First customers 2014, general availability 2015 Founded 2012 by industry veterans with over 120 database patents Queries processed in Snowflake per day: 406 million Largest single table: 46.7 trillion rows Largest number of tables single DB: 3.1 million! Single customer most data: > 19.3PB Single customer most users: > 11,000 Fun facts:
  • 4. OUR MISSION ENABLE ORGANIZATIONS TO BE DATA LEADERS
  • 5. © 2019 Snowflake Inc. All Rights Reserved AGENDA Data Challenges What is a Cloud Data Platform? Real World Use Cases
  • 6. © 2020 Snowflake Inc. All Rights Reserved Data Challenges Today
  • 7. © 2018 Snowflake Computing Inc. All Rights Reserved.
  • 8. 163 Zettabytes by 2025 Web ERP3rd party apps Enterprise apps IoTMobile
  • 9. It’s not the data itself it’s how you take full advantage of the insight it provides Web ERP3rd party apps Enterprise apps IoTMobile
  • 10. © 2020 Snowflake Inc. All Rights Reserved Most firms don’t consistently turn data into action Source: Forrester All Possible data All Possible Action of firms aspire to be data-driven. 73% of firms are good at turning data into action. 29%
  • 11. © 2020 Snowflake Inc. All Rights Reserved JOURNEY TO A CLOUD DATA PLATFORM On Premises EDW Data Lake, Hadoop 1st Gen Cloud EDW Cloud Data Platform All Data All Users Fast Answers SQL Database Value of Data Time
  • 12. © 2020 Snowflake Inc. All Rights Reserved 12 DATA SOURCES OLTP DATABASES ENTERPRISE APPLICATIONS THIRD-PARTY WEB/LOG DATA IoT DATA CONSUMERS DATA MONETIZATION OPERATIONAL REPORTING AD HOC ANALYSIS REAL-TIME ANALYTICS SNOWFLAKE CLOUD DATA PLATFORM
  • 13. © 2020 Snowflake Inc. All Rights Reserved Rethink transformation with robust and integrated data pipelines Simplify and accelerate your data lake with one platform for all your data Develop apps with fast and scalable analytics that delight customers Deliver analytics at scale with a modern data warehouse Empower your ecosystem to securely collaborate across all data Simplify and accelerate machine learning and artificial intelligence ONE PLATFORM, ONE COPY OF DATA, MANY WORKLOADS
  • 14. © 2020 Snowflake Inc. All Rights Reserved Real World Examples
  • 15. © 2020 Snowflake Inc. All Rights Reserved Delivering compelling results Simpler data pipeline Replace noSQL database with Snowflake for storing & transforming JSON event data Faster analytics Replace on-premises data warehouse with Snowflake for analytics workload Significantly lower cost Improved performance while adding new workloads--at a fraction of the cost noSQL data base: 8 hours to prepare data Snowflake: 1.5 minutes Data warehouse appliance: 20+ hours Snowflake: 45 minutes Data warehouse appliance: $5M + to expand Snowflake: added 2 new workloads for $50K
  • 16. © 2020 Snowflake Inc. All Rights Reserved Simplifying the Data pipeline EDW Game Event Data Internal Data Third-party Data Analysts & BI Tools Staging noSQL Database Existing EDW Game Event Data Internal Data Third-party Data Analysts & BI Tools SnowflakeKinesis Cleanse Normalize Transform 11-24 hours 15 minutes Scenario Complex pipeline slowing down analytics Pain Points • Fragile data pipeline • Delays in getting updated data • High cost and complexity • Limited data granularity Solution Send data from Kinesis to S3 to Snowflake with schemaless ingestion and easy querying Snowflake Value • >50x faster data updates • 80% lower costs • Nearly eliminated pipeline failures • Able to retain full data granularity
  • 17. © 2020 Snowflake Inc. All Rights Reserved JSON Support with SQL – No Programming 17 > SELECT … FROM … Semi-structured data (JSON, Avro, XML, Parquet, ORC) Structured data (e.g., CSV, TSV, …) Storage optimization Transparent discovery and storage optimization of repeated elements Query optimization Full database optimization for queries on semi-structured data + select v:lastName::string as last_name from json_demo;
  • 18. © 2020 Snowflake Inc. All Rights Reserved Load data and run queries, we do all the rest Zero infrastructure and admin costs Secure and highly available Fully managed with no knobs or tuning required No indexes, distribution keys, partitioning, or vacuuming 18 Automatic Query Optimization
  • 19. © 2020 Snowflake Inc. All Rights Reserved EVER EXPANDING ECOSYSTEM
  • 20. © 2020 Snowflake Inc. All Rights Reserved DATA ANALYTICS AT EXTREME SCALE 20 Scenario Pain Points Snowflake Value 120x faster – from 20 hours to 45 minutes Ad-Hoc analytics available to all users in minutes Deployed in a week during the busiest time of year • Financial institution with a huge focus on security • Overburdened staff • Business needs to run monthly reports that span 10 years of historical data • No way to analyze semi-structured data • Quoted $500,000 to replace their existing hardware appliance • 20+ hours to run reports • Could not continue to scale • Users unable to query while performing ETL Profiles Ratings Loses Microstrategy 1-2 days 1-2 mins Legacy DW
  • 21. © 2020 Snowflake Inc. All Rights Reserved SNOWFLAKE - SECURE BY DESIGN 21 Snowflake Operational Controls • NIST 800-53 • SOC2 Type 2 • HIPAA • PCI • FedRAMP Access • All communication secured & encrypted • TLS 1.2 encryption in both trusted and untrusted networks • IP Whitelisting Authentication • Password Policy enforcement • Multifactor Authentication • SAML 2.0 support for Federated Authentication Application • Flexible user management • Role-based access control for granular control • RBAC for data and actions Data • Encrypted at rest • Hierarchical key model rooted in Cloud HSM • Automatic key rotation • Time Travel 1-90 days • Tri-Secret Secure • Query statement encryption Infrastructure • AWS, Azure Physical Security • AWS, Azure Redundancy • Regional Data Centers ▪ US ▪ EU ▪ AP
  • 22. © 2020 Snowflake Inc. All Rights Reserved DATA SECURITY & GOVERNANCE Ensure data privacy and restrict access to source data 22 SECURE VIEWS Cell-level security in multi-tenant situations SECURE UDFs Prevent consumers from seeing underlying data SECURE JOINS Securely mask data during join operations SECURE MATERIALIZED VIEWS Secure, precomputed views
  • 23. © 2020 Snowflake Inc. All Rights Reserved Oil & Gas Use Case
  • 24. © 2020 Snowflake Inc. All Rights Reserved Previous bottlenecks Slow to modify due to so many layers Legacy database limited size and scalability Limited datasets to increase stability Limited enterprise access Expensive to build and maintain SITUATION BEFORE: DATA WAREHOUSE 24 Data Warehouse Landing Database BI Analytics Tools Source Systems
  • 25. © 2020 Snowflake Inc. All Rights Reserved Previous bottlenecks Not scalable Too many points of failure Developers and Change Control still required Expensive to build and maintain Too many users SITUATION BEFORE: DATA SERVICES 25 ODataETL Repo Database BI Analytics Tools Transform and Store Data Quality Tool Host API Source Systems
  • 26. © 2020 Snowflake Inc. All Rights Reserved Previous bottlenecks Opaque solutions Difficult to support Duplication of effort from business users Proprietary solution and code SITUATION BEFORE: SELF-SERVICE HADOOP 26 SQL Datamart HDFS BI Analytics Tools Transform and Store ETL Tool HostSource Systems
  • 27. © 2020 Snowflake Inc. All Rights Reserved SNOWFLAKE IMPLEMENTATION 27 Centralized raw and enterprise data Simplified data transformation Existing tools integrate seamlessly Scales for the workload Can handle all of the enterprise’s data Source Systems Attunity Replicate Attunity (M) Azure Blob OData (L) SAS (M) Power BI (L)Databricks (XL) Excel SAS Power BIDatabricks
  • 28. © 2020 Snowflake Inc. All Rights Reserved “SNOWFLAKE IS HERE TO STAY, RIGHT?” IMMEDIATE WINS DATA MANAGEMENT PROFESSIONAL EHS Greenhouse Gas Report Regulatory report originally ran in 48 hours in legacy db, reduced to minutes Data Prep for loading 15 hour process to prepare data for loading into subsurface application reduced to 30 minutes (mostly the data load time) Massive Scale 2,000 simultaneous and random queries against a 40 billion record table all return in under 10 seconds
  • 29. © 2020 Snowflake Inc. All Rights Reserved Separation of Storage & Compute New multi-cluster, shared data architecture 29 Multi-Tenant Snowflake Deployment Semi-structured data Structured data Data Science (L) Database Services Management Optimization Security Availability Transactions Metadata Easy-to-use Service with No Management Unlimited Compute Clusters to Serve Every Use Flexible Cloud Storage For All Kinds of Data SQL Queries (S) Data Transformation (M) BI/Reporting Multi-Cluster (M)
  • 30. © 2020 Snowflake Inc. All Rights Reserved How Snowflake transformed and simplified our data 30 Have all of our data in one place Separated workloads to never have downtime Easy adoption because everything looks like a traditional database Give users a consistent location for data, regardless of tool or use case Tremendous scalability for users of a large enterprise Comparatively low cost
  • 31. © 2020 Snowflake Inc. All Rights Reserved. “2000 Simultaneous Queries against a 40 Billion record table all return in under 10 seconds.” Larry Querbach, Enterprise Data Architect
  • 32. © 2020 Snowflake Inc. All Rights Reserved SNOWFLAKE IN ACTION TODAY
  • 33. © 2020 Snowflake Inc. All Rights Reserved SITUATION • 13 data warehouses, some end-of-life • Huge data silos caused ambiguity and reporting disparities • Poor consumer insight VALUE • Diverse data ingested for analytics, data science • Improved agility, scalability with compute on demand • Much faster – 6 hour queries running in 3 seconds • In depth customer knowledge and improved services
  • 34. © 2020 Snowflake Inc. All Rights Reserved SITUATION / PAIN • Painfully slow analytics cycles • Limited ability to answer complex questions • Inability to provide business continuity and ensure security SOLUTION / VALUE • Hundreds of newly empowered analysts • Improved scalability and faster query times • Guaranteed security and data availability, 24/7/365
  • 35. © 2020 Snowflake Inc. All Rights Reserved. “In terms of processing data, we are 10X faster and 60% cheaper.” Rene Greiner, VP of Data Integration
  • 36. © 2020 Snowflake Inc. All Rights Reserved PROVEN BY OVER 3400 CUSTOMERS
  • 37. © 2020 Snowflake Inc. All Rights Reserved What does a Cloud Data Platform Provide? Cost effective storage and analysis of GBs, TBs, or even PB’s Lightning fast query performance Continuous data loading without impacting query performance Unlimited user concurrency Full SQL relational support of both structured and semi-structured data Support for the tools and languages you already useJava >_ Scripting
  • 38. Big Data does not have to equal Big Effort Web ERP3rd party apps Enterprise apps IoTMobile
  • 39. © 2020 Snowflake Inc. All Rights Reserved Kent Graziano Snowflake Computing Kent.graziano@snowflake.com On Twitter @KentGraziano More info at http://snowflake.com Visit my blog at http://kentgraziano.com Contact Info
  • 40. THANK YOU © 2019 Snowflake Inc. All Rights Reserved