SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
data.world
How to launch a data catalog in minutes
Tim Gasper
VP of Product
data.world
Five things to consider about Data Mesh and Data Governance
Paul Gancz
Partner Solutions Architect
Snowflake
Juan Sequeda
Principal Scientist
data.world
datadotworld data.world
Better together
The Data Cloud
ONE platform
MANY workloads
NO data silos
The most powerful
combined data mesh
solution to eliminate data
silos and democratize
access to well-governed
data products.
The Modern Data Catalog
Make data discovery,
governance, and analysis
easy.
+
Why Data Mesh?
What is the problem?
Monolithic approaches to data
don’t scale socially
Data is treated as an afterthought
Why do we care?
Centralized processes and teams become a
bottleneck for the business
Data value is being left untapped
Distribute responsibility for data pipelines and data quality to people with domain knowledge.
Serve data as a product using a common self-service IT infrastructure platform.
Domain-Centric
Ownership &
Architecture
Data as-a-Product Self-Service
Data Platform
Federated
Governance
Data pipelines owned by
teams with domain knowledge
Domains own cleansing,
refinement, historization,
pre-aggregation, etc.
Domains responsible for
governance, lineage, etc.
Domains treat data with
consumers in mind
Data is discoverable
Data is easy to obtain and use
Data is documented
Domains responsible for the
quality of their data
Common set of tools across
domains
Domain-agnostic
Easy to use and low
maintenance to support
Easy to deploy repeatable
patterns for data cleansing,
transformation, automation,
storage, security, governance,
sharing
Global interoperability
standards across domains
Define and use global data
governance policies
Define and apply governance
within each domain and
propagate downstream
Data Mesh Principles
Source: Zhamak Dehghani, https://martinfowler.com/articles/data-monolith-to-mesh.html , https://martinfowler.com/articles/data-mesh-principles.html
DATA GOVERNANCE CHALLENGES
5
Data Is
Everywhere
Must be able to eliminate
silos inside and outside
your organization
Managing Data Is
Unnecessarily Complex
Knowing what your
data is — and how it is
being used — is hard
Security and Governance
Are Inherently Rigid
Requires managing risk and
changing regulations, while
getting the most from your data
DATA GOVERNANCE IN THE DATA CLOUD
6
Know Your Data Protect Your Data
Understand, classify, and
track data and its usage
Secure sensitive data with
policy-based access controls
Securely collaborate and
share data across teams
Unlock Your Data
What is has been...
Risk avoidance and compliance
Top-down policies
Cumbersome processes
DATA GOVERNANCE
What it needs to be...
DATA GOVERNANCE
Rules of cooperation and collaboration
Process of data & analytics together
Capture knowledge in real-time
What is the goal of data governance
Data Governance and Data Catalogs
What do catalogs do and how they help
Governance is now about data discoverability; not just data
protection.
While application silos pose a governance challenge, inclusive,
agile data governance approaches pose solutions.
Governance needs to be a benefit, not a burden. The friction
has to go away.
Business users don’t want to install software for governance,
SaaS removes all the friction and is the way to go.
Understand and trust your data with profiling, sampling and
lineage.
Everyone (producers and consumers) actively contributes to
data as they use it.
Accelerates time to value and uncover insights.
Cloud-native and multi-tenant approach are highly available,
scale bigger, perform better and evolve faster.
1
2
3
4
5
Five things to consider about Data Mesh and Data Governance
What is the scope?
Who are the stakeholders?
Where should we standardize and productize data?
Who is responsible?
How to be agile?
1. What is the scope?
Example Architecture
Data
sources
Consumers
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
datadotworld data.world
Source: Zhamak Dehghani - martinfowler.com
What are your domains?
Data Mesh: Domain-centric Architecture
Domain: Customer
Data
sources
from
different
domains
Consumers
Domain: Helpdesk & Support
Domain: Products
Interoperability Standards, Federated Governance, Data Catalog
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
Domain: Orders & Sales
Domain:
Marketing & Promotions
Domain: Customer 360
• Domain-centric ownership of data sources, pipelines, and data quality
• Ownership sits with domain knowledge 🡪 better data quality for consumers
• Domain teams can react faster to source format changes or quality issues
• Overall easier to scale the number of sources & consumers
• Data assets offered as products
• “Serve & pull” instead of
“push & ingest” model
datadotworld data.world
Resource Graph
Data Platform
Catalog
How scope affects your data catalog
Analytics
Catalog
Approach Purpose Coverage Stakeholders
Analytics Catalog
Enabling Data
Consumers discover
assets
Data Lake and Data
Mart Tables and related
Reports
Analysts, BI Team,
Report Writers,
Report Users
Data Platform
Catalog
+
Enabling the
management of Data
Platform (automation
and observability)
+
Upstream Data sources,
lineage, streaming data,
ml model, usage
information
+
Data Scientist, Data
Engineers
Enterprise
Resource Graph
+
Managing and protect
the company’s data
related resources
+
All data systems,
services, classification,
access and provenance.
+
Run Time
Developers,
Security, Privacy
The approach to managing metadata will depend on the problems that are a priority to solve.
2. Who are the stakeholders?
datadotworld data.world
Capture and store what user
data exists, where is it, and
who is responsible for it?
Privacy
Tell me where is the sensitive
data, how is it handled, who
has access, who is
responsible for it?
Provide a platform to store
and share data best
practices, certifications,
documentation, and curated
data models.
Tell me what data there is, its
usability, how to use it and
who to go to for help.
Tell me who uses my data,
and give me a platform to
interact with them.
Enable automation within
data systems – registration,
provisioning, validation,
access controls, etc.
Stakeholders
Key to buy-in, executive sponsorship, and oversight.
Security Platforms
Data Governance
Data Producers
Data Consumers
Data Leadership
3. Where should we standardize and productize data?
datadotworld data.world
What is a Data Product?
“A product that facilitates an end goal
through the use of data”
DJ Patil, former United States Chief Data Scientist
“Data as a product defines a new
concept, called data product that
embodies standardized characteristics
to make data valuable and usable.”
Zhamak Dehghani, Thoughtworks Director of Emerging
Technologies and founder of data mesh
datadotworld data.world
Data Product ABCs
Explicit Knowledge
E
● Modeling Schemas
● Documentation
● Relationships with other Data Products
Downstream Consumers
D
● Current and Potential Consumers
● Use Cases
● Roadmap
Contracts & Expectations
C
● Data Constraints, Definitions, Tests
● SLAs, SLOs, Sharing Agreements, Consents, Purposes
● Performance, Scale, Maintainability, etc.
Boundaries
B
● What is it? What isn’t it?
● Where will it live?
● Inputs and Outputs
Accountability
A
● Who is the owner?
● Who defines the requirements?
● Who fixes it when it breaks?
datadotworld data.world
What is a Data Product?
Data Producer A
Internal Data
API
Data Product(s)
Data Consumer B
Data Consumer A
Data Platform
Dataset
The Cloud-Native Data Catalog
datadotworld data.world
What is a Data Product?
Data Producer
A
Internal Data
API
Data Product(s)
Data Producer
B
Internal Data
API
Data Producer
C
Internal Data
API
Data
Consumer C
Data
Consumer B
Data
Consumer A
Data Platform
Aggregate or “Enterprise”
Data Product(s)
Data Mesh Reference Architecture
Domain: Customer
Domain: Sales
Domain: Products
Domain: Marketing
Domain: Customer 360
Inventory of shared
data products
Snowflake
Reader Account
Snowflake Data Cloud
Consumers
Interoperability Standards, Federated Governance, 3rd
Party Tools
Snowflake Data Sharing as the preferred interoperability standard. Data Marketplace makes data discoverable.
Data Exchange / Catalog for
Consumers
• Connects providers to consumers
• Inventory of available assets
• No central storage of shared data
• Providers retain full control over shared
assets (data, functions)
• Consumers access live provider data, no
copies or ETL required. Register shared
data for local SQL access in their
environment (no copy)
Data domains:
• Can consume and share data or
functions
• Control access policies, data masking,
etc. for downstream consumers
• Can share external tables, i.e. provide
access to data outside of Snowflake
• Can provide reader accounts for
non-Snowflake consumers
Data Catalog for Producers:
• Technical Metadata Inventory, Lineage,
Sensitive Data, Business Glossary
3rd
party
marketing
agency
Reseller
Sales
Analysts
Churn &
Retention
Business
optimization
Finance &
Controlling
Data Sources
Global and Multi-Cloud Data Mesh
Data Domain 1
Data Domain 2
Data Domain 3
Data Domain 5
Data Domain 4
Interoperability Standards, Federated Governance, 3rd
Party Tools
US East
FRA
Snowflake
Reader Account
Consumers
Snowflake enables a truly global and multi-cloud data mesh across cloud platforms and regions.
• Data sources, data domains, and
consumers can sit in different regions
and different cloud platforms
• Snowflake enables a truly global and
multi-cloud data mesh
Tokyo
Zurich
Snowflake Data Cloud
Data Sources
Inventory of shared
data products
GOVERNANCE IN THE DATA CLOUD
Know, protect, and unlock your data
Know your data Protect your data Unlock your data
Object Tagging
Auto Classification
Object
Dependencies**
Access History
(writes)**
Access History
(data access audit)
What
Where
Who
Row Access Policies
Dynamic Data Masking
External Tokenization
Conditional Masking
Secure Data Sharing
Data Exchange
Data Marketplace
Object
Dependencies
(impact analysis)
Access History
(data lineage)
4. Who is responsible?
datadotworld data.world
Who is responsible?
Whether you call them data product managers, data stewards, data owners, data
advocates, data custodians, or data trustees…
Let’s revisit Accountability of the Data Product ABCs Framework:
● Who is the owner?
● Who defines the requirements?
● Who fixes it when it breaks?
● Who defines the roadmap?
● Who has the expertise?
What are the fewest number of critical “hats to wear”?
datadotworld data.world
Data Producer Data Consumer
Data Platform
Data Engineering
Data Producer Data Consumer
Data Platform
Data Management
Changing the Paradigm
Data Management as an Intermediary Direct Data Producer and Data
Consumer Collaboration
Data Mesh: Domain-centric Responsibility
Domain: Customer
Data
sources
from
different
domains
Consumers
Domain: Helpdesk & Support
Domain: Products
Interoperability Standards, Federated Governance, Data Catalog
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
Domain: Orders & Sales
Domain:
Marketing & Promotions
Domain: Customer 360
Data
Consumption
Data
Management
Data
Integration
Data Sources
5. How to be agile?
datadotworld data.world
The Cloud Data Catalog
What is Agile Data Governance?
The process of creating and improving data
assets by iteratively capturing knowledge as
data producers and consumers work together
so that everyone can benefit.
Empowering the usage of data safely.
It adapts the deeply proven best practices of
Agile and Open software development to data
and analytics.
datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
Agile Data Governance Process: iterate!
datadotworld data.world
The Cloud Data Catalog datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
The time impact of being fast, incremental, and iterative
Define policies
Release
Refine
Build workflows
Define standards and principles
Use Case 1
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 2
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 3
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 4
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
datadotworld data.world
The Cloud Data Catalog datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
Takeaways
What is the scope?
● Identify the Domains. You are already doing the work,
they exist!
● Depends on the problems that are a priority to solve:
Analytics, Data Platform, Enterprise Resources
Who are the stakeholders of your data catalog?
● Always need Data Leadership
● Consumers, Producers, Governance, Privacy,
Security, Platforms
Where to Standardize/Productize Data?
● Data Product ABCs: Accountability, Boundaries,
Contracts & Expectations, Downstream
Consumers, Explicit Knowledge
● Consumption, Data Mgmt, Data Producing Systems
Who is responsible?
● Accountability: Owner, Requirements, Who
Fixes, Roadmap, Expertises
● Consumption, Data Mgmt, Data Producing
Systems
How to be agile?
● Empowering the usage of data safely.
● Develop a backlog of questions based on end user
business value
● Sprints, Peer Review, Collaborate, Iterate
The Cloud-Native Data Catalog
Learn more about data mesh governance
What’s inside?
How to…
● Establish a framework for treating data as a product
● Find the right balance of decentralization and centralization
● Transform data into knowledge
Download it here:
data.world/resources/reports-and-tools/data-mesh-governance-white-paper
datadotworld data.world
The Cloud Data Catalog datadotworld data.world

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 

Ähnlich wie Five Things to Consider About Data Mesh and Data Governance

Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
Denodo
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Denodo
 

Ähnlich wie Five Things to Consider About Data Mesh and Data Governance (20)

Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
How a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewHow a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 View
 
Data Science Salon 2018 - Building a true enterprise data governance platform...
Data Science Salon 2018 - Building a true enterprise data governance platform...Data Science Salon 2018 - Building a true enterprise data governance platform...
Data Science Salon 2018 - Building a true enterprise data governance platform...
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and Business
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
Data Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with ClouderaData Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with Cloudera
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
The Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementThe Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata Management
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 

Mehr von DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

Mehr von DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 

Kürzlich hochgeladen

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Kürzlich hochgeladen (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Five Things to Consider About Data Mesh and Data Governance

  • 1. data.world How to launch a data catalog in minutes Tim Gasper VP of Product data.world Five things to consider about Data Mesh and Data Governance Paul Gancz Partner Solutions Architect Snowflake Juan Sequeda Principal Scientist data.world
  • 2. datadotworld data.world Better together The Data Cloud ONE platform MANY workloads NO data silos The most powerful combined data mesh solution to eliminate data silos and democratize access to well-governed data products. The Modern Data Catalog Make data discovery, governance, and analysis easy. +
  • 3. Why Data Mesh? What is the problem? Monolithic approaches to data don’t scale socially Data is treated as an afterthought Why do we care? Centralized processes and teams become a bottleneck for the business Data value is being left untapped
  • 4. Distribute responsibility for data pipelines and data quality to people with domain knowledge. Serve data as a product using a common self-service IT infrastructure platform. Domain-Centric Ownership & Architecture Data as-a-Product Self-Service Data Platform Federated Governance Data pipelines owned by teams with domain knowledge Domains own cleansing, refinement, historization, pre-aggregation, etc. Domains responsible for governance, lineage, etc. Domains treat data with consumers in mind Data is discoverable Data is easy to obtain and use Data is documented Domains responsible for the quality of their data Common set of tools across domains Domain-agnostic Easy to use and low maintenance to support Easy to deploy repeatable patterns for data cleansing, transformation, automation, storage, security, governance, sharing Global interoperability standards across domains Define and use global data governance policies Define and apply governance within each domain and propagate downstream Data Mesh Principles Source: Zhamak Dehghani, https://martinfowler.com/articles/data-monolith-to-mesh.html , https://martinfowler.com/articles/data-mesh-principles.html
  • 5. DATA GOVERNANCE CHALLENGES 5 Data Is Everywhere Must be able to eliminate silos inside and outside your organization Managing Data Is Unnecessarily Complex Knowing what your data is — and how it is being used — is hard Security and Governance Are Inherently Rigid Requires managing risk and changing regulations, while getting the most from your data
  • 6. DATA GOVERNANCE IN THE DATA CLOUD 6 Know Your Data Protect Your Data Understand, classify, and track data and its usage Secure sensitive data with policy-based access controls Securely collaborate and share data across teams Unlock Your Data
  • 7. What is has been... Risk avoidance and compliance Top-down policies Cumbersome processes DATA GOVERNANCE
  • 8. What it needs to be... DATA GOVERNANCE Rules of cooperation and collaboration Process of data & analytics together Capture knowledge in real-time
  • 9. What is the goal of data governance Data Governance and Data Catalogs What do catalogs do and how they help Governance is now about data discoverability; not just data protection. While application silos pose a governance challenge, inclusive, agile data governance approaches pose solutions. Governance needs to be a benefit, not a burden. The friction has to go away. Business users don’t want to install software for governance, SaaS removes all the friction and is the way to go. Understand and trust your data with profiling, sampling and lineage. Everyone (producers and consumers) actively contributes to data as they use it. Accelerates time to value and uncover insights. Cloud-native and multi-tenant approach are highly available, scale bigger, perform better and evolve faster.
  • 10. 1 2 3 4 5 Five things to consider about Data Mesh and Data Governance What is the scope? Who are the stakeholders? Where should we standardize and productize data? Who is responsible? How to be agile?
  • 11. 1. What is the scope?
  • 12. Example Architecture Data sources Consumers ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL
  • 13. datadotworld data.world Source: Zhamak Dehghani - martinfowler.com What are your domains?
  • 14. Data Mesh: Domain-centric Architecture Domain: Customer Data sources from different domains Consumers Domain: Helpdesk & Support Domain: Products Interoperability Standards, Federated Governance, Data Catalog ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL Domain: Orders & Sales Domain: Marketing & Promotions Domain: Customer 360 • Domain-centric ownership of data sources, pipelines, and data quality • Ownership sits with domain knowledge 🡪 better data quality for consumers • Domain teams can react faster to source format changes or quality issues • Overall easier to scale the number of sources & consumers • Data assets offered as products • “Serve & pull” instead of “push & ingest” model
  • 15. datadotworld data.world Resource Graph Data Platform Catalog How scope affects your data catalog Analytics Catalog Approach Purpose Coverage Stakeholders Analytics Catalog Enabling Data Consumers discover assets Data Lake and Data Mart Tables and related Reports Analysts, BI Team, Report Writers, Report Users Data Platform Catalog + Enabling the management of Data Platform (automation and observability) + Upstream Data sources, lineage, streaming data, ml model, usage information + Data Scientist, Data Engineers Enterprise Resource Graph + Managing and protect the company’s data related resources + All data systems, services, classification, access and provenance. + Run Time Developers, Security, Privacy The approach to managing metadata will depend on the problems that are a priority to solve.
  • 16. 2. Who are the stakeholders?
  • 17. datadotworld data.world Capture and store what user data exists, where is it, and who is responsible for it? Privacy Tell me where is the sensitive data, how is it handled, who has access, who is responsible for it? Provide a platform to store and share data best practices, certifications, documentation, and curated data models. Tell me what data there is, its usability, how to use it and who to go to for help. Tell me who uses my data, and give me a platform to interact with them. Enable automation within data systems – registration, provisioning, validation, access controls, etc. Stakeholders Key to buy-in, executive sponsorship, and oversight. Security Platforms Data Governance Data Producers Data Consumers Data Leadership
  • 18. 3. Where should we standardize and productize data?
  • 19. datadotworld data.world What is a Data Product? “A product that facilitates an end goal through the use of data” DJ Patil, former United States Chief Data Scientist “Data as a product defines a new concept, called data product that embodies standardized characteristics to make data valuable and usable.” Zhamak Dehghani, Thoughtworks Director of Emerging Technologies and founder of data mesh
  • 20. datadotworld data.world Data Product ABCs Explicit Knowledge E ● Modeling Schemas ● Documentation ● Relationships with other Data Products Downstream Consumers D ● Current and Potential Consumers ● Use Cases ● Roadmap Contracts & Expectations C ● Data Constraints, Definitions, Tests ● SLAs, SLOs, Sharing Agreements, Consents, Purposes ● Performance, Scale, Maintainability, etc. Boundaries B ● What is it? What isn’t it? ● Where will it live? ● Inputs and Outputs Accountability A ● Who is the owner? ● Who defines the requirements? ● Who fixes it when it breaks?
  • 21. datadotworld data.world What is a Data Product? Data Producer A Internal Data API Data Product(s) Data Consumer B Data Consumer A Data Platform Dataset The Cloud-Native Data Catalog
  • 22. datadotworld data.world What is a Data Product? Data Producer A Internal Data API Data Product(s) Data Producer B Internal Data API Data Producer C Internal Data API Data Consumer C Data Consumer B Data Consumer A Data Platform Aggregate or “Enterprise” Data Product(s)
  • 23. Data Mesh Reference Architecture Domain: Customer Domain: Sales Domain: Products Domain: Marketing Domain: Customer 360 Inventory of shared data products Snowflake Reader Account Snowflake Data Cloud Consumers Interoperability Standards, Federated Governance, 3rd Party Tools Snowflake Data Sharing as the preferred interoperability standard. Data Marketplace makes data discoverable. Data Exchange / Catalog for Consumers • Connects providers to consumers • Inventory of available assets • No central storage of shared data • Providers retain full control over shared assets (data, functions) • Consumers access live provider data, no copies or ETL required. Register shared data for local SQL access in their environment (no copy) Data domains: • Can consume and share data or functions • Control access policies, data masking, etc. for downstream consumers • Can share external tables, i.e. provide access to data outside of Snowflake • Can provide reader accounts for non-Snowflake consumers Data Catalog for Producers: • Technical Metadata Inventory, Lineage, Sensitive Data, Business Glossary 3rd party marketing agency Reseller Sales Analysts Churn & Retention Business optimization Finance & Controlling Data Sources
  • 24. Global and Multi-Cloud Data Mesh Data Domain 1 Data Domain 2 Data Domain 3 Data Domain 5 Data Domain 4 Interoperability Standards, Federated Governance, 3rd Party Tools US East FRA Snowflake Reader Account Consumers Snowflake enables a truly global and multi-cloud data mesh across cloud platforms and regions. • Data sources, data domains, and consumers can sit in different regions and different cloud platforms • Snowflake enables a truly global and multi-cloud data mesh Tokyo Zurich Snowflake Data Cloud Data Sources Inventory of shared data products
  • 25. GOVERNANCE IN THE DATA CLOUD Know, protect, and unlock your data Know your data Protect your data Unlock your data Object Tagging Auto Classification Object Dependencies** Access History (writes)** Access History (data access audit) What Where Who Row Access Policies Dynamic Data Masking External Tokenization Conditional Masking Secure Data Sharing Data Exchange Data Marketplace Object Dependencies (impact analysis) Access History (data lineage)
  • 26. 4. Who is responsible?
  • 27. datadotworld data.world Who is responsible? Whether you call them data product managers, data stewards, data owners, data advocates, data custodians, or data trustees… Let’s revisit Accountability of the Data Product ABCs Framework: ● Who is the owner? ● Who defines the requirements? ● Who fixes it when it breaks? ● Who defines the roadmap? ● Who has the expertise? What are the fewest number of critical “hats to wear”?
  • 28. datadotworld data.world Data Producer Data Consumer Data Platform Data Engineering Data Producer Data Consumer Data Platform Data Management Changing the Paradigm Data Management as an Intermediary Direct Data Producer and Data Consumer Collaboration
  • 29. Data Mesh: Domain-centric Responsibility Domain: Customer Data sources from different domains Consumers Domain: Helpdesk & Support Domain: Products Interoperability Standards, Federated Governance, Data Catalog ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL Domain: Orders & Sales Domain: Marketing & Promotions Domain: Customer 360 Data Consumption Data Management Data Integration Data Sources
  • 30. 5. How to be agile?
  • 31. datadotworld data.world The Cloud Data Catalog What is Agile Data Governance? The process of creating and improving data assets by iteratively capturing knowledge as data producers and consumers work together so that everyone can benefit. Empowering the usage of data safely. It adapts the deeply proven best practices of Agile and Open software development to data and analytics. datadotworld data.world The Cloud-Native Data Catalog
  • 32. datadotworld data.world Agile Data Governance Process: iterate! datadotworld data.world The Cloud Data Catalog datadotworld data.world The Cloud-Native Data Catalog
  • 33. datadotworld data.world The time impact of being fast, incremental, and iterative Define policies Release Refine Build workflows Define standards and principles Use Case 1 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 2 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 3 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 4 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate datadotworld data.world The Cloud Data Catalog datadotworld data.world The Cloud-Native Data Catalog
  • 34. datadotworld data.world Takeaways What is the scope? ● Identify the Domains. You are already doing the work, they exist! ● Depends on the problems that are a priority to solve: Analytics, Data Platform, Enterprise Resources Who are the stakeholders of your data catalog? ● Always need Data Leadership ● Consumers, Producers, Governance, Privacy, Security, Platforms Where to Standardize/Productize Data? ● Data Product ABCs: Accountability, Boundaries, Contracts & Expectations, Downstream Consumers, Explicit Knowledge ● Consumption, Data Mgmt, Data Producing Systems Who is responsible? ● Accountability: Owner, Requirements, Who Fixes, Roadmap, Expertises ● Consumption, Data Mgmt, Data Producing Systems How to be agile? ● Empowering the usage of data safely. ● Develop a backlog of questions based on end user business value ● Sprints, Peer Review, Collaborate, Iterate The Cloud-Native Data Catalog
  • 35. Learn more about data mesh governance What’s inside? How to… ● Establish a framework for treating data as a product ● Find the right balance of decentralization and centralization ● Transform data into knowledge Download it here: data.world/resources/reports-and-tools/data-mesh-governance-white-paper datadotworld data.world The Cloud Data Catalog datadotworld data.world