SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
Architect’s Open-Source
Guide for a Data Mesh
Architecture
Lena Hall
Microsoft
Lena Hall
Director at Microsoft
Azure Engineering
ü Architecture
ü Cloud
ü Data
ü ML/AI
lenadroid
Entry Point
How to Move Beyond a Monolithic Data Lake to a Distributed
Data Mesh
https://martinfowler.com/articles/data-monolith-to-mesh.html
Data Mesh Principles and Logical Architecture
https://martinfowler.com/articles/data-mesh-principles.html
Slack for Data-Mesh-Learning
https://launchpass.com/data-mesh-learning
lenadroid
Talk Snapshot
• What is Data Mesh
• When is Data Mesh a Good Idea
• Core Principles and Concepts
• Example: Drone Delivery Service
• Challenges
• OSS and Open Standards
lenadroid
When and Why
Data Mesh
@lenadroid
Data Mesh is Not For Everyone
Challenges Indicating Data Mesh
May Be Considered
@lenadroid
Drone Delivery Service
lenadroid
WHYs
• Ambiguity in Ownership and Responsibility
• Slow Change due to Coupling to Monolithic System
• Data Engineering Resources Bottleneck
lenadroid
Ideas Composing Data Mesh Concept
@lenadroid
Core Ideas
ü Decentralized teams and data ownership
lenadroid
Core Ideas
ü Decentralized teams and data ownership
ü Data Products powered by Domain Driven Design
lenadroid
High-Level View of a Data Product
lenadroid
Core Ideas
ü Decentralized teams and data ownership
ü Data Products powered by Domain Driven Design
ü Self-serve Shared Data Infrastructure
lenadroid
Core Ideas
ü Decentralized teams and data ownership
ü Data Products powered by Domain Driven Design
ü Self-serve Shared Data Infrastructure
ü Global Federated Governance
lenadroid
Drone Delivery Service Data Products
@lenadroid
lenadroid
Core Principles for Data Products
@lenadroid
lenadroid
DISCOVERABLE
Core Principles for Data Products
lenadroid
DISCOVERABLE
SELF-DESCRIBING
Core Principles for Data Products
lenadroid
DISCOVERABLE
SELF-DESCRIBING
ADDRESSABLE
Core Principles for Data Products
lenadroid
DISCOVERABLE
SELF-DESCRIBING
ADDRESSABLE
SECURE
Core Principles for Data Products
lenadroid
DISCOVERABLE
SELF-DESCRIBING
ADDRESSABLE
SECURE
TRUSTWORTHY
Core Principles for Data Products
lenadroid
DISCOVERABLE
SELF-DESCRIBING
ADDRESSABLE
SECURE
TRUSTWORTHY
INTEROPERABLE
Core Principles for Data Products
Input Ports Questions
• Data Source - Where is the data coming from? External dataset or
another data product?
• Data Format - What is the format of the source input?
• Rate of Updates - How frequently does the input need to be updated?
Output Ports Questions
• End-consumers - Who are the end-users of the data product?
• Data purpose - What are they planning to do with the data outputs?
• Data access - Who needs to have access? How do they prefer to access
the data output?
• Data address - How do they prefer to access the data output?
• Data Format - What format of the data do they expect?
Identity and Permission Policies Questions
• Which resources can this data product be allowed to access?
• Which data products or users can read which output ports of this data
product?
• Are all sensitive resources this data product offers protected according
their required privacy standards (e.g. HIPAA, GDPR, PII, CCPA, etc.)
• Is the permissions policy stored and managed in the same package
as the data product?
Data Product Action Questions
• What is the action that needs to happen to produce the outcomes for the
end-users?
• What are the required adjustments, transformations, filters, updates, or
quality improvements to the input data?
Operational Questions
• How can this data product be discovered and how should it be described to
other data products that might want to consume it?
• Which metadata and information should it make available to the end-
users?
• Where and how should data product versioning be managed during
updates to ensure consistency with how the end-users consume it?
• Which SLAs or SLOs does the data product provide?
• Which product success metrics can this data product expose and keep
track of? (adoption, usage, quality)
• Is the automation/resource orchestration logic stored in the same
package?
Other Questions
• Is this product not tightly coupled to any other data source, data product,
or any other resource that makes him not interoperable?
• Does this data product follow the defined global governance standards and
practices defined by the organization?
• Does this data product have any implementation details that could
interfere with its portability?
Cheat Sheet for Planning Data Products lenadroid
Self-Serve Shared Infrastructure
@lenadroid
REAL-TIME DATA
INGESTION
PROCESSING
OBJECT STORAGE
COLUMNAR STORAGE
PROCESSING
COLUMNAR STORAGE
INCOMING
REQUEST
WEB SERVICE
PROCESSING
lenadroid
Types of Workloads Within a Data Product
WEB SERVICE
PROCESSING
lenadroid
It can look like this
Azure Data
Lake
WEB SERVICE
lenadroid
Or, it can look like this
Google
Storage
lenadroid
Self-Serve Shared Infrastructure
SHARED PLATFORM FOR
STREAMING INGESTION
SHARED PLATFORM FOR
RAW DATA STORAGE
SHARED PLATFORM FOR
COLUMNAR DATA STORAGE
SHARED PLATFORM FOR
CONTAINER WORKLOADS
SHARED PLATFORM FOR
CONTINUOUS DELIVERY
SHARED PLATFORM
FOR OBSERVABILITY
AND MORE…
DEPENDING ON THE ORGANIZATION
DATA CATALOGUE
lenadroid
SHARED PLATFORM FOR
CONTAINER WORKLOADS
SHARED PLATFORM FOR
CONTINUOUS DELIVERY
SHARED PLATFORM
FOR OBSERVABILITY
DISCOVERABLE
SELF-DESCRIBING
ADDRESSABLE
SECURE
TRUSTWORTHY
INTEROPERABLE
Data Mesh
SHARED PLATFORM FOR
STREAMING INGESTION
SHARED PLATFORM FOR
RAW DATA STORAGE
SHARED PLATFORM FOR
COLUMNAR DATA STORAGE
DATA CATALOGUE
Wait, What About the OSS Tools for Data Mesh??
@lenadroid
Challenges with Data Mesh
@lenadroid
Challenges
• Cost questions
• Lack of end-to-end examples
• Efforts to shift from centralized architecture to decentralization-
friendly techniques
• Automation required for enabling creating data products
• Underestimating the importance organizational aspects
lenadroid
Considerations for Technology Choices
@lenadroid
Considerations for Technology Choices
• Workload sharing and multi-tenancy
• No-copy data and compute mobility support
• Granularity of access-control
• Richness of automation and extensibility capabilities
• Flexibility and elasticity
• Provider-agnostic/multi-cloud operations support
• Variety of limitations
(quotas, data volume, resource count, etc.)
• Open Standards, Open Protocols, Open-Source Integrations
lenadroid
Examples of Data Mesh-friendly Technologies
@lenadroid
data
Anthos
Azure Arc
Data Catalogue, Data Lineage,
Data Governance
OSS Data Analytics, Data
Processing, Data Querying
Cloud Storage
Open Formats
Data Ingestion, Streaming
Data Orchestration, Workflows
OSS Storage
Products for Data Analytics and
Processing
Data Visualization and BI Tools
Data Experimentation
Cross-Platform Concepts and Tools
Multi and Hybrid Cloud Tools
Amazon S3
Azure Data Lake
Google Storage
Infrastructure Automation
lenadroid
Data Governance Systems
• Metadata
• Data lineage
• Data schemas
• Data relationships
• Data classification
• Data security
• Data catalog
lenadroid
Open Formats
• Open standard
• Atomic updates, serializable isolation, transactions
• Concurrent operations
• Versioning, rollbacks, time-travel
• Schema Evolution
• Scale, Efficiency, Data Volumes
• Compatibility with existing data stores and languages
lenadroid
Data Platforms (Cloud or OSS)
• Separation of storage and compute
• Support for no-copy data sharing
• Bringing compute to data
• Fine-tuned granularity of permissions for access
• Support for automation and resource management
• Open standards and interoperability with other platforms and
tools for governance, visualization, analytics, etc.
lenadroid
Multi-Cloud Infrastructure Management
• Terraform
Open-source infrastructure as code software tool that enables you to safely and
predictably create, change, and improve infrastructure.
• Pulumi
Open-source infrastructure as code SDK that enables you to create, deploy, and
manage infrastructure on any cloud, using your favorite languages.
• Crossplane
Assemble infrastructure from multiple vendors, and expose higher level self-service
APIs for application teams to consume, without having to write any code.
lenadroid
Multi-Cloud Workload Portability
• Azure Arc
Build cloud-native apps anywhere, at scale. Run Azure services in any Kubernetes
environment, whether it’s on-premises, multi-cloud, or at the edge
• Google Athnos
A modern application management platform that provides a consistent development
and operations experience for cloud and on-premises environments
lenadroid
Kubernetes Open-Standard Technologies
• Open Application Model
An open standard for defining cloud native apps.
KubeVella - https://kubevela.io/docs/concepts
• Open Policy Agent
Declarative Policy-as-Code, enables portability, combination with Infra-as-Code.
https://www.openpolicyagent.org/docs/latest
• Service Catalog
Provision managed services and make them available within a Kubernetes cluster.
https://kubernetes.io/docs/concepts/extend-kubernetes/service-catalog/
NOT AN EXHAUSTIVE LIST
lenadroid
Benefits Brought by Data Mesh
• Data Quality
• Tailored resource and focus allocation
• Organizational cohesion while allowing flexibility
• Reducing complexity
• Democratizing creating value
• Better understanding of value and innovation opportunities
• Empowering a more consistent and fast change
@lenadroid
Important Focus Areas for Technology Providers
• Open Standards, Open Protocols, Open-Source Integrations
• Workload sharing and multi-tenancy
• No-copy data and compute mobility support
• Granularity of access-control
• Richness of automation and extensibility capabilities
• Flexibility and elasticity
• Provider-agnostic/multi-cloud operations support
• Variety of limitations
(quotas, data volume, resource count, etc.)
@lenadroid
Data Mesh will drive better Interoperability, Open
Standards, and Data Quality in the Industry
@lenadroid
Thank you!
Follow lenadroid for more insights

Weitere ähnliche Inhalte

Was ist angesagt?

How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Tristan Baker
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...HostedbyConfluent
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as ProductDATAVERSITY
 

Was ist angesagt? (20)

How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Data mesh
Data meshData mesh
Data mesh
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 

Ähnlich wie Architect’s Open-Source Guide for a Data Mesh Architecture

Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?Denodo
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersRevolution Analytics
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketDremio Corporation
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeSaurabh K. Gupta
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.Łukasz Grala
 
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...Memoori
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Connected development data
Connected development dataConnected development data
Connected development dataRob Worthington
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 

Ähnlich wie Architect’s Open-Source Guide for a Data Mesh Architecture (20)

Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
Finding Your Ideal Data Architecture: Data Fabric, Data Mesh or Both?
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.
 
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Connected development data
Connected development dataConnected development data
Connected development data
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 

Mehr von Databricks

Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityDatabricks
 

Mehr von Databricks (20)

Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 

Kürzlich hochgeladen

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 

Kürzlich hochgeladen (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Architect’s Open-Source Guide for a Data Mesh Architecture

  • 1. Architect’s Open-Source Guide for a Data Mesh Architecture Lena Hall Microsoft
  • 2. Lena Hall Director at Microsoft Azure Engineering ü Architecture ü Cloud ü Data ü ML/AI lenadroid
  • 3. Entry Point How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh https://martinfowler.com/articles/data-monolith-to-mesh.html Data Mesh Principles and Logical Architecture https://martinfowler.com/articles/data-mesh-principles.html Slack for Data-Mesh-Learning https://launchpass.com/data-mesh-learning lenadroid
  • 4. Talk Snapshot • What is Data Mesh • When is Data Mesh a Good Idea • Core Principles and Concepts • Example: Drone Delivery Service • Challenges • OSS and Open Standards lenadroid
  • 5. When and Why Data Mesh @lenadroid
  • 6. Data Mesh is Not For Everyone
  • 7. Challenges Indicating Data Mesh May Be Considered @lenadroid
  • 9. WHYs • Ambiguity in Ownership and Responsibility • Slow Change due to Coupling to Monolithic System • Data Engineering Resources Bottleneck lenadroid
  • 10. Ideas Composing Data Mesh Concept @lenadroid
  • 11. Core Ideas ü Decentralized teams and data ownership lenadroid
  • 12. Core Ideas ü Decentralized teams and data ownership ü Data Products powered by Domain Driven Design lenadroid
  • 13. High-Level View of a Data Product lenadroid
  • 14. Core Ideas ü Decentralized teams and data ownership ü Data Products powered by Domain Driven Design ü Self-serve Shared Data Infrastructure lenadroid
  • 15. Core Ideas ü Decentralized teams and data ownership ü Data Products powered by Domain Driven Design ü Self-serve Shared Data Infrastructure ü Global Federated Governance lenadroid
  • 16. Drone Delivery Service Data Products @lenadroid
  • 18. Core Principles for Data Products @lenadroid
  • 25. Input Ports Questions • Data Source - Where is the data coming from? External dataset or another data product? • Data Format - What is the format of the source input? • Rate of Updates - How frequently does the input need to be updated? Output Ports Questions • End-consumers - Who are the end-users of the data product? • Data purpose - What are they planning to do with the data outputs? • Data access - Who needs to have access? How do they prefer to access the data output? • Data address - How do they prefer to access the data output? • Data Format - What format of the data do they expect? Identity and Permission Policies Questions • Which resources can this data product be allowed to access? • Which data products or users can read which output ports of this data product? • Are all sensitive resources this data product offers protected according their required privacy standards (e.g. HIPAA, GDPR, PII, CCPA, etc.) • Is the permissions policy stored and managed in the same package as the data product? Data Product Action Questions • What is the action that needs to happen to produce the outcomes for the end-users? • What are the required adjustments, transformations, filters, updates, or quality improvements to the input data? Operational Questions • How can this data product be discovered and how should it be described to other data products that might want to consume it? • Which metadata and information should it make available to the end- users? • Where and how should data product versioning be managed during updates to ensure consistency with how the end-users consume it? • Which SLAs or SLOs does the data product provide? • Which product success metrics can this data product expose and keep track of? (adoption, usage, quality) • Is the automation/resource orchestration logic stored in the same package? Other Questions • Is this product not tightly coupled to any other data source, data product, or any other resource that makes him not interoperable? • Does this data product follow the defined global governance standards and practices defined by the organization? • Does this data product have any implementation details that could interfere with its portability? Cheat Sheet for Planning Data Products lenadroid
  • 27. REAL-TIME DATA INGESTION PROCESSING OBJECT STORAGE COLUMNAR STORAGE PROCESSING COLUMNAR STORAGE INCOMING REQUEST WEB SERVICE PROCESSING lenadroid Types of Workloads Within a Data Product
  • 28. WEB SERVICE PROCESSING lenadroid It can look like this Azure Data Lake
  • 29. WEB SERVICE lenadroid Or, it can look like this Google Storage
  • 30. lenadroid Self-Serve Shared Infrastructure SHARED PLATFORM FOR STREAMING INGESTION SHARED PLATFORM FOR RAW DATA STORAGE SHARED PLATFORM FOR COLUMNAR DATA STORAGE SHARED PLATFORM FOR CONTAINER WORKLOADS SHARED PLATFORM FOR CONTINUOUS DELIVERY SHARED PLATFORM FOR OBSERVABILITY AND MORE… DEPENDING ON THE ORGANIZATION DATA CATALOGUE
  • 31. lenadroid SHARED PLATFORM FOR CONTAINER WORKLOADS SHARED PLATFORM FOR CONTINUOUS DELIVERY SHARED PLATFORM FOR OBSERVABILITY DISCOVERABLE SELF-DESCRIBING ADDRESSABLE SECURE TRUSTWORTHY INTEROPERABLE Data Mesh SHARED PLATFORM FOR STREAMING INGESTION SHARED PLATFORM FOR RAW DATA STORAGE SHARED PLATFORM FOR COLUMNAR DATA STORAGE DATA CATALOGUE
  • 32. Wait, What About the OSS Tools for Data Mesh?? @lenadroid
  • 33. Challenges with Data Mesh @lenadroid
  • 34. Challenges • Cost questions • Lack of end-to-end examples • Efforts to shift from centralized architecture to decentralization- friendly techniques • Automation required for enabling creating data products • Underestimating the importance organizational aspects lenadroid
  • 35. Considerations for Technology Choices @lenadroid
  • 36. Considerations for Technology Choices • Workload sharing and multi-tenancy • No-copy data and compute mobility support • Granularity of access-control • Richness of automation and extensibility capabilities • Flexibility and elasticity • Provider-agnostic/multi-cloud operations support • Variety of limitations (quotas, data volume, resource count, etc.) • Open Standards, Open Protocols, Open-Source Integrations lenadroid
  • 37. Examples of Data Mesh-friendly Technologies @lenadroid
  • 38. data Anthos Azure Arc Data Catalogue, Data Lineage, Data Governance OSS Data Analytics, Data Processing, Data Querying Cloud Storage Open Formats Data Ingestion, Streaming Data Orchestration, Workflows OSS Storage Products for Data Analytics and Processing Data Visualization and BI Tools Data Experimentation Cross-Platform Concepts and Tools Multi and Hybrid Cloud Tools Amazon S3 Azure Data Lake Google Storage Infrastructure Automation lenadroid
  • 39. Data Governance Systems • Metadata • Data lineage • Data schemas • Data relationships • Data classification • Data security • Data catalog lenadroid
  • 40. Open Formats • Open standard • Atomic updates, serializable isolation, transactions • Concurrent operations • Versioning, rollbacks, time-travel • Schema Evolution • Scale, Efficiency, Data Volumes • Compatibility with existing data stores and languages lenadroid
  • 41. Data Platforms (Cloud or OSS) • Separation of storage and compute • Support for no-copy data sharing • Bringing compute to data • Fine-tuned granularity of permissions for access • Support for automation and resource management • Open standards and interoperability with other platforms and tools for governance, visualization, analytics, etc. lenadroid
  • 42. Multi-Cloud Infrastructure Management • Terraform Open-source infrastructure as code software tool that enables you to safely and predictably create, change, and improve infrastructure. • Pulumi Open-source infrastructure as code SDK that enables you to create, deploy, and manage infrastructure on any cloud, using your favorite languages. • Crossplane Assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume, without having to write any code. lenadroid
  • 43. Multi-Cloud Workload Portability • Azure Arc Build cloud-native apps anywhere, at scale. Run Azure services in any Kubernetes environment, whether it’s on-premises, multi-cloud, or at the edge • Google Athnos A modern application management platform that provides a consistent development and operations experience for cloud and on-premises environments lenadroid
  • 44. Kubernetes Open-Standard Technologies • Open Application Model An open standard for defining cloud native apps. KubeVella - https://kubevela.io/docs/concepts • Open Policy Agent Declarative Policy-as-Code, enables portability, combination with Infra-as-Code. https://www.openpolicyagent.org/docs/latest • Service Catalog Provision managed services and make them available within a Kubernetes cluster. https://kubernetes.io/docs/concepts/extend-kubernetes/service-catalog/ NOT AN EXHAUSTIVE LIST lenadroid
  • 45. Benefits Brought by Data Mesh • Data Quality • Tailored resource and focus allocation • Organizational cohesion while allowing flexibility • Reducing complexity • Democratizing creating value • Better understanding of value and innovation opportunities • Empowering a more consistent and fast change @lenadroid
  • 46. Important Focus Areas for Technology Providers • Open Standards, Open Protocols, Open-Source Integrations • Workload sharing and multi-tenancy • No-copy data and compute mobility support • Granularity of access-control • Richness of automation and extensibility capabilities • Flexibility and elasticity • Provider-agnostic/multi-cloud operations support • Variety of limitations (quotas, data volume, resource count, etc.) @lenadroid
  • 47. Data Mesh will drive better Interoperability, Open Standards, and Data Quality in the Industry @lenadroid
  • 48. Thank you! Follow lenadroid for more insights