SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Navigate
Architecting
Modern Data Platforms
by ankitrathi.com
Content
• Data Architecture Principles
• Data Lake Basics
• High Level Architecture
• Data Characteristics
• Putting It All Together
• Product-Driven Data Architecture
• Reference Architecture
Data Architecture Principals
• Adhere to ADDA (Accessibility, Definition, Decoupling, Agility)
• Design for RSM (Reliability, Scalability, Maintainability)
• Use Right Tools
• Cloud Native/Agnostic
• Be Cost Conscious
Adhere to ADDA
Accessibility
Easily accessible data
for business
Definition
Data catalog for
simplified data
discovery
Decoupling
Decoupled layers for
flexibility
Agility
Agile enough to cater
evolving business
requirements
Design for RSM
Reliability
works correctly,
fault-tolerant
Scalability
adapts to growth
Maintainability
remains easy to maintain
Use Right Tools
Data Structure
Structured, Semi-
structured, Unstructured
Latency
Low, Medium, High
Throughput
High, Medium, Low
Access Pattern
Key-value, Search,
Transactions
Cloud Native/Agnostic
Cloud Native Cloud Agnostic
Pros:
• Better performance
• Better efficiency
• Lower costs (generic services)
Pros:
• Flexibility
• Minimal vendor lock-in
• Standard performance
Cons:
• Vendor lock-in
• Higher costs (specific services)
Cons:
• Underutilization of vendor capabilities
• Solution can become complex
• Performance, logging and monitoring
can take a hit
Be Cost Conscious
• Efficient consumption of services
• Select cost-conscious options
• Enforce policies and controls
Data Lake
• Data Lake Definition
• An architectural approach
• Massive heterogenous data stored centrally
• Available to diverse group of users
• To be categorized, processed, analyzed & consumed
• Data Lake Characteristics
• Structured, semi-structured & unstructured data
• Scaled out as required
• Diverse set of storage, analytics and ML/AI tools
• Designed for low-cost storage and analytics
High-Level Architecture
Process/
Analyse
Ingest Store Serve
Latency, Throughput, Cost
Data Actionable Insights
Ingest
Source Data Type Data
Web/Mobile Apps Records Transactions
Databases Records Transactions
Logging Search documents Files
Logging Log files Files
Messaging Messages Events
IoT Data Streams Events
Data Characteristics
Hot Warm Cold
Volume MB-GB GB-PB PB-EB
Item Size B-KB KB-MB KB-TB
Latency ms ms, sec min, hrs
Durability Low-high High Very high
Request Rate Very high High Low
Cost/GB $$-$ $-¢¢ ¢¢-¢
Data Characteristics
• Type of Data Structures
• Fixed Schema
• Schema Free
• Key-Value
• Type of Access Patterns
• Key-Value
• Simple relations (1:N, M:N)
• Multi-table joins, transactions
• Faceting, Search
Storage
In-memory
File Storage
NoSQL
SQL
Hot data Warm data Cold data
Structure
HighLow
Request rate, Cost per GBHigh Low
Latency, Data VolumeLow High
Analytics Types
• Message/Stream Analysis
• Interactive Analysis
• Batch Analysis
• Machine Learning/AI
ETL Processing
Process/AnalyseStore ETL
Serve
• Applications & APIs
• Analysis & Visualization
• Notebooks
• IDEs
Putting It All Together
Process/AnalyseStore
ETL
Ingest Serve
Web Apps
Mobile Apps
Data Centers
Logging
Messaging
Devices
Sensors
Cache
NoSQL
SQL
ElasticSearch
Object Storage
SQS
Streams
ML/AI
Interactive
Batch
Message
Streams
APIs
Analysis
Visualization
Notebooks
IDE
Records
Documents
Files
Messages
Streams
Security & Governance, Data Catalog
Product-Driven Data Architecture
Reference: https://martinfowler.com/articles/data-monolith-to-mesh.html
Reference Architecture - Azure
Reference: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end
Reference Architecture - AWS
Reference: https://docs.aws.amazon.com/solutions/latest/data-lake-solution/architecture.html
Reference Architecture - GCP
Reference: https://cloud.google.com/solutions/big-data
Navigate
Questions…?
Navigate
Thank You
ankitrathi.com

Weitere ähnliche Inhalte

Was ist angesagt?

Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksDatabricks
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introductionIBM Analytics
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseSnowflake Computing
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesPaul Van Siclen
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingDatabricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxCalvinSim10
 

Was ist angesagt? (20)

Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 
Data mesh
Data meshData mesh
Data mesh
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 

Ähnlich wie Architecting Modern Data Platforms

Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
MariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStoreMariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStoreMariaDB plc
 
MariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStoreMariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStoreMariaDB plc
 
Architectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoopArchitectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoopAnu Ravindranath
 
Serverless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightServerless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightAmazon Web Services
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Rukmani Gopalan
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Amazon Web Services
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 

Ähnlich wie Architecting Modern Data Platforms (20)

Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
MariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStoreMariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStore
 
MariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStoreMariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStore
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Architectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoopArchitectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoop
 
Serverless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightServerless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSight
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
 

Mehr von Ankit Rathi

5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every Business5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every BusinessAnkit Rathi
 
Kaggle Vs Real-world Projects
Kaggle Vs Real-world ProjectsKaggle Vs Real-world Projects
Kaggle Vs Real-world ProjectsAnkit Rathi
 
SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)Ankit Rathi
 
Data & AI Session @ RBS
Data & AI Session @ RBSData & AI Session @ RBS
Data & AI Session @ RBSAnkit Rathi
 
Data Professionals: Job of the Century
Data Professionals: Job of the CenturyData Professionals: Job of the Century
Data Professionals: Job of the CenturyAnkit Rathi
 
Cloud Computing for Data Professionals
Cloud Computing for Data ProfessionalsCloud Computing for Data Professionals
Cloud Computing for Data ProfessionalsAnkit Rathi
 
Data & AI Platform Concepts
Data & AI Platform ConceptsData & AI Platform Concepts
Data & AI Platform ConceptsAnkit Rathi
 
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Ankit Rathi
 
Artificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course OutlineArtificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course OutlineAnkit Rathi
 
Artificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course IntroArtificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course IntroAnkit Rathi
 
Auto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data AnonymizationAuto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data AnonymizationAnkit Rathi
 
Analytics Induction
Analytics InductionAnalytics Induction
Analytics InductionAnkit Rathi
 
Data Science Session
Data Science SessionData Science Session
Data Science SessionAnkit Rathi
 
Becoming Data-Driven
Becoming Data-DrivenBecoming Data-Driven
Becoming Data-DrivenAnkit Rathi
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with PythonAnkit Rathi
 
Data My Perspective
Data My PerspectiveData My Perspective
Data My PerspectiveAnkit Rathi
 
Big Data Overview
Big Data OverviewBig Data Overview
Big Data OverviewAnkit Rathi
 
Oracle DBKB Project
Oracle DBKB ProjectOracle DBKB Project
Oracle DBKB ProjectAnkit Rathi
 

Mehr von Ankit Rathi (19)

5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every Business5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every Business
 
Kaggle Vs Real-world Projects
Kaggle Vs Real-world ProjectsKaggle Vs Real-world Projects
Kaggle Vs Real-world Projects
 
SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)
 
Data & AI Session @ RBS
Data & AI Session @ RBSData & AI Session @ RBS
Data & AI Session @ RBS
 
Data Professionals: Job of the Century
Data Professionals: Job of the CenturyData Professionals: Job of the Century
Data Professionals: Job of the Century
 
Cloud Computing for Data Professionals
Cloud Computing for Data ProfessionalsCloud Computing for Data Professionals
Cloud Computing for Data Professionals
 
Data & AI Platform Concepts
Data & AI Platform ConceptsData & AI Platform Concepts
Data & AI Platform Concepts
 
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
 
Artificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course OutlineArtificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course Outline
 
Artificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course IntroArtificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course Intro
 
Auto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data AnonymizationAuto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data Anonymization
 
Analytics Induction
Analytics InductionAnalytics Induction
Analytics Induction
 
Data Science Session
Data Science SessionData Science Session
Data Science Session
 
Becoming Data-Driven
Becoming Data-DrivenBecoming Data-Driven
Becoming Data-Driven
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with Python
 
Data My Perspective
Data My PerspectiveData My Perspective
Data My Perspective
 
SPEM
SPEMSPEM
SPEM
 
Big Data Overview
Big Data OverviewBig Data Overview
Big Data Overview
 
Oracle DBKB Project
Oracle DBKB ProjectOracle DBKB Project
Oracle DBKB Project
 

Kürzlich hochgeladen

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 

Kürzlich hochgeladen (20)

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 

Architecting Modern Data Platforms

  • 2. Content • Data Architecture Principles • Data Lake Basics • High Level Architecture • Data Characteristics • Putting It All Together • Product-Driven Data Architecture • Reference Architecture
  • 3. Data Architecture Principals • Adhere to ADDA (Accessibility, Definition, Decoupling, Agility) • Design for RSM (Reliability, Scalability, Maintainability) • Use Right Tools • Cloud Native/Agnostic • Be Cost Conscious
  • 4. Adhere to ADDA Accessibility Easily accessible data for business Definition Data catalog for simplified data discovery Decoupling Decoupled layers for flexibility Agility Agile enough to cater evolving business requirements
  • 5. Design for RSM Reliability works correctly, fault-tolerant Scalability adapts to growth Maintainability remains easy to maintain
  • 6. Use Right Tools Data Structure Structured, Semi- structured, Unstructured Latency Low, Medium, High Throughput High, Medium, Low Access Pattern Key-value, Search, Transactions
  • 7. Cloud Native/Agnostic Cloud Native Cloud Agnostic Pros: • Better performance • Better efficiency • Lower costs (generic services) Pros: • Flexibility • Minimal vendor lock-in • Standard performance Cons: • Vendor lock-in • Higher costs (specific services) Cons: • Underutilization of vendor capabilities • Solution can become complex • Performance, logging and monitoring can take a hit
  • 8. Be Cost Conscious • Efficient consumption of services • Select cost-conscious options • Enforce policies and controls
  • 9. Data Lake • Data Lake Definition • An architectural approach • Massive heterogenous data stored centrally • Available to diverse group of users • To be categorized, processed, analyzed & consumed • Data Lake Characteristics • Structured, semi-structured & unstructured data • Scaled out as required • Diverse set of storage, analytics and ML/AI tools • Designed for low-cost storage and analytics
  • 10. High-Level Architecture Process/ Analyse Ingest Store Serve Latency, Throughput, Cost Data Actionable Insights
  • 11. Ingest Source Data Type Data Web/Mobile Apps Records Transactions Databases Records Transactions Logging Search documents Files Logging Log files Files Messaging Messages Events IoT Data Streams Events
  • 12. Data Characteristics Hot Warm Cold Volume MB-GB GB-PB PB-EB Item Size B-KB KB-MB KB-TB Latency ms ms, sec min, hrs Durability Low-high High Very high Request Rate Very high High Low Cost/GB $$-$ $-¢¢ ¢¢-¢
  • 13. Data Characteristics • Type of Data Structures • Fixed Schema • Schema Free • Key-Value • Type of Access Patterns • Key-Value • Simple relations (1:N, M:N) • Multi-table joins, transactions • Faceting, Search
  • 14. Storage In-memory File Storage NoSQL SQL Hot data Warm data Cold data Structure HighLow Request rate, Cost per GBHigh Low Latency, Data VolumeLow High
  • 15. Analytics Types • Message/Stream Analysis • Interactive Analysis • Batch Analysis • Machine Learning/AI
  • 17. Serve • Applications & APIs • Analysis & Visualization • Notebooks • IDEs
  • 18. Putting It All Together Process/AnalyseStore ETL Ingest Serve Web Apps Mobile Apps Data Centers Logging Messaging Devices Sensors Cache NoSQL SQL ElasticSearch Object Storage SQS Streams ML/AI Interactive Batch Message Streams APIs Analysis Visualization Notebooks IDE Records Documents Files Messages Streams Security & Governance, Data Catalog
  • 19. Product-Driven Data Architecture Reference: https://martinfowler.com/articles/data-monolith-to-mesh.html
  • 20. Reference Architecture - Azure Reference: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end
  • 21. Reference Architecture - AWS Reference: https://docs.aws.amazon.com/solutions/latest/data-lake-solution/architecture.html
  • 22. Reference Architecture - GCP Reference: https://cloud.google.com/solutions/big-data