SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Anatomy of a Data Driven
Architecture
@tamir_dresher
System Architect @ Payoneer
2
System Architect @ @tamir_dresher
Tamir Dresher
My Books:
Software Engineering Lecturer
Ruppin Academic Center
tamirdr@payoneer.com
https://www.israelclouds.com/iasaisrael
3
The need for DATA
Your Data
Data-Driven Decision Making Data-powered product
• What markets are leading and where can I expand?
• What’s slowing my process?
• Is there a correlation between the time invested
in a sale and the income from the tenant?
• What products should I recommend to this user?
• Is this action fraudulent ?
• Should I suggest a discount to this user to
raise the chance for a purchase?
4
ETL vs. ELT vs. Streaming
Transform LoadExtract
Transactional DB (OLTP) Analytical DB (OLAP)
LoadExtract
Transactional DB (OLTP)
Analytical DB (OLAP) / Storage
Transform
Events
Event
Stream
Stream
Processor
Real-time
insights
5
Lambda (λ) & Kappa (ϰ) Architectures
Speed Layer
Batch Layer Serving Layer
Real-time
views
Batched
views
Query
Data
Lambda
Speed/Streaming
Layer
Serving
Layer
Speed
views
QueryData
Kappa
Processing
6
The Data Pipeline
Architecture
7Cross Cutting and Ops
The Data Pipeline Architecture
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
(Visualization,
Alerting,
Insights)
Discovery/Catalog/
Metadata Management
Security Quality&Testing Observability
8
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Databases Files Events APIs
• CDC - change data
capture
• Object Storage
• Structured/
Unstructured
• Customer tracking
• WebHooks
• Domain Events
• External Services
• Enrichment
9
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitchdata, rivery
Data Sources
10
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitch
11
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitchdata, rivery
Data Sources
Customized batching/micro-batching
• Spark jobs, Data libraries (Pandas, boto), Hive
• Workflows – Airflow, Dagster, Luigi
12
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitchdata, rivery
Data Sources
Customized batching/micro-batching
• Spark jobs, Data libraries (panda, boto), Hive
• Workflows – Airflow, Dagster, Luigi
https://towardsdatascience.com/building-a-production-level-etl-pipeline-platform-using-apache-airflow-a4cf34203fbd
13
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitchdata, rivery
Data Sources
Customized batching/micro-batching
• Spark jobs, Data libraries (panda, boto), Hive
• Workflows – Airflow, Dagster, Luigi
Event streaming and processing
• Messaging - Kafka, Pulsar, Kinesis, Event Hub
• Processing – Spark, Flink, Samza, Kafka
Streams, Azure Stream Analytics
14
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Integration Platforms
• Connectors to sources and destinations
• E.g. fivtran, stitchdata, rivery
Data Sources
Customized batching/micro-batching
• Spark jobs, Data libraries (panda, boto), Hive
• Workflows – Airflow, Dagster, Luigi
Event streaming and processing
• Messaging - Kafka, Pulsar, Kinesis, Event Hub
• Processing – Spark, Flink, Samza, Kafka
Streams, Azure Stream Analytics
-- Continuously aggregating a stream into a table with a ksqlDB push query.
CREATE STREAM locationUpdatesStream ...;
CREATE TABLE locationsPerUser AS
SELECT username, COUNT(*)
FROM locationUpdatesStream
GROUP BY username
EMIT CHANGES;
// Continuously aggregating to table
KStream<String, String> locationUpdatesStream = ...;
KTable<String, Long> locationsPerUser
= locationUpdatesStream
.groupBy((k, v) -> v.username)
.count();
https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/
15
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Data Warehouse
• Structured format
• Designed to quickly generate
insights based on SQL like
queries
• Modern cloud base offerings –
Redshift, BigQuery, Snowflake,
Azure Synapse
Data Lake
• Structured and non-structured
data – CSV, Parquet, Images,
Audio
• Raw and historical data
• Designed to be used by data
scientists and create models by
various languages
16
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Predictive
• Generate model
• Data Science and ML libraries –
Pandas, Numpy, R, scikit etc
• The model is periodically refreshed
Storage
Retrospective (Historical)
• Deriving Intelligence based on
statistics
• Built-in engine (Data Warehouse)
OR
• Query Engines – Presto, Impala
Real Time Analytics
• Run analytical queries
over big volumes of data
with interactive latencies.
• Apache Pinot,
Clickhouse, Druid
Data Science Platforms
• Helps managing the workflows,
productization and operations
- SageMaker, Iguazio, DataBricks etc.
17
Data Sources
Ingestion
&
Transformation
Storage
Query
&
Processing
Consumption
Custom Apps
• Execute the model – how is the model
reachable?
• Translate user/system actions to
queries
• Visualize – custom (Plotly Dash,
Streamlit etc) or embedded (PowerBI,
Looker etc)
External Apps
• Augmented Analytics - External
services to generate insights and
explain them
(e.g. Anomaly Detection with Anodot/
CrunchMetrics/outlier.ai)
• Customizable Reports and Dashboard
(e.g. Looker, Tableu, PowerBI, Sisense)
18
Summary
Data
at Rest
Data in
Motion Event/Dat
a Stream
Workflow Engine
Stream
Processor
Real-time
insights
Analytical
Storage
/Lake
Query
Engine
Model
Engine
Model
Accessible API
Application
Data Sources Ingestion&Transformation Storage Query&Processing Consume
19
Data at
Rest
Data in
Motion Event/Data
Stream
Workflow Engine
Stream
Processor
Real-time
insights
Analytical
Storage
/Lake
Query
Engine
Model
Engine
Model
Accessible API
Application
Data Sources Ingestion&Transformation Storage Query&Processing Consume
Thank You!
@tamir_dresher

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastDatabricks
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Designing and Building Next Generation Data Pipelines at Scale with Structure...
Designing and Building Next Generation Data Pipelines at Scale with Structure...Designing and Building Next Generation Data Pipelines at Scale with Structure...
Designing and Building Next Generation Data Pipelines at Scale with Structure...Databricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline DevelopmentTimothy Spann
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 

Was ist angesagt? (20)

Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Designing and Building Next Generation Data Pipelines at Scale with Structure...
Designing and Building Next Generation Data Pipelines at Scale with Structure...Designing and Building Next Generation Data Pipelines at Scale with Structure...
Designing and Building Next Generation Data Pipelines at Scale with Structure...
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 

Ähnlich wie Anatomy of a data driven architecture - Tamir Dresher

Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Zhenxiao Luo
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streamsRadu Tudoran
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakeDatabricks
 
[WSO2Con USA 2018] The Rise of Streaming SQL
[WSO2Con USA 2018] The Rise of Streaming SQL[WSO2Con USA 2018] The Rise of Streaming SQL
[WSO2Con USA 2018] The Rise of Streaming SQLWSO2
 
MongoDB World 2019: Streaming ETL on the Shoulders of Giants
MongoDB World 2019: Streaming ETL on the Shoulders of GiantsMongoDB World 2019: Streaming ETL on the Shoulders of Giants
MongoDB World 2019: Streaming ETL on the Shoulders of GiantsMongoDB
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWSSungmin Kim
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark Summit
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureLuan Moreno Medeiros Maciel
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsPat Patterson
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Amazon Web Services LATAM
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream ProcessingJorge Hirtz
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comCedric Vidal
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase DataWorks Summit
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

Ähnlich wie Anatomy of a data driven architecture - Tamir Dresher (20)

Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streams
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta Lake
 
[WSO2Con USA 2018] The Rise of Streaming SQL
[WSO2Con USA 2018] The Rise of Streaming SQL[WSO2Con USA 2018] The Rise of Streaming SQL
[WSO2Con USA 2018] The Rise of Streaming SQL
 
The Rise of Streaming SQL
The Rise of Streaming SQLThe Rise of Streaming SQL
The Rise of Streaming SQL
 
MongoDB World 2019: Streaming ETL on the Shoulders of Giants
MongoDB World 2019: Streaming ETL on the Shoulders of GiantsMongoDB World 2019: Streaming ETL on the Shoulders of Giants
MongoDB World 2019: Streaming ETL on the Shoulders of Giants
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSets
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Mehr von Tamir Dresher

NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfTamir Dresher
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher
 
Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher
 
Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher
 
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
Tamir Dresher   Clarizen adventures with the wild GC during the holiday seasonTamir Dresher   Clarizen adventures with the wild GC during the holiday season
Tamir Dresher Clarizen adventures with the wild GC during the holiday seasonTamir Dresher
 
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019Tamir Dresher
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019Tamir Dresher
 
Tamir Dresher - Demystifying the Core of .NET Core
Tamir Dresher  - Demystifying the Core of .NET CoreTamir Dresher  - Demystifying the Core of .NET Core
Tamir Dresher - Demystifying the Core of .NET CoreTamir Dresher
 
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Tamir Dresher
 
.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir DresherTamir Dresher
 
Testing time and concurrency Rx
Testing time and concurrency RxTesting time and concurrency Rx
Testing time and concurrency RxTamir Dresher
 
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherBuilding responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherTamir Dresher
 
.NET Debugging tricks you wish you knew tamir dresher
.NET Debugging tricks you wish you knew   tamir dresher.NET Debugging tricks you wish you knew   tamir dresher
.NET Debugging tricks you wish you knew tamir dresherTamir Dresher
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherTamir Dresher
 
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx  - CodeMash2017 - Tamir DresherBuilding responsive applications with Rx  - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx - CodeMash2017 - Tamir DresherTamir Dresher
 
Debugging tricks you wish you knew - Tamir Dresher
Debugging tricks you wish you knew  - Tamir DresherDebugging tricks you wish you knew  - Tamir Dresher
Debugging tricks you wish you knew - Tamir DresherTamir Dresher
 
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
Rx 101  - Tamir Dresher - Copenhagen .NET User GroupRx 101  - Tamir Dresher - Copenhagen .NET User Group
Rx 101 - Tamir Dresher - Copenhagen .NET User GroupTamir Dresher
 
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherCloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherTamir Dresher
 
Reactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceReactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceTamir Dresher
 
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Rx 101   Codemotion Milan 2015 - Tamir DresherRx 101   Codemotion Milan 2015 - Tamir Dresher
Rx 101 Codemotion Milan 2015 - Tamir DresherTamir Dresher
 

Mehr von Tamir Dresher (20)

NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptx
 
Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6
 
Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#
 
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
Tamir Dresher   Clarizen adventures with the wild GC during the holiday seasonTamir Dresher   Clarizen adventures with the wild GC during the holiday season
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
 
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
 
Tamir Dresher - Demystifying the Core of .NET Core
Tamir Dresher  - Demystifying the Core of .NET CoreTamir Dresher  - Demystifying the Core of .NET Core
Tamir Dresher - Demystifying the Core of .NET Core
 
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
 
.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher
 
Testing time and concurrency Rx
Testing time and concurrency RxTesting time and concurrency Rx
Testing time and concurrency Rx
 
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherBuilding responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresher
 
.NET Debugging tricks you wish you knew tamir dresher
.NET Debugging tricks you wish you knew   tamir dresher.NET Debugging tricks you wish you knew   tamir dresher
.NET Debugging tricks you wish you knew tamir dresher
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
 
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx  - CodeMash2017 - Tamir DresherBuilding responsive applications with Rx  - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
 
Debugging tricks you wish you knew - Tamir Dresher
Debugging tricks you wish you knew  - Tamir DresherDebugging tricks you wish you knew  - Tamir Dresher
Debugging tricks you wish you knew - Tamir Dresher
 
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
Rx 101  - Tamir Dresher - Copenhagen .NET User GroupRx 101  - Tamir Dresher - Copenhagen .NET User Group
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
 
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherCloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
 
Reactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceReactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 Conference
 
Rx 101 Codemotion Milan 2015 - Tamir Dresher
Rx 101   Codemotion Milan 2015 - Tamir DresherRx 101   Codemotion Milan 2015 - Tamir Dresher
Rx 101 Codemotion Milan 2015 - Tamir Dresher
 

Kürzlich hochgeladen

LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 

Kürzlich hochgeladen (20)

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

Anatomy of a data driven architecture - Tamir Dresher

  • 1. Anatomy of a Data Driven Architecture @tamir_dresher System Architect @ Payoneer
  • 2. 2 System Architect @ @tamir_dresher Tamir Dresher My Books: Software Engineering Lecturer Ruppin Academic Center tamirdr@payoneer.com https://www.israelclouds.com/iasaisrael
  • 3. 3 The need for DATA Your Data Data-Driven Decision Making Data-powered product • What markets are leading and where can I expand? • What’s slowing my process? • Is there a correlation between the time invested in a sale and the income from the tenant? • What products should I recommend to this user? • Is this action fraudulent ? • Should I suggest a discount to this user to raise the chance for a purchase?
  • 4. 4 ETL vs. ELT vs. Streaming Transform LoadExtract Transactional DB (OLTP) Analytical DB (OLAP) LoadExtract Transactional DB (OLTP) Analytical DB (OLAP) / Storage Transform Events Event Stream Stream Processor Real-time insights
  • 5. 5 Lambda (λ) & Kappa (ϰ) Architectures Speed Layer Batch Layer Serving Layer Real-time views Batched views Query Data Lambda Speed/Streaming Layer Serving Layer Speed views QueryData Kappa Processing
  • 7. 7Cross Cutting and Ops The Data Pipeline Architecture Data Sources Ingestion & Transformation Storage Query & Processing Consumption (Visualization, Alerting, Insights) Discovery/Catalog/ Metadata Management Security Quality&Testing Observability
  • 8. 8 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Databases Files Events APIs • CDC - change data capture • Object Storage • Structured/ Unstructured • Customer tracking • WebHooks • Domain Events • External Services • Enrichment
  • 9. 9 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitchdata, rivery Data Sources
  • 10. 10 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitch
  • 11. 11 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitchdata, rivery Data Sources Customized batching/micro-batching • Spark jobs, Data libraries (Pandas, boto), Hive • Workflows – Airflow, Dagster, Luigi
  • 12. 12 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitchdata, rivery Data Sources Customized batching/micro-batching • Spark jobs, Data libraries (panda, boto), Hive • Workflows – Airflow, Dagster, Luigi https://towardsdatascience.com/building-a-production-level-etl-pipeline-platform-using-apache-airflow-a4cf34203fbd
  • 13. 13 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitchdata, rivery Data Sources Customized batching/micro-batching • Spark jobs, Data libraries (panda, boto), Hive • Workflows – Airflow, Dagster, Luigi Event streaming and processing • Messaging - Kafka, Pulsar, Kinesis, Event Hub • Processing – Spark, Flink, Samza, Kafka Streams, Azure Stream Analytics
  • 14. 14 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Integration Platforms • Connectors to sources and destinations • E.g. fivtran, stitchdata, rivery Data Sources Customized batching/micro-batching • Spark jobs, Data libraries (panda, boto), Hive • Workflows – Airflow, Dagster, Luigi Event streaming and processing • Messaging - Kafka, Pulsar, Kinesis, Event Hub • Processing – Spark, Flink, Samza, Kafka Streams, Azure Stream Analytics -- Continuously aggregating a stream into a table with a ksqlDB push query. CREATE STREAM locationUpdatesStream ...; CREATE TABLE locationsPerUser AS SELECT username, COUNT(*) FROM locationUpdatesStream GROUP BY username EMIT CHANGES; // Continuously aggregating to table KStream<String, String> locationUpdatesStream = ...; KTable<String, Long> locationsPerUser = locationUpdatesStream .groupBy((k, v) -> v.username) .count(); https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/
  • 15. 15 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Data Warehouse • Structured format • Designed to quickly generate insights based on SQL like queries • Modern cloud base offerings – Redshift, BigQuery, Snowflake, Azure Synapse Data Lake • Structured and non-structured data – CSV, Parquet, Images, Audio • Raw and historical data • Designed to be used by data scientists and create models by various languages
  • 16. 16 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Predictive • Generate model • Data Science and ML libraries – Pandas, Numpy, R, scikit etc • The model is periodically refreshed Storage Retrospective (Historical) • Deriving Intelligence based on statistics • Built-in engine (Data Warehouse) OR • Query Engines – Presto, Impala Real Time Analytics • Run analytical queries over big volumes of data with interactive latencies. • Apache Pinot, Clickhouse, Druid Data Science Platforms • Helps managing the workflows, productization and operations - SageMaker, Iguazio, DataBricks etc.
  • 17. 17 Data Sources Ingestion & Transformation Storage Query & Processing Consumption Custom Apps • Execute the model – how is the model reachable? • Translate user/system actions to queries • Visualize – custom (Plotly Dash, Streamlit etc) or embedded (PowerBI, Looker etc) External Apps • Augmented Analytics - External services to generate insights and explain them (e.g. Anomaly Detection with Anodot/ CrunchMetrics/outlier.ai) • Customizable Reports and Dashboard (e.g. Looker, Tableu, PowerBI, Sisense)
  • 18. 18 Summary Data at Rest Data in Motion Event/Dat a Stream Workflow Engine Stream Processor Real-time insights Analytical Storage /Lake Query Engine Model Engine Model Accessible API Application Data Sources Ingestion&Transformation Storage Query&Processing Consume
  • 19. 19 Data at Rest Data in Motion Event/Data Stream Workflow Engine Stream Processor Real-time insights Analytical Storage /Lake Query Engine Model Engine Model Accessible API Application Data Sources Ingestion&Transformation Storage Query&Processing Consume Thank You! @tamir_dresher

Hinweis der Redaktion

  1. Data comes from many sources and feed our application We need it for the OLTP obviously but being the new gold we can use the data feed to and historical data to gain more insights BI ML/AI data-driven decision making (analytic systems) and drive data-powered products
  2. Traditionally , orgs moved the data from the OLTP to the OLAP (DWH) with ETL Modern architectures now relies on ELT for batched load And to get the best real time response streaming is the way to go
  3. Traditionally , orgs moved the data from the OLTP to the OLAP (DWH) with ETL Modern architectures now relies on ELT for batched load And to get the best real time response streaming is the way to go
  4. https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/how-to-build-a-data-architecture-to-drive-innovation-today-and-tomorrow#
  5. https://leventov.medium.com/comparison-of-the-open-source-olap-systems-for-big-data-clickhouse-druid-and-pinot-8e042a5ed1c7