SlideShare a Scribd company logo
1 of 69
Download to read offline
The Quest for an
Open Source Data Science Platform
@joerg_schad
Jörg Schad, PhD
● Previous
○ Suki.ai
○ Mesosphere
○ PhD Distributed
DB Systems
● @joerg_schad
@joerg_schad
3
What you want to be doing
4
Get
Data
Write intelligent machine learning code
Train
Model
Run
Model
Repeat
5
Sculley, D., Holt, G., Golovin, D. et al. Hidden Technical Debt in Machine Learning Systems
What you’re actually doing
6
Challenge: Persona(s)
8
Division of Labor
Configuration
Machine
Resource
Management
and
Monitoring
Serving
Infrastructure
Data Collection
Data
Verification
Process Management
Tools
Feature
Extraction
ML Analysis Tools
Model
Monitoring
Inspired by “Sculley, D., Holt, G., Golovin, D. et al. Hidden Technical Debt in Machine Learning
Systems” article
System Admin/ DevOps
Data Engineer/DataOps
Data Scientist
Machine Learning
Engineering Challenges
9
Das Bild kann nicht angezeigt werden.
Challenges
1. End-to-End pipelines as more than just infrastructure
2. Silos between Data Scientists and Ops
3. Reproducible model builds
4. Data management
5. Versioning: datasets, features, models, environments, pipelines, etc.
6. End-to-end metadata management
7. Resource management: CPU, GPU, TPU, etc.
8. …..
Do we need Data Science Engineering
Principles?
11
Software Engineering
The application of a systematic, disciplined,
quantifiable approach to the development,
operation, and maintenance of software
IEEE Standard Glossary of Software Engineering
Terminology
Do we need Data Science Engineering
Principles?
12
Software Engineering
The application of a systematic, disciplined,
quantifiable approach to the development,
operation, and maintenance of software
IEEE Standard Glossary of Software Engineering
Terminology
Das Bild kann nicht angezeigt werden.
Challenges
1. End-to-End pipelines as more than just infrastructure
2. Silos between Data Scientists and Ops
3. Reproducible model builds
4. Data management
5. Versioning: datasets, features, models, environments, pipelines, etc.
6. End-to-end metadata management
7. Resource management: CPU, GPU, TPU, etc.
8. …..
Solutions:
Machine Learning
Platforms
14
15
TensorFlow Dev Summit
https://medium.com/tensorflow/from-research-to-production-with-tfx-pipelines-and-ml-metadata-443a51dac188
16https://www.youtube.com/playlist?list=PLQY2H8rRoyvzoUYI26kHmKSJBedn3SQuB
TensorFlow Dev Summit
17https://www.youtube.com/playlist?list=PLQY2H8rRoyvzoUYI26kHmKSJBedn3SQuB
18https://www.youtube.com/playlist?list=PLQY2H8rRoyvzoUYI26kHmKSJBedn3SQuB
19https://www.youtube.com/playlist?list=PLQY2H8rRoyvzoUYI26kHmKSJBedn3SQuB
TensorFlow World
20
TensorFlow Extended
TensorFlow Extended
https://www.tensorflow.org/tfx/guide
TensorFlow Extended
https://www.tensorflow.org/tfx/guide
Kubeflow
https://www.kubeflow.org/docs/pipelines/pipelines-overview/
Sagemaker
https://aws.amazon.com/blogs/aws/sagemaker/
Logical Clocks Hops
https://www.logicalclocks.com/
Raw
Data
Event
Data
Monitor
HopsFS
Feature
Store Serving
Feature StoreData
Prep
Ingest DeployExperiment/Train
logs
logs
Metadata Store
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data & Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Resource and Service Management
TensorBoard
Model Library
Feature Catalogue
Notebook
Library
Metadata
27
28
Metadata… Which model to pick?
Common Metadata
• Accuracy
– Which...
• Latency
• Environments
• Data Privacy
• ….
29
Metadata
https://docs.google.com/document/d/104jv0BvQJ3unVEufmHVhJWgUAqJOk2CwsCpk5H5vYug/
Metadata in TFX
https://www.tensorflow.org/tfx/guide/mlmd
31
Metadata… How to store...
32
● Native Multi Model Database
○ Stores, K/V, Documents & Graphs
● Distributed
○ Graphs can span multiple nodes
● AQL - SQL-like multi -model query language
● ACID Transactions including Multi
Collection Transactions
33
ArangoSearch
GraphsDocum ents - JSON
{
"type": "pants",
"waist": 32,
"length": 34,
"color": "blue",
"material": "cotton"
}
{
"type": "television",
"diagonal size": 46,
"hdmi inputs": 3,
"wall mountable": true,
"built-in tuner": true,
"dynamic contrast": "50,000:1",
"Resolution": "1920x1080"
}
Key Values
K => V
K => V
K => V
K => V
K => V
K => V
K => V
K => V
K => V
K => V
K => V
K => V
34
Metadata… How to store...
Machine Learning
35
36
Machine Learning
Engineering Challenges
37
Do we need Data Science Engineering
Principles?
38
Software Engineering
The application of a systematic, disciplined,
quantifiable approach to the development,
operation, and maintenance of software
IEEE Standard Glossary of Software Engineering
Terminology
Do we need Data Science Engineering
Principles?
39
Software Engineering
The application of a systematic, disciplined,
quantifiable approach to the development,
operation, and maintenance of software
IEEE Standard Glossary of Software Engineering
Terminology
40
• Do I need Machine Learning? *
• Do I need {Neural Networks, Regression,...}*
• What dataset(s)?
– Quality?
• What target/serving environment?
• What model architecture?
• Pre-trained model available?
• How many training resources?
* Can I actually use ...
Challenge: Requirements Engineering
41
• Many adhocs model/training runs
• Regulatory Requirements
• Dependencies
• CI/CD
• Git
• Time-dependent features
Challenge: Reproducible Builds
Step 1: Training
(In Data Center - Over Hours/Days/Weeks)
Dog
Input:
Lots of Labeled
Data
Output:
Trained Model
Deep neural
network model
42
MFlow
43
MFlow Tracking
import mlflow
# Log parameters (key-value pairs)
mlflow.log_param("num_dimensions", 8)
mlflow.log_param("regularization", 0.1)
# Log a metric;
mlflow.log_metric("accuracy", 0.1)
...
mlflow.log_metric("accuracy", 0.45)
# Log artifacts (output files)
mlflow.log_artifact("roc.png")
mlflow.log_artifact("model.pkl")
44
MFlow Project
name: My Project
conda_env: conda.yaml
entry_points:
main:
parameters:
data_file: path
regularization: {type: float, default:
0.1}
command: "python train.py -r
{regularization} {data_file}"
validate:
parameters:
data_file: path
command: "python validate.py {data_file}"
$mlflow run example/project -P alpha=0.5
$mlflow run git@github.com:databricks/mlflow-example.git
45
MFlow Model
time_created: 2018-02-21T13:21:34.12
flavors:
sklearn:
sklearn_version: 0.19.1
pickled_model: model.pkl
python_function:
loader_module: mlflow.sklearn
pickled_model: model.pkl
$mlflow run example/project -P alpha=0.5
$mlflow run git@github.com:databricks/mlflow-example.git
46
Sculley, D., Holt, G., Golovin, D. et al. Hidden Technical Debt in Machine Learning Systems
Challenge: Persona(s)
Continuous Integration
Monitoring & Operations
Distributed Data
Storage and
Streaming
Data Preparation
and Analysis
Storage of trained
Models and
Metadata
Use trained Model
for Inference
Distributed
Training using
Machine Learning
Frameworks
Data &
Streaming
Model
Engineering
Model
Management
Model Serving
Model
Training
Resource and Service Management
TensorBoard
Model Library
Feature Store
Notebook
Library
Challenge: Metadata
Metadata Layer
1. https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-post-training-integer-quantization-
b4964a1ea9ba?postPublishedType=repub&linkId=68863403
2. https://blog.acolyer.org/2019/06/03/ease-ml-ci/
3. https://blog.acolyer.org/2019/06/05/data-validation-for-machine-learning/
4. https://databricks.com/blog/2019/06/06/announcing-the-mlflow-1-0-release.html
5. https://docs.google.com/document/d/104jv0BvQJ3unVEufmHVhJWgUAqJOk2CwsCpk5H5vYug/edit#
6. https://medium.com/tensorflow/from-research-to-production-with-tfx-pipelines-and-ml-metadata-
443a51dac188?linkId=68054243
7. https://github.com/tensorflow/tfx/tree/master/tfx/examples/chicago_taxi_pipeline
48
49
Challenge: Data Science IDE
50
Challenge: CICD
https://blog.acolyer.org/2019/06/03/ease-ml-ci/
51
Challenge: CICD
https://blog.acolyer.org/2019/06/03/ease-ml-ci/
52
Challenge: Testing
• Training/Test/Validation
Datasets
• Unit Tests?
• Different factors
– Accuracy
– Serving performance
– ….
• A/B Testing with live Data
• Shadow Serving
53
Challenge: Data Quality
• Data is typically not ready to be
consumed by ML job*
– Data Cleaning
• Missing/incorrect labels
– Data Preparation
• Same Format
• Same Distribution
* Demo datasets are a fortunate exception :)
Challenge: Data Validation
54
https://blog.acolyer.org/2019/06/05/data-validation-for-machine-learning/
https://www.tensorflow.org/tfx/guide/tfdv
Challenge: Data Validation
55
https://blog.acolyer.org/2019/06/05/data-validation-for-machine-learning/
https://www.tensorflow.org/tfx/guide/tfdv
56
Challenge: Data (Preprocessing) Sharing
Feature Catalogue
Data & Streaming
Model
Engineering
Model
Training
• Preprocessed Data Sets valuable
– Sharing
– Automatic Updating
• Feature Catalogue ⩬
Preprocessing Cache + Discovery
https://eng.uber.com/michelangelo/
Challenge: Features
Feature Stores
• Discoverability
• Consistency
• Versioning
• Monitoring
• Caching
• Backfill (for time-dependent features)
57
58
Challenge: Model Libraries
• Existing architectures
• Pretrained models
59
Machine Learning Model Serving
• Deploying models
– Choice…
– Metrics
• Updating models
– Zero downtime
– Target environment
• Testing models
– Test model with live data
• Ensemble Decision
– Multiple models working
together
https://www.tensorflow.org/tfx/guide/serving
60
Challenge: Distributed TensorFlow
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute
https://eng.uber.com/horovod/
61
Challenge: Distributed TensorFlow
https://eng.uber.com/horovod/
62
Horovod
https://eng.uber.com/horovod/
• All-Reduce to update
Parameter
– Bandwidth Optimal
• Uber Horovod is MPI based
– Difficult to set up
– Other Spark based
implementations
• Wait for TensorFlow 2.0 ;)
63
TF Distribution Strategy
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute
● MirroredStrategy: This does in-graph replication with synchronous training on many GPUs on one machine. Essentially, we
create copies of all variables in the model's layers on each device. We then use all-reduce to combine gradients across the
devices before applying them to the variables to keep them in sync.
● CollectiveAllReduceStrategy: This is a version of MirroredStrategy for multi-working training. It uses a collective op to do all-
reduce. This supports between-graph communication and synchronization, and delegates the specifics of the all-reduce
implementation to the runtime (as opposed to encoding it in the graph). This allows it to perform optimizations like batching
and switch between plugins that support different hardware or algorithms. In the future, this strategy will implement fault-
tolerance to allow training to continue when there is worker failure.
● ParameterServerStrategy: This strategy supports using parameter servers either for multi-GPU local training or
asynchronous multi-machine training. When used to train locally, variables are not mirrored, instead they placed on the
CPU and operations are replicated across all local GPUs. In a multi-machine setting, some are designated as workers and
some as parameter servers. Each variable is placed on one parameter server. Computation operations are replicated
across all GPUs of the workers.
64
Challenge: Writing Distributed Model Functions
65
Challenge: Debugging
https://www.tensorflow.org/programmers_guide/debugger
66
Profiling
https://www.tensorflow.org/performance/performance_guide
• Crucial when using “expensive”
devices
• Memory Access Pattern
• “Secret knowledge”
• More is not necessarily better....
67
Hyperparameter Optimization
Step 1: Training
(In Data Center - Over Hours/Days/Weeks)
Dog
Input:
Lots of Labeled
Data
Output:
Trained Model
Deep neural
network model
https://towardsdatascience.com/understanding-hyperparameters-and-its-
optimisation-techniques-f0debba07568
● Networks Shape
● Learning Rate
● ...
68
Model Optimization
https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-post-training-integer-quantization-b4964a1ea9ba
69
Model Optimization
70
Challenge: Monitoring
• Understand {...}
• Debug
• Model Quality
– Accuracy
– Training Time
– …
• Overall Architecture
– Availability
– Latencies
– ...
• TensorBoard
• Traditional Cluster Monitoring
Tool

More Related Content

What's hot

Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Databricks
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
Databricks
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
Databricks
 

What's hot (20)

Scaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflowScaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflow
 
mlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecyclemlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecycle
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Porting R Models into Scala Spark
Porting R Models into Scala SparkPorting R Models into Scala Spark
Porting R Models into Scala Spark
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Feature drift monitoring as a service for machine learning models at scale
Feature drift monitoring as a service for machine learning models at scaleFeature drift monitoring as a service for machine learning models at scale
Feature drift monitoring as a service for machine learning models at scale
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
Splice Machine's use of Apache Spark and MLflow
Splice Machine's use of Apache Spark and MLflowSplice Machine's use of Apache Spark and MLflow
Splice Machine's use of Apache Spark and MLflow
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
 
NextGenML
NextGenML NextGenML
NextGenML
 
Importance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowImportance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLow
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
 

Similar to The Quest for an Open Source Data Science Platform

OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
Paco Nathan
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
Chengjen Lee
 

Similar to The Quest for an Open Source Data Science Platform (20)

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Deconstructing a Machine Learning Pipeline with Virtual Data Lake
Deconstructing a Machine Learning Pipeline with Virtual Data LakeDeconstructing a Machine Learning Pipeline with Virtual Data Lake
Deconstructing a Machine Learning Pipeline with Virtual Data Lake
 
Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to production
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
 
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
 
Tango with django
Tango with djangoTango with django
Tango with django
 
2018 data engineering for ml asset management for features and models
2018 data engineering for ml asset management for features and models2018 data engineering for ml asset management for features and models
2018 data engineering for ml asset management for features and models
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning process
 
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data SetsApache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 

More from QAware GmbH

"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
QAware GmbH
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
QAware GmbH
 

More from QAware GmbH (20)

50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzFully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
 
Down the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureDown the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile Architecture
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
 
Make Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringMake Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform Engineering
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightDer Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
 
Was kommt nach den SPAs
Was kommt nach den SPAsWas kommt nach den SPAs
Was kommt nach den SPAs
 
Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
 
Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPKontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
 
Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.
 
Per Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysPer Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API Gateways
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 

Recently uploaded

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Recently uploaded (20)

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 

The Quest for an Open Source Data Science Platform