Flink Forward San Francisco 2018 keynote: Anand Iyer - "Apache Flink + Apache Beam: Expanding the horizons of Big Data"

•

2 gefällt mir•1,553 views

Over the past few months, the Apache Flink and Apache Beam communities have been busy developing an industry leading solution to author batch and streaming pipelines with Python. This was made possible by a significant effort to revamp Beam’s portability framework, build the corresponding Flink Runner, and simplify Flink’s artifact distribution & deployment mechanisms. What is the “killer big-data app” enabled by this integration: production TensorFlow pipelines. Building production machine learning pipelines that process large distributed data sets can get complex. In this talk, we will describe a set of open source libraries developed at Google, that simplify and unify pre and post processing stages for a production TensorFlow pipeline. These libraries are authored on Beam’s python SDK, and can be run on Apache Flink at scale. Last, but not least, we will describe how Beam & Flink aim to bring the power of big-data to newer audiences, in particular, developers of the Go programming language.

© 2017 Google Inc. All rights reserved.
Expanding the horizons of Big-Data
Apache Flink
+
Apache Beam
Presenter:
Anand Iyer
Product Manager, Google Cloud

© 2017 Google Inc. All rights reserved.
Rich history of collaboration
Unified
Batch & Streaming
Comprehensive
streaming
semantics &
correctness
Streaming SQL
(w/ Apache
Calcite)
Unified
Batch & Streaming
Comprehensive
streaming
semantics &
correctness
Streaming SQL
(w/ Apache Calcite)

© 2017 Google Inc. All rights reserved.
Flexible Big-Data Platform for Batch & Streaming
Java Python...
Machine
Learning
Genomics ...
Time
Series
...
Horizontal Framework in multiple languages
Vertical
Solutions
via domain
specific
libraries &
tools
...

© 2017 Google Inc. All rights reserved.
Making the power of Flink available in multiple languages

© 2017 Google Inc. All rights reserved.
The Apache Beam Model

© 2017 Google Inc. All rights reserved.
Cross-language Portability Framework
Language agnostic abstractions are at the core of the Beam Model
Language B
SDK
Language A
SDK
Language C
SDK
Runner 1 Runner 3Runner 2
The Beam Model
Language A Language CLanguage B
The Beam Model

© 2017 Google Inc. All rights reserved.
Prototype Flink Runner
❏ Works with Beam’s Python SDK
❏ Collaborators: Flink, Beam, Lyft,
GetInData
❏ https://issues.apache.org/jira/browse/
BEAM-2889
❏ For updates, please subscribe to
Apache Flink and Apache Beam Blogs

© 2017 Google Inc. All rights reserved.
Prototype Flink Runner
❏ Currently supports batch workloads
❏ Streaming capabilities on the roadmap

© 2017 Google Inc. All rights reserved.
Tools & libraries for compelling use case verticals

© 2017 Google Inc. All rights reserved.
Machine-Learning + Big-Data
Joined at the Hip

© 2017 Google Inc. All rights reserved.

© 2017 Google Inc. All rights reserved.
ML
Code
Because, in addition to the actual ML...

© 2017 Google Inc. All rights reserved.
ML
Cod
...you have to worry about so much more.
Configuration
Data Collection
Data
Verification
Feature Engineering
Process Management
Tools
Analysis Tools
Machine
Resource
Management
Serving
Infrastructure
Monitoring
ML
Code

© 2017 Google Inc. All rights reserved.
TensorFlow Transform
Consistent In-Graph Transformations in Training and Serving

© 2017 Google Inc. All rights reserved.
Typical ML Pipeline
batch processing
During training
“live” processing
During serving
data request

© 2017 Google Inc. All rights reserved.
Typical ML Pipeline
batch processing
During training
“live” processing
During serving
data request

© 2017 Google Inc. All rights reserved.
TensorFlow Transform
tf.Transform batch processing
During training
transform as tf.Graph
During serving
data request

© 2017 Google Inc. All rights reserved.
Scale to ... Bag of Words / N-Grams
Bucketization Feature Crosses
tft.ngrams
tft.string_to_int
tf.string_split
tft.scale_to_z_score
tft.apply_buckets
tft.quantiles
tft.string_to_int
tf.string_join
...
Rich collection of pre-implemented transforms

© 2017 Google Inc. All rights reserved.
Apply another TensorFlow Model
tft.apply_saved_model
Scale to ... Bag of Words / N-Grams
Bucketization Feature Crosses
tft.ngrams
tft.string_to_int
tf.string_split
tft.scale_to_z_score
tft.apply_buckets
tft.quantiles
tft.string_to_int
tf.string_join
...
Rich collection of pre-implemented transforms

© 2017 Google Inc. All rights reserved.
github.com/tensorflow/transform

© 2017 Google Inc. All rights reserved.
TensorFlow Model Analysis
Scalable, sliced, and full-pass metrics

© 2017 Google Inc. All rights reserved.
Analyzing model mistakes by subgroup
Specificity (False Positive Rate)
Sensitivity(TruePositiveRate)
ROC Curve
All groups
Learn more at ml-fairness.com

© 2017 Google Inc. All rights reserved.
Analyzing model mistakes by subgroup
Learn more at ml-fairness.com
Specificity (False Positive Rate)
Sensitivity(TruePositiveRate)
ROC Curve
All groups
Group A
Group B

© 2017 Google Inc. All rights reserved.
TensorFlow Model Analysis

© 2017 Google Inc. All rights reserved.
github.com/tensorflow/model-analysis
https://medium.com/tensorflow/introducing-tensorflow-model-anal
ysis-scaleable-sliced-and-full-pass-metrics-5cde7baf0b7b

© 2017 Google Inc. All rights reserved.
Tensorflow Extended
https://youtu.be/vdG7uKQ2eKk

© 2017 Google Inc. All rights reserved.
TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. KDD (2017).
https://youtu.be/fPTwLVCq00U

© 2017 Google Inc. All rights reserved.
Beam library for transforming and processing VCF files at scale
https://github.com/googlegenomics/gcp-variant-transforms
Big-Data in Genomics

© 2017 Google Inc. All rights reserved.
Come join us!
Complete the Flink
Runner
Add new language
SDKs.
Javascript anyone?
Build vertical
libraries & tools

Empfohlen

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...Flink Forward

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH Flink Forward

Streaming your Lyft Ride Prices - Flink Forward SF 2019

Streaming your Lyft Ride Prices - Flink Forward SF 2019

Streaming your Lyft Ride Prices - Flink Forward SF 2019Thomas Weise

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...Flink Forward

Realizing the promise of portability with Apache Beam

Realizing the promise of portability with Apache Beam

Realizing the promise of portability with Apache BeamJ On The Beach

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...Flink Forward

Building Applications with Streams and Snapshots

Building Applications with Streams and Snapshots

Building Applications with Streams and SnapshotsJ On The Beach

Empfohlen

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...

Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...

Flink Forward San Francisco 2019: Managing Flink on Kubernetes - FlinkK8sOper...Flink Forward

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH

A stream: Ad-hoc Shared Stream Processing - Jeyhun Karimov, DFKI GmbH Flink Forward

Streaming your Lyft Ride Prices - Flink Forward SF 2019

Streaming your Lyft Ride Prices - Flink Forward SF 2019

Streaming your Lyft Ride Prices - Flink Forward SF 2019Thomas Weise

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...

Flink Forward San Francisco 2019: Apache Beam portability in the times of rea...Flink Forward

Realizing the promise of portability with Apache Beam

Realizing the promise of portability with Apache Beam

Realizing the promise of portability with Apache BeamJ On The Beach

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...Flink Forward

Building Applications with Streams and Snapshots

Building Applications with Streams and Snapshots

Building Applications with Streams and SnapshotsJ On The Beach

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source toolsAll Things Open

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuFlink Forward

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink MeetupBowen Li

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...Flink Forward

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...Flink Forward

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...Flink Forward

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...HostedbyConfluent

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Flink Forward

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...Flink Forward

Do Flink on Web with FLOW

Do Flink on Web with FLOW

Do Flink on Web with FLOWDongwon Kim

dA Platform Overview

dA Platform Overview

dA Platform OverviewRobert Metzger

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Till Rohrmann

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, YelpFlink Forward

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Flink Forward

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...Flink Forward

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006Randy Huang

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...HostedbyConfluent

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connectconfluent

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent

Weitere ähnliche Inhalte

Was ist angesagt?

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source toolsAll Things Open

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuFlink Forward

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink MeetupBowen Li

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...Flink Forward

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...Flink Forward

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...Flink Forward

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...HostedbyConfluent

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Flink Forward

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...Flink Forward

Do Flink on Web with FLOW

Do Flink on Web with FLOW

Do Flink on Web with FLOWDongwon Kim

dA Platform Overview

dA Platform Overview

dA Platform OverviewRobert Metzger

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Till Rohrmann

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, YelpFlink Forward

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Flink Forward

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...Flink Forward

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006Randy Huang

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...Flink Forward

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...HostedbyConfluent

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connectconfluent

Was ist angesagt? (20)

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source tools

Implementing MySQL Database-as-a-Service using open source tools

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Apache Flink @ Alibaba - Seattle Apache Flink Meetup

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Flink Forward Berlin 2018: Viktor Klang - Keynote "The convergence of stream ...

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Achieving end-to-end visibility into complex event-sourcing transactions usin...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Flink Forward San Francisco 2018 keynote: Srikanth Satya - "Stream Processin...

Do Flink on Web with FLOW

Do Flink on Web with FLOW

Do Flink on Web with FLOW

dA Platform Overview

dA Platform Overview

dA Platform Overview

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Flink Forward San Francisco 2018: Ken Krugler - "Building a scalable focused ...

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006

Apache Beam @ GCPUG.TW Flink.TW 20161006

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Flink Forward San Francisco 2018: Xu Yang - "Alibaba’s common algorithm platf...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connect

End to-end large messages processing with Kafka Streams & Kafka Connect

Ähnlich wie Flink Forward San Francisco 2018 keynote: Anand Iyer - "Apache Flink + Apache Beam: Expanding the horizons of Big Data"

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHostedbyConfluent

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...Edureka!

Hopsworks at Google AI Huddle, Sunnyvale

Hopsworks at Google AI Huddle, Sunnyvale

Hopsworks at Google AI Huddle, SunnyvaleJim Dowling

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...Lviv Startup Club

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und ExpertenAWS Germany

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSync

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSync

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSyncAmazon Web Services

Ten compelling reasons to learn .net framework

Ten compelling reasons to learn .net framework

Ten compelling reasons to learn .net frameworkJanBask Training

Comparative Study of programming Languages

Comparative Study of programming Languages

Comparative Study of programming LanguagesIshan Monga

Machine Learning State of the Union - MCL210 - re:Invent 2017

Machine Learning State of the Union - MCL210 - re:Invent 2017

Machine Learning State of the Union - MCL210 - re:Invent 2017Amazon Web Services

Artificial Intelligence (Machine Learning) on AWS: How to Start

Artificial Intelligence (Machine Learning) on AWS: How to Start

Artificial Intelligence (Machine Learning) on AWS: How to StartVladimir Simek

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...Information Development World

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...Amazon Web Services

Ashutosh's resume (3)

Ashutosh's resume (3)

Ashutosh's resume (3)Ashutosh Vishnoi

Building an MLOps Stack for Companies at Reasonable Scale

Building an MLOps Stack for Companies at Reasonable Scale

Building an MLOps Stack for Companies at Reasonable ScaleMerelda

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016Alluxio, Inc.

雲端推動的人工智能革命Amazon Web Services

An Early Evaluation of Running Spark on Kubernetes

An Early Evaluation of Running Spark on Kubernetes

An Early Evaluation of Running Spark on KubernetesDataWorks Summit

Generative AI on Enterprise Cloud with NiFi and Milvus

Generative AI on Enterprise Cloud with NiFi and Milvus

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...Amazon Web Services

Ähnlich wie Flink Forward San Francisco 2018 keynote: Anand Iyer - "Apache Flink + Apache Beam: Expanding the horizons of Big Data" (20)

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...

PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...

Hopsworks at Google AI Huddle, Sunnyvale

Hopsworks at Google AI Huddle, Sunnyvale

Hopsworks at Google AI Huddle, Sunnyvale

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...

Mykola Murha "Using Google Cloud Platform for creating of Big Data Analysis ...

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten

Maschinelles Lernen auf AWS für Entwickler, Data Scientists und Experten

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSync

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSync

Hands-On Lab: Building a Serverless Real-Time Chat Application with AWS AppSync

Ten compelling reasons to learn .net framework

Ten compelling reasons to learn .net framework

Ten compelling reasons to learn .net framework

Comparative Study of programming Languages

Comparative Study of programming Languages

Comparative Study of programming Languages

Machine Learning State of the Union - MCL210 - re:Invent 2017

Machine Learning State of the Union - MCL210 - re:Invent 2017

Machine Learning State of the Union - MCL210 - re:Invent 2017

Artificial Intelligence (Machine Learning) on AWS: How to Start

Artificial Intelligence (Machine Learning) on AWS: How to Start

Artificial Intelligence (Machine Learning) on AWS: How to Start

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...

Single-Source Publishing Across Multiple Formats with George Bina and Radu Co...

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...

Orchestrating Machine Learning Training for Netflix Recommendations - MCL317 ...

Ashutosh's resume (3)

Ashutosh's resume (3)

Ashutosh's resume (3)

Building an MLOps Stack for Companies at Reasonable Scale

Building an MLOps Stack for Companies at Reasonable Scale

Building an MLOps Stack for Companies at Reasonable Scale

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

雲端推動的人工智能革命

An Early Evaluation of Running Spark on Kubernetes

An Early Evaluation of Running Spark on Kubernetes

An Early Evaluation of Running Spark on Kubernetes

Generative AI on Enterprise Cloud with NiFi and Milvus

Generative AI on Enterprise Cloud with NiFi and Milvus

Generative AI on Enterprise Cloud with NiFi and Milvus

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...

ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...

Mehr von Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...

Building a fully managed stream processing platform on Flink at scale for Lin...

Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward

Evening out the uneven: dealing with skew in Flink

Evening out the uneven: dealing with skew in Flink

Evening out the uneven: dealing with skew in FlinkFlink Forward

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward

Introducing the Apache Flink Kubernetes Operator

Introducing the Apache Flink Kubernetes Operator

Introducing the Apache Flink Kubernetes OperatorFlink Forward

Autoscaling Flink with Reactive Mode

Autoscaling Flink with Reactive Mode

Autoscaling Flink with Reactive ModeFlink Forward

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward

One sink to rule them all: Introducing the new Async Sink

One sink to rule them all: Introducing the new Async Sink

One sink to rule them all: Introducing the new Async SinkFlink Forward

Tuning Apache Kafka Connectors for Flink.pptx

Tuning Apache Kafka Connectors for Flink.pptx

Tuning Apache Kafka Connectors for Flink.pptxFlink Forward

Flink powered stream processing platform at Pinterest

Flink powered stream processing platform at Pinterest

Flink powered stream processing platform at PinterestFlink Forward

Apache Flink in the Cloud-Native Era

Apache Flink in the Cloud-Native Era

Apache Flink in the Cloud-Native EraFlink Forward

Where is my bottleneck? Performance troubleshooting in Flink

Where is my bottleneck? Performance troubleshooting in Flink

Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward

Using the New Apache Flink Kubernetes Operator in a Production Deployment

Using the New Apache Flink Kubernetes Operator in a Production Deployment

Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward

The Current State of Table API in 2022

The Current State of Table API in 2022

The Current State of Table API in 2022Flink Forward

Flink SQL on Pulsar made easy

Flink SQL on Pulsar made easy

Flink SQL on Pulsar made easyFlink Forward

Dynamic Rule-based Real-time Market Data Alerts

Dynamic Rule-based Real-time Market Data Alerts

Dynamic Rule-based Real-time Market Data AlertsFlink Forward

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward

Processing Semantically-Ordered Streams in Financial Services

Processing Semantically-Ordered Streams in Financial Services

Processing Semantically-Ordered Streams in Financial ServicesFlink Forward

Tame the small files problem and optimize data layout for streaming ingestion...

Tame the small files problem and optimize data layout for streaming ingestion...

Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward

Batch Processing at Scale with Flink & Iceberg

Batch Processing at Scale with Flink & Iceberg

Batch Processing at Scale with Flink & IcebergFlink Forward

Mehr von Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...

Building a fully managed stream processing platform on Flink at scale for Lin...

Building a fully managed stream processing platform on Flink at scale for Lin...

Evening out the uneven: dealing with skew in Flink

Evening out the uneven: dealing with skew in Flink

Evening out the uneven: dealing with skew in Flink

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Introducing the Apache Flink Kubernetes Operator

Introducing the Apache Flink Kubernetes Operator

Introducing the Apache Flink Kubernetes Operator

Autoscaling Flink with Reactive Mode

Autoscaling Flink with Reactive Mode

Autoscaling Flink with Reactive Mode

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

One sink to rule them all: Introducing the new Async Sink

One sink to rule them all: Introducing the new Async Sink

One sink to rule them all: Introducing the new Async Sink

Tuning Apache Kafka Connectors for Flink.pptx

Tuning Apache Kafka Connectors for Flink.pptx

Tuning Apache Kafka Connectors for Flink.pptx

Flink powered stream processing platform at Pinterest

Flink powered stream processing platform at Pinterest

Flink powered stream processing platform at Pinterest

Apache Flink in the Cloud-Native Era

Apache Flink in the Cloud-Native Era

Apache Flink in the Cloud-Native Era

Where is my bottleneck? Performance troubleshooting in Flink

Where is my bottleneck? Performance troubleshooting in Flink

Where is my bottleneck? Performance troubleshooting in Flink

Using the New Apache Flink Kubernetes Operator in a Production Deployment

Using the New Apache Flink Kubernetes Operator in a Production Deployment

Using the New Apache Flink Kubernetes Operator in a Production Deployment

The Current State of Table API in 2022

The Current State of Table API in 2022

The Current State of Table API in 2022

Flink SQL on Pulsar made easy

Flink SQL on Pulsar made easy

Flink SQL on Pulsar made easy

Dynamic Rule-based Real-time Market Data Alerts

Dynamic Rule-based Real-time Market Data Alerts

Dynamic Rule-based Real-time Market Data Alerts

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Processing Semantically-Ordered Streams in Financial Services

Processing Semantically-Ordered Streams in Financial Services

Processing Semantically-Ordered Streams in Financial Services

Tame the small files problem and optimize data layout for streaming ingestion...

Tame the small files problem and optimize data layout for streaming ingestion...

Tame the small files problem and optimize data layout for streaming ingestion...

Batch Processing at Scale with Flink & Iceberg

Batch Processing at Scale with Flink & Iceberg

Batch Processing at Scale with Flink & Iceberg

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Driving Behavioral Change for Information Management through Data-Driven Gree...

Driving Behavioral Change for Information Management through Data-Driven Gree...

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

What Are The Drone Anti-jamming Systems Technology?

What Are The Drone Anti-jamming Systems Technology?

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdflior mazor

Axa Assurance Maroc - Insurer Innovation Award 2024

Axa Assurance Maroc - Insurer Innovation Award 2024

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Apidays New York 2024 - The value of a flexible API Management solution for O...

Apidays New York 2024 - The value of a flexible API Management solution for O...

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

GenCyber Cyber Security Day Presentation

GenCyber Cyber Security Day Presentation

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Histor y of HAM Radio presentation slide

Histor y of HAM Radio presentation slide

Histor y of HAM Radio presentation slidevu2urc

presentation ICT roal in 21st century education

presentation ICT roal in 21st century education

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Handwritten Text Recognition for manuscripts and early printed texts

Handwritten Text Recognition for manuscripts and early printed texts

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

Driving Behavioral Change for Information Management through Data-Driven Gree...

Driving Behavioral Change for Information Management through Data-Driven Gree...

What Are The Drone Anti-jamming Systems Technology?

What Are The Drone Anti-jamming Systems Technology?

What Are The Drone Anti-jamming Systems Technology?

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

Axa Assurance Maroc - Insurer Innovation Award 2024

Axa Assurance Maroc - Insurer Innovation Award 2024

Axa Assurance Maroc - Insurer Innovation Award 2024

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

Apidays New York 2024 - The value of a flexible API Management solution for O...

Apidays New York 2024 - The value of a flexible API Management solution for O...

Apidays New York 2024 - The value of a flexible API Management solution for O...

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

GenCyber Cyber Security Day Presentation

GenCyber Cyber Security Day Presentation

GenCyber Cyber Security Day Presentation

Histor y of HAM Radio presentation slide

Histor y of HAM Radio presentation slide

Histor y of HAM Radio presentation slide

presentation ICT roal in 21st century education

presentation ICT roal in 21st century education

presentation ICT roal in 21st century education

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Handwritten Text Recognition for manuscripts and early printed texts

Handwritten Text Recognition for manuscripts and early printed texts

Handwritten Text Recognition for manuscripts and early printed texts

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Flink Forward San Francisco 2018 keynote: Anand Iyer - "Apache Flink + Apache Beam: Expanding the horizons of Big Data"

1. © 2017 Google Inc. All rights reserved. Expanding the horizons of Big-Data Apache Flink + Apache Beam Presenter: Anand Iyer Product Manager, Google Cloud

2. © 2017 Google Inc. All rights reserved. Rich history of collaboration Unified Batch & Streaming Comprehensive streaming semantics & correctness Streaming SQL (w/ Apache Calcite) Unified Batch & Streaming Comprehensive streaming semantics & correctness Streaming SQL (w/ Apache Calcite)

3. © 2017 Google Inc. All rights reserved. Flexible Big-Data Platform for Batch & Streaming Java Python... Machine Learning Genomics ... Time Series ... Horizontal Framework in multiple languages Vertical Solutions via domain specific libraries & tools ...

4. © 2017 Google Inc. All rights reserved. Making the power of Flink available in multiple languages

5. © 2017 Google Inc. All rights reserved. The Apache Beam Model

6. © 2017 Google Inc. All rights reserved. Cross-language Portability Framework Language agnostic abstractions are at the core of the Beam Model Language B SDK Language A SDK Language C SDK Runner 1 Runner 3Runner 2 The Beam Model Language A Language CLanguage B The Beam Model

7. © 2017 Google Inc. All rights reserved. Prototype Flink Runner ❏ Works with Beam’s Python SDK ❏ Collaborators: Flink, Beam, Lyft, GetInData ❏ https://issues.apache.org/jira/browse/ BEAM-2889 ❏ For updates, please subscribe to Apache Flink and Apache Beam Blogs

8. © 2017 Google Inc. All rights reserved. Prototype Flink Runner ❏ Currently supports batch workloads ❏ Streaming capabilities on the roadmap

9. © 2017 Google Inc. All rights reserved. Tools & libraries for compelling use case verticals

10. © 2017 Google Inc. All rights reserved. Machine-Learning + Big-Data Joined at the Hip

11. © 2017 Google Inc. All rights reserved.

12. © 2017 Google Inc. All rights reserved. ML Code Because, in addition to the actual ML...

13. © 2017 Google Inc. All rights reserved. ML Cod ...you have to worry about so much more. Configuration Data Collection Data Verification Feature Engineering Process Management Tools Analysis Tools Machine Resource Management Serving Infrastructure Monitoring ML Code

14. © 2017 Google Inc. All rights reserved. TensorFlow Transform Consistent In-Graph Transformations in Training and Serving

15. © 2017 Google Inc. All rights reserved. Typical ML Pipeline batch processing During training “live” processing During serving data request

16. © 2017 Google Inc. All rights reserved. Typical ML Pipeline batch processing During training “live” processing During serving data request

17. © 2017 Google Inc. All rights reserved. TensorFlow Transform tf.Transform batch processing During training transform as tf.Graph During serving data request

18. © 2017 Google Inc. All rights reserved. Scale to ... Bag of Words / N-Grams Bucketization Feature Crosses tft.ngrams tft.string_to_int tf.string_split tft.scale_to_z_score tft.apply_buckets tft.quantiles tft.string_to_int tf.string_join ... Rich collection of pre-implemented transforms

19. © 2017 Google Inc. All rights reserved. Apply another TensorFlow Model tft.apply_saved_model Scale to ... Bag of Words / N-Grams Bucketization Feature Crosses tft.ngrams tft.string_to_int tf.string_split tft.scale_to_z_score tft.apply_buckets tft.quantiles tft.string_to_int tf.string_join ... Rich collection of pre-implemented transforms

20. © 2017 Google Inc. All rights reserved. github.com/tensorflow/transform

21. © 2017 Google Inc. All rights reserved. TensorFlow Model Analysis Scalable, sliced, and full-pass metrics

22. © 2017 Google Inc. All rights reserved. Analyzing model mistakes by subgroup Specificity (False Positive Rate) Sensitivity(TruePositiveRate) ROC Curve All groups Learn more at ml-fairness.com

23. © 2017 Google Inc. All rights reserved. Analyzing model mistakes by subgroup Learn more at ml-fairness.com Specificity (False Positive Rate) Sensitivity(TruePositiveRate) ROC Curve All groups Group A Group B

24. © 2017 Google Inc. All rights reserved. TensorFlow Model Analysis

25. © 2017 Google Inc. All rights reserved. github.com/tensorflow/model-analysis https://medium.com/tensorflow/introducing-tensorflow-model-anal ysis-scaleable-sliced-and-full-pass-metrics-5cde7baf0b7b

26. © 2017 Google Inc. All rights reserved. Tensorflow Extended https://youtu.be/vdG7uKQ2eKk

27. © 2017 Google Inc. All rights reserved. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. KDD (2017). https://youtu.be/fPTwLVCq00U

28. © 2017 Google Inc. All rights reserved. Beam library for transforming and processing VCF files at scale https://github.com/googlegenomics/gcp-variant-transforms Big-Data in Genomics

29. © 2017 Google Inc. All rights reserved. Come join us! Complete the Flink Runner Add new language SDKs. Javascript anyone? Build vertical libraries & tools