SlideShare a Scribd company logo
1 of 31
Download to read offline
Workflow Hacks! #1
Taro L. Saito

leo@treasure-data.com
Dec. 14, 2015
dots. Tokyo, Japan
Workflow Hacks! #1
2
アンケート
• 終了後 メールにてアンケートを送付します
• 質問内容
• 現在、どのようなシステムを使っているか?
• ワークフローでどのような問題を解決したいか?
• 回答いただいた方に、抽選でTreasure Dataパーカー
をプレゼント!
3
About Me: Taro L. Saito
4
2007 University of Tokyo. Ph.D.
XML DBMS, Transaction Processing
Relational-Style XML Query [SIGMOD 2008]
~ 2014 Assistant Professor at University of Tokyo
Genome Science Research
- Big Data Processing
- Distributed Computing
2014.03~ Treasure Data, Inc. Tokyo
2015.07~ Treasure Data, Inc. 

Mountain View, CA
Cloud Platform for Data Analytics
8
• Importing 1,000,000~ records / sec.
• Presto (Distributed SQL engine)
• 50,000~ queries / day
• Processing 10 trillion records / day
• http://qiita.com/xerial/items/a9093b60062f2c613fda
Import Export
Store
Analyze with
Presto/Hive
(Distributed SQL Engine)
Enterp
Enterprise
Data
BI
Workflow Fundamental Features
• Dependency management
• task1 -> task2 -> task3 …
• Scheduling
• Execution monitoring
• State management
• Error handling
• Easy access to logs
• Notification
9
Workflow Tools
• Workflow Management Tools
• Python: Luigi, Airflow, pinball
• For Hadoop: Oozie (XML)
• Script-based: Makefile, Azkaban
• Biological Science: Galaxy (Web UI), nextflow
• Domestic: JP1, Hinemos
• Dataflow DSL
• Spark, Flink, DriadLINQ, TensorFlow
• Cascading (Java -> MR), Scalding (Scala -> MR)
10
Dataflow DSL
• Translate this data processing program
• into a cluster computing program
11
A B
A0
A1
A2
B1
B2
f
B0
C
C
g
map reduce
f g
Redbook: Dataflow Engines
• Chapter 5: Large-Scale Dataflow Engine, by Peter Bailis
• http://www.redbook.io/ch5-dataflow.html
• DryadLINQ
• Most influential interface

for dataflow DSL
• SQL-like operation
• Functional style
• Spark
• SparkSQL
• 70% of Spark accesses
• Dataset API
• Shift to the dataframe based API
12
Dataflow -> Execution Plan
• Example - Hive: SQL to MapReduce
• Mapping SQL stages into MapReduce program
• SELECT page, count(*) FROM weblog

GROUP BY page
13
HDFS
A0
B0
A1
A2
B
B1
B2
B3
A
map reduce mergesplit
HDFS
TableScan(weblog)
GroupBy(hash(page))
count(weblog of a page)
result
Workflows
14
A
f
B C
g
D E
F
G
Hadoop is not enough
• C. Olston et al. [SIGMOD 2011]
• continuous processing
• independent scheduling
• Incremental processing
• Google Parcolator [OSDI 2010]
• Naiad - Differential Workflow

Microsoft [SOSP 2013]
15
Continuous Processing
• The Dataflow Model
• Akidau et al., Google [VLDB2015]
• Unbounded data processing
• late-coming data
• Integration of
• batch processing
• accumulation
16
Cluster Computing with Dryad 

M. Budiu, 2008
Cluster Computing with Dryad 

M. Budiu, 2008
Workflow Hacks!
Airflow
19
Airflow
• Best practices with Airflow - An open source
platform for workflows & schedules (Nov 2015)
• At Silicon Valley Data Engineering Meetup
• https://youtu.be/dgaoqOZlvEA
20
Workflow Development
• Programmatic
• Generate workflows by code
• Configuration as Code
• Workflow reuse/overwrite
• object oriented
• Parameterization
21
Luigi
• Luigiによるワークフロー管理
• http://qiita.com/k24d/items/
fb9bed08423e6249d376
22
Nextflow
• http://www.nextflow.io/
23
Dataflow DSL vs Workflow DSL
• Dataflow
• A -> B -> C -> …
• Data dependencies
• Workflow
• Task A -> Task B -> Task C -> …
• Task dependencies
• Data transfer is optional (through file or DB)
• + Scheduling
• + Task names
• For monitoring, redo, etc.
24
Weavelet (wvlet)
• Object-oriented workflow DSL for Scala
• Workflow reuse, extension, override
• Parameterization
• Function := Task, Workflow := Class
25
Isolating DAG generation and its execution
• Alternatives of MR
• Tez
• Pig on Spark https://issues.apache.org/jira/browse/PIG-4059
• Asakusa on Hadoop, Spark
26
Local
Hadoop
Spark
Result
DSL generates DAG
Stream DSL
• Add “moving stream” support to Dataflow DSL
• ”moving" streams and "resting" datasets
• Example
• Spark Streaming
• Spark DSL + Micro-batch for stream
• Microsoft Azure Stream SQL
• Windowing support for moving data
• Norikra
• Stream processing with SQL
• Reactive programming
• ReactiveX (Netflix), Akka Streaming (beta)  <- Stream DSL (DAG)
• Back-pressure support
• Controlling data transfer speed from receiver side
27
Task Execution Retry
• リトライと冪等性のデザインパターン
• http://frsyuki.hatenablog.com/entry/2014/06/09/164559
• System failures
• Process is not responding
• network, hardware failures
• Middleware failures
• provisioning failures, missing components
• User failures
• Wrong configuration
• Programming error
28
Retry Example
• Example: Task calling a REST API /create/xxx
• Client: First attempt
• Server returns 200 Success
• But failed to get the status code
• Client retries the task
• Get 409 conflict error (entry xxx is already created)
• Solution (Application side)
• Handle 409 error as success in the client (idempotent
execution)
• More strict approach
• Making xxx unique for each request
29
Fault Tolerance
• Presto: Distributed query engine developed by Facebook
• Uses HTTP data transfer
• No fault-tolerance
• 99.5% of queries finishes without any failure
• For queries processing 10 billions or more rows => Drops to 85%
30
A0
B0
A1
A2
B
B1
B2
B3
A
map reduce mergesplit
TableScan(weblog)
GroupBy(hash(page))
count(weblog of a page)
result
Summary
• Recent workflow tools
• Driven by Python community
• Because of this book! (=>)
• Airflow, Luigi, etc.
• Workflow manager
• Handle system failures, monitoring
• Workflow development
• DAG based DSL (dataflow, workflow, stream processing) -> Execution
• Does not cover application logic errors
• Idempotent execution
• Requires splitting large tasks into smaller ones
31

More Related Content

What's hot

Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
 
20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_enOgibayashi
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoSadayuki Furuhashi
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookTreasure Data, Inc.
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Spark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteSpark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteDatabricks
 
Presto at Twitter
Presto at TwitterPresto at Twitter
Presto at TwitterBill Graham
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...viirya
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes Ziemowit Jankowski
 
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevWebinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevAltinity Ltd
 
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...Databricks
 
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and RSpark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and RDatabricks
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using sparkVisualizing big data in the browser using spark
Visualizing big data in the browser using sparkDatabricks
 
Building Data Pipelines in Python
Building Data Pipelines in PythonBuilding Data Pipelines in Python
Building Data Pipelines in PythonC4Media
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangDatabricks
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of MusicLars Albertsson
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
Functional architectural patterns
Functional architectural patternsFunctional architectural patterns
Functional architectural patternsLars Albertsson
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178Kai Sasaki
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure DataTaro L. Saito
 

What's hot (20)

Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_en
 
Prestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for PrestoPrestogres, ODBC & JDBC connectivity for Presto
Prestogres, ODBC & JDBC connectivity for Presto
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @Facebook
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Spark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin KeynoteSpark Summit EU 2015: Reynold Xin Keynote
Spark Summit EU 2015: Reynold Xin Keynote
 
Presto at Twitter
Presto at TwitterPresto at Twitter
Presto at Twitter
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
 
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander ZaitsevWebinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
 
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
 
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and RSpark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using sparkVisualizing big data in the browser using spark
Visualizing big data in the browser using spark
 
Building Data Pipelines in Python
Building Data Pipelines in PythonBuilding Data Pipelines in Python
Building Data Pipelines in Python
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of Music
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Functional architectural patterns
Functional architectural patternsFunctional architectural patterns
Functional architectural patterns
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
 

Viewers also liked

Apache Airflow入門 (マーケティングデータ分析基盤技術勉強会)
Apache Airflow入門  (マーケティングデータ分析基盤技術勉強会)Apache Airflow入門  (マーケティングデータ分析基盤技術勉強会)
Apache Airflow入門 (マーケティングデータ分析基盤技術勉強会)Takeshi Mikami
 
Apache Hbase バルクロードの使い方
Apache Hbase バルクロードの使い方Apache Hbase バルクロードの使い方
Apache Hbase バルクロードの使い方Takeshi Mikami
 
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月VirtualTech Japan Inc.
 
並列データベースシステムの概念と原理
並列データベースシステムの概念と原理並列データベースシステムの概念と原理
並列データベースシステムの概念と原理Makoto Yui
 
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決 - OpenStack最新情報セミナー...
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決  - OpenStack最新情報セミナー...Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決  - OpenStack最新情報セミナー...
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決 - OpenStack最新情報セミナー...VirtualTech Japan Inc.
 
OCP, Kubernetes ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)
OCP, Kubernetes  ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)OCP, Kubernetes  ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)
OCP, Kubernetes ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)VirtualTech Japan Inc.
 
EmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤とEmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤とToru Takahashi
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11Sadayuki Furuhashi
 
Rd gatewayによるwindowsインスタンスへの接続
Rd gatewayによるwindowsインスタンスへの接続Rd gatewayによるwindowsインスタンスへの接続
Rd gatewayによるwindowsインスタンスへの接続Amazon Web Services Japan
 

Viewers also liked (11)

Apache Airflow入門 (マーケティングデータ分析基盤技術勉強会)
Apache Airflow入門  (マーケティングデータ分析基盤技術勉強会)Apache Airflow入門  (マーケティングデータ分析基盤技術勉強会)
Apache Airflow入門 (マーケティングデータ分析基盤技術勉強会)
 
xrdpを使ったお手軽BYOD環境の構築
xrdpを使ったお手軽BYOD環境の構築xrdpを使ったお手軽BYOD環境の構築
xrdpを使ったお手軽BYOD環境の構築
 
Apache Hbase バルクロードの使い方
Apache Hbase バルクロードの使い方Apache Hbase バルクロードの使い方
Apache Hbase バルクロードの使い方
 
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月
Yahoo!Japan北米DCでOCPのツボをみせてもらってきました - OpenStack最新情報セミナー 2016年5月
 
xrdpで変える!社内のPC環境
xrdpで変える!社内のPC環境xrdpで変える!社内のPC環境
xrdpで変える!社内のPC環境
 
並列データベースシステムの概念と原理
並列データベースシステムの概念と原理並列データベースシステムの概念と原理
並列データベースシステムの概念と原理
 
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決 - OpenStack最新情報セミナー...
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決  - OpenStack最新情報セミナー...Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決  - OpenStack最新情報セミナー...
Z Lab社におけるOpenStack × Kubernetesの活用 〜アプリケーション開発者からみた課題解決 - OpenStack最新情報セミナー...
 
OCP, Kubernetes ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)
OCP, Kubernetes  ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)OCP, Kubernetes  ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)
OCP, Kubernetes ハイパースケールアーキテクチャ 導入の道のり - OpenStack最新情報セミナー(2016年7月)
 
EmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤とEmbulkとDigdagとデータ分析基盤と
EmbulkとDigdagとデータ分析基盤と
 
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
分散ワークフローエンジン『Digdag』の実装 at Tokyo RubyKaigi #11
 
Rd gatewayによるwindowsインスタンスへの接続
Rd gatewayによるwindowsインスタンスへの接続Rd gatewayによるwindowsインスタンスへの接続
Rd gatewayによるwindowsインスタンスへの接続
 

Similar to Workflow Hacks #1 - dots. Tokyo

Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarAdf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarNilesh Shah
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformGoDataDriven
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS CloudIdan Tohami
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!Teamstudio
 
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...Wilco Turnhout
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Derek Jacoby
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Gabriele Bartolini
 
UK Community day 20180427 Microsoft Flow hackathon
UK Community day 20180427 Microsoft Flow hackathonUK Community day 20180427 Microsoft Flow hackathon
UK Community day 20180427 Microsoft Flow hackathonPenny Coventry
 
Datastage Online Training
Datastage Online TrainingDatastage Online Training
Datastage Online TrainingNagendra Kumar
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Wes McKinney
 
Practical automation for beginners
Practical automation for beginnersPractical automation for beginners
Practical automation for beginnersSeoweon Yoo
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoEqunix Business Solutions
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSPC Adriatics
 

Similar to Workflow Hacks #1 - dots. Tokyo (20)

Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarAdf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
 
JavaOne_2010
JavaOne_2010JavaOne_2010
JavaOne_2010
 
Background processing with hangfire
Background processing with hangfireBackground processing with hangfire
Background processing with hangfire
 
Intro to Big Data - Spark
Intro to Big Data - SparkIntro to Big Data - Spark
Intro to Big Data - Spark
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
The Autobahn Has No Speed Limit - Your XPages Shouldn't Either!
 
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...
SharePoint Connections Conference Amsterdam - Pitfalls and success factors of...
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
 
UK Community day 20180427 Microsoft Flow hackathon
UK Community day 20180427 Microsoft Flow hackathonUK Community day 20180427 Microsoft Flow hackathon
UK Community day 20180427 Microsoft Flow hackathon
 
DOTNET8.pptx
DOTNET8.pptxDOTNET8.pptx
DOTNET8.pptx
 
Datastage Online Training
Datastage Online TrainingDatastage Online Training
Datastage Online Training
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
 
Practical automation for beginners
Practical automation for beginnersPractical automation for beginners
Practical automation for beginners
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi Vončina
 

More from Taro L. Saito

Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
 
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Taro L. Saito
 
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
 
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020Taro L. Saito
 
Airframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecAirframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecTaro L. Saito
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of PrestoTaro L. Saito
 
How To Use Scala At Work - Airframe In Action at Arm Treasure Data
How To Use Scala At Work - Airframe In Action at Arm Treasure DataHow To Use Scala At Work - Airframe In Action at Arm Treasure Data
How To Use Scala At Work - Airframe In Action at Arm Treasure DataTaro L. Saito
 
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Taro L. Saito
 
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Taro L. Saito
 
Tips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTaro L. Saito
 
Learning Silicon Valley Culture
Learning Silicon Valley CultureLearning Silicon Valley Culture
Learning Silicon Valley CultureTaro L. Saito
 
Scala at Treasure Data
Scala at Treasure DataScala at Treasure Data
Scala at Treasure DataTaro L. Saito
 
Presto As A Service - Treasure DataでのPresto運用事例
Presto As A Service - Treasure DataでのPresto運用事例Presto As A Service - Treasure DataでのPresto運用事例
Presto As A Service - Treasure DataでのPresto運用事例Taro L. Saito
 
Treasure Dataを支える技術 - MessagePack編
Treasure Dataを支える技術 - MessagePack編Treasure Dataを支える技術 - MessagePack編
Treasure Dataを支える技術 - MessagePack編Taro L. Saito
 
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoWeaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoTaro L. Saito
 
Spark Internals - Hadoop Source Code Reading #16 in Japan
Spark Internals - Hadoop Source Code Reading #16 in JapanSpark Internals - Hadoop Source Code Reading #16 in Japan
Spark Internals - Hadoop Source Code Reading #16 in JapanTaro L. Saito
 
Silkによる並列分散ワークフロープログラミング
Silkによる並列分散ワークフロープログラミングSilkによる並列分散ワークフロープログラミング
Silkによる並列分散ワークフロープログラミングTaro L. Saito
 

More from Taro L. Saito (20)

Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
 
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
 
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
 
Airframe RPC
Airframe RPCAirframe RPC
Airframe RPC
 
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
 
Airframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecAirframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpec
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
 
How To Use Scala At Work - Airframe In Action at Arm Treasure Data
How To Use Scala At Work - Airframe In Action at Arm Treasure DataHow To Use Scala At Work - Airframe In Action at Arm Treasure Data
How To Use Scala At Work - Airframe In Action at Arm Treasure Data
 
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
 
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
 
Tips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS Projects
 
Learning Silicon Valley Culture
Learning Silicon Valley CultureLearning Silicon Valley Culture
Learning Silicon Valley Culture
 
Scala at Treasure Data
Scala at Treasure DataScala at Treasure Data
Scala at Treasure Data
 
Presto As A Service - Treasure DataでのPresto運用事例
Presto As A Service - Treasure DataでのPresto運用事例Presto As A Service - Treasure DataでのPresto運用事例
Presto As A Service - Treasure DataでのPresto運用事例
 
JNuma Library
JNuma LibraryJNuma Library
JNuma Library
 
Treasure Dataを支える技術 - MessagePack編
Treasure Dataを支える技術 - MessagePack編Treasure Dataを支える技術 - MessagePack編
Treasure Dataを支える技術 - MessagePack編
 
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoWeaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
 
Spark Internals - Hadoop Source Code Reading #16 in Japan
Spark Internals - Hadoop Source Code Reading #16 in JapanSpark Internals - Hadoop Source Code Reading #16 in Japan
Spark Internals - Hadoop Source Code Reading #16 in Japan
 
Silkによる並列分散ワークフロープログラミング
Silkによる並列分散ワークフロープログラミングSilkによる並列分散ワークフロープログラミング
Silkによる並列分散ワークフロープログラミング
 

Recently uploaded

Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 

Recently uploaded (20)

Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

Workflow Hacks #1 - dots. Tokyo

  • 1. Workflow Hacks! #1 Taro L. Saito
 leo@treasure-data.com Dec. 14, 2015 dots. Tokyo, Japan
  • 3. アンケート • 終了後 メールにてアンケートを送付します • 質問内容 • 現在、どのようなシステムを使っているか? • ワークフローでどのような問題を解決したいか? • 回答いただいた方に、抽選でTreasure Dataパーカー をプレゼント! 3
  • 4. About Me: Taro L. Saito 4 2007 University of Tokyo. Ph.D. XML DBMS, Transaction Processing Relational-Style XML Query [SIGMOD 2008] ~ 2014 Assistant Professor at University of Tokyo Genome Science Research - Big Data Processing - Distributed Computing 2014.03~ Treasure Data, Inc. Tokyo 2015.07~ Treasure Data, Inc. 
 Mountain View, CA
  • 5.
  • 6.
  • 7.
  • 8. Cloud Platform for Data Analytics 8 • Importing 1,000,000~ records / sec. • Presto (Distributed SQL engine) • 50,000~ queries / day • Processing 10 trillion records / day • http://qiita.com/xerial/items/a9093b60062f2c613fda Import Export Store Analyze with Presto/Hive (Distributed SQL Engine) Enterp Enterprise Data BI
  • 9. Workflow Fundamental Features • Dependency management • task1 -> task2 -> task3 … • Scheduling • Execution monitoring • State management • Error handling • Easy access to logs • Notification 9
  • 10. Workflow Tools • Workflow Management Tools • Python: Luigi, Airflow, pinball • For Hadoop: Oozie (XML) • Script-based: Makefile, Azkaban • Biological Science: Galaxy (Web UI), nextflow • Domestic: JP1, Hinemos • Dataflow DSL • Spark, Flink, DriadLINQ, TensorFlow • Cascading (Java -> MR), Scalding (Scala -> MR) 10
  • 11. Dataflow DSL • Translate this data processing program • into a cluster computing program 11 A B A0 A1 A2 B1 B2 f B0 C C g map reduce f g
  • 12. Redbook: Dataflow Engines • Chapter 5: Large-Scale Dataflow Engine, by Peter Bailis • http://www.redbook.io/ch5-dataflow.html • DryadLINQ • Most influential interface
 for dataflow DSL • SQL-like operation • Functional style • Spark • SparkSQL • 70% of Spark accesses • Dataset API • Shift to the dataframe based API 12
  • 13. Dataflow -> Execution Plan • Example - Hive: SQL to MapReduce • Mapping SQL stages into MapReduce program • SELECT page, count(*) FROM weblog
 GROUP BY page 13 HDFS A0 B0 A1 A2 B B1 B2 B3 A map reduce mergesplit HDFS TableScan(weblog) GroupBy(hash(page)) count(weblog of a page) result
  • 15. Hadoop is not enough • C. Olston et al. [SIGMOD 2011] • continuous processing • independent scheduling • Incremental processing • Google Parcolator [OSDI 2010] • Naiad - Differential Workflow
 Microsoft [SOSP 2013] 15
  • 16. Continuous Processing • The Dataflow Model • Akidau et al., Google [VLDB2015] • Unbounded data processing • late-coming data • Integration of • batch processing • accumulation 16
  • 17. Cluster Computing with Dryad 
 M. Budiu, 2008
  • 18. Cluster Computing with Dryad 
 M. Budiu, 2008 Workflow Hacks!
  • 20. Airflow • Best practices with Airflow - An open source platform for workflows & schedules (Nov 2015) • At Silicon Valley Data Engineering Meetup • https://youtu.be/dgaoqOZlvEA 20
  • 21. Workflow Development • Programmatic • Generate workflows by code • Configuration as Code • Workflow reuse/overwrite • object oriented • Parameterization 21
  • 24. Dataflow DSL vs Workflow DSL • Dataflow • A -> B -> C -> … • Data dependencies • Workflow • Task A -> Task B -> Task C -> … • Task dependencies • Data transfer is optional (through file or DB) • + Scheduling • + Task names • For monitoring, redo, etc. 24
  • 25. Weavelet (wvlet) • Object-oriented workflow DSL for Scala • Workflow reuse, extension, override • Parameterization • Function := Task, Workflow := Class 25
  • 26. Isolating DAG generation and its execution • Alternatives of MR • Tez • Pig on Spark https://issues.apache.org/jira/browse/PIG-4059 • Asakusa on Hadoop, Spark 26 Local Hadoop Spark Result DSL generates DAG
  • 27. Stream DSL • Add “moving stream” support to Dataflow DSL • ”moving" streams and "resting" datasets • Example • Spark Streaming • Spark DSL + Micro-batch for stream • Microsoft Azure Stream SQL • Windowing support for moving data • Norikra • Stream processing with SQL • Reactive programming • ReactiveX (Netflix), Akka Streaming (beta)  <- Stream DSL (DAG) • Back-pressure support • Controlling data transfer speed from receiver side 27
  • 28. Task Execution Retry • リトライと冪等性のデザインパターン • http://frsyuki.hatenablog.com/entry/2014/06/09/164559 • System failures • Process is not responding • network, hardware failures • Middleware failures • provisioning failures, missing components • User failures • Wrong configuration • Programming error 28
  • 29. Retry Example • Example: Task calling a REST API /create/xxx • Client: First attempt • Server returns 200 Success • But failed to get the status code • Client retries the task • Get 409 conflict error (entry xxx is already created) • Solution (Application side) • Handle 409 error as success in the client (idempotent execution) • More strict approach • Making xxx unique for each request 29
  • 30. Fault Tolerance • Presto: Distributed query engine developed by Facebook • Uses HTTP data transfer • No fault-tolerance • 99.5% of queries finishes without any failure • For queries processing 10 billions or more rows => Drops to 85% 30 A0 B0 A1 A2 B B1 B2 B3 A map reduce mergesplit TableScan(weblog) GroupBy(hash(page)) count(weblog of a page) result
  • 31. Summary • Recent workflow tools • Driven by Python community • Because of this book! (=>) • Airflow, Luigi, etc. • Workflow manager • Handle system failures, monitoring • Workflow development • DAG based DSL (dataflow, workflow, stream processing) -> Execution • Does not cover application logic errors • Idempotent execution • Requires splitting large tasks into smaller ones 31