Suche senden
Hochladen
Distributed Heterogeneous Mixture Learning On Spark
•
2 gefällt mir
•
1,379 views
Spark Summit
Folgen
Spark Summit 2016 talk by Masato Asahara (NEC) and Ryohei Fujimaki (NEC)
Weniger lesen
Mehr lesen
Daten & Analysen
Melden
Teilen
Melden
Teilen
1 von 51
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
Spark Summit
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Spark Summit
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark Summit
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data Payloads
Simeon Fitch
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
Empfohlen
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
Spark Summit
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Spark Summit
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark Summit
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data Payloads
Simeon Fitch
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
Democratizing Data
Democratizing Data
Databricks
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model Deployment
Databricks
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Spark Summit
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
DataWorks Summit
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
Databricks
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Databricks
Building Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous Data
Databricks
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
Databricks
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
DataWorks Summit/Hadoop Summit
Operationalizing Big Data Pipelines At Scale
Operationalizing Big Data Pipelines At Scale
Databricks
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
Serverless data pipelines gcp
Serverless data pipelines gcp
Catherine Kimani
Scaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache Spark
Databricks
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Databricks
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Databricks
Lessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics Platform
Databricks
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
Databricks
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Databricks
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
Insight Technology, Inc.
Moestuinier Wim Lybaert kan ook winst oogsten
Moestuinier Wim Lybaert kan ook winst oogsten
Thierry Debels
Weitere ähnliche Inhalte
Was ist angesagt?
Democratizing Data
Democratizing Data
Databricks
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model Deployment
Databricks
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Spark Summit
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
DataWorks Summit
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
Databricks
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Databricks
Building Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous Data
Databricks
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
Databricks
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
DataWorks Summit/Hadoop Summit
Operationalizing Big Data Pipelines At Scale
Operationalizing Big Data Pipelines At Scale
Databricks
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
Serverless data pipelines gcp
Serverless data pipelines gcp
Catherine Kimani
Scaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache Spark
Databricks
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Databricks
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Databricks
Lessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics Platform
Databricks
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
Databricks
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Databricks
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
Was ist angesagt?
(20)
Democratizing Data
Democratizing Data
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model Deployment
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Building Identity Graphs over Heterogeneous Data
Building Identity Graphs over Heterogeneous Data
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
Operationalizing Big Data Pipelines At Scale
Operationalizing Big Data Pipelines At Scale
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Serverless data pipelines gcp
Serverless data pipelines gcp
Scaling Machine Learning with Apache Spark
Scaling Machine Learning with Apache Spark
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Lessons Learned from Modernizing USCIS Data Analytics Platform
Lessons Learned from Modernizing USCIS Data Analytics Platform
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Andere mochten auch
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
Insight Technology, Inc.
Moestuinier Wim Lybaert kan ook winst oogsten
Moestuinier Wim Lybaert kan ook winst oogsten
Thierry Debels
Baker HIMSS Staffers Final
Baker HIMSS Staffers Final
bakerdb
Ven World Irish Folk Music
Ven World Irish Folk Music
jglasshouse
Future-of-wearable-computing
Future-of-wearable-computing
Hessel van Tuinen
The Brand Is Dead: Long Live the Brand
The Brand Is Dead: Long Live the Brand
Matthew Thompson
If U Dont Want To Be Ill - Speak
If U Dont Want To Be Ill - Speak
ryrota
Enterprise 2.0
Enterprise 2.0
Kyle Mathews
Small and medium-sized busineses' funding
Small and medium-sized busineses' funding
Frozovsky
Digitalis Design - Mobile App Design (hun)
Digitalis Design - Mobile App Design (hun)
Milan Korsos
Sabbath School Lesson - March 6-12
Sabbath School Lesson - March 6-12
ryrota
FOCUS quotes
FOCUS quotes
Anna Zubarev
2012 Taiwan UX Summit 專題演講(四)簡報
2012 Taiwan UX Summit 專題演講(四)簡報
UXTW(Taiwan User Experience Professional Association)
Taklimat r trg draf 4 tkini
Taklimat r trg draf 4 tkini
cukasuam
Temporitmika 5 3
Temporitmika 5 3
Владимир Белоконь
Entrepreneurial opportunities nicoleta litoiu upb_crebus 2012
Entrepreneurial opportunities nicoleta litoiu upb_crebus 2012
crebusproject
Yogurt: Pick This Multipurpose Snack and Meal [INFOGRAPHIC]
Yogurt: Pick This Multipurpose Snack and Meal [INFOGRAPHIC]
Food Insight
A Magical Miniature World By Vyacheslav Mishchenko
A Magical Miniature World By Vyacheslav Mishchenko
guimera
Kotoba te
Kotoba te
ラッキー ね
Science fiction in films and how we design the future - Matthew McGriskin
Science fiction in films and how we design the future - Matthew McGriskin
UXPA UK
Andere mochten auch
(20)
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
[db tech showcase Tokyo 2016] B31: Spark Summit 2016@SFに参加してきたので最新事例などを紹介しつつデ...
Moestuinier Wim Lybaert kan ook winst oogsten
Moestuinier Wim Lybaert kan ook winst oogsten
Baker HIMSS Staffers Final
Baker HIMSS Staffers Final
Ven World Irish Folk Music
Ven World Irish Folk Music
Future-of-wearable-computing
Future-of-wearable-computing
The Brand Is Dead: Long Live the Brand
The Brand Is Dead: Long Live the Brand
If U Dont Want To Be Ill - Speak
If U Dont Want To Be Ill - Speak
Enterprise 2.0
Enterprise 2.0
Small and medium-sized busineses' funding
Small and medium-sized busineses' funding
Digitalis Design - Mobile App Design (hun)
Digitalis Design - Mobile App Design (hun)
Sabbath School Lesson - March 6-12
Sabbath School Lesson - March 6-12
FOCUS quotes
FOCUS quotes
2012 Taiwan UX Summit 專題演講(四)簡報
2012 Taiwan UX Summit 專題演講(四)簡報
Taklimat r trg draf 4 tkini
Taklimat r trg draf 4 tkini
Temporitmika 5 3
Temporitmika 5 3
Entrepreneurial opportunities nicoleta litoiu upb_crebus 2012
Entrepreneurial opportunities nicoleta litoiu upb_crebus 2012
Yogurt: Pick This Multipurpose Snack and Meal [INFOGRAPHIC]
Yogurt: Pick This Multipurpose Snack and Meal [INFOGRAPHIC]
A Magical Miniature World By Vyacheslav Mishchenko
A Magical Miniature World By Vyacheslav Mishchenko
Kotoba te
Kotoba te
Science fiction in films and how we design the future - Matthew McGriskin
Science fiction in films and how we design the future - Matthew McGriskin
Ähnlich wie Distributed Heterogeneous Mixture Learning On Spark
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
DataWorks Summit
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Databricks
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
Seldon
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Ahsan Javed Awan
IBM - Craig Bender
IBM - Craig Bender
IDGnederland
Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow India
Sandesh Rao
Data Science Crash Course
Data Science Crash Course
DataWorks Summit
Meetup Spark UDF performance
Meetup Spark UDF performance
Guilherme Braccialli
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Esther Vasiete
S&OP as a Service.pdf
S&OP as a Service.pdf
David Barbieri Kennedy
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systems
Igor José F. Freitas
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)
Mahadevan N
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
Clarisse Hedglin
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Databricks
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
Fujitsu India
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
SingleStore
CH02-Computer Organization and Architecture 10e.pptx
CH02-Computer Organization and Architecture 10e.pptx
HafizSaifullah4
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
Srivatsan Ramanujam
Ähnlich wie Distributed Heterogeneous Mixture Learning On Spark
(20)
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...
IBM - Craig Bender
IBM - Craig Bender
Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow India
Data Science Crash Course
Data Science Crash Course
Meetup Spark UDF performance
Meetup Spark UDF performance
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
S&OP as a Service.pdf
S&OP as a Service.pdf
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systems
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
CH02-Computer Organization and Architecture 10e.pptx
CH02-Computer Organization and Architecture 10e.pptx
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
Mehr von Spark Summit
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
Mehr von Spark Summit
(20)
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Kürzlich hochgeladen
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
EfruzAsilolu
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Elaine Werffeli
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
vexqp
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
EfruzAsilolu
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
Vivek487417
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
Rajesh Mondal
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
LakpaYanziSherpa
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
Kürzlich hochgeladen
(20)
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Distributed Heterogeneous Mixture Learning On Spark
1.
Distributed Heterogeneous Mixture Learning
on Spark Masato Asahara and Ryohei Fujimaki NEC Data Science Research Labs. Jun/08/2016 @Spark Summit 2016
2.
2 © NEC
Corporation 2016 Who we are? ▌Masato Asahara (Ph.D.) ▌Researcher, NEC Data Science Research Laboratory Masato received his Ph.D. degree from Keio University, and has worked at NEC for 6 years in the research field of distributed computing systems and computing resource management technologies. Masato is currently leading developments of Spark-based machine learning and data analytics systems, particularly using NEC’s Heterogeneous Mixture Learning technology. ▌Ryohei Fujimaki (Ph.D.) ▌Research Fellow, NEC Data Science Research Laboratory Ryohei is responsible for cores to NEC’s leading predictive and prescriptive analytics solutions, including "heterogeneous mixture learning” and “predictive optimization” technologies. In addition to technology R&D, Ryohei is also involved with co-developing cutting-edge advanced analytics solutions with NEC’s global business clients and partners in the North American and APAC regions. Ryohei received his Ph.D. degree from the University of Tokyo, and became the youngest research fellow ever in NEC Corporation’s 115-year history.
3.
3 © NEC
Corporation 2016 Agenda HML
4.
4 © NEC
Corporation 2016 Agenda CPU 12 SEP 11 SEP 10 SEP HML
5.
5 © NEC
Corporation 2016 Agenda HML
6.
NEC’s Predictive Analytics
and Heterogeneous Mixture Learning
7.
7 © NEC
Corporation 2016 Enterprise Applications of HML Driver Risk Assessment Inventory Optimization Churn Retention Predictive Maintenance Product Price Optimization Sales Optimization Energy/Water Operation Mgmt.
8.
8 © NEC
Corporation 2016 NEC’s Heterogeneous Mixture Learning (HML) NEC’s machine learning that automatically derives “accurate” and “transparent” formulas behind Big Data. Sun Sat Y=a1x1+ ・・・ +anxn Y=a1x1+ ・・・+knx3 Y=b1x2+ ・・・+hnxn Y=c1x4+ ・・・+fnxn Mon Sun Explore Massive Formulas Transparent data segmentation and predictive formulas Heterogeneous mixture data
9.
9 © NEC
Corporation 2016 The HML Model Gender Height tallshort age smoke food weight sports sports tea/coffee beer/wine The health risk of a tall man can be predicted by age, alcohol and smoke!
10.
10 © NEC
Corporation 2016 HML (Heterogeneous Mixture Learning)
11.
11 © NEC
Corporation 2016 HML (Heterogeneous Mixture Learning)
12.
12 © NEC
Corporation 2016 HML Algorithm Segment data by a rule-based tree Feature Selection Pruning Data Segmentation Training data
13.
13 © NEC
Corporation 2016 HML Algorithm Select features and fit predictive models for data segments Feature Selection Pruning Data Segmentation Training data
14.
14 © NEC
Corporation 2016 Enterprise Applications of HML Driver Risk Assessment Inventory Optimization Churn Retention Predictive Maintenance Energy/Water Operation Mgmt. Sales Optimization Product Price Optimization
15.
15 © NEC
Corporation 2016 Driver Risk Assessment Inventory Optimization Churn Retention Predictive Maintenanc e Energy/Water Operation Mgmt. Product Price Optimization Demand for Big Data Analysis 24(hour)×365(days)×3(year)×1000(shops) ~26,000,000 training samples Sales Optimization
16.
16 © NEC
Corporation 2016 Driver Risk Assessment Inventory Optimization Churn Retention Predictive Maintenance Energy/Water Demand Forecasting Sales Forecasting Product Price Optimization Demand for Big Data Analysis 5 million(customers)×12(months) =60,000,000 training samples
17.
17 © NEC
Corporation 2016 Driver Risk Assessment Inventory Optimization Churn Retention Predictive Maintenanc e Energy/Water Operation Mgmt. Product Price Optimization Demand for Big Data Analysis Sales Optimization
18.
Distributed HML on
Spark
19.
19 © NEC
Corporation 2016 Why Spark, not Hadoop? Because Spark’s in-memory architecture can run HML faster Feature Selection Pruning Data Segmentation CPU CPU CPU Data segmentation Feature selection ~ Pruning CPU CPU Data segmentation Feature selection
20.
20 © NEC
Corporation 2016 Data Scalability powered by Spark Treat unlimited scale of training data by adding executors executor executor executor CPU CPU CPU driver HDFS Add executors, and get more memory and CPU power Training data
21.
21 © NEC
Corporation 2016 Distributed HML Engine: Architecture HDFS Distributed HML Core (Scala) YARN invoke HML model Distributed HML Interface (Python) Matrix computation libraries (Breeze, BLAS) Data Scientist
22.
22 © NEC
Corporation 2016 3 Technical Key Points to Fast Run ML on Spark CPU 12 SEP 11 SEP 10 SEP
23.
23 © NEC
Corporation 2016 3 Technical Key Points to Fast Run ML on Spark CPU 12 SEP 11 SEP 10 SEP
24.
24 © NEC
Corporation 2016 Challenge to Avoid Data Shuffling executor executor executor driver Model: 10KB~10MB HML Model Merge Algorithm Training data: TB~ executor executor executor driver Naïve Design Distributed HML
25.
25 © NEC
Corporation 2016 Execution Flow of Distributed HML Executors build a rule-based tree from their local data in parallel Driver Executor Feature Selection Pruning Data Segmentation
26.
26 © NEC
Corporation 2016 Execution Flow of Distributed HML Driver merges the trees Driver Executor Feature Selection Pruning Data Segmentation HML Segment Merge Algorithm
27.
27 © NEC
Corporation 2016 Execution Flow of Distributed HML Driver broadcasts the merged tree to executors Driver Executor Feature Selection Pruning Data Segmentation
28.
28 © NEC
Corporation 2016 Execution Flow of Distributed HML Executors perform feature selection with their local data in parallel Driver Executor Feature Selection Pruning Data Segmentation Sat Sat Sat Sat Sat
29.
29 © NEC
Corporation 2016 Execution Flow of Distributed HML Driver merges the results of feature selection Driver Executor Feature Selection Pruning Data Segmentation Sat Sat Sat Sat Sat HML Feature Merge Algorithm
30.
30 © NEC
Corporation 2016 Execution Flow of Distributed HML Driver merges the results of feature selection Driver Executor Feature Selection Pruning Data Segmentation Sat HML Feature Merge Algorithm
31.
31 © NEC
Corporation 2016 Execution Flow of Distributed HML Driver broadcasts the merged results of feature selection to executors Driver Executor Feature Selection Pruning Data Segmentation Sat SatSatSat
32.
32 © NEC
Corporation 2016 3 Technical Key Points to Fast Run ML on Spark CPU 12 SEP 11 SEP 10 SEP
33.
33 © NEC
Corporation 2016 Wait Time for Other Executors Delays Execution Machine learning is likely to cause unbalanced computation load Executor Sat Sat Time Executor Executor
34.
34 © NEC
Corporation 2016 Balance Computational Effort for Each Executor Executor optimizes all predictive formulas with equally-divided training data Executor Sat Sat Time Executor Executor
35.
35 © NEC
Corporation 2016 Balance Computational Effort for Each Executor Executor optimizes all predictive formulas with equally-divided training data Executor Sat Sat Time Executor Executor Sat Sat Sat Sat Completion time reduced!
36.
36 © NEC
Corporation 2016 3 Technical Key Points to Fast Run ML on Spark 1TB CPU 12 SEP 11 SEP 10 SEP
37.
37 © NEC
Corporation 2016 Machine Learning Consists of Many Matrix Computations Feature Selection Pruning Data Segmentation Basic Matrix Computation Matrix Inversion Combinatorial Optimization … CPU
38.
38 © NEC
Corporation 2016 RDD Design for Leveraging High-Speed Libraries Distributed HML’s RDD Design Matrix object1 element … Straightforward RDD Design Training sample1 partition Training sample2 CPU CPU Matrix computation libraries (Breeze, BLAS) ~Training sample1 Training sample2 … seq(Training sample1, Training sample2, …) mapPartition Matrix object1 … Training sample1 Training sample2 map Iterator Matrix object creation element
39.
39 © NEC
Corporation 2016 Performance Degradation by Long-Chained RDD Long-chained RDD operations cause high-cost recalculations Feature Selection Pruning Data Segmentation RDD.map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) .map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) .map(PruningMapFunc)
40.
40 © NEC
Corporation 2016 Performance Degradation by Long-Chained RDD Long-chained RDD operations cause high-cost recalculations Feature Selection Pruning Data Segmentation RDD.map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) .map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) GC .map(TreeMapFunc) CPU .map(PruningMapFunc)
41.
41 © NEC
Corporation 2016 Performance Degradation by Long-Chained RDD Cut long-chained operations periodically by checkpoint() Feature Selection Pruning Data Segmentation RDD.map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) .map(TreeMapFunc) .map(FeatureSelectionMapFunc) .reduce(TreeReduceFunc) .reduce(FeatureSelectionReduceFunc) GC .map(TreeMapFunc) savedRDD .checkpoint() = CPU ~ .map(PruningMapFunc)
42.
Benchmark Performance Evaluations
43.
43 © NEC
Corporation 2016 Eval 1: Prediction Error (vs. Spark MLlib algorithms) Distributed HML achieves low error competitive to a complex model data # samples Distributed HML Logistic Regression Decision Tree Random Forests gas sensor array (CO)* 4,208,261 0.542 0.597 0.587 0.576 household power consumption * 2,075,259 0.524 0.531 0.529 0.655 HIGGS* 11,000,000 0.335 0.358 0.337 0.317 HEPMASS* 7,000,000 0.156 0.163 0.167 0.175 * UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/) 1st 1st 1st 2nd
44.
44 © NEC
Corporation 2016 Eval 2: Performance Scalability (vs. Spark MLlib algos) Distributed HML is competitive with Spark MLlib implementations. HIGGS* (11M samples) HEPMASS* (7M samples)# CPU cores Distributed HML Logistic Regression Distributed HML * UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/) Logistic Regression
45.
Evaluation in Real
Case
46.
46 © NEC
Corporation 2016 ATM Cash Balance Prediction Much cash High inventory cost Little cash High risk of out of stock Around 20,000 ATMs
47.
47 © NEC
Corporation 2016 Training Speed 2 hours9 days ▌Summary of data # ATMs: around 20,000 (in Japan) # training samples: around 10M ▌Cluster spec (10 nodes) # CPU cores: 128 Memory: 2.5TB Spark 1.6.1, Hadoop 2.7.1 (HDP 2.3.2) * Run w/ 1 CPU core and 256GB memory 110x Speed up Serial HML* Distributed HML
48.
48 © NEC
Corporation 2016 Prediction Error ▌Summary of data # ATMs: around 20,000 (in Japan) # training samples: around 23M ▌Cluster spec (10 nodes) # CPU cores: 128 Memory: 2.5TB Spark 1.6.1, Hadoop 2.7.1 (HDP 2.3.2) Average Error: 17% lower Serial HML Distributed HMLSerial HML Serial HML vs Winner: Distributed HML
49.
49 © NEC
Corporation 2016 Summary CPU 12 SEP 11 SEP 10 SEP
50.
50 © NEC
Corporation 2016 Summary
Jetzt herunterladen