SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
© 2015 IBM Corporation
Accelerating Machine Learning
Applications on Spark Using GPUs
Wei Tan, Liana Fong
Other contributors: Minisk Cho, Rajesh Bordawekar
October 25
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
without notice at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product direction
and it should not be relied on in making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment, promise, or
legal obligation to deliver any material, code or functionality. Information about potential future
products may not be incorporated into any contract.
• The development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in the
user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those stated
here.
Please Note:
2
Background: Apache Spark and MLlib
• Apache Spark
 An in memory engine for large-scale data processing
 Used in database, stream, machine learning and graph
processing
2
iter. 1 iter. 2 . . .
Input
Background: Apache Spark and MLlib
3
Classification
(LR, SVM…) Trees Recommendation Clustering … …
Background: GPU computing
4
Xeon e5 2687 CPU Tesla K40 GPU
• Slower clock, fewer cache:
not optimized for latency
• More transistors to
compute
• Higher flops and memory
bw
• Optimized for data-parallel,
high-throughput workload
GPU is with:
Background: Apache Spark and MLlib
5
Classification
(LR, SVM…) Trees Recommendation Clustering … …
+ (GPU) connectors and libs?
Problem: large-scale matrix factorization
• Why
 Recommendation important in
cognitive applications
 Digital ads market in US: 37.3 b*:
Spark/Facebook/IBM Commerce
 Need a fast and scalable solution
6
Problem: large-scale matrix factorization
• Why
–Factorize the word co-occurrence
matrix as rating matrix
–Obtain word features that embeds
semantics
7
man – woman =
king – queen =
brother – sister ….
MF: the state-of-art
• Many systems optimized for medium-
sized problems; very few target at
huge problems.
• Distributed solutions are slow.
 Do not roofline CPU performance
 Do not optimize communication
• Distributed solutions need a lot of
resources and cost.
8
MF: what we what to achieve
• Scale to problems of any size.
• Fast.
• Cost-efficient.
9
Solution: cuMF - ALS on a machine with GPUs
• On one GPU
 GPU (Nvidia K40): Memory BW: 288 GB/sec, compute: 5 Tflops
 Memory slower than compute  need to optimize memory access!
• The roofline model
 Higher Gflops  higher op intensity (more flops per byte)  caching!
Operational intensity (Flops/Byte)
Gflops/s
5T
1
288G ×
17
×
Solution: cuMF - ALS on a machine with GPUs
• MO-ALS on one GPU: Memory-Optimized ALS
•Access many θv columns: irregular due to R’s sparseness
•Aggregate many θvθv
Ts: memory intensive
Solution: cuMF - ALS on a machine with GPUs
• Texture memory to smooth dis-contiguous, irregular memory access
• Register memory to hold hotspot variables
12
Solution: cuMF - ALS on a machine with GPUs
• On multiple GPUs
• Exploit data & model parallelism
– Data parallelism: solve using a portion of the training data
– Model parallelism: solve a portion of the model
• Exploit connection topology to minimize communication overhead
13
Data parallel
model
parallel
CuMF performance
CuMF Performance
• cuMF: ALS on a single machine with 2* Nvidia K80 (4 cards)
 Compared with state-of-art distributed solutions
• 6-10x as fast
• 33-100x as cost-efficient (cuMF costs $2.5 per hour on Softlayer)
 Able to factorize the largest matrix ever reported
15
CuMF Performance
• cuMF: ALS on a machine with one GPU
 4x speedup as Spark ALS accelerator
16
Spark ALS
Spark
run-time
MLlib
cuMF with Spark
cuMF
C
Roadmap
• Current work
 Impressive acceleration of MF with GPUs on one machine
 GPU acceleration techniques with model and data parallelism
 Illustrated applicability of GPU acceleration to Spark/Mllib
 Performance evaluations on K40, K80 GPUs, Intel and Power
• Future work
 GPU acceleration of other ML algorithms in Mllib or others
 Acceleration of algorithms for multiple GPUs on single and
across machines, with and without RDMA across machines
 Performance evaluation on other hardware, including
• Other GPUs such as Nvidia Maxwell
• Forthcoming NVLink connectively across GPUs within a single
machine
17
18
Notices and Disclaimers
Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form
without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for
accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to
update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO
EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO,
LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted
according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as
illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other
results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services
available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the
views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or
other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the
identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will
ensure that the customer is in compliance with any law.
19
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
• IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document
Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON,
OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,
pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ,
Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of
International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
www.ibm.com/legal/copytrade.shtml.
© 2015 IBM Corporation
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

IBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateIBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateAnderson Bassani
 
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014IBM Systems UKI
 
Ims01 ims trends and directions - IMS UG May 2014 Sydney & Melbourne
Ims01   ims trends and directions - IMS UG May 2014 Sydney & MelbourneIms01   ims trends and directions - IMS UG May 2014 Sydney & Melbourne
Ims01 ims trends and directions - IMS UG May 2014 Sydney & MelbourneRobert Hain
 
IBM Server Makeover. Your first step towards lower costs, lower risks
IBM Server Makeover. Your first step towards lower costs, lower risksIBM Server Makeover. Your first step towards lower costs, lower risks
IBM Server Makeover. Your first step towards lower costs, lower risksIBM India Smarter Computing
 
Optimizing z/OS Batch
Optimizing z/OS BatchOptimizing z/OS Batch
Optimizing z/OS BatchMartin Packer
 
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...IBM India Smarter Computing
 
Tip from IBM Connect2014: XPages Accessibility
Tip from IBM Connect2014: XPages AccessibilityTip from IBM Connect2014: XPages Accessibility
Tip from IBM Connect2014: XPages AccessibilitySocialBiz UserGroup
 
OpenWhisk Part 1 Research Data at Interconnect 2017
OpenWhisk Part 1 Research Data at Interconnect 2017OpenWhisk Part 1 Research Data at Interconnect 2017
OpenWhisk Part 1 Research Data at Interconnect 2017Perry Cheng
 
Improving Software Delivery with Software Defined Environments (IBM Interconn...
Improving Software Delivery with Software Defined Environments (IBM Interconn...Improving Software Delivery with Software Defined Environments (IBM Interconn...
Improving Software Delivery with Software Defined Environments (IBM Interconn...Michael Elder
 
Become an IBM Cloud Architect in 40 Minutes
Become an IBM Cloud Architect in 40 MinutesBecome an IBM Cloud Architect in 40 Minutes
Become an IBM Cloud Architect in 40 MinutesAndrew Ferrier
 
AD 1656 - Transforming social data into business insight
AD 1656 - Transforming social data into business insightAD 1656 - Transforming social data into business insight
AD 1656 - Transforming social data into business insightVincent Burckhardt
 
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...paul young cpa, cga
 
OpenWhisk Part 2 Research Day at Interconnect 2017
OpenWhisk Part 2 Research Day at Interconnect 2017OpenWhisk Part 2 Research Day at Interconnect 2017
OpenWhisk Part 2 Research Day at Interconnect 2017Perry Cheng
 
2016 interconnect 7 habits of a successful scaled agile adoption using ibm clm
2016 interconnect   7 habits of a successful scaled agile adoption using ibm clm2016 interconnect   7 habits of a successful scaled agile adoption using ibm clm
2016 interconnect 7 habits of a successful scaled agile adoption using ibm clmReedy Feggins Jr
 
From Creepy to Cool: Fine Lines in Audience Analytics
From Creepy to Cool: Fine Lines in Audience AnalyticsFrom Creepy to Cool: Fine Lines in Audience Analytics
From Creepy to Cool: Fine Lines in Audience Analyticsgraemeknows
 
Creepy to cool audience analytics e merge 2014
Creepy to cool   audience analytics e merge 2014Creepy to cool   audience analytics e merge 2014
Creepy to cool audience analytics e merge 2014graemeknows
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - Strategy
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - StrategyNRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - Strategy
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - StrategyNRB
 
TI 1641 - delivering enterprise software at the speed of cloud
TI 1641 - delivering enterprise software at the speed of cloudTI 1641 - delivering enterprise software at the speed of cloud
TI 1641 - delivering enterprise software at the speed of cloudVincent Burckhardt
 

Was ist angesagt? (19)

IBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateIBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies Update
 
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014
PureApp Hybrid Cloud Jonathan Langley Presentation 11th September 2014
 
Ims01 ims trends and directions - IMS UG May 2014 Sydney & Melbourne
Ims01   ims trends and directions - IMS UG May 2014 Sydney & MelbourneIms01   ims trends and directions - IMS UG May 2014 Sydney & Melbourne
Ims01 ims trends and directions - IMS UG May 2014 Sydney & Melbourne
 
IBM Server Makeover. Your first step towards lower costs, lower risks
IBM Server Makeover. Your first step towards lower costs, lower risksIBM Server Makeover. Your first step towards lower costs, lower risks
IBM Server Makeover. Your first step towards lower costs, lower risks
 
Optimizing z/OS Batch
Optimizing z/OS BatchOptimizing z/OS Batch
Optimizing z/OS Batch
 
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...
Getting the MAX from your Virtualized Environment: Comprehensive Solutions fr...
 
Tip from IBM Connect2014: XPages Accessibility
Tip from IBM Connect2014: XPages AccessibilityTip from IBM Connect2014: XPages Accessibility
Tip from IBM Connect2014: XPages Accessibility
 
OpenWhisk Part 1 Research Data at Interconnect 2017
OpenWhisk Part 1 Research Data at Interconnect 2017OpenWhisk Part 1 Research Data at Interconnect 2017
OpenWhisk Part 1 Research Data at Interconnect 2017
 
Improving Software Delivery with Software Defined Environments (IBM Interconn...
Improving Software Delivery with Software Defined Environments (IBM Interconn...Improving Software Delivery with Software Defined Environments (IBM Interconn...
Improving Software Delivery with Software Defined Environments (IBM Interconn...
 
Become an IBM Cloud Architect in 40 Minutes
Become an IBM Cloud Architect in 40 MinutesBecome an IBM Cloud Architect in 40 Minutes
Become an IBM Cloud Architect in 40 Minutes
 
AD 1656 - Transforming social data into business insight
AD 1656 - Transforming social data into business insightAD 1656 - Transforming social data into business insight
AD 1656 - Transforming social data into business insight
 
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...
Vision 2016 fpm 1072 - tips on using ibm cognos command center with ibm plann...
 
OpenWhisk Part 2 Research Day at Interconnect 2017
OpenWhisk Part 2 Research Day at Interconnect 2017OpenWhisk Part 2 Research Day at Interconnect 2017
OpenWhisk Part 2 Research Day at Interconnect 2017
 
IOD 2012_ADP_092912
IOD 2012_ADP_092912 IOD 2012_ADP_092912
IOD 2012_ADP_092912
 
2016 interconnect 7 habits of a successful scaled agile adoption using ibm clm
2016 interconnect   7 habits of a successful scaled agile adoption using ibm clm2016 interconnect   7 habits of a successful scaled agile adoption using ibm clm
2016 interconnect 7 habits of a successful scaled agile adoption using ibm clm
 
From Creepy to Cool: Fine Lines in Audience Analytics
From Creepy to Cool: Fine Lines in Audience AnalyticsFrom Creepy to Cool: Fine Lines in Audience Analytics
From Creepy to Cool: Fine Lines in Audience Analytics
 
Creepy to cool audience analytics e merge 2014
Creepy to cool   audience analytics e merge 2014Creepy to cool   audience analytics e merge 2014
Creepy to cool audience analytics e merge 2014
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - Strategy
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - StrategyNRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - Strategy
NRB - LUXEMBOURG MAINFRAME DAY 2017 - z platform - Strategy
 
TI 1641 - delivering enterprise software at the speed of cloud
TI 1641 - delivering enterprise software at the speed of cloudTI 1641 - delivering enterprise software at the speed of cloud
TI 1641 - delivering enterprise software at the speed of cloud
 

Andere mochten auch

SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingMark Kilgard
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in Rherbps10
 
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014StampedeCon
 
Deep learning on spark
Deep learning on sparkDeep learning on spark
Deep learning on sparkSatyendra Rana
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering Mark Kilgard
 
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...odsc
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overviewinside-BigData.com
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015Kohei KaiGai
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScaleGoDataDriven
 
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...Spark Summit
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillotsparktc
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudSubbu Rama
 
How to Solve Real-Time Data Problems
How to Solve Real-Time Data ProblemsHow to Solve Real-Time Data Problems
How to Solve Real-Time Data ProblemsIBM Power Systems
 
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Chris Fregly
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkSpark Summit
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit
 

Andere mochten auch (20)

SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
GPU Ecosystem
GPU EcosystemGPU Ecosystem
GPU Ecosystem
 
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
 
Deep learning on spark
Deep learning on sparkDeep learning on spark
Deep learning on spark
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering
 
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overview
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
 
Hadoop + GPU
Hadoop + GPUHadoop + GPU
Hadoop + GPU
 
Deep Learning on Hadoop
Deep Learning on HadoopDeep Learning on Hadoop
Deep Learning on Hadoop
 
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillot
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the Cloud
 
How to Solve Real-Time Data Problems
How to Solve Real-Time Data ProblemsHow to Solve Real-Time Data Problems
How to Solve Real-Time Data Problems
 
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim Hunter
 

Ähnlich wie Accelerating ML Apps on Spark Using GPUs

DESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIDESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIUlf Troppens
 
Best practices for cloud hosted api management
Best practices for cloud hosted api managementBest practices for cloud hosted api management
Best practices for cloud hosted api managementsflynn073
 
Creating your own cloud hosted APIM platform
Creating your own cloud hosted APIM platformCreating your own cloud hosted APIM platform
Creating your own cloud hosted APIM platformsflynn073
 
Making People Flow in Cities Measurable and Analyzable
Making People Flow in Cities Measurable and AnalyzableMaking People Flow in Cities Measurable and Analyzable
Making People Flow in Cities Measurable and AnalyzableWeiwei Yang
 
Disaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File ManagementDisaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File ManagementTrishali Nayar
 
Evolving a monolithic Java EE application to microservices
Evolving a monolithic Java EE application to microservicesEvolving a monolithic Java EE application to microservices
Evolving a monolithic Java EE application to microservicesErin Schnabel
 
The Bluemix Quadruple Threat
The Bluemix Quadruple ThreatThe Bluemix Quadruple Threat
The Bluemix Quadruple ThreatRam Vennam
 
IBM Design Thinking + Agile + DevOps Interconnect 2017
IBM Design Thinking + Agile + DevOps Interconnect 2017IBM Design Thinking + Agile + DevOps Interconnect 2017
IBM Design Thinking + Agile + DevOps Interconnect 2017David Luke
 
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARInterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARDevOps for Enterprise Systems
 
Union Bank Slashes Onboarding Times with Analytics
Union Bank Slashes Onboarding Times with Analytics Union Bank Slashes Onboarding Times with Analytics
Union Bank Slashes Onboarding Times with Analytics Pyramid Solutions, Inc.
 
IC6284A - The Art of Choosing the Best Cloud Solution
IC6284A - The Art of Choosing the Best Cloud SolutionIC6284A - The Art of Choosing the Best Cloud Solution
IC6284A - The Art of Choosing the Best Cloud SolutionHendrik van Run
 
Managing integration in a multi cluster world
Managing integration in a multi cluster worldManaging integration in a multi cluster world
Managing integration in a multi cluster worldShikha Srivastava
 
Fnb optimizes retail banking product offers using real-time propensity models...
Fnb optimizes retail banking product offers using real-time propensity models...Fnb optimizes retail banking product offers using real-time propensity models...
Fnb optimizes retail banking product offers using real-time propensity models...Avsharn
 
Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Seeling Cheung
 
Informix REST API Tutorial
Informix REST API TutorialInformix REST API Tutorial
Informix REST API TutorialBrian Hughes
 
Java and the GPU - Everything You Need To Know
Java and the GPU - Everything You Need To KnowJava and the GPU - Everything You Need To Know
Java and the GPU - Everything You Need To KnowAdam Roberts
 
IBM Message Hub: Cloud-Native Messaging
IBM Message Hub: Cloud-Native MessagingIBM Message Hub: Cloud-Native Messaging
IBM Message Hub: Cloud-Native MessagingAndrew Schofield
 
Witness the Evolution of Teamwork
Witness the Evolution of TeamworkWitness the Evolution of Teamwork
Witness the Evolution of TeamworkMatt Holitza
 
Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Arthur De Magalhaes
 
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...0xdaryl
 

Ähnlich wie Accelerating ML Apps on Spark Using GPUs (20)

DESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA IIIDESY's new data taking and analysis infrastructure for PETRA III
DESY's new data taking and analysis infrastructure for PETRA III
 
Best practices for cloud hosted api management
Best practices for cloud hosted api managementBest practices for cloud hosted api management
Best practices for cloud hosted api management
 
Creating your own cloud hosted APIM platform
Creating your own cloud hosted APIM platformCreating your own cloud hosted APIM platform
Creating your own cloud hosted APIM platform
 
Making People Flow in Cities Measurable and Analyzable
Making People Flow in Cities Measurable and AnalyzableMaking People Flow in Cities Measurable and Analyzable
Making People Flow in Cities Measurable and Analyzable
 
Disaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File ManagementDisaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File Management
 
Evolving a monolithic Java EE application to microservices
Evolving a monolithic Java EE application to microservicesEvolving a monolithic Java EE application to microservices
Evolving a monolithic Java EE application to microservices
 
The Bluemix Quadruple Threat
The Bluemix Quadruple ThreatThe Bluemix Quadruple Threat
The Bluemix Quadruple Threat
 
IBM Design Thinking + Agile + DevOps Interconnect 2017
IBM Design Thinking + Agile + DevOps Interconnect 2017IBM Design Thinking + Agile + DevOps Interconnect 2017
IBM Design Thinking + Agile + DevOps Interconnect 2017
 
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPARInterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
InterConnect 2017 : z/OS-as-a-Service: The Disposable LPAR
 
Union Bank Slashes Onboarding Times with Analytics
Union Bank Slashes Onboarding Times with Analytics Union Bank Slashes Onboarding Times with Analytics
Union Bank Slashes Onboarding Times with Analytics
 
IC6284A - The Art of Choosing the Best Cloud Solution
IC6284A - The Art of Choosing the Best Cloud SolutionIC6284A - The Art of Choosing the Best Cloud Solution
IC6284A - The Art of Choosing the Best Cloud Solution
 
Managing integration in a multi cluster world
Managing integration in a multi cluster worldManaging integration in a multi cluster world
Managing integration in a multi cluster world
 
Fnb optimizes retail banking product offers using real-time propensity models...
Fnb optimizes retail banking product offers using real-time propensity models...Fnb optimizes retail banking product offers using real-time propensity models...
Fnb optimizes retail banking product offers using real-time propensity models...
 
Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...
 
Informix REST API Tutorial
Informix REST API TutorialInformix REST API Tutorial
Informix REST API Tutorial
 
Java and the GPU - Everything You Need To Know
Java and the GPU - Everything You Need To KnowJava and the GPU - Everything You Need To Know
Java and the GPU - Everything You Need To Know
 
IBM Message Hub: Cloud-Native Messaging
IBM Message Hub: Cloud-Native MessagingIBM Message Hub: Cloud-Native Messaging
IBM Message Hub: Cloud-Native Messaging
 
Witness the Evolution of Teamwork
Witness the Evolution of TeamworkWitness the Evolution of Teamwork
Witness the Evolution of Teamwork
 
Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!Exposing auto-generated Swagger 2.0 documents from Liberty!
Exposing auto-generated Swagger 2.0 documents from Liberty!
 
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...
JavaOne 2015 CON7547 "Beyond the Coffee Cup: Leveraging Java Runtime Technolo...
 

Kürzlich hochgeladen

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 

Kürzlich hochgeladen (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 

Accelerating ML Apps on Spark Using GPUs

  • 1. © 2015 IBM Corporation Accelerating Machine Learning Applications on Spark Using GPUs Wei Tan, Liana Fong Other contributors: Minisk Cho, Rajesh Bordawekar October 25
  • 2. • IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. • Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. • The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. • The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. Please Note: 2
  • 3. Background: Apache Spark and MLlib • Apache Spark  An in memory engine for large-scale data processing  Used in database, stream, machine learning and graph processing 2 iter. 1 iter. 2 . . . Input
  • 4. Background: Apache Spark and MLlib 3 Classification (LR, SVM…) Trees Recommendation Clustering … …
  • 5. Background: GPU computing 4 Xeon e5 2687 CPU Tesla K40 GPU • Slower clock, fewer cache: not optimized for latency • More transistors to compute • Higher flops and memory bw • Optimized for data-parallel, high-throughput workload GPU is with:
  • 6. Background: Apache Spark and MLlib 5 Classification (LR, SVM…) Trees Recommendation Clustering … … + (GPU) connectors and libs?
  • 7. Problem: large-scale matrix factorization • Why  Recommendation important in cognitive applications  Digital ads market in US: 37.3 b*: Spark/Facebook/IBM Commerce  Need a fast and scalable solution 6
  • 8. Problem: large-scale matrix factorization • Why –Factorize the word co-occurrence matrix as rating matrix –Obtain word features that embeds semantics 7 man – woman = king – queen = brother – sister ….
  • 9. MF: the state-of-art • Many systems optimized for medium- sized problems; very few target at huge problems. • Distributed solutions are slow.  Do not roofline CPU performance  Do not optimize communication • Distributed solutions need a lot of resources and cost. 8
  • 10. MF: what we what to achieve • Scale to problems of any size. • Fast. • Cost-efficient. 9
  • 11. Solution: cuMF - ALS on a machine with GPUs • On one GPU  GPU (Nvidia K40): Memory BW: 288 GB/sec, compute: 5 Tflops  Memory slower than compute  need to optimize memory access! • The roofline model  Higher Gflops  higher op intensity (more flops per byte)  caching! Operational intensity (Flops/Byte) Gflops/s 5T 1 288G × 17 ×
  • 12. Solution: cuMF - ALS on a machine with GPUs • MO-ALS on one GPU: Memory-Optimized ALS •Access many θv columns: irregular due to R’s sparseness •Aggregate many θvθv Ts: memory intensive
  • 13. Solution: cuMF - ALS on a machine with GPUs • Texture memory to smooth dis-contiguous, irregular memory access • Register memory to hold hotspot variables 12
  • 14. Solution: cuMF - ALS on a machine with GPUs • On multiple GPUs • Exploit data & model parallelism – Data parallelism: solve using a portion of the training data – Model parallelism: solve a portion of the model • Exploit connection topology to minimize communication overhead 13 Data parallel model parallel
  • 16. CuMF Performance • cuMF: ALS on a single machine with 2* Nvidia K80 (4 cards)  Compared with state-of-art distributed solutions • 6-10x as fast • 33-100x as cost-efficient (cuMF costs $2.5 per hour on Softlayer)  Able to factorize the largest matrix ever reported 15
  • 17. CuMF Performance • cuMF: ALS on a machine with one GPU  4x speedup as Spark ALS accelerator 16 Spark ALS Spark run-time MLlib cuMF with Spark cuMF C
  • 18. Roadmap • Current work  Impressive acceleration of MF with GPUs on one machine  GPU acceleration techniques with model and data parallelism  Illustrated applicability of GPU acceleration to Spark/Mllib  Performance evaluations on K40, K80 GPUs, Intel and Power • Future work  GPU acceleration of other ML algorithms in Mllib or others  Acceleration of algorithms for multiple GPUs on single and across machines, with and without RDMA across machines  Performance evaluation on other hardware, including • Other GPUs such as Nvidia Maxwell • Forthcoming NVLink connectively across GPUs within a single machine 17
  • 19. 18 Notices and Disclaimers Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
  • 20. 19 Notices and Disclaimers (con’t) Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. • IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
  • 21. © 2015 IBM Corporation Thank You