SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Intel® Confidential — INTERNAL USE ONLY
SPARKMEETUP@INTEL
CO-HOSTEDbyIntelandDATABRICKS
March23,2017
https://github.com/intel-analytics/BigDL 2
AGENDA
7:00 - 7:10 pm Opening Remarks & Introductions
7:10 - 7:55: pm Intel Tech Talk from Jiao Wang and Sergey Ermolin
7:55 - 8:00 pm Short Break
8:00 - 8:45 pm Databricks Tech Talk from Tathagata Das (TD)
8:45 - 9:00 pm Mingling
WIFI ACCESS
Connect to the wireless network: Guest
Enter access code: 81995579
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 3
INTELBIGDATATECHNOLOGIESgroup
IntelandHadoop/SparkEcosystem
• PMC members and committers on multiple projects
• Unique perspectives gained from customers
• Industry and academia collaboration on open-
source projects
software.intel.com/bigdata
software.intel.com/AI
https://software.intel.com/ai
Jason Dai
Senior Principal Engineer and
Chief Architect,
Big Data Technologies,
Intel Corporation
Intel® Confidential — INTERNAL USE ONLY
DistributedDeepLearningAtScaleon
ApacheSparkwithBigDL
Jiao(Jennie)Wang, SergeyErmolin
BigDataTechnologies,SoftwareandServiceGroup
Intelcorporation
https://github.com/intel-analytics/BigDL 5
WhatisBigDL?
BigDL is a distributed deep learning library for Apache Spark*
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 6
WhyBigDL?
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL https://software.intel.com/ai 7
Production ML/DL system is Complex and Distributed.
Spark-based Deep Learning library is a natural fit
WhyBigDL?
https://github.com/intel-analytics/BigDL 8
BIGDLWITHINSPARKFRAMEWORK
End-to-end Big Data Analytics with
Deep Learning Functionalities
Directly on Spark
• Natively integrated with Big Data
(Hadoop/Spark) ecosystem
• Massively distributed, scale out
• Sends compute to data
• Fault tolerance
• Elasticity
• Incremental scaling
• Dynamic resource sharing
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 9
BIGDLbenefits
• Allows to write deep learning applications as standard Spark programs
• Runs on top of existing Spark or Hadoop/Hive clusters
• Adds rich Deep Learning functionalities to Apache Spark
• Feature parity with Caffe and other single-node DL frameworks.
• High performance - Intel MKL and multi-threaded programming
• Efficient scale-out with an all-reduce communications on Spark
BigDL has been open sourced since 2016: https://github.com/intel-analytics/BigDL
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 10
WHATisinbigdlforyou?
You may want to write your deep learning programs using BigDL if you
need to:
• Analyze “big data” using deep learning on the same Hadoop/Spark cluster
where the data are stored
• Add deep learning functionalities to the Big Data (Spark) programs and/or
workflow
• Leverage existing Hadoop/Spark clusters to run deep learning applications
• Dynamically share with other workloads (e.g., ETL, data warehouse, feature
engineering, classical machine learning, graph analytics, etc.)
• Making deep learning more accessible for Big Data users and data scientists,
who are usually not experts for deep learning
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 11
BigDLFeatures
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 12
BigDLFeatures
Distributed Deep learning applications (training, fine-tuning & prediction)
on Apache Spark*
• No changes to the existing Hadoop/Spark clusters needed
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 13
BigDLcanre-use/fine-tunemodelsfromotherframeworks
• Load existing Caffe/Torch Model
• Allows for transition from single-node
to distributed application deployment
• Useful for inference
• Allows for minor model tuning
• Allows for model sharing between
Data Scientists and Production Engr.
Caffe
Model File
Torch
Model File
Storage
BigDL
BigDL
Model File
Load
Save
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 14
BigDLintegrationwithsparkml
Integrates with Spark-ML Pipeline:
• Wrapper with Spark ML Transformer
• BigDL Plugs into Spark ML pipeline
• Support Spark v1.5/1.6/2.0
DataFrame
Transfomer1
Transfomer2
DLClassifer
…
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 15
PythonAPISupport
Based on PySpark, Python API in BigDL
allows use of existing Python libs:
• Numpy
• Scipy
• Pandas
• Scikit-learn
• NLTK
• Matplotlib
• …
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 16
WorkswithJupyterNotebook
Running BigDL applications directly in
Jupyter notebooks
 Share and Reproduce
– Notebooks can be shared with others
– Easy to reproduce and track
 Rich Content
– Texts, images, videos, LaTeX and JavaScript
– Code can also produce rich contents
 Rich toolbox
– Apache Spark, from Python, R and Scala
– Pandas, scikit-learn, ggplot2, dplyr, etc
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 17
VisualizationforLearning
BigDL integration with TensorBoard
• TensorBoard is a suite of web applications from Google for visualizing and
understanding deep learning applications
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 18
Spark
Streaming RDDs
EvaluatorBigDL
Model
StreamWriter
Integration with Spark Streaming for runtime training and prediction
HDFS/S3
Kafka
Flume
Kinesis
Twitter
Train
Predict
BigDLintegrationwithsparkstreaming
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 19
Tight Integrations with Spark SQL, DataFrames and Structured Streaming
*Image classification on ImageNet(http://www.image-net.org)
BigDLFeatures
Kafka File
Data Frame
(Batch/Stream)
BigDL UDF
Filtered Data
Frame
(Batch/Stream)
df.select($’image’)
.withColumn(
“image_type”,
ImgClassifier(“image”))
.filter($’image_type’
== ‘dog’)
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 20
BigDLFeatures
BigDL provides out-of-the box examples of popular CNN models
- helps a developer to get started
https://github.com/intel-analytics/BigDL/wiki/Examples
Models (Train and Inference Example Code):
• LeNet, Inception, VGG, ResNet, RNN, Auto-encoder
Examples:
• Text Classification
• Image Classification
• Load Torch/Caffe model
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 21
BigDLFeatures
• Single node Xeon performance
• Benchmarked to be best on Xeon E5-26XX v3 or E5-26XX v4
• Orders of magnitude speedup vs. out-of-box open source Caffe, Torch or
TensorFlow
• Scaling-out
• Efficiently scales out to 10s~100s of Xeon servers on Spark
* For more complete information about performance and benchmark results, visit www.intel.com/benchmarks
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 22
BigDLinstallationonmajorcloudframeworks–AWS.
https://github.com/intel-analytics/BigDL/wiki/Running-on-EC2
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 23
BigDLonotherplatforms
• “Intel’s BigDL on Databricks”
https://databricks.com/blog/2017/02/09/intels-bigdl-databricks.html
• “How to use BigDL on Apache Spark for Azure HDInsight”
https://blogs.msdn.microsoft.com/azuredatalake/2017/03/17/how-to-
use-bigdl-on-apache-spark-for-azure-hdinsight/
• More to come – check back periodically and stay tuned
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 24
BigDLUsecases
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 25
VisualrecognitionandObjectDetection
Faster-RCNN SSD: Single Shot MultiBox Detector
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 26
ObjectDetectiononPASCAL
*(http://host.robots.ox.ac.uk/pascal/VOC/)
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL https://software.intel.com/ai 27
NaturalLanguageModel-RNN
Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
https://github.com/intel-analytics/BigDL 28
LearnfromShakespearePoems
Output of RNN:
Long live the King . The King and Queen , and the Strange of the Veils of the
rhapsodic . and grapple, and the entreatments of the pressure .
Upon her head , and in the world ? `` Oh, the gods ! O Jove ! To whom the king :
`` O friends !
Her hair, nor loose ! If , my lord , and the groundlings of the skies . jocund and
Tasso in the Staggering of the Mankind . and
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 29
MoreRNNSupport:LSTM
BigDL also supports LSTM variants such as GRU and LSTM with peepholes
Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 30
FinTech:TransactionFraudDetection
• Historical data is stored on Hive
• Data preprocessing with SparkSQL
• Spark ML pipeline for complex
feature engineering
• Use multiple BigDL CNN models
• Use Sample+Bagging to solve
unbalance problem
• Grid search for hyper parameter
tuning
Powered by BigDL
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 31
Manufacturing:ProductDefectDetectionandClassification
Data source:
• Feed from video cameras installed within manufacturing pipeline
Objective:
• Identify surface defects areas from camera feeds
• Classify defects (eg. Scrape vs smudge).
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 32
ProductDefectDetectionandClassification
(KeyStone ML Pipeline)
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL
BigDLOngithub
https://github.com/intel-analytics/BigDL
33https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 34
BIGDLCommunity
Join Our Mail List
bigdl-user-group+subscribe@googlegroups.com
Report Bugs And Create Feature Request
https://github.com/intel-analytics/BigDL/issues
https://software.intel.com/ai
https://github.com/intel-analytics/BigDL 35
Demo
https://github.com/intel-analytics/BigDL 36
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL
PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO
FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S
PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE
DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY
OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR
ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions
marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to
them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized
errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go
to: http://www.intel.com/design/literature.htm
Intel, Quark, VTune, Xeon, Cilk, Atom, Look Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
*Other names and brands may be claimed as the property of others.
Copyright ©2015 Intel Corporation.
https://github.com/intel-analytics/BigDL 37
Risk Factors
The above statements and any others in this document that refer to plans and expectations for the first quarter, the year and the future are forward-looking
statements that involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,”
“should” and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify
forward-looking statements. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual
results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could
cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors including changes in
business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes
in customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions
poses a risk that consumers and businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other
related matters. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short
term and product demand that is highly variable and difficult to forecast. Revenue and the gross margin percentage are affected by the timing of Intel product
introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions,
marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to
incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in
inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and
execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of
materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets.
Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers
operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates.
Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand
for Intel's products and the level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could
be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving
intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An
unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular
business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed
discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent reports on Form 10-Q,
Form 10-K and earnings release.
Distributed Deep Learning At Scale On Apache Spark With BigDL

Weitere ähnliche Inhalte

Was ist angesagt?

Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Databricks
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Databricks
 
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...Databricks
 
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Databricks
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchIntel® Software
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...Databricks
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...Databricks
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scaleHenry Saputra
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
 
Democratizing AI with Apache Spark
Democratizing AI with Apache SparkDemocratizing AI with Apache Spark
Democratizing AI with Apache SparkSpark Summit
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Spark Summit
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsStavros Kontopoulos
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureData Science Milan
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OSri Ambati
 

Was ist angesagt? (20)

Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
 
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI Research
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
Debugging Big Data Analytics in Apache Spark with BigDebug with Muhammad Gulz...
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scale
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
Democratizing AI with Apache Spark
Democratizing AI with Apache SparkDemocratizing AI with Apache Spark
Democratizing AI with Apache Spark
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 

Andere mochten auch

Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Anyscale
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Donner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsDonner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsVienna Data Science Group
 
Best Deep Learning Post from LinkedIn Group
Best Deep Learning Post from LinkedIn Group Best Deep Learning Post from LinkedIn Group
Best Deep Learning Post from LinkedIn Group Farshid Pirahansiah
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningSri Ambati
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer BuildPetteriTeikariPhD
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learningYu Huang
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習台灣資料科學年會
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through ExamplesSri Ambati
 
Sap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresSap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresAvinash Kumar Gautam
 
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENWHPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENWViplava Kumar Madasu
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
 
Anti-Money Laundering Solution
Anti-Money Laundering SolutionAnti-Money Laundering Solution
Anti-Money Laundering SolutionSri Ambati
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopDatabricks
 

Andere mochten auch (20)

Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Donner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsDonner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspects
 
Best Deep Learning Post from LinkedIn Group
Best Deep Learning Post from LinkedIn Group Best Deep Learning Post from LinkedIn Group
Best Deep Learning Post from LinkedIn Group
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer Build
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learning
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
UC San Diego's Big Data Specialization Capstone
UC San Diego's Big Data Specialization CapstoneUC San Diego's Big Data Specialization Capstone
UC San Diego's Big Data Specialization Capstone
 
Sap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresSap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration features
 
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENWHPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
 
NLP with H2O
NLP with H2ONLP with H2O
NLP with H2O
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
Anti-Money Laundering Solution
Anti-Money Laundering SolutionAnti-Money Laundering Solution
Anti-Money Laundering Solution
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
 

Ähnlich wie Distributed Deep Learning At Scale On Apache Spark With BigDL

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...Databricks
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Spark Summit
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...GetInData
 
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooAutomated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooJason Dai
 
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatengeKarin Patenge
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Peadar Coyle
 
A short introduction to Spark and its benefits
A short introduction to Spark and its benefitsA short introduction to Spark and its benefits
A short introduction to Spark and its benefitsJohan Picard
 
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
IOT with Drupal 8 -  Webinar Hyderabad Drupal CommunityIOT with Drupal 8 -  Webinar Hyderabad Drupal Community
IOT with Drupal 8 - Webinar Hyderabad Drupal CommunityPrateek Jain
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesPetteriTeikariPhD
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big AnalyticsAjay Ohri
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersDatabricks
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Michael Rys
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...Keiichiro Ono
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 

Ähnlich wie Distributed Deep Learning At Scale On Apache Spark With BigDL (20)

BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooAutomated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
 
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.
 
A short introduction to Spark and its benefits
A short introduction to Spark and its benefitsA short introduction to Spark and its benefits
A short introduction to Spark and its benefits
 
Big data hadoop
Big data hadoopBig data hadoop
Big data hadoop
 
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
IOT with Drupal 8 -  Webinar Hyderabad Drupal CommunityIOT with Drupal 8 -  Webinar Hyderabad Drupal Community
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Distributed Deep Learning At Scale On Apache Spark With BigDL

  • 1. Intel® Confidential — INTERNAL USE ONLY SPARKMEETUP@INTEL CO-HOSTEDbyIntelandDATABRICKS March23,2017
  • 2. https://github.com/intel-analytics/BigDL 2 AGENDA 7:00 - 7:10 pm Opening Remarks & Introductions 7:10 - 7:55: pm Intel Tech Talk from Jiao Wang and Sergey Ermolin 7:55 - 8:00 pm Short Break 8:00 - 8:45 pm Databricks Tech Talk from Tathagata Das (TD) 8:45 - 9:00 pm Mingling WIFI ACCESS Connect to the wireless network: Guest Enter access code: 81995579 https://software.intel.com/ai
  • 3. https://github.com/intel-analytics/BigDL 3 INTELBIGDATATECHNOLOGIESgroup IntelandHadoop/SparkEcosystem • PMC members and committers on multiple projects • Unique perspectives gained from customers • Industry and academia collaboration on open- source projects software.intel.com/bigdata software.intel.com/AI https://software.intel.com/ai Jason Dai Senior Principal Engineer and Chief Architect, Big Data Technologies, Intel Corporation
  • 4. Intel® Confidential — INTERNAL USE ONLY DistributedDeepLearningAtScaleon ApacheSparkwithBigDL Jiao(Jennie)Wang, SergeyErmolin BigDataTechnologies,SoftwareandServiceGroup Intelcorporation
  • 5. https://github.com/intel-analytics/BigDL 5 WhatisBigDL? BigDL is a distributed deep learning library for Apache Spark* https://software.intel.com/ai
  • 7. https://github.com/intel-analytics/BigDL https://software.intel.com/ai 7 Production ML/DL system is Complex and Distributed. Spark-based Deep Learning library is a natural fit WhyBigDL?
  • 8. https://github.com/intel-analytics/BigDL 8 BIGDLWITHINSPARKFRAMEWORK End-to-end Big Data Analytics with Deep Learning Functionalities Directly on Spark • Natively integrated with Big Data (Hadoop/Spark) ecosystem • Massively distributed, scale out • Sends compute to data • Fault tolerance • Elasticity • Incremental scaling • Dynamic resource sharing https://software.intel.com/ai
  • 9. https://github.com/intel-analytics/BigDL 9 BIGDLbenefits • Allows to write deep learning applications as standard Spark programs • Runs on top of existing Spark or Hadoop/Hive clusters • Adds rich Deep Learning functionalities to Apache Spark • Feature parity with Caffe and other single-node DL frameworks. • High performance - Intel MKL and multi-threaded programming • Efficient scale-out with an all-reduce communications on Spark BigDL has been open sourced since 2016: https://github.com/intel-analytics/BigDL https://software.intel.com/ai
  • 10. https://github.com/intel-analytics/BigDL 10 WHATisinbigdlforyou? You may want to write your deep learning programs using BigDL if you need to: • Analyze “big data” using deep learning on the same Hadoop/Spark cluster where the data are stored • Add deep learning functionalities to the Big Data (Spark) programs and/or workflow • Leverage existing Hadoop/Spark clusters to run deep learning applications • Dynamically share with other workloads (e.g., ETL, data warehouse, feature engineering, classical machine learning, graph analytics, etc.) • Making deep learning more accessible for Big Data users and data scientists, who are usually not experts for deep learning https://software.intel.com/ai
  • 12. https://github.com/intel-analytics/BigDL 12 BigDLFeatures Distributed Deep learning applications (training, fine-tuning & prediction) on Apache Spark* • No changes to the existing Hadoop/Spark clusters needed https://software.intel.com/ai
  • 13. https://github.com/intel-analytics/BigDL 13 BigDLcanre-use/fine-tunemodelsfromotherframeworks • Load existing Caffe/Torch Model • Allows for transition from single-node to distributed application deployment • Useful for inference • Allows for minor model tuning • Allows for model sharing between Data Scientists and Production Engr. Caffe Model File Torch Model File Storage BigDL BigDL Model File Load Save https://software.intel.com/ai
  • 14. https://github.com/intel-analytics/BigDL 14 BigDLintegrationwithsparkml Integrates with Spark-ML Pipeline: • Wrapper with Spark ML Transformer • BigDL Plugs into Spark ML pipeline • Support Spark v1.5/1.6/2.0 DataFrame Transfomer1 Transfomer2 DLClassifer … https://software.intel.com/ai
  • 15. https://github.com/intel-analytics/BigDL 15 PythonAPISupport Based on PySpark, Python API in BigDL allows use of existing Python libs: • Numpy • Scipy • Pandas • Scikit-learn • NLTK • Matplotlib • … https://software.intel.com/ai
  • 16. https://github.com/intel-analytics/BigDL 16 WorkswithJupyterNotebook Running BigDL applications directly in Jupyter notebooks  Share and Reproduce – Notebooks can be shared with others – Easy to reproduce and track  Rich Content – Texts, images, videos, LaTeX and JavaScript – Code can also produce rich contents  Rich toolbox – Apache Spark, from Python, R and Scala – Pandas, scikit-learn, ggplot2, dplyr, etc https://software.intel.com/ai
  • 17. https://github.com/intel-analytics/BigDL 17 VisualizationforLearning BigDL integration with TensorBoard • TensorBoard is a suite of web applications from Google for visualizing and understanding deep learning applications https://software.intel.com/ai
  • 18. https://github.com/intel-analytics/BigDL 18 Spark Streaming RDDs EvaluatorBigDL Model StreamWriter Integration with Spark Streaming for runtime training and prediction HDFS/S3 Kafka Flume Kinesis Twitter Train Predict BigDLintegrationwithsparkstreaming https://software.intel.com/ai
  • 19. https://github.com/intel-analytics/BigDL 19 Tight Integrations with Spark SQL, DataFrames and Structured Streaming *Image classification on ImageNet(http://www.image-net.org) BigDLFeatures Kafka File Data Frame (Batch/Stream) BigDL UDF Filtered Data Frame (Batch/Stream) df.select($’image’) .withColumn( “image_type”, ImgClassifier(“image”)) .filter($’image_type’ == ‘dog’) https://software.intel.com/ai
  • 20. https://github.com/intel-analytics/BigDL 20 BigDLFeatures BigDL provides out-of-the box examples of popular CNN models - helps a developer to get started https://github.com/intel-analytics/BigDL/wiki/Examples Models (Train and Inference Example Code): • LeNet, Inception, VGG, ResNet, RNN, Auto-encoder Examples: • Text Classification • Image Classification • Load Torch/Caffe model https://software.intel.com/ai
  • 21. https://github.com/intel-analytics/BigDL 21 BigDLFeatures • Single node Xeon performance • Benchmarked to be best on Xeon E5-26XX v3 or E5-26XX v4 • Orders of magnitude speedup vs. out-of-box open source Caffe, Torch or TensorFlow • Scaling-out • Efficiently scales out to 10s~100s of Xeon servers on Spark * For more complete information about performance and benchmark results, visit www.intel.com/benchmarks https://software.intel.com/ai
  • 23. https://github.com/intel-analytics/BigDL 23 BigDLonotherplatforms • “Intel’s BigDL on Databricks” https://databricks.com/blog/2017/02/09/intels-bigdl-databricks.html • “How to use BigDL on Apache Spark for Azure HDInsight” https://blogs.msdn.microsoft.com/azuredatalake/2017/03/17/how-to- use-bigdl-on-apache-spark-for-azure-hdinsight/ • More to come – check back periodically and stay tuned https://software.intel.com/ai
  • 28. https://github.com/intel-analytics/BigDL 28 LearnfromShakespearePoems Output of RNN: Long live the King . The King and Queen , and the Strange of the Veils of the rhapsodic . and grapple, and the entreatments of the pressure . Upon her head , and in the world ? `` Oh, the gods ! O Jove ! To whom the king : `` O friends ! Her hair, nor loose ! If , my lord , and the groundlings of the skies . jocund and Tasso in the Staggering of the Mankind . and https://software.intel.com/ai
  • 29. https://github.com/intel-analytics/BigDL 29 MoreRNNSupport:LSTM BigDL also supports LSTM variants such as GRU and LSTM with peepholes Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ https://software.intel.com/ai
  • 30. https://github.com/intel-analytics/BigDL 30 FinTech:TransactionFraudDetection • Historical data is stored on Hive • Data preprocessing with SparkSQL • Spark ML pipeline for complex feature engineering • Use multiple BigDL CNN models • Use Sample+Bagging to solve unbalance problem • Grid search for hyper parameter tuning Powered by BigDL https://software.intel.com/ai
  • 31. https://github.com/intel-analytics/BigDL 31 Manufacturing:ProductDefectDetectionandClassification Data source: • Feed from video cameras installed within manufacturing pipeline Objective: • Identify surface defects areas from camera feeds • Classify defects (eg. Scrape vs smudge). https://software.intel.com/ai
  • 34. https://github.com/intel-analytics/BigDL 34 BIGDLCommunity Join Our Mail List bigdl-user-group+subscribe@googlegroups.com Report Bugs And Create Feature Request https://github.com/intel-analytics/BigDL/issues https://software.intel.com/ai
  • 36. https://github.com/intel-analytics/BigDL 36 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Intel, Quark, VTune, Xeon, Cilk, Atom, Look Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright ©2015 Intel Corporation.
  • 37. https://github.com/intel-analytics/BigDL 37 Risk Factors The above statements and any others in this document that refer to plans and expectations for the first quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release.