SlideShare ist ein Scribd-Unternehmen logo
1 von 61
Jongwook Woo
HiPIC
CalStateLA
동의대학교
상경대 경제학과 임 동 순 교수
May 29 2018
Jongwook Woo, PhD, jwoo5@calstatela.edu
High-Performance Information Computing Center (HiPIC)
California State University Los Angeles
Introduction to AI on Big Data
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Contents
 Myself
 Introduction To Big Data
 인공지능
 인공지능과 빅데이터
 Summary
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Myself
Experience:
 Since 2002, Professor at California State University Los Angeles
– PhD in 2001: Computer Science and Engineering at USC
 Since 1998: R&D consulting in Hollywood
– Warner Bros (Matrix online game), E!, citysearch.com, ARM 등
– Information Search and Integration with FAST, Lucene/Solr, Sphinx
– implements eBusiness applications using J2EE and middleware
 Since 2007: Exposed to Big Data at CitySearch.com
 2012 - Present : Big Data Academic Partnerships
– For Big Data research and training
• Amazon AWS, MicroSoft Azure, IBM Bluemix
• Databricks, Hadoop vendors
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Myself: S/W Development Lead
http://www.mobygames.com/game/windows/matrix-online/credits
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Experience (Cont’d): Bring in Big Data R&D and training to
Korea since 2009
Collaborating with LA city since 2016
– Collect, Search, and Analyze City Data
• Spark, Hadoop, ElasticSearch, Solr, Java, Cloudera
Sept 2013: Samsung Advanced Technology Training Institute
Since 2008
– Introduce Hadoop Big Data and education to Univ and Research Centers
• Yonsei, Gachon, DongEui
• US: USC, Pennsylvania State Univ, University of Maryland College Park, Univ of Bridgeport, Louisiana
State Univ, California State Univ LB
• Europe: Univ of Luxembourg
Myself
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Myself: Partners for Services
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Experience in Big Data
 Collaboration
 Council Member of IBM Spark Technology Center
 City of Los Angeles for OpenHub and Open Data
 Startup Companies in Los Angeles
 External Collaborator and Advisor in Big Data
– IMSC of USC
– Pennsylvania State University
– The Big Link, Softzen, Wiken in Korea
 Grants
 IBM Bluemix , MicroSoft Windows Azure, Amazon AWS in Research and Education Grant
 Partnership
 Academic Education Partnership with Databricks, Tableau, Qlik, Cloudera, Hortonworks, SAS,
Teradata
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Myself: Public Partners
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Contents
 Myself
 Introduction To Big Data
 인공지능
 인공지능과 빅데이터
 Summary
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Data Issues
Large-Scale data
Tera-Byte (1012), Peta-byte (1015)
– Because of web
– Sensor Data (IoT), Bioinformatics, Social Computing, Streaming data,
smart phone, online game…
Cannot handle with the legacy approach
Too big
Non-/Semi-structured data
Too expensive
Need new systems
Non-expensive
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Two Cores in Big Data
How to store Big Data
How to compute Big Data
Google
How to store Big Data
– GFS
– Distributed Systems on non-expensive commodity computers
How to compute Big Data
– MapReduce
– Parallel Computing with non-expensive computers
Own super computers
Published papers in 2003, 2004
High Performance Information Computing Center
Jongwook Woo
CalStateLA
What is Hadoop?
12
 Hadoop Founder:
o Doug Cutting
 Apache Committer:
Lucene, Nutch, …
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Super Computer vs Hadoop
Parallel vs. Distributed file systems by Michael Malak
Updated by Jongwook Woo
Cluster for Store Cluster for Compute/Store
Cluster for Compute
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Hadoop Cluster: Logical Diagram
Web Browser of Cluster nonitor: CM/Ambari
HTTP(S)
Agent Hadoop Agent Hadoop Agent Hadoop
Agent Hadoop Agent Hadoop Agent Hadoop
Cluster Monitor
.
.
.
.
.
.
.
.
.
Agent Hadoop Agent Hadoop Agent Hadoop
HDFS HDFS HDFS
HDFS HDFS HDFS
HIVE ZooKeeper Impala
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Hadoop Ecosystems
http://dawn.dbsdataprojects.com/tag/hadoop/
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Definition: Big Data
Non-expensive frameworks that is distributed parallel systems
and that can store a large scale data and process it in parallel [1,
2]
Hadoop
– Non-expensive Super Computer
– More public than the traditional super computers
• You can store and process your applications
– In your university labs, small companies, research centers
Others
– NoSQL DB (Cassandra, MongoDB, Redis, HBase)
– ElasticSearch
High Performance Information Computing Center
Jongwook Woo
CalStateLA
NoSQL DB
 Key-Value
Memcached, Memcachedb, Redis
 Column Oriented (Column Family Store)
BigTable, Hbase
Cassandra (Key-Value Column Oriented)
Amazon SimpleDB
 Document Oriented
MongoDB, Couchbase, CouchDB
 Graph Oriented
Neo4j, InfiniteGraph
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Alternate of Hadoop MapReduce
Limitation in MapReduce
Hard to program in Java
Batch Processing
– Not interactive
Disk storage for intermediate data
– Performance issue
Spark by UC Berkley AMP Lab
 In-Memory storage for intermediate data
 20 ~ 100 times faster than N/W and Disk
– MapReduce
Good in Machine Learning
– Iterative algorithms
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Spark and Hadoop
Spark
File Systems: Tachyon
Resource Manager: Mesos
But, Hadoop has been dominating market
Integrating Spark into Hadoop cluster
Cloud Computing
– Amazon AWS, Azure HDInsight, IBM Bluemix
• Object Storage, S3
Hadoop vendors
– HDP, CDH
Databricks: Spark on AWS & Azure
– No Hadoop ecosystems
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Sentiment Map of Alphago
Positive
Negative
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Sentiment Map of Lee Se-Dol vs Alphago
 YouTube video: “alphago sentiment” by Google
 The sentiment of the World in Geo and Time:
https://youtu.be/vAzdnj4fkOg?list=PLaEg1tCLuW0BYLqVS5RTbToiB8wQ2w14a
High Performance Information Computing Center
Jongwook Woo
CalStateLA
K-Election 2017
(April 29 – May 9)
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Mapping of Crimes Occurred within 5miles
from CalStateLA, UCLA and USC in 2015
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Review count of popular sub-categories of
business
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Businesses popular in 5 miles of CalStateLA,
USC , UCLA
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Average Undergraduates Receiving
PELL GRANT in Each College
East Georgia State College: $2,854 Avg.
PELL grant: 97.285%
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Big Data Analysis Flow
Data Collection
Batch API: Yelp,
Google
Streaming: Twitter,
Apache NiFi, Kafka,
Storm
Open Data:
Government
Data Storage
HDFS, S3, Object Storage,
NoSQL DB (Couchbase)…
Data Filtering
Hive, Pig
Data Analysis and Science
Hive, Pig, Spark, BI Tools
(Datameer, Qlik, Tableau,…)
Data Visualization
Qlik, Datameer, Excel
PowerView
- Big Data Engineering
- Big Data Analysis
- Big Data Science
- Data Visualization
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Terms
We know
Data Engineering
– Collect, clean, filter data
Data Analysis
– Find insights from the data
Data Science (Predictive Analysis)
– Predict the trend or pattern from the existing data
Do we know?
Big Data Analysis and Science
– Using Big Data for Data Analysis and Science
• Hadoop, Spark, NoSQL DB, SAP HANA, ElasticSearch,..
– For Massive Data Set
• How to store and compute?
High Performance Information Computing Center
Jongwook Woo
CalStateLA
NoSQL DB
 Key-Value
Memcached, Memcachedb, Redis
 Column Oriented (Column Family Store)
BigTable, Hbase
Cassandra (Key-Value Column Oriented)
Amazon SimpleDB
 Document Oriented
MongoDB, Couchbase, CouchDB
 Graph Oriented
Neo4j, InfiniteGraph
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Contents
 Myself
 Introduction To Big Data
 인공지능
 인공지능과 빅데이터
 Summary
High Performance Information Computing Center
Jongwook Woo
CalStateLA
AI and Deep Learning
Artificial
Intelligence
Machine
Learning
Deep
Learning
Neural
Networks
▪Deep learning
▪Sub-field of neural networks,
machine learning, and artificial
intelligence
▪Deep learning is neural networks
with many layers
▪Inspired by, but not limited to,
▪ the architecture of the human brain
3
1
© 2017 SAP SE or an SAP affiliate company. All rights
reserved. ǀ PUBLIC
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Deep Learning and TensorFlow
▪Development led by Google
▪Open-source library for deep learning
▪ Define model structures, library for efficient
execution
▪Define once, run anywhere:
▪ can run on on CPUs and GPUs, many devices
▪ NVidia, Google GPU
▪Can be used in Python
▪ and many other languages
▪Built for large-scale machine learning
▪ development and operations
3
2
© 2017 SAP SE or an SAP affiliate company. All rights
reserved. ǀ PUBLIC
High Performance Information Computing Center
Jongwook Woo
CalStateLA
7
• Neural Networks
• Multi-Layer Perceptron
• Convolutional Neural
Networks
Deep Learning [9]
High Performance Information Computing Center
Jongwook Woo
CalStateLA
7
• good at problems like image classification.
Convolutional Neural Networks
High Performance Information Computing Center
Jongwook Woo
CalStateLA
9
• Has 3 types of parameters
▫ W – Hidden weights
▫ U – Hidden to Hidden weights
▫ V – Hidden to Label weights
• Good for Text Processing such as sentiment analysis:
• My Projects > sapDeepLearningTensorflow > Week_03_Unit_05_S
Recurrent Neural Networks (RNN)
High Performance Information Computing Center
Jongwook Woo
CalStateLA
10
 Neural Networks are resource intensive
o Typically require huge dedicated hardware (RAM, GPUs)
 Parameter space huge
o 100s of thousands of parameters
o Tuning is important
 Architecture choice is important:
o See http://www.asimovinstitute.org/neural-network-zoo/
Key takeaways from modeling Deep Neural
Networks
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Contents
 Myself
 Introduction To Big Data
 인공지능
 인공지능과 빅데이터
 Summary
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Recap
Spark:
an efficient framework for running computations on
thousands of computers
TensorFlow:
high-performance numerical framework
Get the best of both
Simple API for distributed numerical computing
Can leverage the hardware of the cluster
38
High Performance Information Computing Center
Jongwook Woo
CalStateLA
13
 Investment in Big-Data
o infrastructure
 GPUs
o Require specialized hardware
o – Niche Use-cases
 Can enterprises reuse existing infrastructure
o for deep learning applications?
 What use-cases in Deep learning can leverage Apache Spark?
Deep Learning + Apache Spark
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Spark using TensorFlow [8, 9]
 Neural networks
 have seen spectacular progress during the last few years
 the state of the art in image recognition and automated translation.
 TensorFlow
 a new framework released by Google
– for numerical computations and neural networks.
 Spark and TensorFlow
 use Spark and a cluster of machines
– to improve deep learning pipelines with TensorFlow
– how to use TensorFlow and Spark together to train and apply deep learning models
 Hyperparameter Tuning:
– use Spark to find the best set of hyperparameters for neural network training,
• leading to 10X reduction in training time and 34% lower error rate.
 Deploying models at scale:
– use Spark to apply a trained neural network model on a large amount of data
High Performance Information Computing Center
Jongwook Woo
CalStateLA
 The accuracy of Spark with the default set of hyperparameters
 99.2%.
 best result with hyperparameter tuning
– has a 99.47% accuracy on the test set,
• which is a 34% reduction of the test error.
Spark Cluster with TensorFlow
High Performance Information Computing Center
Jongwook Woo
CalStateLA
14
 Databricks
 Platform for running Spark with TensorFlow
 BigDL
 Intel’s library for deep learning on existing data frameworks.
 TensorflowOnSpark
 Yahoo’s Distributed Deep Learning on Big Data
 SparkNet
 AMPLab’s framework for training deep networks in Spark
Efforts on using Deep Learning
Frameworks with Spark
High Performance Information Computing Center
Jongwook Woo
CalStateLA
14
 DeepLearning4J
 Uses Data parallism to train on separate neural networks
 DeepDist
 Lightning-Fast Deep Learning on Spark Via parallel
stochastic gradient updates
 IBM DSX
Efforts on using Deep Learning
Frameworks with Spark
High Performance Information Computing Center
Jongwook Woo
CalStateLA
15
 Deploying trained models
o to make predictions on data stored in Spark RDDs or Dataframes
o Inception model: https://www.tensorflow.org/tutorials/image_recognition
o Each prediction requires about 4.8 billion operations
o Parallelizing with Spark helps scale operations
Databricks
https://databricks.com/blog/2016/12/21/deep-learning-on-
databricks.html
High Performance Information Computing Center
Jongwook Woo
CalStateLA
16
• Distributed model training
 Use deep learning libraries like TensorFlow to test different
model hyperparameters on each worker
 Task parallelism
Databricks
https://databricks.com/blog/2016/12/21/deep-learning-on-
databricks.html
High Performance Information Computing Center
Jongwook Woo
CalStateLA
IBM DSX
 Data Science Experience (DSX) includes
TensorFlow libraty
GPU
Easy to develop and run Spark with TensorFlow
Don’t need to configure library
Databricks’ examples run in DSX
–While Databricks CE does not support GPU
Brunel for visualization lately
‹#›
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Multiple nodes in the
cluster:
 the computations scaled
linearly
a graph
– the computation times (in
seconds)
• with respect to the number of
machines on the cluster:
– using a 13-node cluster,
• train 13 models in parallel,
• which translates into a 7x
speedup compared to training
the models one at a time on one
machine.
Spark Cluster with TensorFlow (Cont’d)
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Spark Cluster with TensorFlow (Cont’d)
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Spark Cluster with TensorFlow (Cont’d)
the learning rate for different numbers of neurons:
The learning rate is critical:
– if it is too low,
• the neural network does not learn anything (high test error).
– If it is too high,
• the training process may oscillate randomly and even diverge in some configurations.
The number of neurons
– not as important for getting a good performance,
• and networks with many neurons
– much more sensitive to the learning rate.
– This is Occam’s Razor principle:
• simpler model tend to be “good enough” for most purposes.
• If you have the time and resource to go after the missing 1% test error, you
must be willing to invest a lot of resources in training,
• to find the proper hyperparameters that will make the difference.
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Distributed processing of images using
TensorFlow
 Apache Spark with a Deep Learning library
takes an existing neural network (INCEPTION-3)
– applies it to a corpus of images.
requires that TensorFlow be installed on the cluster
Run in IBM DSX
– Not in Databricks CE
• Built by Databricks but needs GPU
 Spark integration work flow:
define TensorFlow operations as methods, to be used within Spark tasks.
broadcast the model for use within Spark tasks.
parallelize a list of image URLs.
Using Spark, we process the image URLs in parallel:
– Load image.
– Run inference on the image using TensorFlow to predict the image contents.
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Distributed processing of images classification using TensorFlow
 use the “Simple image classification with
Inception” example from TensorFlow,
which applies the Inception model to predict the
contents of a set of images.
 For example, given Photo of two scuba divers
The Inception model will tell us the contents of the
image:
('scuba diver', 0.88708681),
('electric ray, crampfish, numbfish, torpedo',
0.012277877),
('sea snake', 0.005639134),
('tiger shark, Galeocerdo cuvieri', 0.0051873429),
('reel', 0.0044495272)
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Distributed processing of images classification using TensorFlow
(Cont’d)
Each of the lines above represents a “synset,”
or a set of synonymous terms
– representing a concept.
The weight given to each synset
– represents a confidence in how applicable the synset is to the image.
– In this case, “scuba diver” is pretty accurate!
Making predictions with Inception-v3
 expensive:
– each prediction requires about 4.8 billion operations (Szegedy et al., 2015).
Even with smaller datasets,
– worthwhile to parallelize this computation.
– distribute these costly predictions using Spark.
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Contents
 Myself
 Introduction To Big Data
 인공지능
 인공지능과 빅데이터
 Summary
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Summary
Introduction to Big Data
Introduction to AI
AI on Big Data
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Databricks Partners
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Training Hadoop and Spark
Cloudera visits to interview Jongwook Woo
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Training Hadoop on IBM Bluemix at
California State Univ. Los Angeles
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Question?
High Performance Information Computing Center
Jongwook Woo
CalStateLA
References
1. “Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing”, Jongwook Woo and
Yuhang Xu, The 2011 international Conference on Parallel and Distributed Processing
Techniques and Applications (PDPTA 2011), Las Vegas (July 18-21, 2011)
2. Jongwook Woo, DMKD-00150, “Market Basket Analysis Algorithms with MapReduce”, Wiley
Interdisciplinary Reviews Data Mining and Knowledge Discovery, Oct 28 2013, Volume 3, Issue
6, pp445-452, ISSN 1942-4795
3. Jongwook Woo, “Big Data Trend and Open Data”, UKC 2016, Dallas, TX, Aug 12 2016
4. How to choose algorithms for Microsoft Azure Machine Learning,
https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-
choice
5. “Big Data Analysis using Spark for Collision Rate Near CalStateLA” , Manik Katyal, Parag
Chhadva, Shubhra Wahi & Jongwook Woo, https://globaljournals.org/GJCST_Volume16/1-Big-
Data-Analysis-using-Spark.pdf
6. Spark Programming Guide: http://spark.apache.org/docs/latest/programming-guide.html
7. Github URL: https://github.com/nmelche/IntroductionToBigDataScience
High Performance Information Computing Center
Jongwook Woo
CalStateLA
References
8. TensorFrames: Google Tensorflow on Apache Spark,
https://www.slideshare.net/databricks/tensorframes-google-tensorflow-on-apache-spark
9. Deep learning and Apache Spark, https://www.slideshare.net/QuantUniversity/deep-learning-
and-apache-spark
10. Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark,
https://www.slideshare.net/SparkSummit/which-is-deeper-comparison-of-deep-learning-
frameworks-on-spark
11. Accelerating Machine Learning and Deep Learning At Scale with Apache Spark,
https://www.slideshare.net/SparkSummit/accelerating-machine-learning-and-deep-learning-
at-scalewith-apache-spark-keynote-by-ziya-ma
12. Deep Learning with Apache Spark and TensorFlow,
https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-
tensorflow.html
13. Tensor Flow Deep Learning Open SAP
High Performance Information Computing Center
Jongwook Woo
CalStateLA
Deep Learning for the Intelligent Enterprise
Deep learning
Artificial
Intelligence
Machine
Learning
Deep
Learning
Neural
Networks
▪ Sub-field of neural
networks, machine
learning, and artificial
intelligence
▪ Deep learning is neural
networks with many layers
▪ Inspired by, but not limited
to, the architecture of the
human brain
▪ Deep learning is the reality
behind artificial intelligence
6
1
© 2017 SAP SE or an SAP affiliate company. All rights
reserved. ǀ PUBLIC

Weitere ähnliche Inhalte

Was ist angesagt?

Traffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big DataTraffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big DataJongwook Woo
 
The Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraThe Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraJongwook Woo
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data ScienceKenny Daniel
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsChandan Rajah
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learningGiuseppe Manco
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Gregory Piatetsky-Shapiro
 

Was ist angesagt? (20)

Traffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big DataTraffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big Data
 
The Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraThe Importance of Open Innovation in AI era
The Importance of Open Innovation in AI era
 
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformPredictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data Platform
 
Analytics and Data Mining Industry Overview
Analytics and Data Mining Industry OverviewAnalytics and Data Mining Industry Overview
Analytics and Data Mining Industry Overview
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data Science
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"#BigDataCanarias: "Big Data & Career Paths"
#BigDataCanarias: "Big Data & Career Paths"
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data science
Data scienceData science
Data science
 
Analytics Education in the era of Big Data
Analytics Education in the era of Big DataAnalytics Education in the era of Big Data
Analytics Education in the era of Big Data
 
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?
 

Ähnlich wie AI on Big Data

Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open DataJongwook Woo
 
Big Data and Data Intensive Computing: Use Cases
Big Data and Data Intensive Computing: Use CasesBig Data and Data Intensive Computing: Use Cases
Big Data and Data Intensive Computing: Use CasesJongwook Woo
 
Big Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingBig Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingJongwook Woo
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017Jongwook Woo
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open DataJongwook Woo
 
Big Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksBig Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksJongwook Woo
 
Introduction To Big Data and Use Cases on Hadoop
Introduction To Big Data and Use Cases on HadoopIntroduction To Big Data and Use Cases on Hadoop
Introduction To Big Data and Use Cases on HadoopJongwook Woo
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open PlatformJongwook Woo
 
Big Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingBig Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingJongwook Woo
 
Big Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingBig Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingJongwook Woo
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Jongwook Woo
 
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Jongwook Woo
 
Big Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkBig Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkJongwook Woo
 
Introduction to Big Data, MapReduce, its Use Cases, and the Ecosystems
Introduction to Big Data, MapReduce, its Use Cases, and the EcosystemsIntroduction to Big Data, MapReduce, its Use Cases, and the Ecosystems
Introduction to Big Data, MapReduce, its Use Cases, and the EcosystemsJongwook Woo
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Jongwook Woo
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Chek mate geolocation analyzer
Chek mate geolocation analyzerChek mate geolocation analyzer
Chek mate geolocation analyzerpriyal mistry
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesQubole
 
Introduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using HadoopIntroduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using HadoopJongwook Woo
 

Ähnlich wie AI on Big Data (20)

Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open Data
 
Big Data and Data Intensive Computing: Use Cases
Big Data and Data Intensive Computing: Use CasesBig Data and Data Intensive Computing: Use Cases
Big Data and Data Intensive Computing: Use Cases
 
Big Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingBig Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and Training
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open Data
 
Big Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksBig Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on Networks
 
Introduction To Big Data and Use Cases on Hadoop
Introduction To Big Data and Use Cases on HadoopIntroduction To Big Data and Use Cases on Hadoop
Introduction To Big Data and Use Cases on Hadoop
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open Platform
 
Big Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingBig Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive Computing
 
Big Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and TrainingBig Data and Data Intensive Computing: Education and Training
Big Data and Data Intensive Computing: Education and Training
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data
 
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
 
Spark ukc2015v1.1
Spark ukc2015v1.1Spark ukc2015v1.1
Spark ukc2015v1.1
 
Big Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkBig Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using Spark
 
Introduction to Big Data, MapReduce, its Use Cases, and the Ecosystems
Introduction to Big Data, MapReduce, its Use Cases, and the EcosystemsIntroduction to Big Data, MapReduce, its Use Cases, and the Ecosystems
Introduction to Big Data, MapReduce, its Use Cases, and the Ecosystems
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Chek mate geolocation analyzer
Chek mate geolocation analyzerChek mate geolocation analyzer
Chek mate geolocation analyzer
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Introduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using HadoopIntroduction To Big Data and Use Cases using Hadoop
Introduction To Big Data and Use Cases using Hadoop
 

Mehr von Jongwook Woo

Machine Learning in Quantum Computing
Machine Learning in Quantum ComputingMachine Learning in Quantum Computing
Machine Learning in Quantum ComputingJongwook Woo
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsJongwook Woo
 
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeWhose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeJongwook Woo
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLJongwook Woo
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and SparkJongwook Woo
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and SparkJongwook Woo
 
Introduction to Hadoop, Big Data, Training, Use Cases
Introduction to Hadoop, Big Data, Training, Use CasesIntroduction to Hadoop, Big Data, Training, Use Cases
Introduction to Hadoop, Big Data, Training, Use CasesJongwook Woo
 
2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in SeoulJongwook Woo
 

Mehr von Jongwook Woo (8)

Machine Learning in Quantum Computing
Machine Learning in Quantum ComputingMachine Learning in Quantum Computing
Machine Learning in Quantum Computing
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
 
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeWhose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure ML
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
 
Introduction to Hadoop, Big Data, Training, Use Cases
Introduction to Hadoop, Big Data, Training, Use CasesIntroduction to Hadoop, Big Data, Training, Use Cases
Introduction to Hadoop, Big Data, Training, Use Cases
 
2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul
 

Kürzlich hochgeladen

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 

Kürzlich hochgeladen (20)

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 

AI on Big Data

  • 1. Jongwook Woo HiPIC CalStateLA 동의대학교 상경대 경제학과 임 동 순 교수 May 29 2018 Jongwook Woo, PhD, jwoo5@calstatela.edu High-Performance Information Computing Center (HiPIC) California State University Los Angeles Introduction to AI on Big Data
  • 2. High Performance Information Computing Center Jongwook Woo CalStateLA Contents  Myself  Introduction To Big Data  인공지능  인공지능과 빅데이터  Summary
  • 3. High Performance Information Computing Center Jongwook Woo CalStateLA Myself Experience:  Since 2002, Professor at California State University Los Angeles – PhD in 2001: Computer Science and Engineering at USC  Since 1998: R&D consulting in Hollywood – Warner Bros (Matrix online game), E!, citysearch.com, ARM 등 – Information Search and Integration with FAST, Lucene/Solr, Sphinx – implements eBusiness applications using J2EE and middleware  Since 2007: Exposed to Big Data at CitySearch.com  2012 - Present : Big Data Academic Partnerships – For Big Data research and training • Amazon AWS, MicroSoft Azure, IBM Bluemix • Databricks, Hadoop vendors
  • 4. High Performance Information Computing Center Jongwook Woo CalStateLA Myself: S/W Development Lead http://www.mobygames.com/game/windows/matrix-online/credits
  • 5. High Performance Information Computing Center Jongwook Woo CalStateLA Experience (Cont’d): Bring in Big Data R&D and training to Korea since 2009 Collaborating with LA city since 2016 – Collect, Search, and Analyze City Data • Spark, Hadoop, ElasticSearch, Solr, Java, Cloudera Sept 2013: Samsung Advanced Technology Training Institute Since 2008 – Introduce Hadoop Big Data and education to Univ and Research Centers • Yonsei, Gachon, DongEui • US: USC, Pennsylvania State Univ, University of Maryland College Park, Univ of Bridgeport, Louisiana State Univ, California State Univ LB • Europe: Univ of Luxembourg Myself
  • 6. High Performance Information Computing Center Jongwook Woo CalStateLA Myself: Partners for Services
  • 7. High Performance Information Computing Center Jongwook Woo CalStateLA Experience in Big Data  Collaboration  Council Member of IBM Spark Technology Center  City of Los Angeles for OpenHub and Open Data  Startup Companies in Los Angeles  External Collaborator and Advisor in Big Data – IMSC of USC – Pennsylvania State University – The Big Link, Softzen, Wiken in Korea  Grants  IBM Bluemix , MicroSoft Windows Azure, Amazon AWS in Research and Education Grant  Partnership  Academic Education Partnership with Databricks, Tableau, Qlik, Cloudera, Hortonworks, SAS, Teradata
  • 8. High Performance Information Computing Center Jongwook Woo CalStateLA Myself: Public Partners
  • 9. High Performance Information Computing Center Jongwook Woo CalStateLA Contents  Myself  Introduction To Big Data  인공지능  인공지능과 빅데이터  Summary
  • 10. High Performance Information Computing Center Jongwook Woo CalStateLA Data Issues Large-Scale data Tera-Byte (1012), Peta-byte (1015) – Because of web – Sensor Data (IoT), Bioinformatics, Social Computing, Streaming data, smart phone, online game… Cannot handle with the legacy approach Too big Non-/Semi-structured data Too expensive Need new systems Non-expensive
  • 11. High Performance Information Computing Center Jongwook Woo CalStateLA Two Cores in Big Data How to store Big Data How to compute Big Data Google How to store Big Data – GFS – Distributed Systems on non-expensive commodity computers How to compute Big Data – MapReduce – Parallel Computing with non-expensive computers Own super computers Published papers in 2003, 2004
  • 12. High Performance Information Computing Center Jongwook Woo CalStateLA What is Hadoop? 12  Hadoop Founder: o Doug Cutting  Apache Committer: Lucene, Nutch, …
  • 13. High Performance Information Computing Center Jongwook Woo CalStateLA Super Computer vs Hadoop Parallel vs. Distributed file systems by Michael Malak Updated by Jongwook Woo Cluster for Store Cluster for Compute/Store Cluster for Compute
  • 14. High Performance Information Computing Center Jongwook Woo CalStateLA Hadoop Cluster: Logical Diagram Web Browser of Cluster nonitor: CM/Ambari HTTP(S) Agent Hadoop Agent Hadoop Agent Hadoop Agent Hadoop Agent Hadoop Agent Hadoop Cluster Monitor . . . . . . . . . Agent Hadoop Agent Hadoop Agent Hadoop HDFS HDFS HDFS HDFS HDFS HDFS HIVE ZooKeeper Impala
  • 15. High Performance Information Computing Center Jongwook Woo CalStateLA Hadoop Ecosystems http://dawn.dbsdataprojects.com/tag/hadoop/
  • 16. High Performance Information Computing Center Jongwook Woo CalStateLA Definition: Big Data Non-expensive frameworks that is distributed parallel systems and that can store a large scale data and process it in parallel [1, 2] Hadoop – Non-expensive Super Computer – More public than the traditional super computers • You can store and process your applications – In your university labs, small companies, research centers Others – NoSQL DB (Cassandra, MongoDB, Redis, HBase) – ElasticSearch
  • 17. High Performance Information Computing Center Jongwook Woo CalStateLA NoSQL DB  Key-Value Memcached, Memcachedb, Redis  Column Oriented (Column Family Store) BigTable, Hbase Cassandra (Key-Value Column Oriented) Amazon SimpleDB  Document Oriented MongoDB, Couchbase, CouchDB  Graph Oriented Neo4j, InfiniteGraph
  • 18. High Performance Information Computing Center Jongwook Woo CalStateLA Alternate of Hadoop MapReduce Limitation in MapReduce Hard to program in Java Batch Processing – Not interactive Disk storage for intermediate data – Performance issue Spark by UC Berkley AMP Lab  In-Memory storage for intermediate data  20 ~ 100 times faster than N/W and Disk – MapReduce Good in Machine Learning – Iterative algorithms
  • 19. High Performance Information Computing Center Jongwook Woo CalStateLA Spark and Hadoop Spark File Systems: Tachyon Resource Manager: Mesos But, Hadoop has been dominating market Integrating Spark into Hadoop cluster Cloud Computing – Amazon AWS, Azure HDInsight, IBM Bluemix • Object Storage, S3 Hadoop vendors – HDP, CDH Databricks: Spark on AWS & Azure – No Hadoop ecosystems
  • 20. High Performance Information Computing Center Jongwook Woo CalStateLA Sentiment Map of Alphago Positive Negative
  • 21. High Performance Information Computing Center Jongwook Woo CalStateLA Sentiment Map of Lee Se-Dol vs Alphago  YouTube video: “alphago sentiment” by Google  The sentiment of the World in Geo and Time: https://youtu.be/vAzdnj4fkOg?list=PLaEg1tCLuW0BYLqVS5RTbToiB8wQ2w14a
  • 22. High Performance Information Computing Center Jongwook Woo CalStateLA K-Election 2017 (April 29 – May 9)
  • 23. High Performance Information Computing Center Jongwook Woo CalStateLA Mapping of Crimes Occurred within 5miles from CalStateLA, UCLA and USC in 2015
  • 24. High Performance Information Computing Center Jongwook Woo CalStateLA Review count of popular sub-categories of business
  • 25. High Performance Information Computing Center Jongwook Woo CalStateLA Businesses popular in 5 miles of CalStateLA, USC , UCLA
  • 26. High Performance Information Computing Center Jongwook Woo CalStateLA Average Undergraduates Receiving PELL GRANT in Each College East Georgia State College: $2,854 Avg. PELL grant: 97.285%
  • 27. High Performance Information Computing Center Jongwook Woo CalStateLA Big Data Analysis Flow Data Collection Batch API: Yelp, Google Streaming: Twitter, Apache NiFi, Kafka, Storm Open Data: Government Data Storage HDFS, S3, Object Storage, NoSQL DB (Couchbase)… Data Filtering Hive, Pig Data Analysis and Science Hive, Pig, Spark, BI Tools (Datameer, Qlik, Tableau,…) Data Visualization Qlik, Datameer, Excel PowerView - Big Data Engineering - Big Data Analysis - Big Data Science - Data Visualization
  • 28. High Performance Information Computing Center Jongwook Woo CalStateLA Terms We know Data Engineering – Collect, clean, filter data Data Analysis – Find insights from the data Data Science (Predictive Analysis) – Predict the trend or pattern from the existing data Do we know? Big Data Analysis and Science – Using Big Data for Data Analysis and Science • Hadoop, Spark, NoSQL DB, SAP HANA, ElasticSearch,.. – For Massive Data Set • How to store and compute?
  • 29. High Performance Information Computing Center Jongwook Woo CalStateLA NoSQL DB  Key-Value Memcached, Memcachedb, Redis  Column Oriented (Column Family Store) BigTable, Hbase Cassandra (Key-Value Column Oriented) Amazon SimpleDB  Document Oriented MongoDB, Couchbase, CouchDB  Graph Oriented Neo4j, InfiniteGraph
  • 30. High Performance Information Computing Center Jongwook Woo CalStateLA Contents  Myself  Introduction To Big Data  인공지능  인공지능과 빅데이터  Summary
  • 31. High Performance Information Computing Center Jongwook Woo CalStateLA AI and Deep Learning Artificial Intelligence Machine Learning Deep Learning Neural Networks ▪Deep learning ▪Sub-field of neural networks, machine learning, and artificial intelligence ▪Deep learning is neural networks with many layers ▪Inspired by, but not limited to, ▪ the architecture of the human brain 3 1 © 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC
  • 32. High Performance Information Computing Center Jongwook Woo CalStateLA Deep Learning and TensorFlow ▪Development led by Google ▪Open-source library for deep learning ▪ Define model structures, library for efficient execution ▪Define once, run anywhere: ▪ can run on on CPUs and GPUs, many devices ▪ NVidia, Google GPU ▪Can be used in Python ▪ and many other languages ▪Built for large-scale machine learning ▪ development and operations 3 2 © 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC
  • 33. High Performance Information Computing Center Jongwook Woo CalStateLA 7 • Neural Networks • Multi-Layer Perceptron • Convolutional Neural Networks Deep Learning [9]
  • 34. High Performance Information Computing Center Jongwook Woo CalStateLA 7 • good at problems like image classification. Convolutional Neural Networks
  • 35. High Performance Information Computing Center Jongwook Woo CalStateLA 9 • Has 3 types of parameters ▫ W – Hidden weights ▫ U – Hidden to Hidden weights ▫ V – Hidden to Label weights • Good for Text Processing such as sentiment analysis: • My Projects > sapDeepLearningTensorflow > Week_03_Unit_05_S Recurrent Neural Networks (RNN)
  • 36. High Performance Information Computing Center Jongwook Woo CalStateLA 10  Neural Networks are resource intensive o Typically require huge dedicated hardware (RAM, GPUs)  Parameter space huge o 100s of thousands of parameters o Tuning is important  Architecture choice is important: o See http://www.asimovinstitute.org/neural-network-zoo/ Key takeaways from modeling Deep Neural Networks
  • 37. High Performance Information Computing Center Jongwook Woo CalStateLA Contents  Myself  Introduction To Big Data  인공지능  인공지능과 빅데이터  Summary
  • 38. High Performance Information Computing Center Jongwook Woo CalStateLA Recap Spark: an efficient framework for running computations on thousands of computers TensorFlow: high-performance numerical framework Get the best of both Simple API for distributed numerical computing Can leverage the hardware of the cluster 38
  • 39. High Performance Information Computing Center Jongwook Woo CalStateLA 13  Investment in Big-Data o infrastructure  GPUs o Require specialized hardware o – Niche Use-cases  Can enterprises reuse existing infrastructure o for deep learning applications?  What use-cases in Deep learning can leverage Apache Spark? Deep Learning + Apache Spark
  • 40. High Performance Information Computing Center Jongwook Woo CalStateLA Spark using TensorFlow [8, 9]  Neural networks  have seen spectacular progress during the last few years  the state of the art in image recognition and automated translation.  TensorFlow  a new framework released by Google – for numerical computations and neural networks.  Spark and TensorFlow  use Spark and a cluster of machines – to improve deep learning pipelines with TensorFlow – how to use TensorFlow and Spark together to train and apply deep learning models  Hyperparameter Tuning: – use Spark to find the best set of hyperparameters for neural network training, • leading to 10X reduction in training time and 34% lower error rate.  Deploying models at scale: – use Spark to apply a trained neural network model on a large amount of data
  • 41. High Performance Information Computing Center Jongwook Woo CalStateLA  The accuracy of Spark with the default set of hyperparameters  99.2%.  best result with hyperparameter tuning – has a 99.47% accuracy on the test set, • which is a 34% reduction of the test error. Spark Cluster with TensorFlow
  • 42. High Performance Information Computing Center Jongwook Woo CalStateLA 14  Databricks  Platform for running Spark with TensorFlow  BigDL  Intel’s library for deep learning on existing data frameworks.  TensorflowOnSpark  Yahoo’s Distributed Deep Learning on Big Data  SparkNet  AMPLab’s framework for training deep networks in Spark Efforts on using Deep Learning Frameworks with Spark
  • 43. High Performance Information Computing Center Jongwook Woo CalStateLA 14  DeepLearning4J  Uses Data parallism to train on separate neural networks  DeepDist  Lightning-Fast Deep Learning on Spark Via parallel stochastic gradient updates  IBM DSX Efforts on using Deep Learning Frameworks with Spark
  • 44. High Performance Information Computing Center Jongwook Woo CalStateLA 15  Deploying trained models o to make predictions on data stored in Spark RDDs or Dataframes o Inception model: https://www.tensorflow.org/tutorials/image_recognition o Each prediction requires about 4.8 billion operations o Parallelizing with Spark helps scale operations Databricks https://databricks.com/blog/2016/12/21/deep-learning-on- databricks.html
  • 45. High Performance Information Computing Center Jongwook Woo CalStateLA 16 • Distributed model training  Use deep learning libraries like TensorFlow to test different model hyperparameters on each worker  Task parallelism Databricks https://databricks.com/blog/2016/12/21/deep-learning-on- databricks.html
  • 46. High Performance Information Computing Center Jongwook Woo CalStateLA IBM DSX  Data Science Experience (DSX) includes TensorFlow libraty GPU Easy to develop and run Spark with TensorFlow Don’t need to configure library Databricks’ examples run in DSX –While Databricks CE does not support GPU Brunel for visualization lately ‹#›
  • 47. High Performance Information Computing Center Jongwook Woo CalStateLA Multiple nodes in the cluster:  the computations scaled linearly a graph – the computation times (in seconds) • with respect to the number of machines on the cluster: – using a 13-node cluster, • train 13 models in parallel, • which translates into a 7x speedup compared to training the models one at a time on one machine. Spark Cluster with TensorFlow (Cont’d)
  • 48. High Performance Information Computing Center Jongwook Woo CalStateLA Spark Cluster with TensorFlow (Cont’d)
  • 49. High Performance Information Computing Center Jongwook Woo CalStateLA Spark Cluster with TensorFlow (Cont’d) the learning rate for different numbers of neurons: The learning rate is critical: – if it is too low, • the neural network does not learn anything (high test error). – If it is too high, • the training process may oscillate randomly and even diverge in some configurations. The number of neurons – not as important for getting a good performance, • and networks with many neurons – much more sensitive to the learning rate. – This is Occam’s Razor principle: • simpler model tend to be “good enough” for most purposes. • If you have the time and resource to go after the missing 1% test error, you must be willing to invest a lot of resources in training, • to find the proper hyperparameters that will make the difference.
  • 50. High Performance Information Computing Center Jongwook Woo CalStateLA Distributed processing of images using TensorFlow  Apache Spark with a Deep Learning library takes an existing neural network (INCEPTION-3) – applies it to a corpus of images. requires that TensorFlow be installed on the cluster Run in IBM DSX – Not in Databricks CE • Built by Databricks but needs GPU  Spark integration work flow: define TensorFlow operations as methods, to be used within Spark tasks. broadcast the model for use within Spark tasks. parallelize a list of image URLs. Using Spark, we process the image URLs in parallel: – Load image. – Run inference on the image using TensorFlow to predict the image contents.
  • 51. High Performance Information Computing Center Jongwook Woo CalStateLA Distributed processing of images classification using TensorFlow  use the “Simple image classification with Inception” example from TensorFlow, which applies the Inception model to predict the contents of a set of images.  For example, given Photo of two scuba divers The Inception model will tell us the contents of the image: ('scuba diver', 0.88708681), ('electric ray, crampfish, numbfish, torpedo', 0.012277877), ('sea snake', 0.005639134), ('tiger shark, Galeocerdo cuvieri', 0.0051873429), ('reel', 0.0044495272)
  • 52. High Performance Information Computing Center Jongwook Woo CalStateLA Distributed processing of images classification using TensorFlow (Cont’d) Each of the lines above represents a “synset,” or a set of synonymous terms – representing a concept. The weight given to each synset – represents a confidence in how applicable the synset is to the image. – In this case, “scuba diver” is pretty accurate! Making predictions with Inception-v3  expensive: – each prediction requires about 4.8 billion operations (Szegedy et al., 2015). Even with smaller datasets, – worthwhile to parallelize this computation. – distribute these costly predictions using Spark.
  • 53. High Performance Information Computing Center Jongwook Woo CalStateLA Contents  Myself  Introduction To Big Data  인공지능  인공지능과 빅데이터  Summary
  • 54. High Performance Information Computing Center Jongwook Woo CalStateLA Summary Introduction to Big Data Introduction to AI AI on Big Data
  • 55. High Performance Information Computing Center Jongwook Woo CalStateLA Databricks Partners
  • 56. High Performance Information Computing Center Jongwook Woo CalStateLA Training Hadoop and Spark Cloudera visits to interview Jongwook Woo
  • 57. High Performance Information Computing Center Jongwook Woo CalStateLA Training Hadoop on IBM Bluemix at California State Univ. Los Angeles
  • 58. High Performance Information Computing Center Jongwook Woo CalStateLA Question?
  • 59. High Performance Information Computing Center Jongwook Woo CalStateLA References 1. “Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing”, Jongwook Woo and Yuhang Xu, The 2011 international Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2011), Las Vegas (July 18-21, 2011) 2. Jongwook Woo, DMKD-00150, “Market Basket Analysis Algorithms with MapReduce”, Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, Oct 28 2013, Volume 3, Issue 6, pp445-452, ISSN 1942-4795 3. Jongwook Woo, “Big Data Trend and Open Data”, UKC 2016, Dallas, TX, Aug 12 2016 4. How to choose algorithms for Microsoft Azure Machine Learning, https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm- choice 5. “Big Data Analysis using Spark for Collision Rate Near CalStateLA” , Manik Katyal, Parag Chhadva, Shubhra Wahi & Jongwook Woo, https://globaljournals.org/GJCST_Volume16/1-Big- Data-Analysis-using-Spark.pdf 6. Spark Programming Guide: http://spark.apache.org/docs/latest/programming-guide.html 7. Github URL: https://github.com/nmelche/IntroductionToBigDataScience
  • 60. High Performance Information Computing Center Jongwook Woo CalStateLA References 8. TensorFrames: Google Tensorflow on Apache Spark, https://www.slideshare.net/databricks/tensorframes-google-tensorflow-on-apache-spark 9. Deep learning and Apache Spark, https://www.slideshare.net/QuantUniversity/deep-learning- and-apache-spark 10. Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark, https://www.slideshare.net/SparkSummit/which-is-deeper-comparison-of-deep-learning- frameworks-on-spark 11. Accelerating Machine Learning and Deep Learning At Scale with Apache Spark, https://www.slideshare.net/SparkSummit/accelerating-machine-learning-and-deep-learning- at-scalewith-apache-spark-keynote-by-ziya-ma 12. Deep Learning with Apache Spark and TensorFlow, https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and- tensorflow.html 13. Tensor Flow Deep Learning Open SAP
  • 61. High Performance Information Computing Center Jongwook Woo CalStateLA Deep Learning for the Intelligent Enterprise Deep learning Artificial Intelligence Machine Learning Deep Learning Neural Networks ▪ Sub-field of neural networks, machine learning, and artificial intelligence ▪ Deep learning is neural networks with many layers ▪ Inspired by, but not limited to, the architecture of the human brain ▪ Deep learning is the reality behind artificial intelligence 6 1 © 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC