SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Machine Learning –
Why we should know
and how it works
Kevin Lee, AVP of ML & AI Consultant
The Agenda
➢ Why Machine Learning?
➢ Introduction of Machine Learning
– Supervised, Unsupervised, Deep
Neural Network
➢ Machine Learning
Implementation
➢ PhUSE Machine Learning Working
Group
➢ Questions and Discussion
Kevin, do you
know about
Machine
Learning?
Why did people ask / expect me if I
know about Machine Learning?
• Programming
• Statistics / modeling
• Working with data all the times
What is ML?
An application of
artificial intelligence
(AI) that provides
systems the ability to
automatically learn
and improve from
experience without
being explicitly
programmed.
Explicit
Programing
Machine Learning
How does Human Learn?
- Experience
How does Machine learn?
Algorithm
Input
Data
How does Machine learn?
Algorithm
Test Data (cats)
Real Data
cat
How does Machine Learning
work?
X0 X1 X2 … Xn Y
• Hypothesis Function
- hθ(x) = θx + b
• Minimize Cost
Function,
J(θ) = hθ(x) – Y,
using Gradient
Descent
Input data
Algorithm
Machine Learning Algorithms
• Hypothesis function
• Model for data
• Y = θ0x0 + θ1x1 + θ2x2 + θ3x3 + …. + θnxn (Y = 2X + 30)
• Cost function
• measures how well hypothesis function fits into data.
• Difference between actual data point and
hypothesized data point.
• Gradient Descent
• Engine that minimizes cost function
Machine Learning Training Process
1. Select Data and hypothesis function
2. Start with random parameters of hypothesis function
3. Process first sets of data in hypothesis function
4. Find predicted values from hypothesis function
5. Calculate cost function ( actual – predicted)
6. Minimize cost function using Gradient Descent
7. Update parameters of hypothesis function based on cost
function
8. Repeat steps ( 3 and 7) with next sets of data until the cost
function becomes 0 or at the lowest point
Machine finds best model
using input data
X
Y
hθ(x) = x + 30
hθ(x) = 2x + 30
Model building in Machine Learning
Hypothesis
Function
New
Iteration
Cost
Function
Calculation
Applying
Gradient
Descent
Updating
Parameters
Best model can provide best
predicted value
X
Y
Xi
Yi
More data,
Better model,
Better prediction
X
Y
X
Y
Different data,
Different model,
Different prediction
X
Y
Garbage
in
Garbage
out
Bad data, Bad model, Bad prediction
Machine
Learning Type
• Supervised - we know
the correct answers
• Unsupervised – no
answers
• Artificial Neural
Network – like human
neural network
Machine Learning Type
• Supervised
• Input data labeled : has correct
answers
• Specific purpose
• Types
• Classification for distinct output
values
• Regression for continuous output
values
• Unsupervised
• Input data not-labeled : No correct
answers
• Exploratory
• Types : Clustering (clusters)
X0 X1 X2 … Xn Y
X0 X1 X2 … Xn
Classification
X1
• Categorical target
• Binary or Multiple
Classification
• Example : Yes/No, 0 to 9,
mild/moderate/severe
• Logistic Regression, SVM,
K-nearest Neighbor, Naïve
Bayes, Decision Tree,
Random Forests, XGBoost
X2
Support Vector Machine (SVM)
• SVM is one of the most powerful classification
model (both binary & multi), especially for
complex, but small/mid-sized datasets.
• Best for binary classification
• Easy to train
• It finds out a line/ hyper-plane (in multidimensional
space that separate outs classes) that maximizes
margin between two classes.
• parameters – Kernel (linear, polynomial), gamma,
margin(a separation of line to the closest class
points)
*** SAS codes;
proc svmachine data=x_train C=1.0; kernel linear;
input x1 x2 x3 x4 / level=interval;
target y;
run;
SVM in SAS Visual
Data Mining and
Machine Learning
SAS Machine
Learning portal
can provide an
interactive
modeling.
SVM in SAS Visual Data
Mining and Machine
Learning
SVM in SAS Visual Data Mining and
Machine Learning
Python codes for SVM
#import ML algorithm
from sklearn.svm import SVC
#prepare train and test datasets
x_train = …
y_train = ….
x_test = ….
#select and train model
svm = SVC(kernel=‘linear’, C=1.0, random_state=1)
svm.fit(x_train, y_train)
#predict output
predicted = svm.predict(x_test)
Decision Trees
The widely used
machine learning
algorithm that use tree
to visually and explicitly
represent decision and
its conditions.
It is used for both
classification (more
popular) and regression.
Why Decision Trees?
• Easy to understand and interpret
• Require a very little data preparation
• Possible to validate model using statistical tests.
Python codes for Decision Tree
#import ML algorithm
from sklearn.tree import DecisionClassifier
#prepare train and test datasets
x_train = …
y_train = ….
x_test = ….
#select and train model
d_tree = DecisionClassifier(max_depth=4)
d_tree.fit(x_train, y_train)
#predict output
predicted = d_tree.predict(x_test)
Decision Tree in SAS Visual Data
Mining and Machine Learning
Python in Machine Learning
Most popular programming language in Machine
Learning
• Scikit-Learn
• simple and efficient ML tool, open-source
• Classification, Regression, SVM, Clustering,
Dimensionality Reduction, Decision Trees,
Random Forests
• TensorFlow
• developed by Google, open source since
Nov.’2015
• Artificial Neural Network, Convolutional NN,
Recurrent NN, Autoencoder, Reinforcement
Learning
Logistic Regression
• Output: Binary target (Yes/No)
• Input: Continuous/categorical variables
• Example : predicting if the tumor is malignant based on tumor size
Tumor size
(mm)
Yes
(1)
No (0)
Yes
(1)
No (0) Tumor size
(mm)
Malignant?Malignant?
Threshold : 0.5
Logistic Regression Architecture
• Hypothesis function : hθ(x) = 1 / (1 + e-z)
where z= ( θ0x0 + θ1x1 + θ2x2 + θ3x3 +,, + θnxn )
x1
x2
x3
xn
∑ f(z)
θ1
θ2
θ3
θn
x0
z hθ(x)
• What are the parameters?θ0
Python codes for Logistic Regression
#import ML algorithm
from sklearn.linear_model import LogisticRegression
#prepare train and test datasets
x_train = …
y_train = ….
x_test = ….
#select and train model
lr = LogisticRegression()
lr.fit(x_train, y_train)
#predict output
predicted = lr.predict(x_test)
Regression
x
• To predict values of a desired target
quantity when the target quantity is
continuous.
• Output: Continuous variables
• Example : predicting house price per sqft
• Types: Linear Regression, Polynomial
Regression, Decision Tree, Random Forest
• Hypothesis function:
• hθ(x) = θ0x0 + θ1x1 + θ2x2 + θ3x3 +,, + θnxn
• y = 10 + 2x
• Accuracy of Regression model
• Square Error(SE): sum (y – pred)2
• Mean Square Error (MSE) = (1/n)*SE
• Root Mean Square Error (RMSE) = square
root of MSE
• Relative MSE (rMSE) = MSE / Var(y)
• R2 = 1 - rMSE
y
Python codes for ML Linear Regression
#import ML algorithm
from sklearn import linear_model
from sklearn.metrics import mean_squared_error
#prepare train and test datasets
x_train = …
y_train = ….
x_test = ….
#select and train model
linear = linear_model.LinearRegression()
linear.fit(x_train, y_train)
#predict output
predicted = linear.predict(x_test)
#evaluate model (MSE)
Mean_squiared_error(y_test, predicted)
Unsupervised Machine Learning
• Not-labeled input data : no correct answers
• Exploratory
• Clustering : the assignment of set of observations into subsets
(clusters) so that data in the same clusters are more similar to each
other than to those from different clusters.
• Algorithms : k-means
• Industry implementation - the grouping of documents, music, movies
by different topics, finding customers that share similar interests based
on common purchase behaviors.
• Customer segmentation
• Document clustering
• Image segmentation
• Recommendation engines
k-means clustering
• The simplest and popular unsupervised machine learning algorithm
• To group similar data points together and discover underlying patterns. To
achieve this objective, K-means looks for a fixed number (k) of clusters in a
dataset.
• the K-means algorithm identifies k number of centroids, and then allocates
every data point to the nearest cluster, while keeping the centroids as
small as possible.
How it works
1. Randomly pick k centroids as initial
cluster centers.
2. Assign each sample to the nearest
centroid.
3. Move the centroids to the center of the
samples that were assigned to it.
4. Repeat 2 and 3 until the cluster
assignments do not change under a
tolerance.
Python codes for k-means
#import ML algorithm
from sklearn.cluster import KMeans
#prepare datasets of X whose size is (100,2)
X = …
# build k means model with 2 clusters
kmeans = KMeans(n_clusters = 2, tol=0.0001, max_iter=100)
kmeans.fit(X)
#print the results of the value of centroids
print(kmeans.cluster_centers_) # two centroids
---- array([[-0.94665068, -0.97138368],[ 2.01559419, 2.02597093]])
#print the results of the value of centroids
print(kmeans.labels_) # 100 labels based on two centroids.
---- array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Artificial Neural Network (ANN)
• Most powerful ML
algorithm
• Game Changer
• Works very similar to
human brain – Neural
Network
• Types – DNN, CNN, RNN
• Architecture
• Artificial Neurons
• Input Layers
• Hidden Layers
• Output Layers
Human Neural Network vs Artificial
Neural Network (ANN)
➢ Human
Neurons :
Neural
Network – 100
billions
➢ Artificial
Neurons :
Artificial Neural
Networks
Artificial Neuron
• Input (x1, x2, x3,,,,xn) – numeric values that goes into neuron model
• Weight (w1, w2, w3,,,,wn) – weight of each input values
• Bias (b1) – bias of (input * weight)
• Artificial neuron
• Model (input values * weights + bias): z1 = x1w1,1 + x2w1,2 +
x3w1,3 +,,,, + xnw1,n + b1
• Activation function : a1 = f(z1)
x1
x2
x3
xn
∑
Activation
function
W1,1
W1,2
W1,3
W1,n
b1
z1 a1
Neural Network in SAS using
proc nnet
Proc nnet data=Train;
architecture mlp;
hidden 4;
hidden 2;
input x1 x2 x3 ;
target Y;
Run;
Neural Network in SAS Visual Data
Mining and Machine Learning
Python codes for DNN
#import ANN - TensorFlow
Import tensorflow as tf
X = tf.placeholder(..)
Y = tf.placeholder(..)
hidden1 = tf.layer.dense(X, 4, activation=tf.nn.relu)
hidden2 = tf.layer.dense(hidden1, 2, activation=tf.nn.relu)
logits = neuron_layer(hidden2, 2)
….
loss = tf.reduce_mean(….)
optimizer = tf.train.GradientDescentOptimezer(0.1)
traing_op = optimizer.minimizer(loss)
tf.Session.run(training_op, feed_dict={X:x_train, Y:y_train})
Tensor Flow Demo
http://playground.tensorflow.org
Where is SAS in ML?
SAS Visual Data Mining and Machine Learning
• Linear Regression
• Logistic Regression
• Support Vector Machine
• Deep Neural Networks ( limited layers)
How ML is
being used in
our daily life
Recommendation
• Amazon
• Netflix
• Spotify
Customer
Service
• Online
Chatting
• Call
Amazon Go – Cashless
Shopping
AlphaGO
Why is AI(ML) so popular now?
• Cost effective
• Automate a lot of works
• Can replace or enhance human labors
• “Pretty much anything that a normal person
can do in <1 sec, we can now automate with
AI” Andrew Ng
• Accurate
• Better than humans
• Can solve a lot of complex business
problems
Now, how Pharma goes into AI/ML
market
• GSK sign $43 million contract with
Exscientia to speed drug discovery
• aiming ¼ time and ¼ cost
• identifying a target for disease intervention to a
molecule from 5.5 to 1 year
• J&J
• Surgical Robotics – partners with Google.
Leverage AI/ML to help surgeons by interpreting
what they see or predict during surgery
Now, how Pharma goes into AI/ML
market
• Roche
• With GNS Healthcare, use ML to find novel targets
for cancer therapy using cancer patient data
• Pfizer
• With IBM, utilize Watson for drug discovery
• Watson has accumulated data from 25 million
articles compared to 200 articles a human
researcher can read in a year.
• Novartis
• With IBM Watson, developing a cognitive
solution using real-time data to gain better
insights on the expected outcomes.
• With Cota Healthcare, aiming to accelerate
clinical development of new therapies for
breast cancer.
• New CEO claimed Novartis as Medicine
and Data Science company.
Now, how Pharma goes into AI/ML
market
• First FDA approval for medical
device using AI/ML
• FDA is open for machine learning
algorithm based on devices.
How FDA sees ML/AL
ML application in Pharma R&D
• Drug discovery
• Drug candidate selection
• Clinical system optimization
• Medical image recognition
• Medical diagnosis
• Optimum site selection /
recruitment
• Data anomality detection
• Personized medicine
Adoption of AI/ML in Pharma
• Slow
• Regulatory restriction
• Machine Learning Black Box challenge
– need to build ML models, statistically
or mathematically proven and
validated, to explain final results.
• Big investment in Healthcare and a lot
of AI Start up aiming Pharma.
Healthcare AI/ML market
• US - 320 million in 2016
• Europe – 270 million in 2016
• 40% annual rate
• 10 billion in 2024
• Short in talents
PhUSE Machine Learning & AI Project
Goals
• To raise awareness in Machine Learning in
Pharma Industry
• To get more helps (volunteers and expertise)
• To Introduce Machine Learning Education
• To curate Machine Learning Materials
• White papers
• Projects
Machine Learning & AI Team
• PhUSE Wiki in
http://www.phusewiki.org/wiki/index.php?titl
e=Machine_Learning_/_Artificial_Intelligence
• Project of “Educating For the Future”
Why Machine
Learning & AI
Team?
Machine
Learning
& AI
Team
• Kevin Lee at
kevin.kyosun.lee@gmail.com
• Nicolas Dupuis at
ndupuis@protonmail.com
• Wendy Dobson at wendy@phuse.com
How to Volunteer
Kevin, do
you know
about
Machine
Learning?
Thanks!!!
Please contact
kevin.kyosun.lee@gmail.com
https://www.linkedin.com/in/
HelloKevinLee/

Weitere ähnliche Inhalte

Was ist angesagt?

Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
butest
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira
 

Was ist angesagt? (20)

[系列活動] 機器學習速遊
[系列活動] 機器學習速遊[系列活動] 機器學習速遊
[系列活動] 機器學習速遊
 
Machine learning
Machine learningMachine learning
Machine learning
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Introduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi ChenIntroduction to Boosted Trees by Tianqi Chen
Introduction to Boosted Trees by Tianqi Chen
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning Algorithm
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Random forest using apache mahout
Random forest using apache mahoutRandom forest using apache mahout
Random forest using apache mahout
 
My7class
My7classMy7class
My7class
 
Leakage in Meta Modeling And Its Connection to HCC Target-Encoding - Mathias ...
Leakage in Meta Modeling And Its Connection to HCC Target-Encoding - Mathias ...Leakage in Meta Modeling And Its Connection to HCC Target-Encoding - Mathias ...
Leakage in Meta Modeling And Its Connection to HCC Target-Encoding - Mathias ...
 
Boosted tree
Boosted treeBoosted tree
Boosted tree
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 

Ähnlich wie Machine Learning : why we should know and how it works

Ähnlich wie Machine Learning : why we should know and how it works (20)

The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Machine learning
Machine learningMachine learning
Machine learning
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
ML basics.pptx
ML basics.pptxML basics.pptx
ML basics.pptx
 
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
 
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 

Mehr von Kevin Lee

Introduction of AWS Cloud Computing and its future for Biometric Department
Introduction of AWS Cloud Computing and its future for Biometric DepartmentIntroduction of AWS Cloud Computing and its future for Biometric Department
Introduction of AWS Cloud Computing and its future for Biometric Department
Kevin Lee
 
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
Kevin Lee
 
Prompt it, not Google it - Prompt Engineering for Data Scientists
Prompt it, not Google it - Prompt Engineering for Data ScientistsPrompt it, not Google it - Prompt Engineering for Data Scientists
Prompt it, not Google it - Prompt Engineering for Data Scientists
Kevin Lee
 
Leading into the Unknown? Yes, we need Change Management Leadership
Leading into the Unknown? Yes, we need Change Management LeadershipLeading into the Unknown? Yes, we need Change Management Leadership
Leading into the Unknown? Yes, we need Change Management Leadership
Kevin Lee
 
Perfect partnership - machine learning and CDISC standard data
Perfect partnership - machine learning and CDISC standard dataPerfect partnership - machine learning and CDISC standard data
Perfect partnership - machine learning and CDISC standard data
Kevin Lee
 
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
Kevin Lee
 
Are you ready for Dec 17, 2016 - CDISC compliant data?
Are you ready for Dec 17, 2016 - CDISC compliant data?Are you ready for Dec 17, 2016 - CDISC compliant data?
Are you ready for Dec 17, 2016 - CDISC compliant data?
Kevin Lee
 
SAS integration with NoSQL data
SAS integration with NoSQL dataSAS integration with NoSQL data
SAS integration with NoSQL data
Kevin Lee
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmers
Kevin Lee
 

Mehr von Kevin Lee (20)

Patient’s Journey using Real World Data and its Advanced Analytics
Patient’s Journey using Real World Data and its Advanced AnalyticsPatient’s Journey using Real World Data and its Advanced Analytics
Patient’s Journey using Real World Data and its Advanced Analytics
 
Introduction of AWS Cloud Computing and its future for Biometric Department
Introduction of AWS Cloud Computing and its future for Biometric DepartmentIntroduction of AWS Cloud Computing and its future for Biometric Department
Introduction of AWS Cloud Computing and its future for Biometric Department
 
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...
 
Prompt it, not Google it - Prompt Engineering for Data Scientists
Prompt it, not Google it - Prompt Engineering for Data ScientistsPrompt it, not Google it - Prompt Engineering for Data Scientists
Prompt it, not Google it - Prompt Engineering for Data Scientists
 
Leading into the Unknown? Yes, we need Change Management Leadership
Leading into the Unknown? Yes, we need Change Management LeadershipLeading into the Unknown? Yes, we need Change Management Leadership
Leading into the Unknown? Yes, we need Change Management Leadership
 
How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1
 
Enterprise-level Transition from SAS to Open-source Programming for the whole...
Enterprise-level Transition from SAS to Open-source Programming for the whole...Enterprise-level Transition from SAS to Open-source Programming for the whole...
Enterprise-level Transition from SAS to Open-source Programming for the whole...
 
How I became ML Engineer
How I became ML Engineer How I became ML Engineer
How I became ML Engineer
 
Artificial Intelligence in Pharmaceutical Industry
Artificial Intelligence in Pharmaceutical IndustryArtificial Intelligence in Pharmaceutical Industry
Artificial Intelligence in Pharmaceutical Industry
 
Tell stories with jupyter notebook
Tell stories with jupyter notebookTell stories with jupyter notebook
Tell stories with jupyter notebook
 
Perfect partnership - machine learning and CDISC standard data
Perfect partnership - machine learning and CDISC standard dataPerfect partnership - machine learning and CDISC standard data
Perfect partnership - machine learning and CDISC standard data
 
Big data for SAS programmers
Big data for SAS programmersBig data for SAS programmers
Big data for SAS programmers
 
Big data in pharmaceutical industry
Big data in pharmaceutical industryBig data in pharmaceutical industry
Big data in pharmaceutical industry
 
How FDA will reject non compliant electronic submission
How FDA will reject non compliant electronic submissionHow FDA will reject non compliant electronic submission
How FDA will reject non compliant electronic submission
 
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...
 
Are you ready for Dec 17, 2016 - CDISC compliant data?
Are you ready for Dec 17, 2016 - CDISC compliant data?Are you ready for Dec 17, 2016 - CDISC compliant data?
Are you ready for Dec 17, 2016 - CDISC compliant data?
 
SAS integration with NoSQL data
SAS integration with NoSQL dataSAS integration with NoSQL data
SAS integration with NoSQL data
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmers
 
Standards Metadata Management (system)
Standards Metadata Management (system)Standards Metadata Management (system)
Standards Metadata Management (system)
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data development
 

Kürzlich hochgeladen

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Kürzlich hochgeladen (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Machine Learning : why we should know and how it works

  • 1. Machine Learning – Why we should know and how it works Kevin Lee, AVP of ML & AI Consultant
  • 2. The Agenda ➢ Why Machine Learning? ➢ Introduction of Machine Learning – Supervised, Unsupervised, Deep Neural Network ➢ Machine Learning Implementation ➢ PhUSE Machine Learning Working Group ➢ Questions and Discussion
  • 3. Kevin, do you know about Machine Learning?
  • 4. Why did people ask / expect me if I know about Machine Learning? • Programming • Statistics / modeling • Working with data all the times
  • 5. What is ML? An application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
  • 7. How does Human Learn? - Experience
  • 8. How does Machine learn? Algorithm Input Data
  • 9. How does Machine learn? Algorithm Test Data (cats) Real Data cat
  • 10. How does Machine Learning work? X0 X1 X2 … Xn Y • Hypothesis Function - hθ(x) = θx + b • Minimize Cost Function, J(θ) = hθ(x) – Y, using Gradient Descent Input data Algorithm
  • 11. Machine Learning Algorithms • Hypothesis function • Model for data • Y = θ0x0 + θ1x1 + θ2x2 + θ3x3 + …. + θnxn (Y = 2X + 30) • Cost function • measures how well hypothesis function fits into data. • Difference between actual data point and hypothesized data point. • Gradient Descent • Engine that minimizes cost function
  • 12. Machine Learning Training Process 1. Select Data and hypothesis function 2. Start with random parameters of hypothesis function 3. Process first sets of data in hypothesis function 4. Find predicted values from hypothesis function 5. Calculate cost function ( actual – predicted) 6. Minimize cost function using Gradient Descent 7. Update parameters of hypothesis function based on cost function 8. Repeat steps ( 3 and 7) with next sets of data until the cost function becomes 0 or at the lowest point
  • 13. Machine finds best model using input data X Y hθ(x) = x + 30 hθ(x) = 2x + 30
  • 14. Model building in Machine Learning Hypothesis Function New Iteration Cost Function Calculation Applying Gradient Descent Updating Parameters
  • 15. Best model can provide best predicted value X Y Xi Yi
  • 16. More data, Better model, Better prediction X Y X Y Different data, Different model, Different prediction
  • 18. Machine Learning Type • Supervised - we know the correct answers • Unsupervised – no answers • Artificial Neural Network – like human neural network
  • 19. Machine Learning Type • Supervised • Input data labeled : has correct answers • Specific purpose • Types • Classification for distinct output values • Regression for continuous output values • Unsupervised • Input data not-labeled : No correct answers • Exploratory • Types : Clustering (clusters) X0 X1 X2 … Xn Y X0 X1 X2 … Xn
  • 20. Classification X1 • Categorical target • Binary or Multiple Classification • Example : Yes/No, 0 to 9, mild/moderate/severe • Logistic Regression, SVM, K-nearest Neighbor, Naïve Bayes, Decision Tree, Random Forests, XGBoost X2
  • 21. Support Vector Machine (SVM) • SVM is one of the most powerful classification model (both binary & multi), especially for complex, but small/mid-sized datasets. • Best for binary classification • Easy to train • It finds out a line/ hyper-plane (in multidimensional space that separate outs classes) that maximizes margin between two classes. • parameters – Kernel (linear, polynomial), gamma, margin(a separation of line to the closest class points) *** SAS codes; proc svmachine data=x_train C=1.0; kernel linear; input x1 x2 x3 x4 / level=interval; target y; run;
  • 22. SVM in SAS Visual Data Mining and Machine Learning SAS Machine Learning portal can provide an interactive modeling.
  • 23. SVM in SAS Visual Data Mining and Machine Learning SVM in SAS Visual Data Mining and Machine Learning
  • 24. Python codes for SVM #import ML algorithm from sklearn.svm import SVC #prepare train and test datasets x_train = … y_train = …. x_test = …. #select and train model svm = SVC(kernel=‘linear’, C=1.0, random_state=1) svm.fit(x_train, y_train) #predict output predicted = svm.predict(x_test)
  • 25. Decision Trees The widely used machine learning algorithm that use tree to visually and explicitly represent decision and its conditions. It is used for both classification (more popular) and regression. Why Decision Trees? • Easy to understand and interpret • Require a very little data preparation • Possible to validate model using statistical tests.
  • 26. Python codes for Decision Tree #import ML algorithm from sklearn.tree import DecisionClassifier #prepare train and test datasets x_train = … y_train = …. x_test = …. #select and train model d_tree = DecisionClassifier(max_depth=4) d_tree.fit(x_train, y_train) #predict output predicted = d_tree.predict(x_test)
  • 27. Decision Tree in SAS Visual Data Mining and Machine Learning
  • 28. Python in Machine Learning Most popular programming language in Machine Learning • Scikit-Learn • simple and efficient ML tool, open-source • Classification, Regression, SVM, Clustering, Dimensionality Reduction, Decision Trees, Random Forests • TensorFlow • developed by Google, open source since Nov.’2015 • Artificial Neural Network, Convolutional NN, Recurrent NN, Autoencoder, Reinforcement Learning
  • 29. Logistic Regression • Output: Binary target (Yes/No) • Input: Continuous/categorical variables • Example : predicting if the tumor is malignant based on tumor size Tumor size (mm) Yes (1) No (0) Yes (1) No (0) Tumor size (mm) Malignant?Malignant? Threshold : 0.5
  • 30. Logistic Regression Architecture • Hypothesis function : hθ(x) = 1 / (1 + e-z) where z= ( θ0x0 + θ1x1 + θ2x2 + θ3x3 +,, + θnxn ) x1 x2 x3 xn ∑ f(z) θ1 θ2 θ3 θn x0 z hθ(x) • What are the parameters?θ0
  • 31. Python codes for Logistic Regression #import ML algorithm from sklearn.linear_model import LogisticRegression #prepare train and test datasets x_train = … y_train = …. x_test = …. #select and train model lr = LogisticRegression() lr.fit(x_train, y_train) #predict output predicted = lr.predict(x_test)
  • 32. Regression x • To predict values of a desired target quantity when the target quantity is continuous. • Output: Continuous variables • Example : predicting house price per sqft • Types: Linear Regression, Polynomial Regression, Decision Tree, Random Forest • Hypothesis function: • hθ(x) = θ0x0 + θ1x1 + θ2x2 + θ3x3 +,, + θnxn • y = 10 + 2x • Accuracy of Regression model • Square Error(SE): sum (y – pred)2 • Mean Square Error (MSE) = (1/n)*SE • Root Mean Square Error (RMSE) = square root of MSE • Relative MSE (rMSE) = MSE / Var(y) • R2 = 1 - rMSE y
  • 33. Python codes for ML Linear Regression #import ML algorithm from sklearn import linear_model from sklearn.metrics import mean_squared_error #prepare train and test datasets x_train = … y_train = …. x_test = …. #select and train model linear = linear_model.LinearRegression() linear.fit(x_train, y_train) #predict output predicted = linear.predict(x_test) #evaluate model (MSE) Mean_squiared_error(y_test, predicted)
  • 34. Unsupervised Machine Learning • Not-labeled input data : no correct answers • Exploratory • Clustering : the assignment of set of observations into subsets (clusters) so that data in the same clusters are more similar to each other than to those from different clusters. • Algorithms : k-means • Industry implementation - the grouping of documents, music, movies by different topics, finding customers that share similar interests based on common purchase behaviors. • Customer segmentation • Document clustering • Image segmentation • Recommendation engines
  • 35. k-means clustering • The simplest and popular unsupervised machine learning algorithm • To group similar data points together and discover underlying patterns. To achieve this objective, K-means looks for a fixed number (k) of clusters in a dataset. • the K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible. How it works 1. Randomly pick k centroids as initial cluster centers. 2. Assign each sample to the nearest centroid. 3. Move the centroids to the center of the samples that were assigned to it. 4. Repeat 2 and 3 until the cluster assignments do not change under a tolerance.
  • 36. Python codes for k-means #import ML algorithm from sklearn.cluster import KMeans #prepare datasets of X whose size is (100,2) X = … # build k means model with 2 clusters kmeans = KMeans(n_clusters = 2, tol=0.0001, max_iter=100) kmeans.fit(X) #print the results of the value of centroids print(kmeans.cluster_centers_) # two centroids ---- array([[-0.94665068, -0.97138368],[ 2.01559419, 2.02597093]]) #print the results of the value of centroids print(kmeans.labels_) # 100 labels based on two centroids. ---- array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
  • 37. Artificial Neural Network (ANN) • Most powerful ML algorithm • Game Changer • Works very similar to human brain – Neural Network • Types – DNN, CNN, RNN • Architecture • Artificial Neurons • Input Layers • Hidden Layers • Output Layers
  • 38. Human Neural Network vs Artificial Neural Network (ANN) ➢ Human Neurons : Neural Network – 100 billions ➢ Artificial Neurons : Artificial Neural Networks
  • 39. Artificial Neuron • Input (x1, x2, x3,,,,xn) – numeric values that goes into neuron model • Weight (w1, w2, w3,,,,wn) – weight of each input values • Bias (b1) – bias of (input * weight) • Artificial neuron • Model (input values * weights + bias): z1 = x1w1,1 + x2w1,2 + x3w1,3 +,,,, + xnw1,n + b1 • Activation function : a1 = f(z1) x1 x2 x3 xn ∑ Activation function W1,1 W1,2 W1,3 W1,n b1 z1 a1
  • 40. Neural Network in SAS using proc nnet Proc nnet data=Train; architecture mlp; hidden 4; hidden 2; input x1 x2 x3 ; target Y; Run;
  • 41. Neural Network in SAS Visual Data Mining and Machine Learning
  • 42. Python codes for DNN #import ANN - TensorFlow Import tensorflow as tf X = tf.placeholder(..) Y = tf.placeholder(..) hidden1 = tf.layer.dense(X, 4, activation=tf.nn.relu) hidden2 = tf.layer.dense(hidden1, 2, activation=tf.nn.relu) logits = neuron_layer(hidden2, 2) …. loss = tf.reduce_mean(….) optimizer = tf.train.GradientDescentOptimezer(0.1) traing_op = optimizer.minimizer(loss) tf.Session.run(training_op, feed_dict={X:x_train, Y:y_train})
  • 44. Where is SAS in ML? SAS Visual Data Mining and Machine Learning • Linear Regression • Logistic Regression • Support Vector Machine • Deep Neural Networks ( limited layers)
  • 45. How ML is being used in our daily life
  • 46.
  • 49. Amazon Go – Cashless Shopping
  • 51.
  • 52. Why is AI(ML) so popular now? • Cost effective • Automate a lot of works • Can replace or enhance human labors • “Pretty much anything that a normal person can do in <1 sec, we can now automate with AI” Andrew Ng • Accurate • Better than humans • Can solve a lot of complex business problems
  • 53. Now, how Pharma goes into AI/ML market • GSK sign $43 million contract with Exscientia to speed drug discovery • aiming ¼ time and ¼ cost • identifying a target for disease intervention to a molecule from 5.5 to 1 year • J&J • Surgical Robotics – partners with Google. Leverage AI/ML to help surgeons by interpreting what they see or predict during surgery
  • 54. Now, how Pharma goes into AI/ML market • Roche • With GNS Healthcare, use ML to find novel targets for cancer therapy using cancer patient data • Pfizer • With IBM, utilize Watson for drug discovery • Watson has accumulated data from 25 million articles compared to 200 articles a human researcher can read in a year.
  • 55. • Novartis • With IBM Watson, developing a cognitive solution using real-time data to gain better insights on the expected outcomes. • With Cota Healthcare, aiming to accelerate clinical development of new therapies for breast cancer. • New CEO claimed Novartis as Medicine and Data Science company. Now, how Pharma goes into AI/ML market
  • 56. • First FDA approval for medical device using AI/ML • FDA is open for machine learning algorithm based on devices. How FDA sees ML/AL
  • 57. ML application in Pharma R&D • Drug discovery • Drug candidate selection • Clinical system optimization • Medical image recognition • Medical diagnosis • Optimum site selection / recruitment • Data anomality detection • Personized medicine
  • 58. Adoption of AI/ML in Pharma • Slow • Regulatory restriction • Machine Learning Black Box challenge – need to build ML models, statistically or mathematically proven and validated, to explain final results. • Big investment in Healthcare and a lot of AI Start up aiming Pharma.
  • 59. Healthcare AI/ML market • US - 320 million in 2016 • Europe – 270 million in 2016 • 40% annual rate • 10 billion in 2024 • Short in talents
  • 60. PhUSE Machine Learning & AI Project Goals • To raise awareness in Machine Learning in Pharma Industry • To get more helps (volunteers and expertise) • To Introduce Machine Learning Education • To curate Machine Learning Materials • White papers • Projects
  • 61. Machine Learning & AI Team • PhUSE Wiki in http://www.phusewiki.org/wiki/index.php?titl e=Machine_Learning_/_Artificial_Intelligence • Project of “Educating For the Future”
  • 64. • Kevin Lee at kevin.kyosun.lee@gmail.com • Nicolas Dupuis at ndupuis@protonmail.com • Wendy Dobson at wendy@phuse.com How to Volunteer