How Does Math Matter in Data Science

How Does Math
Matter in Data
Science?
Prepared by Mutia Ulfi

Outline
Part I
Required Math in
Data Science
Part II
Algorithm Used in
Data Science

What Actually Data Science Is?
HACKING
SKILLS
DATA
SCIENCE
Math &
Stats
SUBSTANSIVE
EXPERTISE
DANGER
ZONE! TRADITIONAL
RESEARCH

What Are Needed in Data Science?

Part I
Required
Mathematics in
Data Science

25% 15%
35%
15%
10%
Required Mathematics in Data
Science

Why Probability & Statistics?
• To understand
whether data is meaningful,
including:
• optimization,
• inference,
• testing,
• and other methods.
• Therefore, we are able to
analyze patterns
in data and using them to
predict, understand, and
improve results.
Prob: to
predict
likelihood
of future
Stats: to
analyze
frequency
of past
events

Why Calculus?
Calculus:
to find
values
that
maximize
or
minimize
outcomes
Optimizing cost for online dating

Why Optimization?
Opt: to
find values
that
maximize or
minimize
outcomes

Why Linear Algebra?
Linear
Algebra: to
simplify
codes by
understandi
ng matrix
algebra and
eigenvalues

Part II
Algorithms Used in
Data Science

Regression
Supervised Learning
Task Given
Probs & Stats
Linear Algebra
• Linear Regression
• Logistic Regression
• Neural Network
Optimization

Linear Regression
• Purpose  to estimate real values based on
continuous variables (e.g. forecasting)
• Mathematically speaking
𝑦 = dependent variable
𝑎 = slope
𝑥 = independent variable
𝑏 = intercept
𝑦 = 𝑎𝑥 + 𝑏

Linear Regression
• The Code

Logistic Regression
• Purpose  to estimate the probability of a
binary response based on ≥ 1 predictor
variables (features)
𝑦 = dependent variable
𝑎 = slope
𝑥 = independent variable
𝑏 = intercept
Ɛ = error
𝑦 = {0 𝑒𝑙𝑠𝑒
1 𝑎𝑥+𝑏+Ɛ

Logistic Regression
• The Code

Neural Network
• Purpose  to find out the output of
complex input by modeling the
relationship
• To find patterns in data (pattern
recognition)
• To predict sampled functions given no form
of the functions (function estimation)
Let input and
weight, then activation,
a = =
𝑥1, 𝑥2, 𝑥3, … , 𝑥 𝑛 𝑤1, 𝑤2, 𝑤3, … , 𝑤 𝑛
𝑖=1
𝑛
𝑥𝑖 𝑤𝑖𝑥1 𝑤1 + 𝑥2 𝑤2 + 𝑥3 𝑤3 + ⋯ + 𝑤 𝑛

Classification
• Decision Tree
• Random Forest
• Naïve Bayes
• Support Vector
Machine
• k-Nearest Neighbor
• Neural Network
Supervised Learning
Task Given
Probs & Stats
Linear Algebra
Calculus

Decision Tree
•Categorical
•Continuous

Decision Tree
• Purpose  to model the relationships among the
features and the possible outcomes in tree structures
• Gini impurity
• Information gain
• Variance reduction

Random Forest
• Purpose  to model the relationships among the
features and the possible outcomes in tree structures
(but bigger features than decision tree, weights are
also varied: BOOSTING)
• Predictions,
With , if 𝑥𝑖 is one of the 𝑘’ points in the
same leaf as 𝑥’ , and 0 otherwise.

Naïve Bayes
• Purpose  to describe the probability of
events and how probabilities should be revised
in the light of additional information.

Support Vector Machine
• Purpose  to create flat boundary called a
hyperplane, which divides the space to create
fairly homogeneous partition on either side.
• Mathematically Speaking
Kernel 𝐾 𝑥𝑖, 𝑥𝑗 = 𝑥𝑖, 𝑥𝑗
2

k-Nearest Neighbor
• Purpose  to classify one of data into a class based
on its similarity to its nearest neighbor
𝑦 = argmax
𝑦
𝑝 𝑦|𝑥, 𝐷
𝑦 = majority vote (predictor)
D = a set of points in the circle
𝑝 𝑦|𝑥, 𝐷 = portion of points in k-nearest points
We intend to find a class for
question-tagged object. HOW?
x
y
1. Make a circle to get k-nearest neighbors of object (it’s a majority vote of k-nearest points, refer to the figure).
2. Repeat the process
3. Calculate the distance between object and its nearest neighbors
4. Find the probability of object by portion of points in k-nearest points (refer to figure: portion of A and B in the
circles) , or we can write it mathematically as above.

Clustering
Supervised Learning
Task Given
Probs & Stats
Linear Algebra
• k-Means Clustering
• Association Rules

k-Means Clustering
• Purpose  to minimize the
differences within each cluster
and maximize the differences
between the clusters.
• Data assignment
𝑐𝑖 = collection of centroids
𝐶𝑖 = set of 𝑐𝑖
𝑥 = data point
• Centroid update step
𝑐𝑖 = collection of centroids
𝑆𝑖 = set of data point assignments for each 𝑖 𝑡ℎ
cluster centroid
𝑥𝑖 = data point

• Purpose  to give suggestion
of possible outcome by
predicting historical input
(e.g. recommendation of a
product to buy)
Association Rules

Reinforcement Learning
Supervised Learning
Task Given
Probs & Stats
Linear Algebra
Calculus
Optimization

Tired of seeing
mathematical
equation?
me too

It’s not about equation
BUT
Mathematical intuitions
• Choose the right algorithms for problem
• Make good choices on parameter settings, validation strategies
• Recognize underfitting or overfitting
• Troubleshoot poor or ambiguous results
• Put appropriate bounds of confidence or uncertainty on results
• Do a better job of coding algorithms or incorporating them into
more complex analysis pipelines

And the most important thing is…
Understand the concept
NOT
The packages

Want to deep dive in particula
algorithm?
Simply request! :D

References
• https://courses.washington.edu/css490/2012.Winter/lecture_slides/0
2_math_essentials.pdf
• https://www.quora.com/How-do-I-learn-mathematics-for-machine-
learning
• https://www.analyticsvidhya.com
• https://en.wikipedia.org/wiki/Decision_tree_learning
• http://csyue.nccu.edu.tw/ch/Cartoon%20Statistics.pdf
• https://www.datascience.com/blog/k-means-clustering

How Does Math Matter in Data Science

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie How Does Math Matter in Data Science

Ähnlich wie How Does Math Matter in Data Science (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

How Does Math Matter in Data Science