6. Here is what you should NOT do when you start studying machine learning in Python.
1. Get really good at Python programming and Python syntax.
2. Deeply understand the underlying theory and parameters for machine learning algorithms in
scikit-learn*.
3. Avoid or lightly touch on all of the other tasks needed to complete a real project.
- J. Brownlee in his book
“Machine Learning Mastery with Python”
* https://scikit-learn.org/stable/
Don’ts of Getting Started
8. Data Structures in Python - An Overview
● list: sequence of mutable values
● tuple: sequence of immutable values
● dict: collection of key-value pairs
10. “Machine Learning is the interaction between theoretically sound computer science and practically
noisy data. Essentially, it’s about machines making sense out of data
in much the same way as humans do.
Machine learning is a type of artificial intelligence whereby
an algorithm or method extracts patterns from data.”
- Matthew Kirk in his book
“Thoughtful Machine Learning with Python”
Machine Learning Definitions
11. “Machine Learning is the interaction between theoretically sound computer science and practically
noisy data. Essentially, it’s about machines making sense out of data
in much the same way as humans do.
Machine learning is a type of artificial intelligence whereby
an algorithm or method extracts patterns from data.”
- Matthew Kirk in his book
“Thoughtful Machine Learning with Python”
Machine Learning Definitions
12. “ML is a multi-disciplinary approach, involving several scientific domains (e.g., mathematics, computer
science, physics, biology, etc.), that enable computers to automatically learn from data. By learning we
mean here a process that takes as input data and gives as output algorithms capable of performing, over
the same kind of data, a desired task.”
- From the book
“Machine Learning for Audio, Image and Video Analysis, 2e”
Machine Learning Definitions
13. “You will find it difficult to describe your mother’s face accurately enough for your friend to recognize
her in a supermarket. But if you show him a few of her photos, he will immediately spot the tell-tale
traits he needs. As they say, a picture- an example -is worth a thousand words.
This is what we want our technology to emulate. Unable to define certain objects or concepts with
accuracy, we want to convey them to the machine by ways of examples. For this to work, however, the
computer has to be able to convert the examples into knowledge.
Hence our interest in algorithms and techniques for machine learning …”
- From the book
“An Introduction to Machine Learning, 2e”
Machine Learning Definitions
14. “Machine learning is functionality that helps software perform a task
without explicit programming or rules.”
- Google
Machine Learning Definitions
15. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
16. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
17. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
18. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
19. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
20. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
21. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
22. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
23. - Image by
“Amazon Machine Learning on SlideShare”
How Explicit Programming Looks Like?
24. - Image by
“Amazon Machine Learning on SlideShare”
Machine Learning - Ditch Explicit Programming
29. “Facebook asks you to list your hometown and your current location, ostensibly to make it easier for
your friends to find and connect with you. But it also analyzes these locations to identify global
migration patterns and where the fanbases of different football teams live.”
- From the book
“Data Science from Scratch”
Understanding Data Science
32. Getting Started with ML Algorithms
● Regression: It is used to predict continuous variables (for e.g., salary, weight, temperature, etc.). It
is a form of supervised learning since the input is mapped to an output using input-output pairs.
Popular regression algorithms include Linear, eXtreme Gradient Boosting (XGBoost), CatBoost,
LightGBM, Lasso, Decision Tree, Random Forest, and Quantile regression.
● Classification: It is used to categorize data points (for e.g., spam filtering, predicting survival of
people due to a collapsed building, etc.). It is a form of supervised learning.
Popular classification algorithms include Logistic regression (yes, I said what I said), Support Vector
Machines (SVM), Naive Bayes, XGB, and KNN (K Nearest Neighbor) classifier.
33. Getting Started with ML Algorithms
● Clustering: It involves grouping data points into ‘clusters’ on the basis of their similarity (for e.g.,
grouping customers into clusters on the basis of their consumer behavior). It is a form of
unsupervised learning since the structure from the given input is not already defined. Unsupervised
learning involves finding hidden structures in the data.
Popular clustering algorithm K-Means, Hierarchical, DBSCAN clustering, and Gaussian Mixture
Model (GMM).
● Dimensionality Reduction: As the name suggests, it reduces the number of input dimensions
(variables). It is also used to conceal confidential data (for e.g., banking transactions, etc.). It is a
form of unsupervised learning.
Popular dimensionality reduction algorithms include t-SNE (t-distributed Stochastic Neighbor
Embedding), and PCA (Principal Component Analysis).
36. Getting Started with Core Libraries
● NumPy: NumPy (Numerical Python) is used to perform linear algebraic operations (for e.g., matrix
operations). In Numpy, matrices are referred to as ‘arrays’.
● Pandas: Pandas is useful for performing analysis of the data. It can also perform basic visualization
of the data (for e.g., line, scatter, box, bar plots, etc.)
● Matplotlib: The most comprehensive data visualization library Python ever had. Almost any graph
can be plotted using the Matplotlib library. However, the resulting graphs are bland in nature by
default and therefore require additional customization.
● Seaborn: Seaborn is a data visualization library and is one step ahead of Matplotlib. It requires less
code to plot some graphs and produces more visually-appealing graphs.
37. Getting Started with Core Libraries
● Plotly: The plots produced by Matplotlib and Seaborn are non-interactive, while Plotly yields
interactive plots. The interactivity is done via hovering over the plot.
● Scikit-Learn: Often referred to as the Swiss Army knife of machine learning, Scikit-Learn (sklearn)
can perform a multitude of ML operations on the data, including data scaling, clustering,
dimensionality reduction, and encoding (converting categorical variables into numeric variables).
● Scipy: The go-to library for performing statistical operations on the data (for e.g., calculating the
skewness and kurtosis, performing probability tests, hypothesis testing, etc.).
● OpenCV: OpenCV (opencv-python) is an intuitive library for performing image processing. Pose
estimation, face mask detection, etc. are possible through this library.