2. “Goal - Become a Data Scientist”
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
“A Dream becomes a Goal when action is taken towards its achievement” - Bo Bennett
7. 2. Essential Statistics & Maths - 5 hrs
● Relationships - Deterministic vs Statistical
● Statistics - Descriptive vs Inferential
● Sampling
● Variables
● Distribution
● Summarizing Distribution
● Correlation, Collinearity, Causation
● Probability
● Normal Distribution
● Confidence Interval
● Hypothesis Testing
● Calculus
● Linear Algebra
● Matrix Ops
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
8. 3. Pandas & scipy for Data Wrangling & Statistics - 5 hrs
● Series vs DataFrames
● Loading CSV, JSON, DB etc.
● Access & Filters
● DataFrame
● Exploratory Data Analysis
● Finding & Handling Missing Data
● Duplicate Handling
● Rolling averages
● Applying functions
● Handling Time Series Data
● Merging & Grouping Data
● Pivot Table & Crosstab
● Random data using scipy
● Comparing datasets using scipy
● Analyzing sample using scipy
● Kernel Density Estimation using scipy
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
9. 4. Data Visualization - 4 hrs
● Understanding matplotlib
● Plotting Quantitative data
● Plotting Qualitative data
● Histograms
● Frequency Polygons
● Box-Plots
● Bar charts
● Line Graphs
● Scatter Plots
● 3D Plots
● Exploring seaborn & Bokeh
● Introduction to Tableau
● Plotting scatter plot
● Bubble chart
● Bullet chart
● Gantt chart
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
13. 8. Feature Selection 2 hrs
● SelectKBest for Regression
● SelectKBest for Classification
● Variance Threshold
● Drop Highly correlated features
● Dropping based on non null values
● SelectFromModel
● Feature Selection using RandomForest
● Based on correlation with target
● Univariate Feature Selection
● Recursive Feature Elimination
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
14. 9. Model Evaluation - 1 hr
● Why do we need to evaluate at all ?
● Metrics for Classification
● Metrics for Regression
● Clustering matrices
● Probability Calibration
● Pairwise matrices
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
15. 10. Model Selection 1 hr
● Motivation
● KFold
● StratifiedKFold
● Splitting training testing data
● Cross Validate
● GridSearchCV
● RandomizedSearchCV
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
16. 11. Linear Regression - 3 hrs
● Understanding Ordinary Least Squares
● Cost Function
● Bias & Variance
● Coefficients & Intercept
● Simple Linear Regression
● Polynomial Linear Regression
● Ridge
● Lasso
● Elastic Net
● Stochastic Gradient Descent
● Robustness Regression
● Problem - Insurance Payout Prediction
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
17. 12. Logistic Regression - 2 hrs
● Basics of Logistic Regression
● Sigmoid
● Cost Function
● Understanding important
hyperparameters
● Predicting linear separator
● Predicting nonlinear decision boundary
● Handling Imbalanced classes
● Project - Predicting if income is less than
50K or more
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
18. 13. Naive Bayes - 2 hrs
● Bayes Theorem
● Gaussian Naive Bayes
● Multinomial Naive Bayes
● Bernoulli’s Naive Bayes
● Out-of-core naive bayes using partial-fit
● Limitations of naive bayes
● Choosing right
● Problem - Mail data classification
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
19. 14. Trees - 2 hrs
● Understanding Information Theory
● Entropy
● Decision Tree creation
● Tree for Classification
● Tree for Regression
● Advantages of Decision Tree
● Important Hyper-parameters
● Limitations of Decision Tree
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
20. 15. Ensemble Methods - 3 hrs
● Bagging vs Boosting
● Forests
● AdaBoost
● XGBoost
● Gradient Tree Boosting
● Voting Classifier
● Role weak estimators play
● Problem - Attack detection on network
data
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
22. 17. Support Vector Machine 3 hrs
● Understanding SVM
● Classification
● Regression
● OneClassSVM
● Imbalanced Classes
● Kernel Functions
● Understanding Maths behind it
● Problem - Face recognition
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
23. 17b. Novelty & Outlier Detection 1 hr
● Novelty vs Outlier
● OneClassSVM
● Fitting data in Elliptical Envelop
● Isolation Forest
● Local Outlier Factor
● When to use what
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
25. 19. Deployment & Scaling - 3 hrs
● Bottom-Up approach for dealing with large
data
● Extracting features using Hashing
Techniques
● Incremental learning
● Serializing data for quicker access
● Running as a Python .egg or wheel
● Model behind REST server
● Persisting & Loading model
● Deploying model behind web application
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
26. 20. Use Cases
● Credit Risk - Predicting Defaulters
● Amazon Food Review Sentiment
● Predicting Employee Attrition
● Identify characters on unknown language
● Predicting insurance payout amount
● Text Categorization
● Churn Prediction
● Attack Prediction on network data
● Identifying faces
● Predict patient stay in hospital
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
31. Visit : www.zekeLabs.com for more details
Let us know how can we help your organization to Upskill the employees to
stay updated in the ever-evolving IT Industry.
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com