1. ! What this project was about?
! Finding location of small objects in
microscope images
! How did I do it?
! Used a machine learning algorithm
called Probabilistic Boosting Tree
! Major Thing I learnt?
! Learning to code machine learning
algorithms from scratch, and tuning
parameters
Machine Learning Algorithms to
Detect Objects in Microscope Images
Typical noisy synthetic
images used as training
and test data.
2. Machine Learning Algorithms to
Detect Objects in Microscope Images
! Coded algorithm from
scratch in Python
https://github.com/
souravc83/pbtspot
! Extension of Viola
Jones face recognition
algorithm
! Probabilistic Boosting
Tree
! Decision Tree with
strong Adaboost
classifier at each node
ROC curves for different
signal to noise (SNR)ratios
Decision Tree separates
positive (in red) and negative
(in blue) examples
3. Predicting Box Office Success
! What this project was about?
! A web application ranking for
movies in theater now
! Predicts box office revenue based
on metadata
! How did I do it?
! Scraped movie metadata from
raw html and used a regression
model
! Major thing I learnt?
! Data cleaning and feature
extraction
Movie Rating
Guardians of the
Galaxy
40.4
Dolphin Tale 2 30.2
The Maze Runner 28.7
Into the Storm 25.4
Raw data: What I start with
Final form: What I end up with
See the app at:
http://moviemetr.herokuapp.com
4. • Data manipulation:
Pandas
• Machine Learning:
scikit-learn
• Storing Data:
SQLite
• Data scraping from
raw HTML
• BeautifulSoup and
Regular expressions
• Using Rotten Tomato
API
• CSS/HTML:
Bootstrap
• Visualization: D3.js
• Querying Data:
SQLite
• Python to HTML:
Flask
Predicting Box Office Success
See the app at:
http://moviemetr.herokuapp.com
Source code at:
https://github.com/souravc83/MovieMetR
5. How does the site look like?
Comparing our
ratings with
Rottentomato
User Ratings
See the app at:
http://moviemetr.herokuapp.com
Source code at:
https://github.com/souravc83/
MovieMetR
6. Examples of Past Research
Image Processing: Measured Deformation
In animal skin (wrote source code using
MATLAB and C++)
Quantified ordering in periodic structures
using Fourier Transforms (Phys. Rev. E,
2012)
Built numerical models to analyze instabilities in
fluid mechanics (Soft Matter, 2014)
Currently, I am a data scientist
at Western Digital. I use
machine learning techniques
to predict hard drive failures,
and identify critical issues.