Schema on read is obsolete. Welcome metaprogramming..pdf
Use of data science for startups_Sept 2021
1. Bohitesh Misra
Co-Founder and Director
Decisiontree Endeavour Pvt Ltd (www.ndtepl.com)
Bohitesh.misra@gmail.com
#ITNEXT100 #Eminent CIOs of India #CIO200
3. Bohitesh Misra
Health passports - These are mobile apps that indicate the relative
level of infection risk a person is and whether they can gain access to
buildings, supermarkets, restaurants, public spaces and
transportation. Ex. Aarogya Setu
Embedded AI - It has the potential to increase the accuracy, insights
and intelligence gained from current and next-generation sensors.
Responsible AI - purpose is to assist businesses in making more
ethical, balanced business decisions by attempting to reduce bias.
Identify fake news.
Generative AI - It is the technology most often used for creating “deep
fakes” videos and digital content.
AI-augmented development - its purpose is to improve the cycle times
of application and DevOps teams in creating high-quality software
faster and more consistently.
(C) Bohitesh Misra 3
4. Bohitesh Misra
By 2022, at least 40% of new application development projects will
have artificial intelligence co-developers on the team.
By 2022, 10% of new vehicles will have autonomous driving
capabilities, compared with less than 1% in 2018.
By 2030, blockchain will create $3.1 trillion in business value.
Through 2028, storage, computing and advanced AI and analytics
technologies will expand the capabilities of edge devices.
By 2022, 100 million consumers will shop in Augmented Reality.
By 2022, more than 50% of all people collaborating in Industry 4.0
ecosystems will use virtual assistants or intelligent agents to
interact more naturally with their surroundings and with people.
(C) Bohitesh Misra 4
6. Bohitesh Misra
The Internet of Things (IoT) refers
to the ever-growing network of
physical objects that feature an IP
address, and the communication
that occurs between these objects
and other Internet-enabled devices
and systems.
(C) Bohitesh Misra 6
8. The Internet of Things
connects all manner of
end-points, a treasure
trove of data
Networks and device
proliferation enable
access to a massive
and growing amount
of traditionally
siloed information
Analytics and
business intelligence
tools empower
decision makers by
extracting and
presenting
meaningful
information in real-
time
IoT Big Data Analytics
(C) Bohitesh Misra 8
10. Bohitesh Misra 10
▪ There has been enormous data
growth in both commercial and
scientific databases due to
advances in data generation and
collection technologies
▪ New mantra
▪ Gather whatever data you can
whenever and wherever possible.
▪ Expectations
▪ Gathered data will have value either
for the purpose collected or for a
purpose not envisioned.
Computational Simulations
Social Networking: Twitter
Sensor Networks
Traffic Patterns
Cyber Security
E-Commerce
11. Bohitesh Misra
Big Data is a phrase used to mean a massive volume of both structured and
unstructured data that is so large it is difficult to process using traditional database
and software techniques.
An example of big data might be petabytes
(1,024 terabytes) or exabytes (1,024
petabytes) of data consisting of billions to
trillions of records of millions of people—all
from different sources (e.g. Web, sales,
customer contact center, social media, mobile
data, e-Commerce and so on).
A single Jet engine generates 10+terabytes of data
in 30 minutes of flight time. With many thousand
flights per day, generation of data reaches up to
many Petabytes.
12. Bohitesh Misra
Application Of Big Data analytics
Homeland
Security
Smarter Healthcare
Integrated and
smart patient care
systems and
processes
Retail & Multi-channel
sales
Highly personalized
customer
experience across
channels and
devices
Telecom
Manufacturing
Intelligent
interconnectivity
across the
enterprise for
enhanced control,
speed and efficiency
Traffic Control
Trading Analytics
Search Quality
Log Analysis
Finance & Banking
Seamless customer
experience across all
banking channels
(C) Bohitesh Misra 12
13. Bohitesh Misra 13
Lots of data is being collected and warehoused
◦ Web data
Yahoo has Peta Bytes of web data
Facebook has billions of active users
◦ purchases at department/ grocery stores, e-commerce
Amazon handles millions of visits/day
◦ Bank/Credit Card transactions
Computers have become cheaper and more powerful
Competitive Pressure is Strong
◦ Provide better, customized services for an edge (e.g. CRM)
14. Bohitesh Misra 14
Data collected and stored at enormous speeds
◦ Remote sensors on a satellite
NASA archives over petabytes of earth science data / year
◦ Telescopes scanning the skies
Sky survey data
◦ High-throughput biological data
◦ Scientific simulations
terabytes of data generated in a few hours
Data mining helps scientists
◦ in automated analysis of massive datasets
◦ In hypothesis formation
MRI Data from Brain
Sky Survey Data
Surface Temperature of Earth
15. Bohitesh Misra 15
Improving health care and reducing costs
Finding alternative/ green energy sources
Predicting the impact of climate change
Reducing hunger and poverty by increasing agriculture production
16. Bohitesh Misra
Data Mining is Extraction of Knowledge from large volumes of data
that are structured or unstructured.
Data mining is a potential solution to a big problem facing many
firms : an overabundance of data and a relative dearth of staff,
technology, and time to transform numbers and notes into
meaningful information about existing and prospective customers.
Alternative names
◦ Knowledge discovery (mining) in databases (KDD), knowledge extraction, data /
pattern analysis, data archeology, data dredging, information harvesting,
business intelligence
AI refers to the ability of machines to perform cognitive tasks like
thinking, perceiving, learning, problem solving and decision making
17. Bohitesh Misra 17
Science
◦ Astronomy, bioinformatics, drug discovery
Business
◦ CRM (Customer Relationship management), fraud detection, e-commerce,
manufacturing, sports/entertainment, telecom, targeted marketing, health care,
warehouses
Web:
◦ Search engines, advertising, web and text mining
Government
◦ Surveillance, crime detection, profiling tax cheaters
21. Bohitesh Misra
Supervised Learning
◦ supervised learning is a learning in which we teach or train the machine using data which
is well labelled that means some data is already tagged with the correct answer.
◦ After that, the machine is provided with a new set of examples(data) so that supervised
learning algorithm analyses the training data (set of training examples) and produces a
correct outcome from labelled data.
◦ suppose you are given a basket filled with different kinds of fruits. Now the first step is to
train the machine with all different fruits one by one
If shape of object is rounded and depression at top having color Red then it will be labeled as –Apple.
If shape of object is long curving cylinder having color Green-Yellow then it will be labeled as –Banana.
◦ Now suppose after training the data, you have given a new separate fruit say Banana from
basket and asked to identify it.
◦ Since the machine has already learned the things from previous data and this time have to
use it wisely. It will first classify the fruit with its shape and color and would confirm the
fruit name as BANANA and put it in Banana category.
22. Bohitesh Misra
Types:-
• Regression
• Logistic Regression
• Classification
• Naïve Bayes Classifiers
• Decision Trees
• Support Vector Machine
Advantages:-
• Supervised learning allows collecting data and produce data output from the previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world computation problems.
Disadvantages:-
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So, it requires a lot of time.
23. Bohitesh Misra
Unsupervised learning is the training of machine using information that is neither
classified nor labelled and allowing the algorithm to act on that information without
guidance.
Here the task of machine is to group unsorted information according to similarities,
patterns and differences without any prior training of data.
For instance, suppose it is given an image having both dogs and cats which have
not seen ever.
machine has no idea about the features of dogs and cat so we can’t categorize it in
dogs and cats. But it can categorize them according to their similarities, patterns,
and difference
It allows the model to work on its own to discover patterns and information that
was previously undetected. It mainly deals with unlabelled data.
24. Bohitesh Misra
Unsupervised learning classified into two categories of
algorithms:
• Clustering: A clustering problem is where you want to discover the
inherent groupings in the data, such as grouping customers by
purchasing behaviour.
• Association: An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as people
that buy X also tend to buy Y.
25. Bohitesh Misra 25
Classification: predicting an item class
Clustering: finding clusters in data
Associations: e.g. A & B & C occur frequently
Visualization: to facilitate human discovery
Summarization: describing a group
Deviation Detection: finding changes
Estimation: predicting a continuous value
Link Analysis: finding relationships
26. Bohitesh Misra
Tid Refund Marital
Status
Taxable
Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
11 No Married 60K No
12 Yes Divorced 220K No
13 No Single 85K Yes
14 No Married 75K No
15 No Single 90K Yes
10
Milk
Data
Data Mining Tasks
26
27. Bohitesh Misra
Find a model for class attribute as a function of the values of
other attributes
27
Tid Employed
Level of
Education
# years at
present
address
Credit
Worthy
1 Yes Graduate 5 Yes
2 Yes High School 2 No
3 No Undergrad 1 No
4 Yes High School 10 Yes
… … … … …
10
Model for predicting credit
worthiness
Class Employed
No Education
Number of
years
No Yes
Graduate
{ High school,
Undergrad }
Yes No
> 7 yrs < 7 yrs
Yes
Number of
years
No
> 3 yr < 3 yr
Predictive Modeling: Classification
28. Bohitesh Misra
Classification and label prediction
◦ Construct models (functions) based on some training examples
◦ Describe and distinguish classes or concepts for future prediction
E.g., classify countries based on (climate), or classify cars based on (gas mileage)
◦ Predict some unknown class labels
Typical methods
◦ Decision trees, naïve Bayesian classification, support vector machines, neural
networks, rule-based classification, pattern-based classification, logistic regression
Typical applications:
◦ Credit card fraud detection, direct marketing, classifying stars, diseases, web-
pages
28
29. Bohitesh Misra
▪ Classifying credit card transactions as legitimate or
fraudulent
▪ Classifying land covers (water bodies, urban areas,
forests, etc.) using satellite data
▪ Categorizing news stories as finance, weather,
entertainment, sports, etc
▪ Identifying intruders in the cyberspace
▪ Predicting tumor cells as benign or malignant
29
30. Bohitesh Misra
◦ Goal: Predict fraudulent cases in credit card transactions.
◦ Approach:
Use credit card transactions and the information on its account-
holder as attributes.
When does a customer buy, what does he buy, how often he pays on
time, etc
Label past transactions as fraud or fair transactions. This forms the
class attribute.
Learn a model for the class of the transactions.
Use this model to detect fraud by observing credit card transactions
on an account.
30
31. Bohitesh Misra
Churn prediction for telephone customers
◦ Goal: To predict whether a customer is likely to be lost to a competitor.
◦ Approach:
Use detailed record of transactions with each of the past and
present customers, to find attributes.
How often the customer calls, where he calls, what time-of-the
day he calls most, his financial status, marital status, etc.
Label the customers as loyal or disloyal.
Find a model for loyalty.
31
32. Bohitesh Misra
Finding groups of objects such that the objects in a group will be
similar (or related) to one another and different from (or unrelated to)
the objects in other groups
32
Inter-cluster
distances are
maximized
Intra-cluster
distances are
minimized
Clustering
33. Bohitesh Misra
Unsupervised learning (i.e., Class label is unknown)
Group data to form new categories (i.e., clusters), e.g., cluster houses to
find distribution patterns
Principle: Maximizing intra-class similarity & minimizing interclass
similarity
33
34. Bohitesh Misra
K-means clustering
◦ aims to partition n observations
into k clusters in which each
observation belongs to
the cluster with the nearest mean
Hierarchical clustering
◦ Produces a set of nested clusters
organized as a hierarchical tree
◦ Can be visualized as a dendrogram
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 3
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 4
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 6
Nested Clusters Dendrogram
3 6 4 1 2 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1
2
3
4
5
6
1
2 5
3
4
35. Bohitesh Misra
Market Segmentation:
◦ Goal: subdivide a market into distinct subsets of customers where
any subset may conceivably be selected as a market target to be
reached with a distinct marketing mix.
◦ Approach:
Collect different attributes of customers based on their geographical and
lifestyle related information.
Find clusters of similar customers.
Measure the clustering quality by observing buying patterns of customers
in same cluster vs. those from different clusters.
35
36. Bohitesh Misra
Given a set of records each of which contain some number of
items from a given collection
◦ Produce dependency rules which will predict occurrence of an item
based on occurrences of other items.
36
TID Items
1 Bread, Coke, Milk
2 Beer, Bread
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
Rules Discovered:
{Milk} --> {Coke}
{Diaper, Milk} --> {Beer}
37. Bohitesh Misra
Frequent patterns (or frequent itemsets)
◦ What items are frequently purchased together in your Walmart?
Association, correlation vs. causality
◦ A typical association rule
Diaper → Beer [0.5%, 75%] (support, confidence)
◦ Are strongly associated items also strongly correlated?
How to mine such patterns and rules efficiently in large datasets?
37
38. Bohitesh Misra
Market-basket analysis
◦ Rules are used for sales promotion, shelf management, and
inventory management
Medical Informatics
◦ Rules are used to find combination of patient symptoms and test
results associated with certain diseases
38
39. Bohitesh Misra 39
An Example Subspace Differential Coexpression Pattern from lung
cancer dataset
Enriched with the TNF/NFB signaling pathway
which is well-known to be related to lung cancer
P-value: 1.4*10-5 (6/10 overlap with the pathway)
Three lung cancer datasets [Bhattacharjee et al. 2001], [Stearman et al. 2005], [Su et al. 2007]
Association Analysis - Applications
40. Bohitesh Misra
Outlier analysis
◦ Outlier: A data object that does not comply with the general behavior of the data.
Detect significant deviations from normal behavior
◦ Noise or exception? ― One person’s garbage could be another person’s treasure
◦ Methods: by product of clustering or regression analysis
◦ Useful in fraud detection, rare events analysis
40
41. Bohitesh Misra 41
Outlier: A data object that deviates significantly from the normal objects as if it
were generated by a different mechanism
◦ Ex.: Unusual credit card purchase
Outliers are different from the noise data
◦ Noise is random error or variance in a measured variable
◦ Noise should be removed before outlier detection
Applications:
◦ Credit card fraud detection
◦ Telecom fraud detection
◦ Customer segmentation
◦ Medical analysis
42. Bohitesh Misra
42
Three kinds: global, contextual and collective outliers
Global outlier (or point anomaly)
◦ Object is Og if it significantly deviates from the rest of the data set
◦ Ex. Intrusion detection in computer networks
◦ Issue: Find an appropriate measurement of deviation
Contextual outlier (or conditional outlier)
◦ Object is Oc if it deviates significantly based on a selected context
◦ Ex. 80o F in Urbana: outlier? (depending on summer or winter?)
◦ Can be viewed as a generalization of local outliers—whose density significantly deviates
from its local area
◦ Issue: How to define or formulate meaningful context?
Global Outlier
43. Bohitesh Misra
43
Collective Outliers
◦ A subset of data objects collectively deviate significantly from the
whole data set, even if the individual data objects may not be
outliers
◦ Applications: E.g., intrusion detection:
When a number of computers keep sending denial-of-service
packages to each other Collective Outlier
◼ Detection of collective outliers
◼ Consider not only behavior of individual objects, but also that of groups of objects
◼ Need to have the background knowledge on the relationship among data objects, such
as a distance or similarity measure on objects.
◼ A data set may have multiple types of outlier
◼ One object may belong to more than one type of outlier
44. Bohitesh Misra
Regression is the measure of the average relationship between two or more
variables in terms of original units of data.
Predict a value of a given continuous valued variable based on the values of
other variables, assuming a linear or nonlinear model of dependency.
Independent variable: variable which is used to predict of interest
Dependent variable: variable which we want to predict
Examples:
◦ Predicting sales amounts of new product based on advertising expenditure.
◦ Predicting wind velocities as a function of temperature, humidity, air
pressure, etc.
◦ Time series prediction of stock market indices.
45. Bohitesh Misra
Regression analysis provides estimates of values of
Dependent Variable (DV) from the values of the IV by means
of device called regression lines.
It helps in obtaining a measure of error involved in using the
regression lines as a basis of estimation.
With the help of regression coefficients, we can calculate the
correlation coefficient.
46. Regression is the attempt to explain the variation in a dependent variable using the variation in
independent variables.
If the independent variable(s) sufficiently explain the variation in the dependent variable, the
model can be used for prediction.
Independent variable (x)
Dependent
variable
47. Independent variable (x)
Dependent
variable
(y)
The output of a regression is a function that predicts the dependent variable based upon
values of the independent variables.
Simple regression fits a straight line to the data.
y’ = b0 + b1X ± є
b0 (y intercept)
b1 = slope
= ∆y/ ∆x
є
48. Bohitesh Misra
Logistic Regression is used for a different class of problems known as
classification problems. Here the aim is to predict the group to which the
current object under observation belongs to. It gives a discrete binary
outcome between 0 and 1. A simple example would be whether a person
will vote or not in upcoming elections
How does it work?
LR measures the relationship between the dependent variable (what we
want to predict) and the one or more independent variables, by estimating
probabilities using its underlying logistic functions.
Making predictions?
These probabilities must then be transformed into binary values in order
to actually make a prediction. Logistic function or sigmoid function does it
and its values range between 0 and 1. We can transform into 0 or 1 using
a threshold classifier.
Logistic vs Linear?
Logistic regression gives a discrete outcome, but linear regression gives a
continuous outcome.
49. Bohitesh Misra
Mining Methodology
◦ Mining knowledge in multi-dimensional
space
◦ Data mining: An interdisciplinary effort
◦ Handling noise, uncertainty, and
incompleteness of data
◦ Pattern evaluation and pattern- or
constraint-guided mining
User Interaction
◦ Interactive mining
◦ Incorporation of background knowledge
◦ Presentation and visualization of data
mining results
49
Efficiency and Scalability
◦ Efficiency and scalability of data mining
algorithms
◦ Parallel, distributed, stream, and
incremental mining methods
Diversity of data types
◦ Handling complex types of data
◦ Mining dynamic, networked, and global
data repositories
Data mining and society
◦ Social impacts of data mining
◦ Privacy-preserving data mining
54. Bohitesh Misra
Skill #1 : Statistics, Probability,
Hypothesis testing, multivariate analysis
Skill #2 : Computer science, data
structures, algorithms, parallel
computing, scripting languages-R,
Python and Perl, Cloud computing
Skill #3: Correlation, Modeling exercises,
Business Understanding and ability to
assess which models are feasible
So what are the skills needed for data scientist?
56. Bohitesh Misra
Identify the problem or opportunity.
◦ The importance of customer relationship and understanding the firms goal are more
crucial than understanding the technology.
◦ Always build a bilateral trust and intimacy with consumers.
Prepare the data
◦ To over come with the hidden agendas in interpreting data that a personnel in an
organisation used to create a bridge between statisticians and the concerned HODs.
Transform the data into meaningful information
◦ Firms need to establish a clear cut objectives to limit what need to find.
◦ Develop a standardized data recording system.
Validate the model on different samples
Fine-tune the model
61. Bohitesh Misra
The Explosive Growth of Data: from terabytes to petabytes
◦ Data collection and data availability
Automated data collection tools, database systems, Web, computerized
society
◦ Major sources of abundant data
Business: Web, e-commerce, transactions, stocks, …
Science: Remote sensing, bioinformatics, scientific simulation, …
Society and everyone: news, digital cameras, YouTube
We are drowning in data, but starving for knowledge!
“Necessity is mother of invention”—Data mining—Automated analysis of
massive data sets
63
62. Bohitesh Misra
There is ready availability of large amounts of data with
exponential growth of data,
Sharp decline in cost of storage, unlimited level of computing
power and bandwidth
Fact-based decisions have resulted in emergence of self-service
analytics and BI
Rapid rise in AI investments and Advanced analytics techniques
helped analysts to have access to sophisticated algorithms
64
63. Bohitesh Misra
Analytics helps profitability in cross-selling and up-selling to
current customers
Helps in reducing costs
◦ early payments to suppliers to take advantage of discounts,
◦ Retain cash as long as possible
◦ Use of matrices to find optimal balance
Helps in detection and prevention of frauds
◦ Sophisticated forensic analytics to find irregularities in financial
transactions
Helps in extrapolating current trends
65
64. Bohitesh Misra
Government of India (NITI Aayog) has released the draft National Strategy
on Artificial Intelligence.
Key features are setting up of research centers to foster breakthroughs,
Intellectual Property (IP) protection and continuous re-skilling to keep
talent up-to-date
http://niti.gov.in/writereaddata/files/document_publication/NationalStrat
egy-for-AI-Discussion-Paper.pdf
Major focus of use of analytics in areas like:
Healthcare
Agriculture
Education
Smart cities and infrastructure
Transportation
66
66. Bohitesh Misra
Machine learning is using data to find patterns & generate business insights.
User churn prediction - Engaging a customer at right time can help reduce the churn if we know specific
customers are about to churn
Recommendation engine - Up-selling & cross-selling based on machine learning basket analytics
Customer segmentation - With statistical segmentations, users can be defined in specific type of users to
better understand of your customer base.
Marketing Campaign optimisation - To better manage marketing budget, one need to analyse which
campaign doing well and why.
Product inventory optimisation - with the demand prediction, business can be lean enough to reduce storage
& waiting costs for various products
Dynamic deal scoring – help to price smartly
68
69. Bohitesh Misra
Crop and Soil Monitoring – Companies are leveraging sensors and various IoT-based technologies
to monitor crop and soil health. Using Deep Learning for Image Analysis, Agricultural Product
Grading, Alerts on Crop Infestation
Predictive Agricultural Analytics – Various AI and machine learning tools are being used to predict
the optimal time to sow seeds, get alerts on risks from pest attacks, and more.
Supply Chain Efficiencies – Companies are using real-time data analytics on data-streams coming
from multiple sources to build an efficient and smart supply chain.
Image Recognition for Soil Science - Use of AI and machine learning to predict pest and disease,
forecast commodity prices for better price realizations and recommends products to farmers
Minimum Support Price estimation - Use of AI and Machine learning to predict MSP for various
crops in real time estimate.
71
70. Bohitesh Misra
Price optimization allow retailers to consider factors
such as:
•Yield prediction using ML
•Competition
•Weather (IMD), Satellite imagery
•Season
•Special events / holidays
•Macroeconomic variables, farm machinery,
•Operating costs, Input cost - seed, Fertilizer,
pesticides
•Warehouse information (FCI), cold storage
to determine:
•The initial price
•The best price
•The discount price
•The promotional price
•MSP for major crops
Using Dimension reduction, Naïve Bayes Algorithm
which is a Machine Learning Classification technique
e-National Agriculture Market
Soil Health Card
mKisan Portal
Multivariate agricultural commodity MSP price
forecasting model
Directorate of Marketing & Inspection, Ministry of
Agriculture
77. Bohitesh Misra
~1 billion cameras worldwide by 2020
30 billion inferences/sec
Tesla P40: 2,500 inferences/sec @
720P
AI City needs ~10M P40 servers
1B cameras by 2020
78. Bohitesh Misra
Real time Surveillance
2.Warning/Comparison Zone:
real-time display of the current
pictured people v.s surveillance
people status.
1.Real-time Surveillance Zone:real-time
display of the monitoring screen.
3.Pictured Display Zoon:real-time
display the pictured photos.
4.Menu Zone:
Capability to
complete real-time
surveillance,
picture/inquiry,
police
notification/inquiry
and database
management.
79. Bohitesh Misra
Scene Parsing Crowd Density
Analysis
Crowd Tracking Search by Face
Face
Recognition
License
Recognition
Pedestrian
Detection
People
Counting
Face Alignment
Face Detection
Vehicle Model
Recognition
Vehicle
Detection
81
80. Bohitesh Misra
AI analytics to detect COVID violations in high-traffic public places
Camera based AI system detects:
•Intrusive monitoring of
temperature of people
entering into any premise
•detects through its AI based
algorithms whether person is
wearing a face cover (mask)
•detects social distancing
between people
82. Bohitesh Misra
Use of Data Science by Zomato
https://analyticsindiamag.com/the-amazing-way-zomato-uses-data-science-for-success/
AIM:
Driving commercial and operational efficiencies such as for logistics
optimisation, call centre/driver fleet capacity planning, delivery time
prediction, ad delivery, supply prioritization which are some of the key
areas
Process:
Zomato team uses Scala pipeline which ingests data from S3 and performs
ETL operations needed for machine learning algorithms. “Most of the
machine learning modelling happens in Python and leverages scale
transformed historic raw data as input. The model once finalised is then
set up as a service in production, deployed on dedicated servers as
dockerized REST APIs using Elastic Beanstalk/ECS.
84
85. Bohitesh Misra
Areas where AI is most likely to be exploited
• Physical
• Remote-controlled car crashes - The biggest concern involves AI being used to
carry out physical attacks on humans, such as hacking into self-driving cars to
cause major collisions.
• Digital
• Sophisticated phishing - In the future, attempts to access sensitive and
personal information from an individual could be carried out by AI almost
entirely. “These attacks may use AI systems to complete certain tasks more
successfully than any human could,”
• Political
• Manipulating public opinion - Fake news and fake videos generated by bots
and AI could have a big impact on public opinion, disrupting all layers of
society, from politics to media. The use of social media bots spreading fake
news was already a reality during the 2016 US presidential campaign.
AI could threaten our world
88. Bohitesh Misra
Data Science Everywhere
INTERNET & CLOUD
Image Classification
Speech Recognition
Language Translation
Language Processing
Sentiment Analysis
Recommendation
MEDIA &
ENTERTAINMENT
Video Captioning
Video Search
Real Time Translation
AUTONOMOUS
MACHINES
Pedestrian Detection
Lane Tracking
Recognize Traffic Sign
SECURITY & DEFENSE
Face Detection
Video Surveillance
Satellite Imagery
MEDICINE & BIOLOGY
Cancer Cell Detection
Diabetic Grading
Drug Discovery
COVID-19 detection
90
89. BOHITESH MISRA, PMP
CO-FOUNDER, XIPHIAS XPAY LIFE PVT LTD
BOHITESH.MISRA@GMAIL.COM
#CIO200 #ITNEXT100 #EMINENT CIOS OF INDIA 2019
@bohiteshmisra
/in/bohitesh