SlideShare ist ein Scribd-Unternehmen logo
1 von 71
zekeLabs
Moving from BI to AI
Learning made Simpler !
www.zekeLabs.com
Agenda
● Workflow with common BI tools
● Limitations of BI tools
● Black box Introduction to Machine Learning
● Machine Learning, Deep Learning & AI
● Machine Learning Pipeline
● Adopting Machine Learning in your Product : Use cases
● Challenges in adopting Machine Learning
● Open Source Options
● Cost Optimization
What and Why of BI
● Data is core to business strategy to gain competitive advantage
● BI has grown from a decision support system to a decisive factor
● BI gives the first hand look on business health
● Answers the “What” and “Where” of Business
○ Critical to run operations
○ Important to build tactical decision
Descriptive Inquisitive Predictive Prescriptive
How BI is done today?
ETL
Reporting Server
Visualization
BI
Administration
Iterations
OLAP Systems :
Data Marts/Lakes
OLTP Systems
Web
Mobile
Current BI Implementation - 3 Approaches
Web
Technologies
Based BI
Self Service
BI
Hybrid
BI
BI Approaches - Self Service BI
● Self Service BI (Tableau, Qlikview, PowerBI, JasperSoft)
○ Pros :
■ Business Analyst Friendly
■ Quick Turnaround time
■ Quick changes and fixes pretty easy to do
○ Cons :
■ Less Customization opportunities
■ Major changes require incremental development cycles
■ Least Flexible from a developer perspective since most of the solutions are
available as out-of-the-box tools offered by third party products
BI Approaches - Web Technologies based BI
● Web Technologies based BI (D3, FusionCharts, HighCharts etc.)
○ Pros :
■ Excellent visuals possible
■ Fine grained customizations possible
■ No limit to kind of visualizations, integrations with third party libraries
■ Uses the modern web technologies HTML5, CSS3, Javascript based approach
○ Cons :
■ Big Turnaround time
■ Even simple customizations or fixes need to go through full development cycle
■ Highly skilled web development skill sets needed
BI Approaches - Hybrid BI
● Hybrid BI (JasperSoft, Qlikview)
○ Pros :
■ Self service BI
■ Third party integrations with few supported charting libraries to extend capabilies
■ Libraries available to build on the fly reports (dynamic reports)
○ Cons :
■ Big Turn around time
■ Even simple customizations or fixes need to go through full development cycle
■ Highly skilled web development skill sets needed
Current Gaps of BI Approaches
Essence of BI is visual decision making
○ 2D visuals is the best a human can perceive
Answers from a BI system comes with a considerable delay
○ Delay would mean loss of money as well as opportunity
○ Business does not wish to wait to fetch its own data
Dashboards are static until next change
○ User interactivity and interest drop significantly after first few hits (especially for strategic
dashboards)
Change Management is expensive (cost, time and effort wise)
Future of BI - Embrace AI
1. BI is about depth. EDA is still a forte of humans. Machines are good at repeating
tasks at unparalleled speed
2. AI provides the scale and speed which humans currently can’t offer
3. AI offers promise to close in the process delay between the business questions and
answer
4. AI provides an opportunity to transfer the BI talent of an enterprise to invest time on
learning new skills of AI and spend quality time of data exploration rather than
doing repeat BI work
5. AI offers innovative solutions for user interactivity which can make a dashboard as
easy to use as a personal assistant (voice, text driven BI)
Black Box Introduction to ML
What is not Machine Learning ?
● Rule Based Approach
● Legacy Systems
Learning Algorithm
What is Machine Learning ?
● Solve prediction problem
Input Data
● Logic is learned from examples & not by rules
Training Data
Prediction Function
or
Trained Model
Types of Machine Learning
Machine Learning
ReinforcementUnsupervisedSupervised
Task Driven Data Driven Environment Driven
Spam Mail Detection
● Input - Mail
● Output - Spam or Ham
● Supervised Machine Learning,
● Binary Classification Problem
● Input - Sensor Data
● Output - Failure time
● Supervised Machine Learning,
● Regression Problem
Predicting Lift Failure
● Input - Accident details
● Output - Insurance amount
● Supervised Machine Learning,
● Regression Problem
Predicting Insurance Amount
● Input - Patient Synopsis (fever,
temperature, BP, etc. )
● Output - Diagnosis
● Supervised Machine Learning,
● Multi-class classification Problem
Medical Diagnosis
Question - What is common between them ?
Market Segmentation
● Input - Customer Details
● Output - Clusters
● Unsupervised Machine Learning,
● Clustering Problem
Robot playing Football
● Input - Player information,
Rewards
● Output - Action to score
● Reinforcement Learning
Relationship - AI, ML & DL
ML, DL & AI
Machine Learning Pipeline
Machine Learning Pipeline
Machine Learning Pipeline - Business Understanding
● Business understanding includes clarity what you are trying to achieve.
● Machine learning is not possible with small data size
● Consolidating data pipeline to channelize continues flow of data.
● Web scraping, data lakes access, REST etc.
Machine Learning Pipeline - Data Wrangling
● Production data is never clean.
● It needs a major effort ( around 70% of total effort ) to make it ready for next stage
● Transforming & mapping data from raw format to another format ready for next stage
Machine Learning Pipeline - Data Visualization
● Visualization makes it easy to grasp difficult concepts
● Find useful pattern in the data
● Interactively drill down into charts for deeper details
Vectors - Fixed length array of numbers
● Text documents
● Image files
● CSV
● Audio
● Video
● Time Series data
● Many more ...
Machine Learning Pipeline - Data Preprocessing
Feature Extraction
Machine Learning Pipeline - Model Training
Learning Algorithm
Regression/Trees/SVM/Naiv
e Bayes/Neural Networks/
Prediction Function
or
Trained Model
● Linear Regression
● Logistic Regression
● Naive Bayes
● Nearest Neighbors
● Decision Trees
● Ensemble Methods
● Clustering
● Support Vector Machines
● Neural Networks
● CNN
● RNN
● GAN
Machine Learning Pipeline - Learning Algorithms
Prediction
Prediction Function
or
Trained Model
Machine Learning Pipeline - Model Validation
● Training different learning method will give you different trained model.
● Also, each model have huge possibilities of configuration (hyper-parameters).
● Finding the best model among all possibilities & best configuration for it is done as a part
of Model Validation.
● If results are not satisfactory, one has to go back in the chain & fix a few things
Machine Learning Pipeline - Deployment
Trained Model
Or
Interface Model
Consumers RESTful Interface
Business Intelligence vs Machine Learning
Image Sourced from DataRobot
● BI is about deriving not-so-complex pattern
from historical data
● ML can find complex patterns in high
volume of data
● ML is about predicting future based on past
data
● ML can be automated
Choosing Model
Break - Let’s meet in 5 minutes.
Adopting Machine Learning - Real Stories
1. Customer Service Industry
1. Reduce manual
effort of classifying
reviews.
2.Channelizing data
from Web server to
Analytics Engine.
1. Getting
data ready for
visualization.
2. Historical
data shows
past trends.
Visualization
of trend
Text needs to
be tokenized
& vectorized
Different
models were
trained.
Naive Bayes,
SGD Classifier
Choose the
best model
with best
hyper-
parameter
Naive Bayes
(MultinomialNB)
was chosen & put
in deployment
1. Customer Service Industry
● Manually labeled data is used for training model.
● Labels are target & review are feature data
● Batch training is supported by MultinomialNB allowing incremental learning
● Any mis-classification done by model will be labelled right & fed again
2. Fast Query Chatbots
2. Fast Query Chatbots
1. Reduce manual effort
understanding the text
query
2. Waiting for BI has a
long turnaround time
3. We are trying to do this
using chatbot
1. Getting data
ready for
visualization.
2. Historical
data shows
past trends
Visualization
of trend of
text & sql
Text cannot
be used for
ML
Needs to be
tokenized &
vectorized
Deep learning
models with
different layer
configuration
Choosing the
best model
with best
hyper-
parameter
Model with best
config was chosen
& put in
deployment
● Convert natural language query to SQL Query
● Model is trained with historical text (feature) & SQL (target)
● The generated SQL was executed & Output was subjected to visualization libraries
● Anybody without database & infra understanding can get visualization in seconds
3. Preventing System Failure
Challenges of Adopting Machine Learning
Data & Security
● Volume of data - Machine learning
on smaller data is infeasible.
● Accessibility of data - Important
data is not accessible & may be in
encrypted format.
Infrastructure for development
● Finding the best model is an iterative
process.
● More experiments leads better model.
● Hyper-parameter Tuning
● Scaled infrastructure for developer is
important.
Infrastructure for deployment
● Speedy Deployment.
● Easy deployment
● Fluctuating Demand.
● Need of Elastic infrastructure.
● Cost optimization.
Talent Acquisition
Talent Acquisition
● Upskill your current team ?
Overcoming the challenges - Getting started
Choose a Good Programming Language
Why Python makes life easy ?
● Easy to learn for ETL developers
● Integrates very well with other technologies
● Full-stack development -
○ Dashboard using bokeh,
○ Web application using django,
○ Machine learning models using scikit,
○ Scaling using PySpark
Choose appropriate Libraries
- Statistical Modeling & Data Processing
Choose appropriate Libraries
- Visualization
Choose appropriate Libraries
- Machine Learning or Deep Learning
Choosing between
Machine Learning or Deep learning
What is Deep Learning ?
● Specialized Learning Technique
● Rather than we choosing features for learning, this technique finds
important feature derivatives.
● Objective is to learn best derived features for prediction.
● It mimics the way our brain learns
● Very useful for natural language, computer vision, audio, video etc.
Do you always need Deep Learning ?
● More data is required for Deep Learning
● More Compute Power
● Models less interpretable
“Don’t kill a mosquito with a cannon ball”
Don’t use Deep Learning if you don’t need to
Cost optimization:
● Use Open Source alternatives
● Infrastructure optimization
● Don’t reinvent the wheel
Open source resources
Infrastructure Optimization
Monolithic or Serverless
Monolithic Infrastructure - Preallocated Infra
Model Training
● Developers request access
whenever required
● Might incur delay in peak
working hours.
● Idle in non-working hours
Model Interfacing
● Idle in non-peak hours.
● May fall short in spikes.
● Pay even if infra is not used
Serverless Infrastructure - Elastic Allocation
Model Training
● No-preallocation
● Pay only for what you use
● Absolute no idle time for infra
● No wait time for developers
Model Interfacing
● Allocate infra only when required
● Scales down during non-peak
hours
● Improved customer experience
even in peak hours
Serverless Infrastructure Solutions
● Open Function as a Service (OpenFaas)
● AWS Lambda
● Google Cloud Function
● Azure Function
Container based CI/CD for ML/AI application
Distributed Machine Learning using Spark
● Apache Spark is a distributed data
processing framework.
● Many machine learning algorithms are
implemented in Spark.
● Most of the API’s are same that of scikit-
learn
● Scaled ETL & Machine Learning can be done
using Spark
Other alternatives
Google Cloud AI
Q & A
Repositories
● https://github.com/zekelabs/machine-learning-for-beginners
● https://github.com/zekelabs/tensorflow-tutorial/
● Dog breed prediction -
https://www.edyoda.com/resources/watch/54AEA4CDC35394F1183A9D
D17AA47/
● Python learning course -
https://www.edyoda.com/resources/videolisting/98/
Thank You !!!
Visit : www.zekeLabs.com for more details
THANK YOU
Let us know how can we help your organization to Upskill the
employees to stay updated in the ever-evolving IT Industry.
Get in touch:
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com

Weitere ähnliche Inhalte

Ähnlich wie Moving from BI to AI : For decision makers

Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Albert Y. C. Chen
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCgdgsurrey
 
Process mining: The role of Data in Business Processes
Process mining: The role of Data in Business ProcessesProcess mining: The role of Data in Business Processes
Process mining: The role of Data in Business ProcessesBonitasoft
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Productioniguazio
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfprevota
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Live predictions with schemaless data at scale. MLMU Kosice, Exponea
Live predictions with schemaless data at scale. MLMU Kosice, ExponeaLive predictions with schemaless data at scale. MLMU Kosice, Exponea
Live predictions with schemaless data at scale. MLMU Kosice, ExponeaData Science Club
 
Anwar kamal .pdf.pptx
Anwar kamal .pdf.pptxAnwar kamal .pdf.pptx
Anwar kamal .pdf.pptxLuminous8
 
Self service BI for humans
Self service BI for humansSelf service BI for humans
Self service BI for humansAdrian Brudaru
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceWei Di
 
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...Databricks
 
Transforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales IntelligenceTransforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales IntelligenceSongtao Guo
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comMathieu Dumoulin
 
Delivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PMDelivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PMProduct School
 
How Does the Denodo Platform Accelerate Your Time to Insights?
How Does the Denodo Platform Accelerate Your Time to Insights?How Does the Denodo Platform Accelerate Your Time to Insights?
How Does the Denodo Platform Accelerate Your Time to Insights?Denodo
 

Ähnlich wie Moving from BI to AI : For decision makers (20)

Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
NZS-4555 - IT Analytics Keynote - IT Analytics for the Enterprise
NZS-4555 - IT Analytics Keynote - IT Analytics for the EnterpriseNZS-4555 - IT Analytics Keynote - IT Analytics for the Enterprise
NZS-4555 - IT Analytics Keynote - IT Analytics for the Enterprise
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Process mining: The role of Data in Business Processes
Process mining: The role of Data in Business ProcessesProcess mining: The role of Data in Business Processes
Process mining: The role of Data in Business Processes
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdf
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Live predictions with schemaless data at scale. MLMU Kosice, Exponea
Live predictions with schemaless data at scale. MLMU Kosice, ExponeaLive predictions with schemaless data at scale. MLMU Kosice, Exponea
Live predictions with schemaless data at scale. MLMU Kosice, Exponea
 
Anwar kamal .pdf.pptx
Anwar kamal .pdf.pptxAnwar kamal .pdf.pptx
Anwar kamal .pdf.pptx
 
Self service BI for humans
Self service BI for humansSelf service BI for humans
Self service BI for humans
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
 
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
Transforming B2B Sales with Spark-Powered Sales Intelligence with Songtao Guo...
 
Transforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales IntelligenceTransforming B2B Sales with Spark Powered Sales Intelligence
Transforming B2B Sales with Spark Powered Sales Intelligence
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Delivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PMDelivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PM
 
How Does the Denodo Platform Accelerate Your Time to Insights?
How Does the Denodo Platform Accelerate Your Time to Insights?How Does the Denodo Platform Accelerate Your Time to Insights?
How Does the Denodo Platform Accelerate Your Time to Insights?
 

Mehr von zekeLabs Technologies

Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...
Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...
Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...zekeLabs Technologies
 
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabsDesign Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabszekeLabs Technologies
 
[Webinar] Following the Agile Footprint - zekeLabs
[Webinar] Following the Agile Footprint - zekeLabs[Webinar] Following the Agile Footprint - zekeLabs
[Webinar] Following the Agile Footprint - zekeLabszekeLabs Technologies
 
A curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & KubernetesA curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & KuberneteszekeLabs Technologies
 
Docker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container worldDocker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container worldzekeLabs Technologies
 
Master guide to become a data scientist
Master guide to become a data scientist Master guide to become a data scientist
Master guide to become a data scientist zekeLabs Technologies
 

Mehr von zekeLabs Technologies (20)

Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...
Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...
Webinar - Build Cloud-native platform using Docker, Kubernetes, Prometheus, I...
 
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabsDesign Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
 
[Webinar] Following the Agile Footprint - zekeLabs
[Webinar] Following the Agile Footprint - zekeLabs[Webinar] Following the Agile Footprint - zekeLabs
[Webinar] Following the Agile Footprint - zekeLabs
 
A curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & KubernetesA curtain-raiser to the container world Docker & Kubernetes
A curtain-raiser to the container world Docker & Kubernetes
 
Docker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container worldDocker - A curtain raiser to the Container world
Docker - A curtain raiser to the Container world
 
Serverless and cloud computing
Serverless and cloud computingServerless and cloud computing
Serverless and cloud computing
 
SQL
SQLSQL
SQL
 
02 terraform core concepts
02 terraform core concepts02 terraform core concepts
02 terraform core concepts
 
08 Terraform: Provisioners
08 Terraform: Provisioners08 Terraform: Provisioners
08 Terraform: Provisioners
 
Outlier detection handling
Outlier detection handlingOutlier detection handling
Outlier detection handling
 
Nearest neighbors
Nearest neighborsNearest neighbors
Nearest neighbors
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Master guide to become a data scientist
Master guide to become a data scientist Master guide to become a data scientist
Master guide to become a data scientist
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Linear models of classification
Linear models of classificationLinear models of classification
Linear models of classification
 
Grid search, pipeline, featureunion
Grid search, pipeline, featureunionGrid search, pipeline, featureunion
Grid search, pipeline, featureunion
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Essential NumPy
Essential NumPyEssential NumPy
Essential NumPy
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Dimentionality reduction
Dimentionality reductionDimentionality reduction
Dimentionality reduction
 

Kürzlich hochgeladen

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 

Kürzlich hochgeladen (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 

Moving from BI to AI : For decision makers

  • 1. zekeLabs Moving from BI to AI Learning made Simpler ! www.zekeLabs.com
  • 2. Agenda ● Workflow with common BI tools ● Limitations of BI tools ● Black box Introduction to Machine Learning ● Machine Learning, Deep Learning & AI ● Machine Learning Pipeline ● Adopting Machine Learning in your Product : Use cases ● Challenges in adopting Machine Learning ● Open Source Options ● Cost Optimization
  • 3. What and Why of BI ● Data is core to business strategy to gain competitive advantage ● BI has grown from a decision support system to a decisive factor ● BI gives the first hand look on business health ● Answers the “What” and “Where” of Business ○ Critical to run operations ○ Important to build tactical decision Descriptive Inquisitive Predictive Prescriptive
  • 4. How BI is done today? ETL Reporting Server Visualization BI Administration Iterations OLAP Systems : Data Marts/Lakes OLTP Systems Web Mobile
  • 5. Current BI Implementation - 3 Approaches Web Technologies Based BI Self Service BI Hybrid BI
  • 6. BI Approaches - Self Service BI ● Self Service BI (Tableau, Qlikview, PowerBI, JasperSoft) ○ Pros : ■ Business Analyst Friendly ■ Quick Turnaround time ■ Quick changes and fixes pretty easy to do ○ Cons : ■ Less Customization opportunities ■ Major changes require incremental development cycles ■ Least Flexible from a developer perspective since most of the solutions are available as out-of-the-box tools offered by third party products
  • 7. BI Approaches - Web Technologies based BI ● Web Technologies based BI (D3, FusionCharts, HighCharts etc.) ○ Pros : ■ Excellent visuals possible ■ Fine grained customizations possible ■ No limit to kind of visualizations, integrations with third party libraries ■ Uses the modern web technologies HTML5, CSS3, Javascript based approach ○ Cons : ■ Big Turnaround time ■ Even simple customizations or fixes need to go through full development cycle ■ Highly skilled web development skill sets needed
  • 8. BI Approaches - Hybrid BI ● Hybrid BI (JasperSoft, Qlikview) ○ Pros : ■ Self service BI ■ Third party integrations with few supported charting libraries to extend capabilies ■ Libraries available to build on the fly reports (dynamic reports) ○ Cons : ■ Big Turn around time ■ Even simple customizations or fixes need to go through full development cycle ■ Highly skilled web development skill sets needed
  • 9. Current Gaps of BI Approaches Essence of BI is visual decision making ○ 2D visuals is the best a human can perceive Answers from a BI system comes with a considerable delay ○ Delay would mean loss of money as well as opportunity ○ Business does not wish to wait to fetch its own data Dashboards are static until next change ○ User interactivity and interest drop significantly after first few hits (especially for strategic dashboards) Change Management is expensive (cost, time and effort wise)
  • 10. Future of BI - Embrace AI 1. BI is about depth. EDA is still a forte of humans. Machines are good at repeating tasks at unparalleled speed 2. AI provides the scale and speed which humans currently can’t offer 3. AI offers promise to close in the process delay between the business questions and answer 4. AI provides an opportunity to transfer the BI talent of an enterprise to invest time on learning new skills of AI and spend quality time of data exploration rather than doing repeat BI work 5. AI offers innovative solutions for user interactivity which can make a dashboard as easy to use as a personal assistant (voice, text driven BI)
  • 12. What is not Machine Learning ? ● Rule Based Approach ● Legacy Systems
  • 13. Learning Algorithm What is Machine Learning ? ● Solve prediction problem Input Data ● Logic is learned from examples & not by rules Training Data Prediction Function or Trained Model
  • 14. Types of Machine Learning Machine Learning ReinforcementUnsupervisedSupervised Task Driven Data Driven Environment Driven
  • 15. Spam Mail Detection ● Input - Mail ● Output - Spam or Ham ● Supervised Machine Learning, ● Binary Classification Problem
  • 16. ● Input - Sensor Data ● Output - Failure time ● Supervised Machine Learning, ● Regression Problem Predicting Lift Failure
  • 17. ● Input - Accident details ● Output - Insurance amount ● Supervised Machine Learning, ● Regression Problem Predicting Insurance Amount
  • 18. ● Input - Patient Synopsis (fever, temperature, BP, etc. ) ● Output - Diagnosis ● Supervised Machine Learning, ● Multi-class classification Problem Medical Diagnosis
  • 19. Question - What is common between them ?
  • 20. Market Segmentation ● Input - Customer Details ● Output - Clusters ● Unsupervised Machine Learning, ● Clustering Problem
  • 21. Robot playing Football ● Input - Player information, Rewards ● Output - Action to score ● Reinforcement Learning
  • 23. ML, DL & AI
  • 26. Machine Learning Pipeline - Business Understanding ● Business understanding includes clarity what you are trying to achieve. ● Machine learning is not possible with small data size ● Consolidating data pipeline to channelize continues flow of data. ● Web scraping, data lakes access, REST etc.
  • 27. Machine Learning Pipeline - Data Wrangling ● Production data is never clean. ● It needs a major effort ( around 70% of total effort ) to make it ready for next stage ● Transforming & mapping data from raw format to another format ready for next stage
  • 28. Machine Learning Pipeline - Data Visualization ● Visualization makes it easy to grasp difficult concepts ● Find useful pattern in the data ● Interactively drill down into charts for deeper details
  • 29. Vectors - Fixed length array of numbers ● Text documents ● Image files ● CSV ● Audio ● Video ● Time Series data ● Many more ... Machine Learning Pipeline - Data Preprocessing Feature Extraction
  • 30. Machine Learning Pipeline - Model Training Learning Algorithm Regression/Trees/SVM/Naiv e Bayes/Neural Networks/ Prediction Function or Trained Model
  • 31. ● Linear Regression ● Logistic Regression ● Naive Bayes ● Nearest Neighbors ● Decision Trees ● Ensemble Methods ● Clustering ● Support Vector Machines ● Neural Networks ● CNN ● RNN ● GAN Machine Learning Pipeline - Learning Algorithms
  • 33. Machine Learning Pipeline - Model Validation ● Training different learning method will give you different trained model. ● Also, each model have huge possibilities of configuration (hyper-parameters). ● Finding the best model among all possibilities & best configuration for it is done as a part of Model Validation. ● If results are not satisfactory, one has to go back in the chain & fix a few things
  • 34. Machine Learning Pipeline - Deployment Trained Model Or Interface Model Consumers RESTful Interface
  • 35. Business Intelligence vs Machine Learning Image Sourced from DataRobot ● BI is about deriving not-so-complex pattern from historical data ● ML can find complex patterns in high volume of data ● ML is about predicting future based on past data ● ML can be automated
  • 37. Break - Let’s meet in 5 minutes.
  • 38. Adopting Machine Learning - Real Stories
  • 40. 1. Reduce manual effort of classifying reviews. 2.Channelizing data from Web server to Analytics Engine. 1. Getting data ready for visualization. 2. Historical data shows past trends. Visualization of trend Text needs to be tokenized & vectorized Different models were trained. Naive Bayes, SGD Classifier Choose the best model with best hyper- parameter Naive Bayes (MultinomialNB) was chosen & put in deployment 1. Customer Service Industry ● Manually labeled data is used for training model. ● Labels are target & review are feature data ● Batch training is supported by MultinomialNB allowing incremental learning ● Any mis-classification done by model will be labelled right & fed again
  • 41. 2. Fast Query Chatbots
  • 42. 2. Fast Query Chatbots 1. Reduce manual effort understanding the text query 2. Waiting for BI has a long turnaround time 3. We are trying to do this using chatbot 1. Getting data ready for visualization. 2. Historical data shows past trends Visualization of trend of text & sql Text cannot be used for ML Needs to be tokenized & vectorized Deep learning models with different layer configuration Choosing the best model with best hyper- parameter Model with best config was chosen & put in deployment ● Convert natural language query to SQL Query ● Model is trained with historical text (feature) & SQL (target) ● The generated SQL was executed & Output was subjected to visualization libraries ● Anybody without database & infra understanding can get visualization in seconds
  • 44. Challenges of Adopting Machine Learning
  • 45. Data & Security ● Volume of data - Machine learning on smaller data is infeasible. ● Accessibility of data - Important data is not accessible & may be in encrypted format.
  • 46. Infrastructure for development ● Finding the best model is an iterative process. ● More experiments leads better model. ● Hyper-parameter Tuning ● Scaled infrastructure for developer is important.
  • 47. Infrastructure for deployment ● Speedy Deployment. ● Easy deployment ● Fluctuating Demand. ● Need of Elastic infrastructure. ● Cost optimization.
  • 49. Talent Acquisition ● Upskill your current team ?
  • 50. Overcoming the challenges - Getting started
  • 51. Choose a Good Programming Language
  • 52. Why Python makes life easy ? ● Easy to learn for ETL developers ● Integrates very well with other technologies ● Full-stack development - ○ Dashboard using bokeh, ○ Web application using django, ○ Machine learning models using scikit, ○ Scaling using PySpark
  • 53. Choose appropriate Libraries - Statistical Modeling & Data Processing
  • 55. Choose appropriate Libraries - Machine Learning or Deep Learning
  • 57. What is Deep Learning ? ● Specialized Learning Technique ● Rather than we choosing features for learning, this technique finds important feature derivatives. ● Objective is to learn best derived features for prediction. ● It mimics the way our brain learns ● Very useful for natural language, computer vision, audio, video etc.
  • 58. Do you always need Deep Learning ? ● More data is required for Deep Learning ● More Compute Power ● Models less interpretable “Don’t kill a mosquito with a cannon ball” Don’t use Deep Learning if you don’t need to
  • 59. Cost optimization: ● Use Open Source alternatives ● Infrastructure optimization ● Don’t reinvent the wheel
  • 62. Monolithic Infrastructure - Preallocated Infra Model Training ● Developers request access whenever required ● Might incur delay in peak working hours. ● Idle in non-working hours Model Interfacing ● Idle in non-peak hours. ● May fall short in spikes. ● Pay even if infra is not used
  • 63. Serverless Infrastructure - Elastic Allocation Model Training ● No-preallocation ● Pay only for what you use ● Absolute no idle time for infra ● No wait time for developers Model Interfacing ● Allocate infra only when required ● Scales down during non-peak hours ● Improved customer experience even in peak hours
  • 64. Serverless Infrastructure Solutions ● Open Function as a Service (OpenFaas) ● AWS Lambda ● Google Cloud Function ● Azure Function
  • 65. Container based CI/CD for ML/AI application
  • 66. Distributed Machine Learning using Spark ● Apache Spark is a distributed data processing framework. ● Many machine learning algorithms are implemented in Spark. ● Most of the API’s are same that of scikit- learn ● Scaled ETL & Machine Learning can be done using Spark
  • 68. Q & A
  • 69. Repositories ● https://github.com/zekelabs/machine-learning-for-beginners ● https://github.com/zekelabs/tensorflow-tutorial/ ● Dog breed prediction - https://www.edyoda.com/resources/watch/54AEA4CDC35394F1183A9D D17AA47/ ● Python learning course - https://www.edyoda.com/resources/videolisting/98/
  • 71. Visit : www.zekeLabs.com for more details THANK YOU Let us know how can we help your organization to Upskill the employees to stay updated in the ever-evolving IT Industry. Get in touch: www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com