SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Department of Statistics
The Maharaja Sayajirao University of Baroda
Agenda
 What is Data Science?
 What does Data Science promise for your business?
 Investment in Data Science and ROI
 Data Science Process
 Data Science Roles
 Infrastructure Requirements
 Data Science Tools and Techniques
 Where do I begin?
 Developing Data Science Culture
 Questions
What is Data Science?
Everything concerning Data
is in the purview of Data Science
What is Data Science?
Data science is a young inter-disciplinary field that uses
scientific principles, methods, processes, algorithms and
systems to extract knowledge and insights from data.
 Data science involves Statistics at its core.
 Data Science extends the field of statistics to
incorporate advances in computing with data
 Apart from Statistics, Computer Science is another
major discipline that plays a major role in capturing,
managing and sharing data.
 It is a driving force behind innovations is almost all
disciplines of Science.
 This new approach is termed Data driven science.
Data Science Discipline
Data Science Profession
The Data Science promise
Top Objectives of Successful Businesses
 Increase profitability
 Ensure customer satisfaction
 Optimize productivity
 Make your employees happy
 Social and public responsibility
Businesses traditionally rely on intuition, creativity and
experience to fulfill these objectives.
This has been reflected by HIPPO phenomenon for
decades.
The Data Science promise
Without Data, you are just another person with an opinion
– Edwards Deming
Although, intuition, experience, etc. are important, these work
gets much better when supported with data.
Data Science helps you to
 Understand your customers better by
 Learning about their needs
 Their struggles, their motivations, their habits and their
relationships to your product or service.
 Use this understanding to create a better product and/or
service and turning that into profit.
The Data Science promise
Data science helps you to
 See clearly how your business performs.
 Understand dynamics of your business
 Improve business processes
 Discover new opportunities / products / services that
your customers need.
 Discover new audiences for your current products /
services.
and much more...
The Data Science promise
If you manage to collect the right data and use it well,
 You will be able to make better decisions more quickly
and more easily.
 That will lead to a better product, happier
customers and eventually more revenue.
That’s what business data science is all about.
If you are among the first in your domain to embrace
data science, you can outsmart your competition.
Signs that You Should Invest in Data
Science
 Your marketing budgets are growing, but your sales
numbers are not.
 Your company is struggling with personalization
 It’s taking too long for the sales team to score leads
 You are unable to analyze your marketing ROI
 You want the competitive edge without significantly
increasing your budget
 Your competitors are already investing in Data Science
Data Science Investments
Human Resource
According to an estimate, good teams spend about 5% of
their total working hours with data and quantitative
research.
 So, if you are working alone, that's around 2-3 hours a
week.
 If you are a team of 50, then ideally you should have
one or two full-time dedicated people for Data Science
projects.
 As your business grows, you may setup Data Science
division
Data Science Investments
Data Infrastructure
A data infrastructure is a digital infrastructure for
promoting data sharing and consumption.
 It includes data assets, hardware, software and
processes.
 It includes data ingestion and storage infrastructure
 It includes data management, data security and data
privacy.
Data Science Investments
Analytics Infrastructure
Much of data science work involves computationally
intensive experiments.
 Thus, Data scientists should be able to access large
machines/ specialized hardware for running
experiments or doing exploratory analysis.
 They should also be able to easily use burst/elastic
compute on demand.
 Data Scientists need software support for
communicating their findings to business
stakeholders.
Cloud Analytics
On-premises analytics solutions have challanges
 Cost of infrastructure
 Need for specialized skills
 Time required to configure and maintain these
systems
 Nonscalability
Cloud Analytics provides solution. Some major players
 IBM Cognos analytics
 Microfost Azure Stream Analytics
 AWS Analytics
Success Stories
 Southwest Airlines saved $ 100 million by reducing the
time its planes stood idle on the airstrip.
 UPS, a logistics company, saved 38 million gallons of
fuel by optimizing its fleet.
 $ 2 billion tax dollars saved by the Internal Revenue
Service by improving its ability to detect identity fraud
and improper payments.
 Croma, a subsidiary of Tata sons used data science to
understand 360° view of its users and used it to give
personalized shopping experience to its online
customers and their conversions have significantly
improved.
And many more…
With Data in your possession,
You are sitting on a gold mine…
However, if you don't know this fact OR don’t know how
to extract it, you won't be able to benefit from it.
Data Science Process
The diagram shows the major phases of data science
process. The diagram presents the CRISP-DM methodology
Data Science Process
The six steps of a data science project
 Data Collection
 Data Storage
 Data Preparation
 Data Utilization
 Business Analytics
 Predictive Analytics
 Developing Data Product
 Communication, data visualization
 Data-driven Decision
Data Collection
This is where many businesses fail. Too many companies collect
incomplete, unreliable data and everything they do after that is just
messed up.
Proper tracking and collection of data, and ensuring its quality is
crucial for every business doing data science.
What to collect?
 It is important to decide the details of the data that must be
collected/ captured.
 The general idea is to collect everything you can – because the
value of data can be realized any time in future.
 However, the more data you capture, the more engineering time
you need to allocate to implement it, the slower your business
processes will be, the more complex your data infrastructure
becomes, and so on…
Also consider legal and ethical aspects!
Data Wrangling
Data wrangling is all about getting the data into the right
form that is suitable for feeding into the modeling and
visualization stages.
This activity involves variety of tasks from discovering
data to acquiring and transforming it into the form
where the Data that is ready to be processed.
The tasks following the data acquisition are also referred
to by different terms such as Data Munging or Data
Preprocessing.
Big Data
Big data is like teenage sex: everyone talks about it,
nobody really knows how to do it, everyone thinks
everyone else is doing it, so everyone claims they are
doing it.
- Dan Ariely
What is Big data?
 Big data is a data set whose volume is beyond the ability of
commonly used hardware and software tools to capture, manage,
and process the data within a tolerable execution time.
 They are gathered by information-sensing mobile devices,
remote sensing technologies, software logs, cameras,
microphones, RFID readers, and many such devices.
 As a result, such datasets are continuously growing in size.
 By 2020, there will be around 40 trillion gigabytes of data
 90% of the data in the world today was created within just the
past two years.
 Internet users generate about 2.5 quintillion bytes (2.5 million
terabytes) of data each day
Twitter
 500 million tweets per day
Facebook
 Facebook generates 4 petabytes of data per day.
 Users generate 4 million likes every minute.
 350 million photos are uploaded per day.
Instagram
 The Like button is hit an average of 4.2 billion times/ day.
WhatsApp
 In 2018, WhatsApp users sent 65 billion messages per
day
Almost every field
Some Examples
Characteristics of big data (3V’s)
In a 2001 research report, Gartner analyst, Doug Laney,
defined data growth challenges (and opportunities) as being
three-dimensional - increasing volume, velocity , and variety.
Data volume:
 This is the primary attribute of big data. Most people
define big data in multi terabytes—sometimes petabytes.
Data variety
 Big data is coming from a greater variety of sources than
ever before. Many of the newer ones are Web sources,
including logs, click-streams, and social media.
Data velocity
 Big data can be described by its velocity or speed. The rate
at which new data is generated.
Data Analysis
Data Analysis is process for extracting value from Data.
This is where data science gets exciting. It’s a creative process.
 Ask right Questions
It is important to ask right questions. They usually comes
from the management/ or other colleagues, who may
already have suspicions based on their experience.
 Do Qualitative research
It’s important to understand the things concerning
business and its customers in detail. This can be achieved
through qualitative research, which in turn gives direction
to the useful investigations through data.
Three Major Business Applications
 Business Analytics
It answers the questions of “what has happened in the
past?” and “where are we now?”
E.g. reporting, measuring retention, finding the right user
segments, funnel analysis, etc.
 Predictive Analytics
It answers the question, “what will happen in the future?”
E.g. early warning, predicting the marketing budget you will
need in the next quarter, etc.
 Data (Based) Product
A product that is built, and works using your data.
E.g. recommendation systems, image recognition, voice
recognition, etc.
 SafetiPin is a map-based mobile phone application, which
leverages the power of big data to make our communities
and cities safer for women.
 It provides safety-related information collected through
crowdsourcing.
 The app captures data on 9 parameters (Lighting,
openness, visibility, people density, security in the area,
walk path, transportation, gender diversity, feeling in the
area), and uses it to compute and provide safety score, the
information on personal vulnerability to crime, in every
pocket of the city.
 App utilizes this score ang integrates with big data sources
such as Google map to recommends Safest Route to
provide the best possible route in terms of safety.
Data Communication
This is the step where most data science projects fail.
To reap the benefits of Data Science, effective
communication of the findings is crucial.
 It is necessary to build a culture where people can
communicate and use data. For this, everyone at your
company needs to be involved.
 Business people should also educate data scientists by
helping them to create and deliver better presentations.
 Communication should be as simple as it can be.
 No fancy scientific words
 No complicated charts
What People you need in your Team?
You data science team should feature
 Best Data Engineers,
 Best software developers, and
 Best statisticians
They need to have domain knowledge to know the actual
business application of their data projects.
Data Science Roles: Data Engineer
The data engineer is someone who develops, constructs,
tests and maintains data architectures, such as
databases, data warehouses, data lakes and large-scale
processing systems.
Data engineers manage data of all sizes, and types. They
develop, deploy, manage, and optimize data pipelines
and infrastructure to transform and transfer data to data
scientists for querying.
Skills needed: SQL, Data bases, Data warehousing,
ETL, Big data tools, Building API’s
Data Science Roles: Data Analyst
Data analysts perform the following tasks
 Data wrangling
 Create Data visualizations and Dash boards
 Analyze data to discover and interesting trends in the data
 Presenting the results of analysis to business clients or
internal teams
 Help other stakeholders to optimize their data utilization
Skills needed: Programming skills (SAS, R, Python),
statistical and mathematical skills, data wrangling, data
visualization tools like tableau/ Power BI
Data Science Roles: Data Scientist
A data scientist is a specialist having expertise in
Statistics and developing models, including predictive
models and machine learning models.
 Data scientists can tackle more open-ended questions
by leveraging their knowledge of advanced statistics.
 Data scientists bring an entirely new approach and
perspective to understanding data
Skills needed: Programming skills (SAS, R, Python),
statistical and mathematical skills, storytelling and data
visualization, Hadoop, SQL, machine learning, Big data
analytics.
Data Science projects can fail
Yes, that’s true!
Here are some of the reasons.
 Not every manager is ready for this change.
Even a very well-executed data project can fail, just
because someone’s feelings or ego is hurt.
 Answering the wrong question
 Failure to integrate into business operations
 Stakeholders disengaged
 Benefits don’t justify the costs
Developing Data Science culture
Failures can be prevented by establishing a data-driven
company culture early on. As the company size
increases, it becomes harder to make the organization
data-driven.
 It’s important that the managers develop the right
mindset.
 It important that everyone in the organization
understands importance of data science.
Data professionals should hold frequent presentations
about their recent findings.
Data Strategy
Why Data Strategy?
If you don't have a data strategy, you won't have enough
information to make the right decisions. Having data
strategy is crucial to become a data-driven organization.
Without it
 you will waste money on the wrong marketing
campaigns
 you will have wrong product development plans
Where do I begin?
It is recommended to start with development of Data Strategy. For
this, following questions need to be answered
 What are the right metrics to focus on? And how to figure it out?
 How to collect and store the data. Which tools should you use?
 Can you trust your data? And how can you make it trustworthy?
 How to communicate the data in your organization efficiently?
Start with a simple data project that answers the basic questions
about your business.
Subsequently, as you recognize your customers’ needs, you may
initiate other projects such as Predictive modelling, and Machine
learning
Pick your first data project
Develop and use the Prioritization matrix.
Your first data project
Your first data project should be a simple project (feasible)
with an aim to understanding your own business and your
customers better (High business value)
In other words, Start with investing in business analytics and
simple reports.
This project answers the basic questions about your business,
such as
 Who prefers what and why?
 How to win customer loyalty?
 Why a particular product failed?
And so on …
Questions?
You can write to me
kalamkar.vipul-stat@msubaroda.ac.in
Thanks!

Weitere ähnliche Inhalte

Was ist angesagt?

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
My latest white paper
My latest white paperMy latest white paper
My latest white paperJason Rushin
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Jennifer Walker
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...mark madsen
 
Big Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderBig Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderJennifer Walker
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataDavid Pittman
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humansmark madsen
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-MakingTeradata Aster
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsThe Marketing Distillery
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 

Was ist angesagt? (20)

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
My latest white paper
My latest white paperMy latest white paper
My latest white paper
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?Move It Don't Lose It: Is Your Big Data Collecting Dust?
Move It Don't Lose It: Is Your Big Data Collecting Dust?
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
Big Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not HarderBig Data Management: Work Smarter Not Harder
Big Data Management: Work Smarter Not Harder
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big Data
 
Buyer's guide to strategic analytics
Buyer's guide to strategic analyticsBuyer's guide to strategic analytics
Buyer's guide to strategic analytics
 
Big data basics
Big data basicsBig data basics
Big data basics
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humans
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-Making
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
Big data
Big dataBig data
Big data
 
Orzota all-in-one Big Data Platform
Orzota all-in-one Big Data PlatformOrzota all-in-one Big Data Platform
Orzota all-in-one Big Data Platform
 
Reaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analyticsReaping the benefits of Big Data and real time analytics
Reaping the benefits of Big Data and real time analytics
 
Analytics3.0 e book
Analytics3.0 e bookAnalytics3.0 e book
Analytics3.0 e book
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 

Ähnlich wie Embracing data science

Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxPrabhaJoshi4
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thingBharath Rao
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solutioncamssguide
 
Analytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookAnalytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookDeloitte Canada
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraVin Malhotra
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Datahimanshu13jun
 
ABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxVASANTHIG10
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfAnil
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business PlansOur Business Ladder
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data ScienceNyraSehgal
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
_What Is Data Science.pdf
_What Is Data Science.pdf_What Is Data Science.pdf
_What Is Data Science.pdfFlyWly
 

Ähnlich wie Embracing data science (20)

Difference b/w DataScience, Data Analyst
Difference b/w DataScience, Data AnalystDifference b/w DataScience, Data Analyst
Difference b/w DataScience, Data Analyst
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
Achieving Business Success with Data.pdf
Achieving Business Success with Data.pdfAchieving Business Success with Data.pdf
Achieving Business Success with Data.pdf
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solution
 
Analytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface lookAnalytics Trends 2015: A below-the-surface look
Analytics Trends 2015: A below-the-surface look
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
365 Data Science
365 Data Science365 Data Science
365 Data Science
 
Untitled document.pdf
Untitled document.pdfUntitled document.pdf
Untitled document.pdf
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Data
 
ABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptxABOUT DATA SCIENCE big data analytics ppt.pptx
ABOUT DATA SCIENCE big data analytics ppt.pptx
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business Plans
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data Science
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
_What Is Data Science.pdf
_What Is Data Science.pdf_What Is Data Science.pdf
_What Is Data Science.pdf
 

Kürzlich hochgeladen

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 

Embracing data science

  • 1. Department of Statistics The Maharaja Sayajirao University of Baroda
  • 2. Agenda  What is Data Science?  What does Data Science promise for your business?  Investment in Data Science and ROI  Data Science Process  Data Science Roles  Infrastructure Requirements  Data Science Tools and Techniques  Where do I begin?  Developing Data Science Culture  Questions
  • 3. What is Data Science? Everything concerning Data is in the purview of Data Science
  • 4. What is Data Science? Data science is a young inter-disciplinary field that uses scientific principles, methods, processes, algorithms and systems to extract knowledge and insights from data.  Data science involves Statistics at its core.  Data Science extends the field of statistics to incorporate advances in computing with data  Apart from Statistics, Computer Science is another major discipline that plays a major role in capturing, managing and sharing data.  It is a driving force behind innovations is almost all disciplines of Science.  This new approach is termed Data driven science.
  • 7. The Data Science promise Top Objectives of Successful Businesses  Increase profitability  Ensure customer satisfaction  Optimize productivity  Make your employees happy  Social and public responsibility Businesses traditionally rely on intuition, creativity and experience to fulfill these objectives. This has been reflected by HIPPO phenomenon for decades.
  • 8. The Data Science promise Without Data, you are just another person with an opinion – Edwards Deming Although, intuition, experience, etc. are important, these work gets much better when supported with data. Data Science helps you to  Understand your customers better by  Learning about their needs  Their struggles, their motivations, their habits and their relationships to your product or service.  Use this understanding to create a better product and/or service and turning that into profit.
  • 9. The Data Science promise Data science helps you to  See clearly how your business performs.  Understand dynamics of your business  Improve business processes  Discover new opportunities / products / services that your customers need.  Discover new audiences for your current products / services. and much more...
  • 10. The Data Science promise If you manage to collect the right data and use it well,  You will be able to make better decisions more quickly and more easily.  That will lead to a better product, happier customers and eventually more revenue. That’s what business data science is all about. If you are among the first in your domain to embrace data science, you can outsmart your competition.
  • 11. Signs that You Should Invest in Data Science  Your marketing budgets are growing, but your sales numbers are not.  Your company is struggling with personalization  It’s taking too long for the sales team to score leads  You are unable to analyze your marketing ROI  You want the competitive edge without significantly increasing your budget  Your competitors are already investing in Data Science
  • 12. Data Science Investments Human Resource According to an estimate, good teams spend about 5% of their total working hours with data and quantitative research.  So, if you are working alone, that's around 2-3 hours a week.  If you are a team of 50, then ideally you should have one or two full-time dedicated people for Data Science projects.  As your business grows, you may setup Data Science division
  • 13. Data Science Investments Data Infrastructure A data infrastructure is a digital infrastructure for promoting data sharing and consumption.  It includes data assets, hardware, software and processes.  It includes data ingestion and storage infrastructure  It includes data management, data security and data privacy.
  • 14. Data Science Investments Analytics Infrastructure Much of data science work involves computationally intensive experiments.  Thus, Data scientists should be able to access large machines/ specialized hardware for running experiments or doing exploratory analysis.  They should also be able to easily use burst/elastic compute on demand.  Data Scientists need software support for communicating their findings to business stakeholders.
  • 15. Cloud Analytics On-premises analytics solutions have challanges  Cost of infrastructure  Need for specialized skills  Time required to configure and maintain these systems  Nonscalability Cloud Analytics provides solution. Some major players  IBM Cognos analytics  Microfost Azure Stream Analytics  AWS Analytics
  • 16. Success Stories  Southwest Airlines saved $ 100 million by reducing the time its planes stood idle on the airstrip.  UPS, a logistics company, saved 38 million gallons of fuel by optimizing its fleet.  $ 2 billion tax dollars saved by the Internal Revenue Service by improving its ability to detect identity fraud and improper payments.  Croma, a subsidiary of Tata sons used data science to understand 360° view of its users and used it to give personalized shopping experience to its online customers and their conversions have significantly improved. And many more…
  • 17. With Data in your possession, You are sitting on a gold mine… However, if you don't know this fact OR don’t know how to extract it, you won't be able to benefit from it.
  • 18. Data Science Process The diagram shows the major phases of data science process. The diagram presents the CRISP-DM methodology
  • 19. Data Science Process The six steps of a data science project  Data Collection  Data Storage  Data Preparation  Data Utilization  Business Analytics  Predictive Analytics  Developing Data Product  Communication, data visualization  Data-driven Decision
  • 20. Data Collection This is where many businesses fail. Too many companies collect incomplete, unreliable data and everything they do after that is just messed up. Proper tracking and collection of data, and ensuring its quality is crucial for every business doing data science. What to collect?  It is important to decide the details of the data that must be collected/ captured.  The general idea is to collect everything you can – because the value of data can be realized any time in future.  However, the more data you capture, the more engineering time you need to allocate to implement it, the slower your business processes will be, the more complex your data infrastructure becomes, and so on… Also consider legal and ethical aspects!
  • 21. Data Wrangling Data wrangling is all about getting the data into the right form that is suitable for feeding into the modeling and visualization stages. This activity involves variety of tasks from discovering data to acquiring and transforming it into the form where the Data that is ready to be processed. The tasks following the data acquisition are also referred to by different terms such as Data Munging or Data Preprocessing.
  • 22. Big Data Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it. - Dan Ariely
  • 23. What is Big data?  Big data is a data set whose volume is beyond the ability of commonly used hardware and software tools to capture, manage, and process the data within a tolerable execution time.  They are gathered by information-sensing mobile devices, remote sensing technologies, software logs, cameras, microphones, RFID readers, and many such devices.  As a result, such datasets are continuously growing in size.  By 2020, there will be around 40 trillion gigabytes of data  90% of the data in the world today was created within just the past two years.  Internet users generate about 2.5 quintillion bytes (2.5 million terabytes) of data each day
  • 24. Twitter  500 million tweets per day Facebook  Facebook generates 4 petabytes of data per day.  Users generate 4 million likes every minute.  350 million photos are uploaded per day. Instagram  The Like button is hit an average of 4.2 billion times/ day. WhatsApp  In 2018, WhatsApp users sent 65 billion messages per day Almost every field Some Examples
  • 25. Characteristics of big data (3V’s) In a 2001 research report, Gartner analyst, Doug Laney, defined data growth challenges (and opportunities) as being three-dimensional - increasing volume, velocity , and variety. Data volume:  This is the primary attribute of big data. Most people define big data in multi terabytes—sometimes petabytes. Data variety  Big data is coming from a greater variety of sources than ever before. Many of the newer ones are Web sources, including logs, click-streams, and social media. Data velocity  Big data can be described by its velocity or speed. The rate at which new data is generated.
  • 26. Data Analysis Data Analysis is process for extracting value from Data. This is where data science gets exciting. It’s a creative process.  Ask right Questions It is important to ask right questions. They usually comes from the management/ or other colleagues, who may already have suspicions based on their experience.  Do Qualitative research It’s important to understand the things concerning business and its customers in detail. This can be achieved through qualitative research, which in turn gives direction to the useful investigations through data.
  • 27. Three Major Business Applications  Business Analytics It answers the questions of “what has happened in the past?” and “where are we now?” E.g. reporting, measuring retention, finding the right user segments, funnel analysis, etc.  Predictive Analytics It answers the question, “what will happen in the future?” E.g. early warning, predicting the marketing budget you will need in the next quarter, etc.  Data (Based) Product A product that is built, and works using your data. E.g. recommendation systems, image recognition, voice recognition, etc.
  • 28.  SafetiPin is a map-based mobile phone application, which leverages the power of big data to make our communities and cities safer for women.  It provides safety-related information collected through crowdsourcing.  The app captures data on 9 parameters (Lighting, openness, visibility, people density, security in the area, walk path, transportation, gender diversity, feeling in the area), and uses it to compute and provide safety score, the information on personal vulnerability to crime, in every pocket of the city.  App utilizes this score ang integrates with big data sources such as Google map to recommends Safest Route to provide the best possible route in terms of safety.
  • 29. Data Communication This is the step where most data science projects fail. To reap the benefits of Data Science, effective communication of the findings is crucial.  It is necessary to build a culture where people can communicate and use data. For this, everyone at your company needs to be involved.  Business people should also educate data scientists by helping them to create and deliver better presentations.  Communication should be as simple as it can be.  No fancy scientific words  No complicated charts
  • 30. What People you need in your Team? You data science team should feature  Best Data Engineers,  Best software developers, and  Best statisticians They need to have domain knowledge to know the actual business application of their data projects.
  • 31. Data Science Roles: Data Engineer The data engineer is someone who develops, constructs, tests and maintains data architectures, such as databases, data warehouses, data lakes and large-scale processing systems. Data engineers manage data of all sizes, and types. They develop, deploy, manage, and optimize data pipelines and infrastructure to transform and transfer data to data scientists for querying. Skills needed: SQL, Data bases, Data warehousing, ETL, Big data tools, Building API’s
  • 32. Data Science Roles: Data Analyst Data analysts perform the following tasks  Data wrangling  Create Data visualizations and Dash boards  Analyze data to discover and interesting trends in the data  Presenting the results of analysis to business clients or internal teams  Help other stakeholders to optimize their data utilization Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, data wrangling, data visualization tools like tableau/ Power BI
  • 33. Data Science Roles: Data Scientist A data scientist is a specialist having expertise in Statistics and developing models, including predictive models and machine learning models.  Data scientists can tackle more open-ended questions by leveraging their knowledge of advanced statistics.  Data scientists bring an entirely new approach and perspective to understanding data Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, storytelling and data visualization, Hadoop, SQL, machine learning, Big data analytics.
  • 34. Data Science projects can fail Yes, that’s true! Here are some of the reasons.  Not every manager is ready for this change. Even a very well-executed data project can fail, just because someone’s feelings or ego is hurt.  Answering the wrong question  Failure to integrate into business operations  Stakeholders disengaged  Benefits don’t justify the costs
  • 35. Developing Data Science culture Failures can be prevented by establishing a data-driven company culture early on. As the company size increases, it becomes harder to make the organization data-driven.  It’s important that the managers develop the right mindset.  It important that everyone in the organization understands importance of data science. Data professionals should hold frequent presentations about their recent findings.
  • 36. Data Strategy Why Data Strategy? If you don't have a data strategy, you won't have enough information to make the right decisions. Having data strategy is crucial to become a data-driven organization. Without it  you will waste money on the wrong marketing campaigns  you will have wrong product development plans
  • 37. Where do I begin? It is recommended to start with development of Data Strategy. For this, following questions need to be answered  What are the right metrics to focus on? And how to figure it out?  How to collect and store the data. Which tools should you use?  Can you trust your data? And how can you make it trustworthy?  How to communicate the data in your organization efficiently? Start with a simple data project that answers the basic questions about your business. Subsequently, as you recognize your customers’ needs, you may initiate other projects such as Predictive modelling, and Machine learning
  • 38. Pick your first data project Develop and use the Prioritization matrix.
  • 39. Your first data project Your first data project should be a simple project (feasible) with an aim to understanding your own business and your customers better (High business value) In other words, Start with investing in business analytics and simple reports. This project answers the basic questions about your business, such as  Who prefers what and why?  How to win customer loyalty?  Why a particular product failed? And so on …
  • 41. You can write to me kalamkar.vipul-stat@msubaroda.ac.in