Data Science.pdf

Data Science is a blend of various tools, algorithms, and machine learning principles with the goal of discovering hidden patterns from the raw data.

What is Data Science?
Data Science is a blend of various tools, algorithms, and machine
learning principles with the goal of discovering hidden patterns
from the raw data. But how is this different from what
statisticians have been doing for years?
The answer lies in the difference between explaining and
predicting.
As you can see from the above image, a Data Analyst usually
explains what is going on by processing history of the data. On the
other hand, Data Scientist not only does the exploratory analysis to
discover insights from it, but also uses various advanced machine
learning algorithms to identify the occurrence of a particular event
in the future. A Data Scientist will look at the data from many
angles, sometimes angles not known earlier.
So, Data Science is primarily used to make decisions and
predictions making use of predictive causal analytics, prescriptive
analytics (predictive plus decision science) and machine learning.
Predictive causal analytics – If you want a model that can
predict the possibilities of a particular event in the future,
you need to apply predictive causal analytics. Say, if you are
providing money on credit, then the probability of customers
making future credit payments on time is a matter of
concern for you. Here, you can build a model that can
perform predictive analytics on the payment history of the
customer to predict if the future payments will be on time or
not.
 Prescriptive analytics: If you want a model that has the
intelligence of taking its own decisions and the ability to
modify it with dynamic parameters, you certainly need
prescriptive analytics for it. This relatively new field is all
about providing advice. In other terms, it not only predicts
but suggests a range of prescribed actions and associated
outcomes.
The best example for this is Google’s self-driving car which I
had discussed earlier too. The data gathered by vehicles can
be used to train self-driving cars. You can run algorithms on
this data to bring intelligence to it. This will enable your car
to take decisions like when to turn, which path to take, when
to slow down or speed up.
 Machine learning for making predictions — If you have
transactional data of a finance company and need to build a
model to determine the future trend, then machine learning
algorithms are the best bet. This falls under the paradigm of
supervised learning. It is called supervised because you already
have the data based on which you can train your machines.
For example, a fraud detection model can be trained using a
historical record of fraudulent purchases.
 Machine learning for pattern discovery — If you don’t have
the parameters based on which you can make predictions,
then you need to find out the hidden patterns within the
dataset to be able to make meaningful predictions. This is
nothing but the unsupervised model as you don’t have any
predefined labels for grouping. The most common algorithm
used for pattern discovery is Clustering.
Let’s say you are working in a telephone company and you
need to establish a network by putting towers in a region.
Then, you can use the clustering technique to find those
tower locations which will ensure that all the users receive
optimum signal strength.
Let’s see how the proportion of above-described approaches differ
for Data Analysis as well as Data Science. As you can see in the
image below, Data Analysis includes descriptive analytics and
prediction to a certain extent. On the other hand, Data Science is
more about Predictive Causal Analytics and Machine Learning.
Why Data Science?
 Traditionally, the data that we had was mostly structured
and small in size, which could be analyzed by using simple BI
tools. Unlike data in the traditional systems which were
mostly structured, today most of the data is unstructured or
semi-structured. Let’s have a look at the data trends in the
image given below which shows that by 2020, more than
80 % of the data will be unstructured.
This data is generated from different sources like financial
logs, text files, multimedia forms, sensors, and instruments.
Simple BI tools are not capable of processing this huge volume
and variety of data. This is why we need more complex and
advanced analytical tools and algorithms for processing,
analyzing and drawing meaningful insights out of it.
This is not the only reason why Data Science has become so popular.
Let’s dig deeper and see how Data Science is being used in various
domains.
 How about if you could understand the precise requirements
of your customers from the existing data like the customer’s
past browsing history, purchase history, age and income. No
doubt you had all this data earlier too, but now with the vast
amount and variety of data, you can train models more
effectively and recommend the product to your customers
with more precision. Wouldn’t it be amazing as it will bring
more business to your organization?
 Let’s take a different scenario to understand the role of Data
Science in decision making. How about if your car had the
intelligence to drive you home? Self-driving cars collect live
data from sensors, including radars, cameras, and lasers to
create a map of its surroundings. Based on this data, it takes
decisions like when to speed up, when to speed down, whento
overtake, where to take a turn – making use of advanced
machine learning algorithms.
 Let’s see how Data Science can be used in predictive analytics.
Let’s take weather forecasting as an example. Data from ships,
aircraft, radars, satellites can be collected and analyzed to
build models. These models will not only forecast the weather
but also help in predicting the occurrence of any natural
calamities. It will help you to take appropriate measures
beforehand and save many precious lives.

Recomendados

Data science tutorial von
Data science tutorialData science tutorial
Data science tutorialAakashdata
82 views12 Folien
DataMining Techniq von
DataMining TechniqDataMining Techniq
DataMining TechniqRespa Peter
448 views5 Folien
Credit card fraud detection using python machine learning von
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
1.8K views43 Folien
A guide to preparing your data for tableau von
A guide to preparing your data for tableauA guide to preparing your data for tableau
A guide to preparing your data for tableauPhillip Reinhart
329 views6 Folien
Unit IV.pdf von
Unit IV.pdfUnit IV.pdf
Unit IV.pdfPreethaSuresh2
4 views31 Folien
Simplify our analytics strategy von
Simplify our analytics strategySimplify our analytics strategy
Simplify our analytics strategysaurabh sethia
96 views19 Folien

Más contenido relacionado

Similar a Data Science.pdf

Data Analytics Introduction.pptx von
Data Analytics Introduction.pptxData Analytics Introduction.pptx
Data Analytics Introduction.pptxamitparashar42
31 views17 Folien
Machine learning von
Machine learningMachine learning
Machine learningRajib Kumar De
2.5K views20 Folien
Regression and correlation von
Regression and correlationRegression and correlation
Regression and correlationVrushaliSolanke
66 views72 Folien
BigData Analytics_1.7 von
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
322 views22 Folien
Guide for a Data Scientist von
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
87 views258 Folien

Similar a Data Science.pdf(20)

Guide for a Data Scientist von Rohit Dubey
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
Rohit Dubey87 views
Machine Learning for Business - Eight Best Practices for Getting Started von Bhupesh Chaurasia
Machine Learning for Business - Eight Best Practices for Getting StartedMachine Learning for Business - Eight Best Practices for Getting Started
Machine Learning for Business - Eight Best Practices for Getting Started
Bhupesh Chaurasia172 views
Simplify your analytics strategy von Shikhar Gupta
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
Shikhar Gupta232 views
In-Depth Data Analytics von YASH GAIKWAD
In-Depth Data AnalyticsIn-Depth Data Analytics
In-Depth Data Analytics
YASH GAIKWAD138 views
McKinsey Big Data Trinity for self-learning culture von Matt Ariker
McKinsey Big Data Trinity for self-learning cultureMcKinsey Big Data Trinity for self-learning culture
McKinsey Big Data Trinity for self-learning culture
Matt Ariker397 views
Applied_Data_Science_Presented_by_Yhat von Charlie Hecht
Applied_Data_Science_Presented_by_YhatApplied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_Yhat
Charlie Hecht431 views
Machine Learning with Azure and Databricks Virtual Workshop von CCG
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
CCG252 views

Más de Satishkumar722293

TARGET AUDIENCE.pdf von
TARGET AUDIENCE.pdfTARGET AUDIENCE.pdf
TARGET AUDIENCE.pdfSatishkumar722293
4 views3 Folien
CAREER DEVELOPMENT.pdf von
CAREER DEVELOPMENT.pdfCAREER DEVELOPMENT.pdf
CAREER DEVELOPMENT.pdfSatishkumar722293
4 views3 Folien
DAX.pdf von
DAX.pdfDAX.pdf
DAX.pdfSatishkumar722293
6 views3 Folien
WHAT IS JVM.pdf von
WHAT IS JVM.pdfWHAT IS JVM.pdf
WHAT IS JVM.pdfSatishkumar722293
3 views3 Folien
Data Science.pdf von
Data Science.pdfData Science.pdf
Data Science.pdfSatishkumar722293
6 views5 Folien
JAVA SCRIPT.pdf von
JAVA SCRIPT.pdfJAVA SCRIPT.pdf
JAVA SCRIPT.pdfSatishkumar722293
3 views3 Folien

Más de Satishkumar722293(20)

Último

Readiness Quiz - Sr. Engineer.pptx von
Readiness Quiz - Sr. Engineer.pptxReadiness Quiz - Sr. Engineer.pptx
Readiness Quiz - Sr. Engineer.pptxguptanavneet1
443 views7 Folien
kibria_portfolio.pdf von
kibria_portfolio.pdfkibria_portfolio.pdf
kibria_portfolio.pdfMasumKhan59
6 views17 Folien
Resume_McCauleyFynnBullock-1 (1).pdf von
Resume_McCauleyFynnBullock-1 (1).pdfResume_McCauleyFynnBullock-1 (1).pdf
Resume_McCauleyFynnBullock-1 (1).pdfFynnBullock
16 views2 Folien
CH05 Job Costing.ppt von
CH05 Job Costing.pptCH05 Job Costing.ppt
CH05 Job Costing.ppthassanakhar
9 views27 Folien
Software Engineer's Career Management Toolkit von
Software Engineer's Career Management ToolkitSoftware Engineer's Career Management Toolkit
Software Engineer's Career Management Toolkitozgengungor1
17 views41 Folien
Public Speaking von
Public SpeakingPublic Speaking
Public SpeakingBasel Ahmed
45 views11 Folien

Último(14)

Readiness Quiz - Sr. Engineer.pptx von guptanavneet1
Readiness Quiz - Sr. Engineer.pptxReadiness Quiz - Sr. Engineer.pptx
Readiness Quiz - Sr. Engineer.pptx
guptanavneet1443 views
Resume_McCauleyFynnBullock-1 (1).pdf von FynnBullock
Resume_McCauleyFynnBullock-1 (1).pdfResume_McCauleyFynnBullock-1 (1).pdf
Resume_McCauleyFynnBullock-1 (1).pdf
FynnBullock16 views
Software Engineer's Career Management Toolkit von ozgengungor1
Software Engineer's Career Management ToolkitSoftware Engineer's Career Management Toolkit
Software Engineer's Career Management Toolkit
ozgengungor117 views
WordCamp (Why fret over AI overlords when you can befriend them).pdf von BiaAhmed1
WordCamp (Why fret over AI overlords when you can befriend them).pdfWordCamp (Why fret over AI overlords when you can befriend them).pdf
WordCamp (Why fret over AI overlords when you can befriend them).pdf
BiaAhmed125 views
Readiness Quiz - Staff Engineer.pptx von guptanavneet1
Readiness Quiz - Staff Engineer.pptxReadiness Quiz - Staff Engineer.pptx
Readiness Quiz - Staff Engineer.pptx
guptanavneet1621 views
Danny Gaethofs CV - n English.pdf von Danny Gaethofs
Danny Gaethofs  CV - n English.pdfDanny Gaethofs  CV - n English.pdf
Danny Gaethofs CV - n English.pdf
Danny Gaethofs13 views
SUDIP DHAR Resume.pdf von Sudip Dhar
SUDIP DHAR  Resume.pdfSUDIP DHAR  Resume.pdf
SUDIP DHAR Resume.pdf
Sudip Dhar13 views

Data Science.pdf

  • 1. What is Data Science? Data Science is a blend of various tools, algorithms, and machine learning principles with the goal of discovering hidden patterns from the raw data. But how is this different from what statisticians have been doing for years? The answer lies in the difference between explaining and predicting. As you can see from the above image, a Data Analyst usually explains what is going on by processing history of the data. On the other hand, Data Scientist not only does the exploratory analysis to discover insights from it, but also uses various advanced machine learning algorithms to identify the occurrence of a particular event in the future. A Data Scientist will look at the data from many angles, sometimes angles not known earlier. So, Data Science is primarily used to make decisions and predictions making use of predictive causal analytics, prescriptive
  • 2. analytics (predictive plus decision science) and machine learning. Predictive causal analytics – If you want a model that can predict the possibilities of a particular event in the future, you need to apply predictive causal analytics. Say, if you are providing money on credit, then the probability of customers making future credit payments on time is a matter of concern for you. Here, you can build a model that can perform predictive analytics on the payment history of the customer to predict if the future payments will be on time or not.  Prescriptive analytics: If you want a model that has the intelligence of taking its own decisions and the ability to modify it with dynamic parameters, you certainly need prescriptive analytics for it. This relatively new field is all about providing advice. In other terms, it not only predicts but suggests a range of prescribed actions and associated outcomes. The best example for this is Google’s self-driving car which I had discussed earlier too. The data gathered by vehicles can be used to train self-driving cars. You can run algorithms on this data to bring intelligence to it. This will enable your car to take decisions like when to turn, which path to take, when to slow down or speed up.
  • 3.  Machine learning for making predictions — If you have transactional data of a finance company and need to build a model to determine the future trend, then machine learning algorithms are the best bet. This falls under the paradigm of supervised learning. It is called supervised because you already have the data based on which you can train your machines. For example, a fraud detection model can be trained using a historical record of fraudulent purchases.  Machine learning for pattern discovery — If you don’t have the parameters based on which you can make predictions, then you need to find out the hidden patterns within the dataset to be able to make meaningful predictions. This is nothing but the unsupervised model as you don’t have any predefined labels for grouping. The most common algorithm used for pattern discovery is Clustering. Let’s say you are working in a telephone company and you need to establish a network by putting towers in a region. Then, you can use the clustering technique to find those tower locations which will ensure that all the users receive optimum signal strength. Let’s see how the proportion of above-described approaches differ for Data Analysis as well as Data Science. As you can see in the image below, Data Analysis includes descriptive analytics and prediction to a certain extent. On the other hand, Data Science is more about Predictive Causal Analytics and Machine Learning.
  • 4. Why Data Science?  Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using simple BI tools. Unlike data in the traditional systems which were mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured. This data is generated from different sources like financial logs, text files, multimedia forms, sensors, and instruments. Simple BI tools are not capable of processing this huge volume and variety of data. This is why we need more complex and advanced analytical tools and algorithms for processing, analyzing and drawing meaningful insights out of it. This is not the only reason why Data Science has become so popular. Let’s dig deeper and see how Data Science is being used in various
  • 5. domains.  How about if you could understand the precise requirements of your customers from the existing data like the customer’s past browsing history, purchase history, age and income. No doubt you had all this data earlier too, but now with the vast amount and variety of data, you can train models more effectively and recommend the product to your customers with more precision. Wouldn’t it be amazing as it will bring more business to your organization?  Let’s take a different scenario to understand the role of Data Science in decision making. How about if your car had the intelligence to drive you home? Self-driving cars collect live data from sensors, including radars, cameras, and lasers to create a map of its surroundings. Based on this data, it takes decisions like when to speed up, when to speed down, whento overtake, where to take a turn – making use of advanced machine learning algorithms.  Let’s see how Data Science can be used in predictive analytics. Let’s take weather forecasting as an example. Data from ships, aircraft, radars, satellites can be collected and analyzed to build models. These models will not only forecast the weather but also help in predicting the occurrence of any natural calamities. It will help you to take appropriate measures beforehand and save many precious lives.