SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Introduction to Data Science
Week 1
www.swaraadyasolutions.co.in
Agenda
• Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Defining Data Science
• Data Science deals with the science and algorithms
related to data.
• Data generated from various sort of sources.
• Report says, “Every day, approximately 2 quintillion bytes
of data is generated. If it grows at this pace, then by the
next 3 years, it is expected that 2MB of data will be
created every second for every individual on this planet.”
• Last 2 years witnessing the creation of 90% of data over
the globe.
www.swaraadyasolutions.co.in
• Data has two sources:
• Structured
• Unstructured
• Structured sources include information that is compatible
with the relational database.
• E.g. ATM transactions, Flight Tickets which enable SQL to
make changes in them.
• Unstructured data is generated from tweets and comments
on social media, audio and video files which the SQL cannot
process.
www.swaraadyasolutions.co.in
Definition
“ Data Science is a broad field which is an assembly of scientific techniques,
methods, processes used to clean the data and then extract some useful
patterns and insights in form of visualizations.”
• Visualizations are crucial to make important business decisions and come up
with strategies that are instrumental for organization’s well-being.
www.swaraadyasolutions.co.in
History
In 1997, when C. F. Jeff at University of Michigan, stated that below concepts
should be studied under phrase Data Science.
• Data Collection
• Data Modeling
• DataAnalysis
www.swaraadyasolutions.co.in
Role of Data Science on Statistics
• Statistics
• Mathematics
• Computer Science
• DataAnalysis
• CriticalThinking
• Problem Solving
• Machine Learning
• DataVisualization
www.swaraadyasolutions.co.in
Data Science??
In 2012, it was titled as the “The sexiest job of the
21st Century” by Harvard Business School.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Statistics
• Statistics is the branch of mathematics that deals with data collection,
categorization, interpretation and presentation.
• These techniques helped with the processing and analyzing of the data at a
large scale.
www.swaraadyasolutions.co.in
StatisticsTechniquesTo Deal with Data
• Data Collection
– Collecting relevant data/information
– Primary data includes surveys, observations and experiments.
– Secondary data has internal records and government published data.
• Data Categorization and Classification
– Organized to get some insights
For example, we have data of heights of 10 people
160cm, 165cm, 155cm, 190cm, 177cm, 181cm, 179cm, 185cm, 159cm, 173cm
This data in an ordered array will look like
155cm, 159cm, 160cm ,165cm, 173cm, 177cm, 179cm, 181cm, 185cm, 190cm
The above data tells us that 155cm is the shortest height while 190cm is the tallest.
www.swaraadyasolutions.co.in
StatisticsTechniquesTo Deal with Data
• Data Classification
– Assembly of relevant facts/data into different categories/groups as per features.
– Factors are:
• Geographical
• Chronological (basis of time)
• Qualitative
• Quantitative
• Data Presentation
– Includes frequency distribution using histograms.
– For example, assume you are looking for prospective clients for your new
product which is an electric bike.
www.swaraadyasolutions.co.in
Applications
• Data Science has tons of applications in real-world implementation.
• Recommender Systems
– Content based – keeps track of users watching habits.
– Collaborative based – recognizes users with similar tastes.
• Voice and Image Recognition
• Spam and Fraud Detection
• Many more…….
www.swaraadyasolutions.co.in
Data Scientists andTheir Role
• Data Scientist is a Rockstar!!!
• A Data Scientist is an individual who has the power and freedom to
experiment with tons of different kinds of data.
• Based on knowledge of:
– Mathematics
– Problem solving
– Critical thinking
– Careful analysis
www.swaraadyasolutions.co.in
• For anyone who is willing to carry this “tag” along should be well-versed with a lot
of concepts.
Some of them are
• Mathematics
• Statistics
• Problem-solving
• Data wrangling or data munging
• Coding prowess in both R and Python
• SQL
• Hadoop
• Machine learning and AI
• Data visualization
• Communication skills
www.swaraadyasolutions.co.in
Data Analyst v/s Data Scientist
• Data Analyst has a lot to do with converting the data into a structured
format in order to process it further.
• Focus more on Data Mining and Data Auditing
• Data mining involves retrieving information from large databases with the help of SQL to
extract new data/information.
• Data auditing involves checking the essence of data and trying to figure out if the data is
capable enough for gaining useful insights or not.
www.swaraadyasolutions.co.in
Data Analyst v/s Data Scientist
• Data Scientist take the clean data and trying to gain some meaningful
insights.
• An algorithm either from classification or regression is implemented in
order to create a model and make it sustainable enough to gain some
business insights with the help of visualization tools.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Are There Enough Skilled Data Scientists In The Industry?
• According to a survey conducted by IBM, the demand for data
scientists will soar by 28% by 2020.
• That includes all jobs which require machine learning, big data,
visualization likeTableau and PowerBI expertise and knowledge of
data analysis.
• This is divided among the industries looking for such professionals in
finance, insurance, professional services, and IT sectors.
www.swaraadyasolutions.co.in
A candidate who is always thirsty for new challenges and loves problem-solving
of any kind is capable to become a skilled data scientist.
He likes observing and defining a problem from different angles and
perspectives.
Coding is his daily hustle and loves doing it, not because the problem demands
him to do, but he knows how interesting it becomes to come up with new findings
and insights and then make a cute little story out of it!
www.swaraadyasolutions.co.in
Data Science Effects
How Can Data Science Help A Business/CompanyGrow?
• Data Science was breathing in the IT industry for a long time.
• The sudden increase in the amount of data hinted the companies to make it a norm slowly and steadily.
• There are numerous ways in which this emerging discipline can help an organization grow and achieve
new heights
• Business logistics, including supply chain optimization
• Finance
• Health and wellness
• Education and electronic teaching
• Climate and energy
www.swaraadyasolutions.co.in
Popular Data ProcessingTOOLS in Data Science
• Jupyter – open source tool to create and distribute documents
• R Studio – open source tool for R programming.
• SAS – analytics tool.
• Apache Spark – open source shared software specializes in cluster computing.
• Microsoft Excel – spreadsheet.
• SQL – programming language.
• Tableau – data visualization tool used for representing data in terms of charts.
• PowerBI – business intelligence tool developed by Microsoft.
www.swaraadyasolutions.co.in
What does Data Science Professional Do?
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Installation of R and R Studio
www.swaraadyasolutions.co.in
Conclusion/Endnotes
• Data Science is turning out to be one of the fastest growing fields in the US and India.
• Today, it has its foot in weather forecasting, sales prediction, fraud and spam detection, pattern recognition, taxi fare
prediction, sentiment analysis, and neural networks.
• The future of data science is going to be dominated byArtificial Intelligence and Automation.
• These two big-heads have the capability of changing the current market scenario into something that data scientists describe
as the “age of revolution”.
• Machines are enriching themselves with new concepts and technology every counting second which is making them smarter
and sharper than humans.
• Looking at the current scenario of the market, data science is slowly and gradually making its
way into businesses and enterprises.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in

Weitere ähnliche Inhalte

Was ist angesagt?

Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introductionkrishna singh
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsRohithND
 
Ppt on data science
Ppt on data science Ppt on data science
Ppt on data science Ansh Budania
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data scienceShilpaKrishna6
 
Data science & data scientist
Data science & data scientistData science & data scientist
Data science & data scientistVijayMohan Vasu
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overviewColleen Farrelly
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data ScienceKenny Daniel
 

Was ist angesagt? (20)

Data science presentation
Data science presentationData science presentation
Data science presentation
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Ppt on data science
Ppt on data science Ppt on data science
Ppt on data science
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
Data science & data scientist
Data science & data scientistData science & data scientist
Data science & data scientist
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data Science
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 

Ähnlich wie introduction to data science

Data Science Overview
Data Science OverviewData Science Overview
Data Science OverviewDavide Mauri
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION Elvis Muyanja
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetNever Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetInformationActive Inc.
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptxSamiksha880257
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data scienceThinkful
 
NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceMark West
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceMark West
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
Business Analytics and Data mining.pdf
Business Analytics and Data mining.pdfBusiness Analytics and Data mining.pdf
Business Analytics and Data mining.pdfssuser0413ec
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career pathRubikal
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 

Ähnlich wie introduction to data science (20)

Data Science Overview
Data Science OverviewData Science Overview
Data Science Overview
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
Big data
Big dataBig data
Big data
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetNever Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptx
 
Ds01 data science
Ds01   data scienceDs01   data science
Ds01 data science
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
 
NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data Science
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Business Analytics and Data mining.pdf
Business Analytics and Data mining.pdfBusiness Analytics and Data mining.pdf
Business Analytics and Data mining.pdf
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 

Mehr von bhavesh lande

The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019 The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019 bhavesh lande
 
information control and Security system
information control and Security systeminformation control and Security system
information control and Security systembhavesh lande
 
information technology and infrastructures choices
information technology and  infrastructures choicesinformation technology and  infrastructures choices
information technology and infrastructures choicesbhavesh lande
 
ethical issues,social issues
 ethical issues,social issues ethical issues,social issues
ethical issues,social issuesbhavesh lande
 
managing inforamation system
managing inforamation systemmanaging inforamation system
managing inforamation systembhavesh lande
 
• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governancebhavesh lande
 
organisations and information systems
organisations and  information systemsorganisations and  information systems
organisations and information systemsbhavesh lande
 
IT stratergy and digital goods
IT stratergy and digital goodsIT stratergy and digital goods
IT stratergy and digital goodsbhavesh lande
 
Implement Mapreduce with suitable example using MongoDB.
 Implement Mapreduce with suitable example using MongoDB. Implement Mapreduce with suitable example using MongoDB.
Implement Mapreduce with suitable example using MongoDB.bhavesh lande
 
aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.bhavesh lande
 
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
 Unnamed PL/SQL code block: Use of Control structure and Exception handling i... Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...bhavesh lande
 
database application using SQL DML statements: all types of Join, Sub-Query ...
 database application using SQL DML statements: all types of Join, Sub-Query ... database application using SQL DML statements: all types of Join, Sub-Query ...
database application using SQL DML statements: all types of Join, Sub-Query ...bhavesh lande
 
database application using SQL DML statements: Insert, Select, Update, Delet...
 database application using SQL DML statements: Insert, Select, Update, Delet... database application using SQL DML statements: Insert, Select, Update, Delet...
database application using SQL DML statements: Insert, Select, Update, Delet...bhavesh lande
 
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
 Design and Develop SQL DDL statements which demonstrate the use of SQL objec... Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...bhavesh lande
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of pythonbhavesh lande
 
introduction of python in data science
introduction of python in data scienceintroduction of python in data science
introduction of python in data sciencebhavesh lande
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their rolebhavesh lande
 

Mehr von bhavesh lande (20)

The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019 The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019
 
information control and Security system
information control and Security systeminformation control and Security system
information control and Security system
 
information technology and infrastructures choices
information technology and  infrastructures choicesinformation technology and  infrastructures choices
information technology and infrastructures choices
 
ethical issues,social issues
 ethical issues,social issues ethical issues,social issues
ethical issues,social issues
 
managing inforamation system
managing inforamation systemmanaging inforamation system
managing inforamation system
 
• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance
 
IT and innovations
 IT and  innovations  IT and  innovations
IT and innovations
 
organisations and information systems
organisations and  information systemsorganisations and  information systems
organisations and information systems
 
IT stratergy and digital goods
IT stratergy and digital goodsIT stratergy and digital goods
IT stratergy and digital goods
 
Implement Mapreduce with suitable example using MongoDB.
 Implement Mapreduce with suitable example using MongoDB. Implement Mapreduce with suitable example using MongoDB.
Implement Mapreduce with suitable example using MongoDB.
 
aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.
 
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
 Unnamed PL/SQL code block: Use of Control structure and Exception handling i... Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
 
database application using SQL DML statements: all types of Join, Sub-Query ...
 database application using SQL DML statements: all types of Join, Sub-Query ... database application using SQL DML statements: all types of Join, Sub-Query ...
database application using SQL DML statements: all types of Join, Sub-Query ...
 
database application using SQL DML statements: Insert, Select, Update, Delet...
 database application using SQL DML statements: Insert, Select, Update, Delet... database application using SQL DML statements: Insert, Select, Update, Delet...
database application using SQL DML statements: Insert, Select, Update, Delet...
 
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
 Design and Develop SQL DDL statements which demonstrate the use of SQL objec... Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
 
working with python
working with pythonworking with python
working with python
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of python
 
introduction of python in data science
introduction of python in data scienceintroduction of python in data science
introduction of python in data science
 
tools
toolstools
tools
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their role
 

Kürzlich hochgeladen

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 

Kürzlich hochgeladen (20)

Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 

introduction to data science

  • 1. Introduction to Data Science Week 1 www.swaraadyasolutions.co.in
  • 2. Agenda • Defining Data Science • What Does a Data Science Professional Do? • Data Science in Business • Use Cases for Data Science • Installation of R and R studio www.swaraadyasolutions.co.in
  • 4. Defining Data Science • Data Science deals with the science and algorithms related to data. • Data generated from various sort of sources. • Report says, “Every day, approximately 2 quintillion bytes of data is generated. If it grows at this pace, then by the next 3 years, it is expected that 2MB of data will be created every second for every individual on this planet.” • Last 2 years witnessing the creation of 90% of data over the globe. www.swaraadyasolutions.co.in
  • 5. • Data has two sources: • Structured • Unstructured • Structured sources include information that is compatible with the relational database. • E.g. ATM transactions, Flight Tickets which enable SQL to make changes in them. • Unstructured data is generated from tweets and comments on social media, audio and video files which the SQL cannot process. www.swaraadyasolutions.co.in
  • 6. Definition “ Data Science is a broad field which is an assembly of scientific techniques, methods, processes used to clean the data and then extract some useful patterns and insights in form of visualizations.” • Visualizations are crucial to make important business decisions and come up with strategies that are instrumental for organization’s well-being. www.swaraadyasolutions.co.in
  • 7. History In 1997, when C. F. Jeff at University of Michigan, stated that below concepts should be studied under phrase Data Science. • Data Collection • Data Modeling • DataAnalysis www.swaraadyasolutions.co.in
  • 8. Role of Data Science on Statistics • Statistics • Mathematics • Computer Science • DataAnalysis • CriticalThinking • Problem Solving • Machine Learning • DataVisualization www.swaraadyasolutions.co.in
  • 9. Data Science?? In 2012, it was titled as the “The sexiest job of the 21st Century” by Harvard Business School. www.swaraadyasolutions.co.in
  • 11. Statistics • Statistics is the branch of mathematics that deals with data collection, categorization, interpretation and presentation. • These techniques helped with the processing and analyzing of the data at a large scale. www.swaraadyasolutions.co.in
  • 12. StatisticsTechniquesTo Deal with Data • Data Collection – Collecting relevant data/information – Primary data includes surveys, observations and experiments. – Secondary data has internal records and government published data. • Data Categorization and Classification – Organized to get some insights For example, we have data of heights of 10 people 160cm, 165cm, 155cm, 190cm, 177cm, 181cm, 179cm, 185cm, 159cm, 173cm This data in an ordered array will look like 155cm, 159cm, 160cm ,165cm, 173cm, 177cm, 179cm, 181cm, 185cm, 190cm The above data tells us that 155cm is the shortest height while 190cm is the tallest. www.swaraadyasolutions.co.in
  • 13. StatisticsTechniquesTo Deal with Data • Data Classification – Assembly of relevant facts/data into different categories/groups as per features. – Factors are: • Geographical • Chronological (basis of time) • Qualitative • Quantitative • Data Presentation – Includes frequency distribution using histograms. – For example, assume you are looking for prospective clients for your new product which is an electric bike. www.swaraadyasolutions.co.in
  • 14. Applications • Data Science has tons of applications in real-world implementation. • Recommender Systems – Content based – keeps track of users watching habits. – Collaborative based – recognizes users with similar tastes. • Voice and Image Recognition • Spam and Fraud Detection • Many more……. www.swaraadyasolutions.co.in
  • 15. Data Scientists andTheir Role • Data Scientist is a Rockstar!!! • A Data Scientist is an individual who has the power and freedom to experiment with tons of different kinds of data. • Based on knowledge of: – Mathematics – Problem solving – Critical thinking – Careful analysis www.swaraadyasolutions.co.in
  • 16. • For anyone who is willing to carry this “tag” along should be well-versed with a lot of concepts. Some of them are • Mathematics • Statistics • Problem-solving • Data wrangling or data munging • Coding prowess in both R and Python • SQL • Hadoop • Machine learning and AI • Data visualization • Communication skills www.swaraadyasolutions.co.in
  • 17. Data Analyst v/s Data Scientist • Data Analyst has a lot to do with converting the data into a structured format in order to process it further. • Focus more on Data Mining and Data Auditing • Data mining involves retrieving information from large databases with the help of SQL to extract new data/information. • Data auditing involves checking the essence of data and trying to figure out if the data is capable enough for gaining useful insights or not. www.swaraadyasolutions.co.in
  • 18. Data Analyst v/s Data Scientist • Data Scientist take the clean data and trying to gain some meaningful insights. • An algorithm either from classification or regression is implemented in order to create a model and make it sustainable enough to gain some business insights with the help of visualization tools. www.swaraadyasolutions.co.in
  • 20. Are There Enough Skilled Data Scientists In The Industry? • According to a survey conducted by IBM, the demand for data scientists will soar by 28% by 2020. • That includes all jobs which require machine learning, big data, visualization likeTableau and PowerBI expertise and knowledge of data analysis. • This is divided among the industries looking for such professionals in finance, insurance, professional services, and IT sectors. www.swaraadyasolutions.co.in
  • 21. A candidate who is always thirsty for new challenges and loves problem-solving of any kind is capable to become a skilled data scientist. He likes observing and defining a problem from different angles and perspectives. Coding is his daily hustle and loves doing it, not because the problem demands him to do, but he knows how interesting it becomes to come up with new findings and insights and then make a cute little story out of it! www.swaraadyasolutions.co.in
  • 22. Data Science Effects How Can Data Science Help A Business/CompanyGrow? • Data Science was breathing in the IT industry for a long time. • The sudden increase in the amount of data hinted the companies to make it a norm slowly and steadily. • There are numerous ways in which this emerging discipline can help an organization grow and achieve new heights • Business logistics, including supply chain optimization • Finance • Health and wellness • Education and electronic teaching • Climate and energy www.swaraadyasolutions.co.in
  • 23. Popular Data ProcessingTOOLS in Data Science • Jupyter – open source tool to create and distribute documents • R Studio – open source tool for R programming. • SAS – analytics tool. • Apache Spark – open source shared software specializes in cluster computing. • Microsoft Excel – spreadsheet. • SQL – programming language. • Tableau – data visualization tool used for representing data in terms of charts. • PowerBI – business intelligence tool developed by Microsoft. www.swaraadyasolutions.co.in
  • 24. What does Data Science Professional Do? www.swaraadyasolutions.co.in
  • 27. Installation of R and R Studio www.swaraadyasolutions.co.in
  • 28. Conclusion/Endnotes • Data Science is turning out to be one of the fastest growing fields in the US and India. • Today, it has its foot in weather forecasting, sales prediction, fraud and spam detection, pattern recognition, taxi fare prediction, sentiment analysis, and neural networks. • The future of data science is going to be dominated byArtificial Intelligence and Automation. • These two big-heads have the capability of changing the current market scenario into something that data scientists describe as the “age of revolution”. • Machines are enriching themselves with new concepts and technology every counting second which is making them smarter and sharper than humans. • Looking at the current scenario of the market, data science is slowly and gradually making its way into businesses and enterprises. www.swaraadyasolutions.co.in