SlideShare ist ein Scribd-Unternehmen logo
1 von 32
WHAT IS DATA SCIENCE ?
BY
SHILPA KRISHNA
RESEARCH SCHOLAR
Data
Science
Process
DISCOVERY
DATA
PREPARATIO
N
MODEL
PLANNIN
G
MODEL
BUILDIN
G
OPERATI
ON
COMMUNICAT
E
RESULTS
DISCOVERY
 It involves acquiring data from all the identified
internal and external sources which helps you to
answer the business question.
 The data can be :
1. Logs from webservers
2. Data gathered from social media
3. Census datasets
4. Data streamed from online sources using APIs
DATA PREPARATION
 Data can have lots of inconsistencies like
missing value,blank columns,incorrect data
format which needs to be cleaned.
 You need to process,explore and condition
data before modeling.
 The cleaner your data, the better are your
predictions.
MODEL PLANNING
 In this stage, you need to determine the
method and technique to draw the relation
between input variables.
 Planning for a model is performed by using
different statistical formulas and
visualization tools like SQL analysis
services, R and SAS/access
MODEL BUILDING
 Data scientist distributes datasets for
training and testing.
 Techniques like association, classification,
and clustering are applied to the training
dataset.
 The model once prepared is tested
against the “testing” dataset
OPERATIONALIZE
 You deliver the final baselined model with
reports,code and technical documents.
 Model is deployed into a real-time
production environment after through
testing.
COMMUNICATE RESULTS
 The key findings are communicated to all
stakeholders.
 This helps you to decide if the results of
the project are a success or a failure
based on the inputs from the model.
MOST PROMINENT DATA SCIENTIST JOB TITLES ARE :
1) Data scientist
2) Data engineer
3) Data analyst
4) Statistician
5) Data admin
6) Business analyst
Data Scientist
ROLE LANGUAGES
 It is a professional who
manages enormous
amounts of data to come
up with compelling
business visions by using
various tools, techniques,
methodologies, algorithms
etc…
 R
 SAS
 PYTHON
 SQL
 HIVE
 MATLAB
 PIG
 SPARK
Data Engineer
ROLE LANGUAGES
 He is working with large
amounts of data and
develops constructs,
tests and maintains
architectures like large
scale processing system
and databases.
 SQL
 HIVE
 R
 SAS
 MATLAB
 PYTHON
 JAVA
 RUBY
 C++
 PERL
Data Analyst
ROLE LANGUAGES
 Responsible for mining vast
amounts of data and look
for relationships, patterns,
trends in data.
 Later deliver compeling
reporting and visualization
for analyzing the data to
take the most viable
business decisions.
 R
 PYTHON
 HTML
 JS
 C
 C++
 SQL
Statistician
ROLE LANGUAGES
 Collects, analyses,
understand qualitative
and quantitative data by
using statistical theories
and methods.
 SQL
 R
 MATLAB
 TABLEAU
 PYTHON
 PERL
 SPARK
 HIVE
Data Administrator
ROLE LANGUAGES
 Data admin should
ensure that the database
is accessible to all
relevant users also
makes sure that it is
performing correctly and
is being kept safe from
hacking
 RUBY on Rails
 SQL
 JAVA
 C#
 PYTHON
Business Analyst
ROLE LANGUAGES
 This professional need to
improves business
processes and He is an
intermediary between the
business executive team
and IT department
 SQL
 TABLEAU
 POWER BI
 PYTHON
DEFINE THE GOAL
 Define a measurable and quantifiable goal
 Goal should be specific and precise
 Goal is come up with candidate
hypothesis. These hypothesis can then be
turned into concrete questions or goals for
a full-scale modeling project.
COLLECT AND MANAGE DATA
 Time consuming step
 Conduct initial exploration and
visualization of the data
 Clean data: repair data errors and
transform variables as needed
BUILD THE MODEL
Most common data science modeling tasks are
 Classification
 Scoring
 Ranking
 Clustering
 Finding relations
 Characterization
EVALUATE AND CRITIQUE MODEL
Once you have a model, you need to
determine if it meets your goals :
 Is it accurate enough for your needs ?
 Does it perform better than the obvious
guess ?
 Do the results of the model make sense in
the context of the problem domain ?
PRESENT RESULTS AND DOCUMENT
 Present results to your project sponser
and other stakeholders.
 Document the model for those in the
organization who are responsible for
using running and maintaining the model
once it has been deployed.
DEPLOY MODEL
 Make sure that the model can be updated
as its environment changes.
 The model initially be deployed in a small
pilot program.
Several ways of gathering data for
analysis are :
 CSV FILE
 FLAT FILE(tab, space
or any other separator)
 TEXT FILE(In a single
file- reading data all at
once) or (reading data
line by line)
 ZIP FILE
 APIs(JSON)
 MULTIPLE TEXT
FILE(data is split over
multiple text files)
 DOWNLOAD FILE
FROM INTERNET(file
hosted on a server)
 WEBPAGE(scraping)
 RDBMS(SQL tables)
 Relational database uses tables which
are called Records
 Establish connections among records by
using primary key and foreign key
 Allows users to establish defined
relationships between tables
 In RDBMS, we use SQL instructions to
reproduce and analyze data separately
SOME COMMONLY USED PLOTS FOR EDA ARE :
 Histogram
 Scatter plots
 Maps
 Feature corelation plot(Heatmap)
 Time series plots
Data management platforms enables
organizations and enterprises to use data
analytics in beneficial ways, such as :
 Personalizing the customer experience
 Adding value to customer interactions
 Improving customer engagement
 Increasing customer loyalty
 Reaping and revenues associated with data
driven marketing
 Identifying the root causes of marketing failures
and business issues in real time

Weitere ähnliche Inhalte

Was ist angesagt?

Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI dayMohammed Barakat
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceData Science Thailand
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 
Big data visualization
Big data visualizationBig data visualization
Big data visualizationAnurag Gupta
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Edureka!
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 

Was ist angesagt? (20)

Data science presentation
Data science presentationData science presentation
Data science presentation
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data science
Data scienceData science
Data science
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Big data
Big dataBig data
Big data
 
Big data visualization
Big data visualizationBig data visualization
Big data visualization
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Data science
Data science Data science
Data science
 
Data science ppt
Data science pptData science ppt
Data science ppt
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Data science - An Introduction
Data science - An IntroductionData science - An Introduction
Data science - An Introduction
 

Ähnlich wie Data science | What is Data science

Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Chain Sys Corporation
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark ZaranTech LLC
 
Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)Marié Roux
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxNagarajanG35
 
Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020Marié Roux
 
Sujit lead plsql
Sujit lead plsqlSujit lead plsql
Sujit lead plsqlSujit Jha
 
Shraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CVShraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CVShraddha Mehrotra
 
TechoERP.pdf
TechoERP.pdfTechoERP.pdf
TechoERP.pdfTechoERP
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1ReKruiTIn.com
 
Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resumedeblina dey
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America
 
Resume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - IndResume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - IndAbhishek Ray
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleatSistemas
 
CV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .NetCV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .NetSham Sunder
 
Resume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_DeveloperResume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_DeveloperRaghav Mahajan
 

Ähnlich wie Data science | What is Data science (20)

Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 
Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark
 
Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020
 
Sujit lead plsql
Sujit lead plsqlSujit lead plsql
Sujit lead plsql
 
Shraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CVShraddha Verma_IT_ETL Architect_10+_CV
Shraddha Verma_IT_ETL Architect_10+_CV
 
TechoERP.pdf
TechoERP.pdfTechoERP.pdf
TechoERP.pdf
 
Sap Interview Questions - Part 1
Sap Interview Questions - Part 1Sap Interview Questions - Part 1
Sap Interview Questions - Part 1
 
Deblina Dey - Resume
Deblina Dey - ResumeDeblina Dey - Resume
Deblina Dey - Resume
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Resume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - IndResume - Abhishek Ray-Mar-2016 - Ind
Resume - Abhishek Ray-Mar-2016 - Ind
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
CV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .NetCV | Sham Sunder | Data | Database | Business Intelligence | .Net
CV | Sham Sunder | Data | Database | Business Intelligence | .Net
 
Kanakaraj_Periasamy
Kanakaraj_PeriasamyKanakaraj_Periasamy
Kanakaraj_Periasamy
 
Resume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_DeveloperResume_RaghavMahajan_ETL_Developer
Resume_RaghavMahajan_ETL_Developer
 
Future.ready().watson dataplatform 01
Future.ready().watson dataplatform 01Future.ready().watson dataplatform 01
Future.ready().watson dataplatform 01
 

Mehr von ShilpaKrishna6

WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)ShilpaKrishna6
 
Big data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business AnalyticsBig data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business AnalyticsShilpaKrishna6
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsShilpaKrishna6
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesShilpaKrishna6
 
Internet of Things(IoT) Applications
Internet of Things(IoT) ApplicationsInternet of Things(IoT) Applications
Internet of Things(IoT) ApplicationsShilpaKrishna6
 
Iot enabled technologies
Iot enabled technologiesIot enabled technologies
Iot enabled technologiesShilpaKrishna6
 
Physical design of io t
Physical design of io tPhysical design of io t
Physical design of io tShilpaKrishna6
 
Introduction to iot(internet of things)
Introduction to iot(internet of things)Introduction to iot(internet of things)
Introduction to iot(internet of things)ShilpaKrishna6
 
Number system and its conversions
Number system and its conversionsNumber system and its conversions
Number system and its conversionsShilpaKrishna6
 

Mehr von ShilpaKrishna6 (13)

WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)WBAN(Wireless Body Area Network)
WBAN(Wireless Body Area Network)
 
Evolution of big data
Evolution of big dataEvolution of big data
Evolution of big data
 
Big data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business AnalyticsBig data business analytics | Introduction to Business Analytics
Big data business analytics | Introduction to Business Analytics
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
What is MapReduce ?
What is MapReduce ?What is MapReduce ?
What is MapReduce ?
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databases
 
Internet of Things(IoT) Applications
Internet of Things(IoT) ApplicationsInternet of Things(IoT) Applications
Internet of Things(IoT) Applications
 
4 pillers of iot
4 pillers of iot4 pillers of iot
4 pillers of iot
 
Iot enabled technologies
Iot enabled technologiesIot enabled technologies
Iot enabled technologies
 
Iot logical design
Iot logical designIot logical design
Iot logical design
 
Physical design of io t
Physical design of io tPhysical design of io t
Physical design of io t
 
Introduction to iot(internet of things)
Introduction to iot(internet of things)Introduction to iot(internet of things)
Introduction to iot(internet of things)
 
Number system and its conversions
Number system and its conversionsNumber system and its conversions
Number system and its conversions
 

Kürzlich hochgeladen

UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 

Kürzlich hochgeladen (20)

UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Data science | What is Data science

  • 1. WHAT IS DATA SCIENCE ? BY SHILPA KRISHNA RESEARCH SCHOLAR
  • 3. DISCOVERY  It involves acquiring data from all the identified internal and external sources which helps you to answer the business question.  The data can be : 1. Logs from webservers 2. Data gathered from social media 3. Census datasets 4. Data streamed from online sources using APIs
  • 4. DATA PREPARATION  Data can have lots of inconsistencies like missing value,blank columns,incorrect data format which needs to be cleaned.  You need to process,explore and condition data before modeling.  The cleaner your data, the better are your predictions.
  • 5. MODEL PLANNING  In this stage, you need to determine the method and technique to draw the relation between input variables.  Planning for a model is performed by using different statistical formulas and visualization tools like SQL analysis services, R and SAS/access
  • 6. MODEL BUILDING  Data scientist distributes datasets for training and testing.  Techniques like association, classification, and clustering are applied to the training dataset.  The model once prepared is tested against the “testing” dataset
  • 7. OPERATIONALIZE  You deliver the final baselined model with reports,code and technical documents.  Model is deployed into a real-time production environment after through testing.
  • 8. COMMUNICATE RESULTS  The key findings are communicated to all stakeholders.  This helps you to decide if the results of the project are a success or a failure based on the inputs from the model.
  • 9.
  • 10. MOST PROMINENT DATA SCIENTIST JOB TITLES ARE : 1) Data scientist 2) Data engineer 3) Data analyst 4) Statistician 5) Data admin 6) Business analyst
  • 11. Data Scientist ROLE LANGUAGES  It is a professional who manages enormous amounts of data to come up with compelling business visions by using various tools, techniques, methodologies, algorithms etc…  R  SAS  PYTHON  SQL  HIVE  MATLAB  PIG  SPARK
  • 12. Data Engineer ROLE LANGUAGES  He is working with large amounts of data and develops constructs, tests and maintains architectures like large scale processing system and databases.  SQL  HIVE  R  SAS  MATLAB  PYTHON  JAVA  RUBY  C++  PERL
  • 13. Data Analyst ROLE LANGUAGES  Responsible for mining vast amounts of data and look for relationships, patterns, trends in data.  Later deliver compeling reporting and visualization for analyzing the data to take the most viable business decisions.  R  PYTHON  HTML  JS  C  C++  SQL
  • 14. Statistician ROLE LANGUAGES  Collects, analyses, understand qualitative and quantitative data by using statistical theories and methods.  SQL  R  MATLAB  TABLEAU  PYTHON  PERL  SPARK  HIVE
  • 15. Data Administrator ROLE LANGUAGES  Data admin should ensure that the database is accessible to all relevant users also makes sure that it is performing correctly and is being kept safe from hacking  RUBY on Rails  SQL  JAVA  C#  PYTHON
  • 16. Business Analyst ROLE LANGUAGES  This professional need to improves business processes and He is an intermediary between the business executive team and IT department  SQL  TABLEAU  POWER BI  PYTHON
  • 17.
  • 18.
  • 19. DEFINE THE GOAL  Define a measurable and quantifiable goal  Goal should be specific and precise  Goal is come up with candidate hypothesis. These hypothesis can then be turned into concrete questions or goals for a full-scale modeling project.
  • 20. COLLECT AND MANAGE DATA  Time consuming step  Conduct initial exploration and visualization of the data  Clean data: repair data errors and transform variables as needed
  • 21. BUILD THE MODEL Most common data science modeling tasks are  Classification  Scoring  Ranking  Clustering  Finding relations  Characterization
  • 22. EVALUATE AND CRITIQUE MODEL Once you have a model, you need to determine if it meets your goals :  Is it accurate enough for your needs ?  Does it perform better than the obvious guess ?  Do the results of the model make sense in the context of the problem domain ?
  • 23. PRESENT RESULTS AND DOCUMENT  Present results to your project sponser and other stakeholders.  Document the model for those in the organization who are responsible for using running and maintaining the model once it has been deployed.
  • 24. DEPLOY MODEL  Make sure that the model can be updated as its environment changes.  The model initially be deployed in a small pilot program.
  • 25.
  • 26. Several ways of gathering data for analysis are :  CSV FILE  FLAT FILE(tab, space or any other separator)  TEXT FILE(In a single file- reading data all at once) or (reading data line by line)  ZIP FILE  APIs(JSON)  MULTIPLE TEXT FILE(data is split over multiple text files)  DOWNLOAD FILE FROM INTERNET(file hosted on a server)  WEBPAGE(scraping)  RDBMS(SQL tables)
  • 27.
  • 28.  Relational database uses tables which are called Records  Establish connections among records by using primary key and foreign key  Allows users to establish defined relationships between tables  In RDBMS, we use SQL instructions to reproduce and analyze data separately
  • 29.
  • 30. SOME COMMONLY USED PLOTS FOR EDA ARE :  Histogram  Scatter plots  Maps  Feature corelation plot(Heatmap)  Time series plots
  • 31.
  • 32. Data management platforms enables organizations and enterprises to use data analytics in beneficial ways, such as :  Personalizing the customer experience  Adding value to customer interactions  Improving customer engagement  Increasing customer loyalty  Reaping and revenues associated with data driven marketing  Identifying the root causes of marketing failures and business issues in real time