SlideShare ist ein Scribd-Unternehmen logo
1 von 20
CHAPTER: 3.7
Overview of Data Processing and Analysis
Editing
Coding
Classification and tabulation (data entry)
Data Analysis
Descriptive Inferential Statistics
Univariate
Bivariate
Multivariate
Processing
7.1. Data processing
Data possessing implies
• Editing:- examining the collected raw
data to detect errors and omission to
correct those when possible
– Field editing:- completing what has
been written in abbreviation and/ or
in illegible form at a time of
recording the respondents’
response
– Central editing (to correct errors
such as entry in the wrong place,
omission)
• Coding (assigning numerical or other
symbols to answers so that
responses can be put into a limited
Continued…
• Classification:- arranging
data in groups or classes on
the basis of common
characteristics.
Classifications:
• According to attributes
which is descriptive in
nature (such as literacy,
sex, honesty, etc) or
numerical (such as weight,
age, height, income,
Continued…
• According to class interval -
Data relating to income,
production, age, weight, come
under category. Such data are
known as statistics of variables
and are classified on the basis of
class interval
• Tabulation:- arrangement of
data in to rows and columns so
that it becomes easy for analysis,
comparison, statistical
computations, summation of items
and detection of errors and
7.2. Analysis
• It is further transformation of the
processed data to look for patterns
& relations among data groups
• The computation of certain
measures along with searching for
r/ships that exist among the data
groups
• It involves estimating the values of
unknown parameters of the
population and testing of hypothesis
for drawing inferences
• Analysis can be categorized as:
– Descriptive Analysis
– Inferential (Statistical) Analysis
7.2.1 Descriptive analysis
• It is largely the study of distribution
of one variable
• Profiles of companies, work groups,
persons, etc on any of a multiple of
characteristics such as size,
composition, efficiency, preference
etc. This sort of analysis can be in
respect of 1, 2, more than 3
variables (unidimensional,
Bivariate, multivariate )
• The calculation of averages,
frequency distribution, and
percentage distribution is the most
common form of summarizing data.
The most common forms of
describing the processed data
are:
Tabulation
Percentage
Measure of central tendency
Measure of dispersion
Measure of asymmetry
Data transformation
• It is the process of changing
original form of data to a
form that is more suitable to
perform a data analysis that
will achieve the research
objective.
1) Tabulation
• Refers to the orderly arrangement
of data in a table or other
summary format.
• It presents responses or the
observations on a question-by-
question basis & provides the
most basic form of information.
• It tells the researcher how
frequently each response occurs
• The starting pint of analysis
requires the counting of
responses or observations for
each of the categories. E.g.
Frequency tables
2) Percentage
– Whether the data are tabulated by
computer or by hand, it is useful to
have percentages and cumulative
percentage.
– Table containing percentage and
frequency distribution is easier to
interpret.
– Percentages are useful for
comparing the trend over time or
among categories
3) Measure of central tendency
– It is also known as statistical
average. Mean, median and mode
are most popular averages.
– Mean (arithmetic mean) is the
common measure of central
tendency
– Mode is not commonly used one
– Median is commonly used in
estimating the average of
qualitative phenomenon like
estimating intelligence.
4) Measurement of dispersion
• How the value of an item is
scattered around the true value of
the mean.
• It is a measurement of how far is
the value of the variable far from
the average value.
Important measures of dispersion
are:
• Range:
• Mean deviation: It is the average
dispersion of an observation
around the mean value. (Xi – X)/n
• Variance: It measures the sample
5) Measurement of asymmetry
(skew-ness)
• When the distribution of items is
happen to be perfectly symmetrical,
then we have a normal curve & the
distribution is normal. Such curve is
perfectly bell shaped curve in which
case the value of Mean = Median =
Mode
• Under this condition the skew-ness
is altogether absent. If the curve is
distorted (whether on the right or
the left side), we have asymmetric
distribution which indicates that
there is a skew-ness.
7.2.2. Inferential Analysis
• Researchers frequently conduct
& seek to determine the r/ship
between variables & test
statistical significance
• If we have data on two variables
we said to have a bivariate
variable, if the data is more than
two variables then the population
is known as multivariate
population
• If for every measure of a variable
X, we have corresponding value
of variable Y, the resulting pairs of
value are called a bivariate
population
Continued…..
• In case of bivariate or multivariate
population, we often wish to know
the relationship between the two or
more variables from the data
obtained.
E.g. We may like to know, “Whether
the number of hours students devote
for study is somehow related to their
family income, to age, to sex, or to
similar other factors.
Continued……
Two questions should be
answered to determine the
relationship between
variables:
1. Is there exist association or
correlation between the two
or more variables? If yes,
then up to what degree?
• This will be answered by
the use of correlation
technique.
• In case of bivariate population,
correlation can be found using
– Cross tabulation
– Karl Pearson’s coefficient of
correlation: It is simple
correlation and commonly
used
– Charles Spearman’s
coefficient of correlation
• In case of multivariate
population correlation can be
studied through:
– Coefficient of multiple
correlation
– Coefficient of partial
correlation
2. Is there any cause and effect
(causal relationship) between two
variables or between one variable
on one side and two or more
variables on the other side?
• This question can be answered
by the use of regression analysis.
• In regression analysis the
researcher tries to estimate or
predict the average value of one
variable on the basis of the value
of other variable.
• For instance a researcher
estimates the average value
score on statistics knowing a
student’s score on a mathematics
• There are different
techniques of regression:
–In case of bivariate
population cause and
effect relationship can be
studied through simple
regression.
–In case of multivariate
population. causal
relationship can be
studied through multiple
regression analysis.
Time series Analysis
• Successive observations of the
given phenomenon over a period
of time are analyzed through time
series analysis. It measures the
relationship between variables
and time (trend)
• Time series will measure
seasonal fluctuation, cyclical
irregular fluctuation, and trend.
• The analysis of time series is
done to understand the dynamic
condition of achieving the short
term and long-term goal of
business firm for forecasting

Weitere ähnliche Inhalte

Ähnlich wie RM7.ppt

Data Processing and Statistical Treatment.pptx
Data Processing and Statistical Treatment.pptxData Processing and Statistical Treatment.pptx
Data Processing and Statistical Treatment.pptxVamPagauraAlvarado
 
Chapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and TabulationChapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and TabulationInternational advisers
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminologyaswhite
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysisILRI-Jmaru
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAttaullah Khan
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statisticsdonthuraj
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptxTawhid4
 
Week 2 measures of disease occurence
Week 2  measures of disease occurenceWeek 2  measures of disease occurence
Week 2 measures of disease occurenceHamdi Alhakimi
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statisticsjasondroesch
 
Machine learning pre requisite
Machine learning pre requisiteMachine learning pre requisite
Machine learning pre requisiteRam Singh
 
Unit 4 editing and coding (2)
Unit 4 editing and coding (2)Unit 4 editing and coding (2)
Unit 4 editing and coding (2)kalailakshmi
 
Introduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse ResearchersIntroduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse ResearchersRupa Verma
 

Ähnlich wie RM7.ppt (20)

Data Processing and Statistical Treatment.pptx
Data Processing and Statistical Treatment.pptxData Processing and Statistical Treatment.pptx
Data Processing and Statistical Treatment.pptx
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Quantitative research
Quantitative researchQuantitative research
Quantitative research
 
Chapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and TabulationChapter 11 Data Analysis Classification and Tabulation
Chapter 11 Data Analysis Classification and Tabulation
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
 
BMS.ppt
BMS.pptBMS.ppt
BMS.ppt
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
 
Unit 1 - Statistics (Part 1).pptx
Unit 1 - Statistics (Part 1).pptxUnit 1 - Statistics (Part 1).pptx
Unit 1 - Statistics (Part 1).pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statistics
 
Stat and prob a recap
Stat and prob   a recapStat and prob   a recap
Stat and prob a recap
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptx
 
Week 2 measures of disease occurence
Week 2  measures of disease occurenceWeek 2  measures of disease occurence
Week 2 measures of disease occurence
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Presentation1
Presentation1Presentation1
Presentation1
 
Machine learning pre requisite
Machine learning pre requisiteMachine learning pre requisite
Machine learning pre requisite
 
Unit 4.pptx
Unit 4.pptxUnit 4.pptx
Unit 4.pptx
 
Unit 4 editing and coding (2)
Unit 4 editing and coding (2)Unit 4 editing and coding (2)
Unit 4 editing and coding (2)
 
Statistics
StatisticsStatistics
Statistics
 
Introduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse ResearchersIntroduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse Researchers
 

Kürzlich hochgeladen

Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 

Kürzlich hochgeladen (17)

Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 

RM7.ppt

  • 1. CHAPTER: 3.7 Overview of Data Processing and Analysis Editing Coding Classification and tabulation (data entry) Data Analysis Descriptive Inferential Statistics Univariate Bivariate Multivariate Processing
  • 2. 7.1. Data processing Data possessing implies • Editing:- examining the collected raw data to detect errors and omission to correct those when possible – Field editing:- completing what has been written in abbreviation and/ or in illegible form at a time of recording the respondents’ response – Central editing (to correct errors such as entry in the wrong place, omission) • Coding (assigning numerical or other symbols to answers so that responses can be put into a limited
  • 3. Continued… • Classification:- arranging data in groups or classes on the basis of common characteristics. Classifications: • According to attributes which is descriptive in nature (such as literacy, sex, honesty, etc) or numerical (such as weight, age, height, income,
  • 4. Continued… • According to class interval - Data relating to income, production, age, weight, come under category. Such data are known as statistics of variables and are classified on the basis of class interval • Tabulation:- arrangement of data in to rows and columns so that it becomes easy for analysis, comparison, statistical computations, summation of items and detection of errors and
  • 5. 7.2. Analysis • It is further transformation of the processed data to look for patterns & relations among data groups • The computation of certain measures along with searching for r/ships that exist among the data groups • It involves estimating the values of unknown parameters of the population and testing of hypothesis for drawing inferences • Analysis can be categorized as: – Descriptive Analysis – Inferential (Statistical) Analysis
  • 6. 7.2.1 Descriptive analysis • It is largely the study of distribution of one variable • Profiles of companies, work groups, persons, etc on any of a multiple of characteristics such as size, composition, efficiency, preference etc. This sort of analysis can be in respect of 1, 2, more than 3 variables (unidimensional, Bivariate, multivariate ) • The calculation of averages, frequency distribution, and percentage distribution is the most common form of summarizing data.
  • 7. The most common forms of describing the processed data are: Tabulation Percentage Measure of central tendency Measure of dispersion Measure of asymmetry
  • 8. Data transformation • It is the process of changing original form of data to a form that is more suitable to perform a data analysis that will achieve the research objective.
  • 9. 1) Tabulation • Refers to the orderly arrangement of data in a table or other summary format. • It presents responses or the observations on a question-by- question basis & provides the most basic form of information. • It tells the researcher how frequently each response occurs • The starting pint of analysis requires the counting of responses or observations for each of the categories. E.g. Frequency tables
  • 10. 2) Percentage – Whether the data are tabulated by computer or by hand, it is useful to have percentages and cumulative percentage. – Table containing percentage and frequency distribution is easier to interpret. – Percentages are useful for comparing the trend over time or among categories
  • 11. 3) Measure of central tendency – It is also known as statistical average. Mean, median and mode are most popular averages. – Mean (arithmetic mean) is the common measure of central tendency – Mode is not commonly used one – Median is commonly used in estimating the average of qualitative phenomenon like estimating intelligence.
  • 12. 4) Measurement of dispersion • How the value of an item is scattered around the true value of the mean. • It is a measurement of how far is the value of the variable far from the average value. Important measures of dispersion are: • Range: • Mean deviation: It is the average dispersion of an observation around the mean value. (Xi – X)/n • Variance: It measures the sample
  • 13. 5) Measurement of asymmetry (skew-ness) • When the distribution of items is happen to be perfectly symmetrical, then we have a normal curve & the distribution is normal. Such curve is perfectly bell shaped curve in which case the value of Mean = Median = Mode • Under this condition the skew-ness is altogether absent. If the curve is distorted (whether on the right or the left side), we have asymmetric distribution which indicates that there is a skew-ness.
  • 14. 7.2.2. Inferential Analysis • Researchers frequently conduct & seek to determine the r/ship between variables & test statistical significance • If we have data on two variables we said to have a bivariate variable, if the data is more than two variables then the population is known as multivariate population • If for every measure of a variable X, we have corresponding value of variable Y, the resulting pairs of value are called a bivariate population
  • 15. Continued….. • In case of bivariate or multivariate population, we often wish to know the relationship between the two or more variables from the data obtained. E.g. We may like to know, “Whether the number of hours students devote for study is somehow related to their family income, to age, to sex, or to similar other factors.
  • 16. Continued…… Two questions should be answered to determine the relationship between variables: 1. Is there exist association or correlation between the two or more variables? If yes, then up to what degree? • This will be answered by the use of correlation technique.
  • 17. • In case of bivariate population, correlation can be found using – Cross tabulation – Karl Pearson’s coefficient of correlation: It is simple correlation and commonly used – Charles Spearman’s coefficient of correlation • In case of multivariate population correlation can be studied through: – Coefficient of multiple correlation – Coefficient of partial correlation
  • 18. 2. Is there any cause and effect (causal relationship) between two variables or between one variable on one side and two or more variables on the other side? • This question can be answered by the use of regression analysis. • In regression analysis the researcher tries to estimate or predict the average value of one variable on the basis of the value of other variable. • For instance a researcher estimates the average value score on statistics knowing a student’s score on a mathematics
  • 19. • There are different techniques of regression: –In case of bivariate population cause and effect relationship can be studied through simple regression. –In case of multivariate population. causal relationship can be studied through multiple regression analysis.
  • 20. Time series Analysis • Successive observations of the given phenomenon over a period of time are analyzed through time series analysis. It measures the relationship between variables and time (trend) • Time series will measure seasonal fluctuation, cyclical irregular fluctuation, and trend. • The analysis of time series is done to understand the dynamic condition of achieving the short term and long-term goal of business firm for forecasting