SlideShare ist ein Scribd-Unternehmen logo
1 von 52
GOOD
MORNING
1
DESCRIPTIVE
DATA
Presented by
Dr. P. Gnana Sarita Kumari
I MDS
Department of Public Health Dentistry
2
CONTENTS
: INTRODUCTION
 TYPES OF VARIABLES AND LEVELS OF
MEASUREMENT
 MEASURES OF CENTRAL TENDENCY
 MEASURES OF DISPERSION
 NORMAL DISTRIBUTION
 MEASURES OF ASYMMETRY
 MEASURES OF RELATIONSHIP
 CONCLUSION
 REFERENCES
3
DESCRIPTIVE ANALYSIS :
• The data describe one group and that group only.
• Descriptive data analysis limits generalization to the
particular group of individuals observed.
• No conclusions are extended beyond this group.
• It provides valuable information about the nature of a
particular group of individuals.
INTRODUCTION 4
CLASSIFICATION OF VARIABLES
QUALITATIVE QUANTITATIVE
NOMINAL ORDINAL DISCRETE CONTINUOUS
5
LEVELS OF MEASUREMENT
• Introduced by STEVENS
6
NOMINAL MEASUREMENT SCALE
Nomina
scale
Represents
Simplest
of data
Values in
unordered
categories
No
quantitative
relationship
Numbers are
used for the
sake of
convenience
7
ORDINAL MEASUREMENT SCALE
Ordina
scale
Can be
ordered or
ranked
Though
ordered is not
quantified
Number or label
assigned does
indicate
magnitude
Precise
measurement
of differences
does not exist
8
INTERVAL MEASUREMENT SCALE
Interva
scale
Observations
can be ordered
Precise differences
between units of
measure exist
No meaningful
absolute zero
9
RATIO MEASUREMENT SCALE
Possess same properties as that of
interval scale
• Highest level of measurement
In this a true zero exist
10
MEASURES OF CENTRAL TENDENCY
• Mean
• Median
• Mode
11
TYPES OF MEAN
• Sample mean
• Weighted mean
• Geometric mean
• Harmonic mean
• Mean of two or more means
12
SAMPLE MEAN
Mean = Total or Sum of observations
Number of observations
For ungrouped series it is Calculated by :
1. DIRECT METHOD
2. ASSUMED MEAN METHOD
Where,
13
WEIGHTED MEAN
• Grouped data with a range of values :
 Also called GRAND MEAN
Calculation :
𝑋 𝑤= 𝑤1 𝑋1 + 𝑤2 𝑋2 + …….. + 𝑤 𝑛 𝑋 𝑛 = 𝑖=1
𝑛
𝑤𝑖𝑋𝑖
o By middle point method
o By alternative method
Let 𝑋1, 𝑋2,….., 𝑋 𝑛 be n measurements, and their relative importance be
expressed by a corresponding set of numbers 𝑤1, 𝑤2,…..., 𝑤 𝑛
14
GEOMETRIC MEAN
• The sample geometric mean of n non-negative observations, 𝑋1, 𝑋2,…..,
𝑋 𝑛, in a sample is defined by 𝒏 𝒕𝒉
root of the product.
𝑋 𝐺 = 𝑛
𝑋1. 𝑋2….. 𝑋 𝑛 = [𝑋1, 𝑋2,….., 𝑋 𝑛]1/𝑛
• If there are any negative measurements in a data set, the geometric
mean cannot be used.
15
HARMONIC MEAN
• Harmonic mean is defined as the reciprocal of the
average of reciprocals of the values of items of a series.
• Harmonic mean
16
MEAN OF TWO OR MORE MEANS
17
MEDIAN
• The median is the value that divides the distribution of data
points into two equal parts, that is, the value at which 50% of
the data points lie above it and 50% lie below it.
• The median is the middle of the quartiles (the values that
divide the series into quarters) and the middle of the
percentiles (the values that divide the series into defined
percentages).
18
Calculation :
Median for ungrouped series :
a) In a series with an odd number of untied values, the values in the series are
arranged from lowest to highest, and the value that divides the series in half is the
median.
b) In a series with even number of untied values, the two values that divide the
series in half are determined, and the arithmetic mean of these values is the median.
c) An alternative method for calculating the median is to determine the 50% value
on a cumulative frequency curve.
19
d] If the data include tied scores at the median point, interpolation
within the tied scores is necessary.
• Lets consider a series of 70, 73, 74, 75, 75, 75, 75, 80 in which the mid
point observations were tied.
20
Median for grouped data :
21
MODE
• The mode of a data set is that value that occurs with the greatest frequency.
• Whenever there are two non-adjacent scores with the same frequency and
they are the highest in the distribution, each score may be referred to as the
‘mode’ and the distribution is ‘bimodal’.
• In truly bimodal distribution, the population contains two sub-groups, each
of which has a different distribution that peaks at a different point.
• Calculation :
Mode = Mean – 3 [ Mean – Median ] or
= 3 Median – 2 Mean
22
MEASURES OF DISPERSION
• Percentile
• Range
• Inter-quartile range
• Mean deviation
• Standard deviation
• Variance
• Coefficient of variation
• To understand the data more completely, it is necessary to know how the members
of the data set arrange themselves about the central or typical value.
• The following questions must be answered:
1. How spread out are the data points?
2. How stable are the values in the group?
Based on percentiles
Based on mean
23
RANGE
• The range is the difference between the highest and
lowest values in a series.
Range = Maximum – Minimum.
• For example in the following series :
8, 8,10,10,10,12,13,14,15,16,58
Range = 18-8 = 10 min
24
PERCENTILE
• These are the percentage of observations below the point
indicated when all of the observations are ranked in ascending
order.
• The median is the 50th percentile.
• The 75th percentile is the point below which 75% of the
observations lie, while the 25th percentile is the point below which
25% of the observations lie.
25
INTER-QUARTILE RANGE
• The range of a variable between first quartile and the third
quartile is called inter-quartile range.
• Interquartile range = Q3 – Q1
• Median is the second quartile.
• Half of the median is called semi – interquartile range or
sometimes quartile deviation which is a measure of
dispersion around the mean.
26
MEAN DEVIATION
• Because the mean has several advantages, it might seem logical to
measure dispersion by taking the “average deviation” from the mean.
That proves to be useless, because the sum of the deviations from the
mean is 0.
• However, this inconvenience can easily be solved by computing the
mean deviation, which is the average of the absolute value of the
deviations from the mean, as shown in the following formula:
Mean deviation = |𝑋 − 𝑋|
n
27
VARIANCE
• The variance is the sum of the squared deviations from the mean divided by the
number of values in the series minus 1.
• Variance is symbolized by 𝑆2 or V.
𝑆2
= Σ(X − X)2
/n where Σ(X − X)2
is called sum of squares.
• Dividing by N-1 (called degrees of freedom), instead of dividing by N, is necessary
for the sample variance to be an unbiased estimator of the population variance.
• The numerator of the variance (i.e., the sum of the squared deviations of the
observations from the mean) is an extremely important entity in statistics. It is usually
called either the sum of squares (abbreviated SS) or the total sum of squares.
28
STANDARD DEVIATION
• The standard deviation is a measure of the variability among the
individual values within a group.
• Loosely defined, it is a description of the average distance of
individual observations from the group mean.
• From one point of view, however, the s is similar to the mean; that is;
it represents the mean of the squared deviations.
29
• Taking the mean and the standard deviation together, a sample can be described
in terms of its average score and in terms of its average variation.
• If more samples were taken from the same population it would be possible to
predict with some accuracy the average score of these samples and also the
amount of variation.
• The mathematical derivation of the standard deviation is presented here in some
detail because the intermediate steps in its calculation.
• (1) create a theme (called “sum of squares”) that is repeated over and over in
statistical arithmetic and (2) create the quantity known as the sample variance.
30
• The standard deviation is reported along with the sample mean, usually
in the following format: mean Âą SD.
• This format serves as a pertinent reminder that the SD measures the
variability of values surrounding the middle of the data set.
• It also leads us to the practical application of the concepts of mean and
standard deviation shown in the following rules of thumb:
X Âą 1 SD encompasses approximately 68% of the values in a group.
X Âą 2 SD encompasses approximately 95% of the values in a group.
X Âą 3 SD encompasses approximately 99% of the values in a group.
31
• These rules of thumb are useful when deciding whether to report
the mean Âą SD or the median and range as the appropriate
descriptive statistics for a group of data points.
• If roughly 95% of the values in a group are contained in the
interval ‘X’ ± 2SD, researchers tend to use mean ± SD. Otherwise
the median and the range are perhaps more appropriate.
32
Applications and characteristics
1. The standard deviation is extremely important in sampling theory, in co relational
analysis, in estimating reliability of measures, and in determining relative position of an
individual within a distribution of scores and between distributions of scores.
2. The standard deviation is the most widely used estimate of variation because of its
known algebraic properties and its amenability to use with other statistics.
3. It also provides a better estimate of variation in the population than the other indexes.
33
4. When the standard deviation of any sample is small, the sample mean is
close to any individual value.
5. When standard deviation of a random sample is small, the sample mean is
likely to be close to the mean of all the data in the population.
6. The standard deviation decreases when the sample size increases.
34
COEFFICIENT OF VARIATION
• The coefficient of variation is the ratio of the standard deviation of a series to
the arithmetic mean of the series.
• The coefficient of variation is unit less and is expressed as a percentage.
Application and characteristics
The co efficient of variation is used to compare the relative variation, or spread,
of the distributions of different series, samples, or populations or of the
distributions of different characteristics of a single series.
35
Calculation:
• The coefficient of variation (CV) is calculated as CV (%) = SD / X х100
• For example,
In a typical medical school, the mean weight of 100 fourth-year medical
students is 140 lb, with a standard deviation of 28 lb.
CV (%) = 28 / 140 х 100 = 20%
The coefficient of variation for weight is 28 lb divided by 140 lb, or 20%.
36
NORMAL DISTRIBUTION
• Normal distribution, also called Gaussian distribution, is a continuous,
symmetric, bell shaped distribution and can be defined by a number of
measures.
• The majority of measurements of continuous data in medicine and
biology tend to approximate the theoretical distribution that is known as
the normal distribution and is also called the Gaussian distribution
(named after Johann Karl Gauss, the person who best described it).
37
• The normal distribution is one of the most frequently used distributions in biomedical and dental
research.
• The normal distribution is a population frequency distribution.
• It is characterized by a bell-shaped curve that is unimodal and is symmetric around the mean of the
distribution.
• The normal curve depends on two parameters: the population mean and the population standard
deviation.
• In order to discuss the area under the normal curve in terms of easily seen percentages of the
population distribution, the normal distribution has been standardized to the normal distribution in
which the population mean is 0 and the population standard deviation is 1.
• The area under the normal curve can be segmented starting with the mean in the center (on the x
axis) and moving by increments of 1 SD above and below the mean.
38
Figure shows a standard normal distribution (mean = 0; SD= 1) and the
percentages of area under the curve at each increment of SD.
39
• The total area beneath the normal curve is 1, or 100% of the observations in the
population represented by the curve.
• As indicated in the figure, the portion of the area under the curve between the
mean and 1 SD is 34.13% of the total area.
• The same area is found between the mean and one unit below the mean.
• Moving 2 SD more above the mean cuts off an additional 13.59% of the area,
and moving a total of 3 SD above the mean cuts off another 2.27%.
40
• The theory of the standard normal distribution leads us, therefore, to the following
property of a normally distributed variable:
Exactly 68.26% of the observations lie within 1 SD of the mean.
Exactly 95.45% of the observations lie within 2 SD of the mean.
Exactly 99.73% of the observations lie within 3 SD of the mean.
• Virtually all of the observations are contained within 3 SD of the mean. This is the
justification used by those who label values outside of the interval `X Âą 3 SD as
“outliers” or unlikely values.
• Incidentally, the number of standard deviations away from the mean is called Z
score.
41
MEASURES OF ASYMMETRY
• Skewness
• kurtosis
42
SKEWNESS
A horizontal stretching of a frequency distribution to one side or
the other, so that one tail of observations is longer and has more
observations than the other tail, is called skewness.
43
• If a distribution is skewed, the mean moves farther in the direction of the
long tail than does the median, because the mean is more heavily
influenced by extreme values.
44
KURTOSIS
• It is characterized by a vertical
stretching of the frequency distribution.
• It is the measure of the peakedness of
a probability distribution.
• As shown in the figure kurtotic
distribution could look more peaked or
could look more flattened than the bell
shaped normal distribution.
• A normal distribution has zero kurtosis.
45
46
• Any distribution with kurtosis =3 is called as Mesokurtic.
• In Leptokurtic, the central peak is higher & sharper , tails are longer & flatter.
• In platykurtic, the central peak is lower & broader, tails are short & thinner.
MEASURES OF RELATIONSHIP
Correlation :
• This is used to assess the relationship between two continuous
variables within a group of subjects.
• This is used for quantifying any association between two
continuous variables. But it does not prove that one particular
variable alone causes the change in the other.
47
Correlation coefficient :
• This a measure of degree of straight line association
between two continuous variables.
• It is denoted by ‘r’ which may vary from -1 or +1.
• This can be of 5 types:
r = +1 [ perfect positive correlation ]
r = -1 [ perfect negative correlation ]
r = 0 [ no correlation ]
0 < r < 1 [ partially positive correlation ]
0 > r > -1 [ partially negative correlation ]
48
Types of correlation
49
CONCLUSION
• In conclusion we would like to know that the best research studies are
initiated with a statistical plan already created.
• This plan may or may not have been developed with the assistance of a
statistician.
• The first step of data analysis is usually to describe the sample and then
sub groups within the sample. Frequency distribution, mean, median,
mode, range and the standard deviation are the most commonly used
statistics for accomplishing this task.
• This information can also be used as a background for the discussion
regarding inferential statistics.
50
REFERENCES :
 SANJEEV. B SARMUKADDAM, FUNDAMENTALS OF BIOSTATISTICS, 1st EDITION,
NEW DELHI, JITENDRA.P, 2006
 JOHN W. BEST AND JAMES V. KAHN, RESEARCH IN EDUCATION, 9th EDITION,
NEW DELHI, ASOKE K. GHOSH, 2006
 JAY S. KIM AND RONALD J. DAILEY, BIOSTATISTICS FOR ORAL HEALTH CARE, 1st
EDITION, NEW DELHI, BLACKWELL, 2008
 C. R. KOTHARI, RESEARCH METHODOLOGY, 2nd EDITION, NEW DELHI, NEW AGE
INTERNATIONAL LIMITED, 2004
 RONALD N. FORTHOFER, INTRODUCTION TO BIOSTATISTICS, LONDON,
ACADEMIC PRESS, 1995
51
 BRATATI BANERJEE, MAHAJAN’S METHODS IN BIOSTATISTICS, 9th
EDITION, NEW DELHI, JAYPEE BROTHERS, 2018
 F GAO SMITH AND J E SMITH, CLINICAL RESEARCH, 2nd EDITION, UK, BIOS
SCIENTIFIC PUBLISHERS LIMITED, 2005
 JAMES. F JEKEL, EPIDEMIOLOGY, BIOSTATISTICS AND PREVENTIVE
MEDICINE, 3rd EDITION, SAUNDERS, ELSEVIER PUBLICATIONS, 2007
 CHERYL BAGLEY THOMPSON, ‘DESCRIPTIVE DATA ANALYSIS’, AIR
DENTAL JOURNAL, 2009, VOLUME 28 [ 2 ] : 56 - 59
52

Weitere ähnliche Inhalte

Was ist angesagt?

Statistical analysis and interpretation
Statistical analysis and interpretationStatistical analysis and interpretation
Statistical analysis and interpretationDave Marcial
 
What is statistics
What is statisticsWhat is statistics
What is statisticsRaj Teotia
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statisticsDalia El-Shafei
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"Bashir7576
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAttaullah Khan
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to StatisticsAnjan Mahanta
 
Descriptive statistics ii
Descriptive statistics iiDescriptive statistics ii
Descriptive statistics iiMohammad Ihmeidan
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersionGnana Sravani
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsSarfraz Ahmad
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statisticsKapil Dev Ghante
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAnand Thokal
 
An introduction to spss
An introduction to spssAn introduction to spss
An introduction to spsszeeshanwrch
 
Scatter diagram
Scatter diagramScatter diagram
Scatter diagramsagar kunwar
 
Types of variables and descriptive statistics
Types of variables and descriptive statisticsTypes of variables and descriptive statistics
Types of variables and descriptive statisticsDhritiman Chakrabarti
 
Concept of Inferential statistics
Concept of Inferential statisticsConcept of Inferential statistics
Concept of Inferential statisticsSarfraz Ahmad
 
Introduction to Descriptive Statistics
Introduction to Descriptive StatisticsIntroduction to Descriptive Statistics
Introduction to Descriptive StatisticsSanju Rusara Seneviratne
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisVishwas N
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statisticsaan786
 

Was ist angesagt? (20)

Statistical analysis and interpretation
Statistical analysis and interpretationStatistical analysis and interpretation
Statistical analysis and interpretation
 
What is statistics
What is statisticsWhat is statistics
What is statistics
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Descriptive statistics ii
Descriptive statistics iiDescriptive statistics ii
Descriptive statistics ii
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Data entry in Excel and SPSS
Data entry in Excel and SPSS Data entry in Excel and SPSS
Data entry in Excel and SPSS
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
An introduction to spss
An introduction to spssAn introduction to spss
An introduction to spss
 
Scatter diagram
Scatter diagramScatter diagram
Scatter diagram
 
Types of variables and descriptive statistics
Types of variables and descriptive statisticsTypes of variables and descriptive statistics
Types of variables and descriptive statistics
 
Concept of Inferential statistics
Concept of Inferential statisticsConcept of Inferential statistics
Concept of Inferential statistics
 
Introduction to Descriptive Statistics
Introduction to Descriptive StatisticsIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 

Ähnlich wie descriptive data analysis

Ch2 Data Description
Ch2 Data DescriptionCh2 Data Description
Ch2 Data DescriptionFarhan Alfin
 
Business statistics
Business statisticsBusiness statistics
Business statisticsRavi Prakash
 
UNIT III -Central Tendency.ppt
UNIT III -Central Tendency.pptUNIT III -Central Tendency.ppt
UNIT III -Central Tendency.pptssuser620c82
 
Measure OF Central Tendency
Measure OF Central TendencyMeasure OF Central Tendency
Measure OF Central TendencyIqrabutt038
 
Statistics for Medical students
Statistics for Medical studentsStatistics for Medical students
Statistics for Medical studentsANUSWARUM
 
Statr sessions 4 to 6
Statr sessions 4 to 6Statr sessions 4 to 6
Statr sessions 4 to 6Ruru Chowdhury
 
3. Statistical Analysis.pptx
3. Statistical Analysis.pptx3. Statistical Analysis.pptx
3. Statistical Analysis.pptxjeyanthisivakumar
 
Upload 140103034715-phpapp01 (1)
Upload 140103034715-phpapp01 (1)Upload 140103034715-phpapp01 (1)
Upload 140103034715-phpapp01 (1)captaininfantry
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyMmedsc Hahm
 
Biostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptxBiostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptxSailajaReddyGunnam
 
Descriptive Statistics.pptx
Descriptive Statistics.pptxDescriptive Statistics.pptx
Descriptive Statistics.pptxtest215275
 
Measures of central tendancy
Measures of central tendancy Measures of central tendancy
Measures of central tendancy Pranav Krishna
 
2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)Chhom Karath
 
measures of central tendency in statistics which is essential for business ma...
measures of central tendency in statistics which is essential for business ma...measures of central tendency in statistics which is essential for business ma...
measures of central tendency in statistics which is essential for business ma...SoujanyaLk1
 
State presentation2
State presentation2State presentation2
State presentation2Lata Bhatta
 

Ähnlich wie descriptive data analysis (20)

Ch2 Data Description
Ch2 Data DescriptionCh2 Data Description
Ch2 Data Description
 
Business statistics
Business statisticsBusiness statistics
Business statistics
 
UNIT III -Central Tendency.ppt
UNIT III -Central Tendency.pptUNIT III -Central Tendency.ppt
UNIT III -Central Tendency.ppt
 
chapter3.ppt
chapter3.pptchapter3.ppt
chapter3.ppt
 
Measure OF Central Tendency
Measure OF Central TendencyMeasure OF Central Tendency
Measure OF Central Tendency
 
Statistics for Medical students
Statistics for Medical studentsStatistics for Medical students
Statistics for Medical students
 
Unit 3_1.pptx
Unit 3_1.pptxUnit 3_1.pptx
Unit 3_1.pptx
 
Statr sessions 4 to 6
Statr sessions 4 to 6Statr sessions 4 to 6
Statr sessions 4 to 6
 
SUMMARY MEASURES.pdf
SUMMARY MEASURES.pdfSUMMARY MEASURES.pdf
SUMMARY MEASURES.pdf
 
3. Statistical Analysis.pptx
3. Statistical Analysis.pptx3. Statistical Analysis.pptx
3. Statistical Analysis.pptx
 
Upload 140103034715-phpapp01 (1)
Upload 140103034715-phpapp01 (1)Upload 140103034715-phpapp01 (1)
Upload 140103034715-phpapp01 (1)
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Biostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptxBiostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptx
 
Descriptive Statistics.pptx
Descriptive Statistics.pptxDescriptive Statistics.pptx
Descriptive Statistics.pptx
 
Measures of central tendancy
Measures of central tendancy Measures of central tendancy
Measures of central tendancy
 
2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)
 
measures of central tendency in statistics which is essential for business ma...
measures of central tendency in statistics which is essential for business ma...measures of central tendency in statistics which is essential for business ma...
measures of central tendency in statistics which is essential for business ma...
 
Statistics
StatisticsStatistics
Statistics
 
BMS.ppt
BMS.pptBMS.ppt
BMS.ppt
 
State presentation2
State presentation2State presentation2
State presentation2
 

KĂźrzlich hochgeladen

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectBoston Institute of Analytics
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 

KĂźrzlich hochgeladen (20)

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

descriptive data analysis

  • 2. DESCRIPTIVE DATA Presented by Dr. P. Gnana Sarita Kumari I MDS Department of Public Health Dentistry 2
  • 3. CONTENTS : INTRODUCTION  TYPES OF VARIABLES AND LEVELS OF MEASUREMENT  MEASURES OF CENTRAL TENDENCY  MEASURES OF DISPERSION  NORMAL DISTRIBUTION  MEASURES OF ASYMMETRY  MEASURES OF RELATIONSHIP  CONCLUSION  REFERENCES 3
  • 4. DESCRIPTIVE ANALYSIS : • The data describe one group and that group only. • Descriptive data analysis limits generalization to the particular group of individuals observed. • No conclusions are extended beyond this group. • It provides valuable information about the nature of a particular group of individuals. INTRODUCTION 4
  • 5. CLASSIFICATION OF VARIABLES QUALITATIVE QUANTITATIVE NOMINAL ORDINAL DISCRETE CONTINUOUS 5
  • 6. LEVELS OF MEASUREMENT • Introduced by STEVENS 6
  • 7. NOMINAL MEASUREMENT SCALE Nomina scale Represents Simplest of data Values in unordered categories No quantitative relationship Numbers are used for the sake of convenience 7
  • 8. ORDINAL MEASUREMENT SCALE Ordina scale Can be ordered or ranked Though ordered is not quantified Number or label assigned does indicate magnitude Precise measurement of differences does not exist 8
  • 9. INTERVAL MEASUREMENT SCALE Interva scale Observations can be ordered Precise differences between units of measure exist No meaningful absolute zero 9
  • 10. RATIO MEASUREMENT SCALE Possess same properties as that of interval scale • Highest level of measurement In this a true zero exist 10
  • 11. MEASURES OF CENTRAL TENDENCY • Mean • Median • Mode 11
  • 12. TYPES OF MEAN • Sample mean • Weighted mean • Geometric mean • Harmonic mean • Mean of two or more means 12
  • 13. SAMPLE MEAN Mean = Total or Sum of observations Number of observations For ungrouped series it is Calculated by : 1. DIRECT METHOD 2. ASSUMED MEAN METHOD Where, 13
  • 14. WEIGHTED MEAN • Grouped data with a range of values :  Also called GRAND MEAN Calculation : 𝑋 𝑤= 𝑤1 𝑋1 + 𝑤2 𝑋2 + …….. + 𝑤 𝑛 𝑋 𝑛 = 𝑖=1 𝑛 𝑤𝑖𝑋𝑖 o By middle point method o By alternative method Let 𝑋1, 𝑋2,….., 𝑋 𝑛 be n measurements, and their relative importance be expressed by a corresponding set of numbers 𝑤1, 𝑤2,…..., 𝑤 𝑛 14
  • 15. GEOMETRIC MEAN • The sample geometric mean of n non-negative observations, 𝑋1, 𝑋2,….., 𝑋 𝑛, in a sample is defined by 𝒏 𝒕𝒉 root of the product. 𝑋 𝐺 = 𝑛 𝑋1. 𝑋2….. 𝑋 𝑛 = [𝑋1, 𝑋2,….., 𝑋 𝑛]1/𝑛 • If there are any negative measurements in a data set, the geometric mean cannot be used. 15
  • 16. HARMONIC MEAN • Harmonic mean is defined as the reciprocal of the average of reciprocals of the values of items of a series. • Harmonic mean 16
  • 17. MEAN OF TWO OR MORE MEANS 17
  • 18. MEDIAN • The median is the value that divides the distribution of data points into two equal parts, that is, the value at which 50% of the data points lie above it and 50% lie below it. • The median is the middle of the quartiles (the values that divide the series into quarters) and the middle of the percentiles (the values that divide the series into defined percentages). 18
  • 19. Calculation : Median for ungrouped series : a) In a series with an odd number of untied values, the values in the series are arranged from lowest to highest, and the value that divides the series in half is the median. b) In a series with even number of untied values, the two values that divide the series in half are determined, and the arithmetic mean of these values is the median. c) An alternative method for calculating the median is to determine the 50% value on a cumulative frequency curve. 19
  • 20. d] If the data include tied scores at the median point, interpolation within the tied scores is necessary. • Lets consider a series of 70, 73, 74, 75, 75, 75, 75, 80 in which the mid point observations were tied. 20
  • 21. Median for grouped data : 21
  • 22. MODE • The mode of a data set is that value that occurs with the greatest frequency. • Whenever there are two non-adjacent scores with the same frequency and they are the highest in the distribution, each score may be referred to as the ‘mode’ and the distribution is ‘bimodal’. • In truly bimodal distribution, the population contains two sub-groups, each of which has a different distribution that peaks at a different point. • Calculation : Mode = Mean – 3 [ Mean – Median ] or = 3 Median – 2 Mean 22
  • 23. MEASURES OF DISPERSION • Percentile • Range • Inter-quartile range • Mean deviation • Standard deviation • Variance • Coefficient of variation • To understand the data more completely, it is necessary to know how the members of the data set arrange themselves about the central or typical value. • The following questions must be answered: 1. How spread out are the data points? 2. How stable are the values in the group? Based on percentiles Based on mean 23
  • 24. RANGE • The range is the difference between the highest and lowest values in a series. Range = Maximum – Minimum. • For example in the following series : 8, 8,10,10,10,12,13,14,15,16,58 Range = 18-8 = 10 min 24
  • 25. PERCENTILE • These are the percentage of observations below the point indicated when all of the observations are ranked in ascending order. • The median is the 50th percentile. • The 75th percentile is the point below which 75% of the observations lie, while the 25th percentile is the point below which 25% of the observations lie. 25
  • 26. INTER-QUARTILE RANGE • The range of a variable between first quartile and the third quartile is called inter-quartile range. • Interquartile range = Q3 – Q1 • Median is the second quartile. • Half of the median is called semi – interquartile range or sometimes quartile deviation which is a measure of dispersion around the mean. 26
  • 27. MEAN DEVIATION • Because the mean has several advantages, it might seem logical to measure dispersion by taking the “average deviation” from the mean. That proves to be useless, because the sum of the deviations from the mean is 0. • However, this inconvenience can easily be solved by computing the mean deviation, which is the average of the absolute value of the deviations from the mean, as shown in the following formula: Mean deviation = |𝑋 − 𝑋| n 27
  • 28. VARIANCE • The variance is the sum of the squared deviations from the mean divided by the number of values in the series minus 1. • Variance is symbolized by 𝑆2 or V. 𝑆2 = ÎŁ(X − X)2 /n where ÎŁ(X − X)2 is called sum of squares. • Dividing by N-1 (called degrees of freedom), instead of dividing by N, is necessary for the sample variance to be an unbiased estimator of the population variance. • The numerator of the variance (i.e., the sum of the squared deviations of the observations from the mean) is an extremely important entity in statistics. It is usually called either the sum of squares (abbreviated SS) or the total sum of squares. 28
  • 29. STANDARD DEVIATION • The standard deviation is a measure of the variability among the individual values within a group. • Loosely defined, it is a description of the average distance of individual observations from the group mean. • From one point of view, however, the s is similar to the mean; that is; it represents the mean of the squared deviations. 29
  • 30. • Taking the mean and the standard deviation together, a sample can be described in terms of its average score and in terms of its average variation. • If more samples were taken from the same population it would be possible to predict with some accuracy the average score of these samples and also the amount of variation. • The mathematical derivation of the standard deviation is presented here in some detail because the intermediate steps in its calculation. • (1) create a theme (called “sum of squares”) that is repeated over and over in statistical arithmetic and (2) create the quantity known as the sample variance. 30
  • 31. • The standard deviation is reported along with the sample mean, usually in the following format: mean Âą SD. • This format serves as a pertinent reminder that the SD measures the variability of values surrounding the middle of the data set. • It also leads us to the practical application of the concepts of mean and standard deviation shown in the following rules of thumb: X Âą 1 SD encompasses approximately 68% of the values in a group. X Âą 2 SD encompasses approximately 95% of the values in a group. X Âą 3 SD encompasses approximately 99% of the values in a group. 31
  • 32. • These rules of thumb are useful when deciding whether to report the mean Âą SD or the median and range as the appropriate descriptive statistics for a group of data points. • If roughly 95% of the values in a group are contained in the interval ‘X’ Âą 2SD, researchers tend to use mean Âą SD. Otherwise the median and the range are perhaps more appropriate. 32
  • 33. Applications and characteristics 1. The standard deviation is extremely important in sampling theory, in co relational analysis, in estimating reliability of measures, and in determining relative position of an individual within a distribution of scores and between distributions of scores. 2. The standard deviation is the most widely used estimate of variation because of its known algebraic properties and its amenability to use with other statistics. 3. It also provides a better estimate of variation in the population than the other indexes. 33
  • 34. 4. When the standard deviation of any sample is small, the sample mean is close to any individual value. 5. When standard deviation of a random sample is small, the sample mean is likely to be close to the mean of all the data in the population. 6. The standard deviation decreases when the sample size increases. 34
  • 35. COEFFICIENT OF VARIATION • The coefficient of variation is the ratio of the standard deviation of a series to the arithmetic mean of the series. • The coefficient of variation is unit less and is expressed as a percentage. Application and characteristics The co efficient of variation is used to compare the relative variation, or spread, of the distributions of different series, samples, or populations or of the distributions of different characteristics of a single series. 35
  • 36. Calculation: • The coefficient of variation (CV) is calculated as CV (%) = SD / X х100 • For example, In a typical medical school, the mean weight of 100 fourth-year medical students is 140 lb, with a standard deviation of 28 lb. CV (%) = 28 / 140 х 100 = 20% The coefficient of variation for weight is 28 lb divided by 140 lb, or 20%. 36
  • 37. NORMAL DISTRIBUTION • Normal distribution, also called Gaussian distribution, is a continuous, symmetric, bell shaped distribution and can be defined by a number of measures. • The majority of measurements of continuous data in medicine and biology tend to approximate the theoretical distribution that is known as the normal distribution and is also called the Gaussian distribution (named after Johann Karl Gauss, the person who best described it). 37
  • 38. • The normal distribution is one of the most frequently used distributions in biomedical and dental research. • The normal distribution is a population frequency distribution. • It is characterized by a bell-shaped curve that is unimodal and is symmetric around the mean of the distribution. • The normal curve depends on two parameters: the population mean and the population standard deviation. • In order to discuss the area under the normal curve in terms of easily seen percentages of the population distribution, the normal distribution has been standardized to the normal distribution in which the population mean is 0 and the population standard deviation is 1. • The area under the normal curve can be segmented starting with the mean in the center (on the x axis) and moving by increments of 1 SD above and below the mean. 38
  • 39. Figure shows a standard normal distribution (mean = 0; SD= 1) and the percentages of area under the curve at each increment of SD. 39
  • 40. • The total area beneath the normal curve is 1, or 100% of the observations in the population represented by the curve. • As indicated in the figure, the portion of the area under the curve between the mean and 1 SD is 34.13% of the total area. • The same area is found between the mean and one unit below the mean. • Moving 2 SD more above the mean cuts off an additional 13.59% of the area, and moving a total of 3 SD above the mean cuts off another 2.27%. 40
  • 41. • The theory of the standard normal distribution leads us, therefore, to the following property of a normally distributed variable: Exactly 68.26% of the observations lie within 1 SD of the mean. Exactly 95.45% of the observations lie within 2 SD of the mean. Exactly 99.73% of the observations lie within 3 SD of the mean. • Virtually all of the observations are contained within 3 SD of the mean. This is the justification used by those who label values outside of the interval `X Âą 3 SD as “outliers” or unlikely values. • Incidentally, the number of standard deviations away from the mean is called Z score. 41
  • 42. MEASURES OF ASYMMETRY • Skewness • kurtosis 42
  • 43. SKEWNESS A horizontal stretching of a frequency distribution to one side or the other, so that one tail of observations is longer and has more observations than the other tail, is called skewness. 43
  • 44. • If a distribution is skewed, the mean moves farther in the direction of the long tail than does the median, because the mean is more heavily influenced by extreme values. 44
  • 45. KURTOSIS • It is characterized by a vertical stretching of the frequency distribution. • It is the measure of the peakedness of a probability distribution. • As shown in the figure kurtotic distribution could look more peaked or could look more flattened than the bell shaped normal distribution. • A normal distribution has zero kurtosis. 45
  • 46. 46 • Any distribution with kurtosis =3 is called as Mesokurtic. • In Leptokurtic, the central peak is higher & sharper , tails are longer & flatter. • In platykurtic, the central peak is lower & broader, tails are short & thinner.
  • 47. MEASURES OF RELATIONSHIP Correlation : • This is used to assess the relationship between two continuous variables within a group of subjects. • This is used for quantifying any association between two continuous variables. But it does not prove that one particular variable alone causes the change in the other. 47
  • 48. Correlation coefficient : • This a measure of degree of straight line association between two continuous variables. • It is denoted by ‘r’ which may vary from -1 or +1. • This can be of 5 types: r = +1 [ perfect positive correlation ] r = -1 [ perfect negative correlation ] r = 0 [ no correlation ] 0 < r < 1 [ partially positive correlation ] 0 > r > -1 [ partially negative correlation ] 48
  • 50. CONCLUSION • In conclusion we would like to know that the best research studies are initiated with a statistical plan already created. • This plan may or may not have been developed with the assistance of a statistician. • The first step of data analysis is usually to describe the sample and then sub groups within the sample. Frequency distribution, mean, median, mode, range and the standard deviation are the most commonly used statistics for accomplishing this task. • This information can also be used as a background for the discussion regarding inferential statistics. 50
  • 51. REFERENCES :  SANJEEV. B SARMUKADDAM, FUNDAMENTALS OF BIOSTATISTICS, 1st EDITION, NEW DELHI, JITENDRA.P, 2006  JOHN W. BEST AND JAMES V. KAHN, RESEARCH IN EDUCATION, 9th EDITION, NEW DELHI, ASOKE K. GHOSH, 2006  JAY S. KIM AND RONALD J. DAILEY, BIOSTATISTICS FOR ORAL HEALTH CARE, 1st EDITION, NEW DELHI, BLACKWELL, 2008  C. R. KOTHARI, RESEARCH METHODOLOGY, 2nd EDITION, NEW DELHI, NEW AGE INTERNATIONAL LIMITED, 2004  RONALD N. FORTHOFER, INTRODUCTION TO BIOSTATISTICS, LONDON, ACADEMIC PRESS, 1995 51
  • 52.  BRATATI BANERJEE, MAHAJAN’S METHODS IN BIOSTATISTICS, 9th EDITION, NEW DELHI, JAYPEE BROTHERS, 2018  F GAO SMITH AND J E SMITH, CLINICAL RESEARCH, 2nd EDITION, UK, BIOS SCIENTIFIC PUBLISHERS LIMITED, 2005  JAMES. F JEKEL, EPIDEMIOLOGY, BIOSTATISTICS AND PREVENTIVE MEDICINE, 3rd EDITION, SAUNDERS, ELSEVIER PUBLICATIONS, 2007  CHERYL BAGLEY THOMPSON, ‘DESCRIPTIVE DATA ANALYSIS’, AIR DENTAL JOURNAL, 2009, VOLUME 28 [ 2 ] : 56 - 59 52