Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Box whisker show
1. Application of Box-whisker Plot in
Psychological Research
Dr. D. Dutta Roy, Ph.D.
Psychology Research Unit
INDIAN STATISTICAL INSTITUTE
203, B.T. Road, Kolkata – 700 108
E-mail: ddroy@isical.ac.in
http://www.isical.ac.in/~ddroy
Venue: Psychology Research Unit, ISI., Kolkata
2. Box-Whisker Plot
JOHN WILDER TUCKY(1915-2000)
• It is a plot that displays summary
information about the
distribution of the values.
• SPSS and STATISTICA are
useful statistical software to
draw box whisker plot.
Dr. D. Dutta Roy, Indian Statistical Institute
3. PROPERTIES
Dr. D. Dutta Roy, Indian Statistical Institute
4. HINGES
• There are two hinges 25th and 75th percentiles.
• The lowest boundary of the box is the 25th
percentile and upper boundary of the box is 75th
percentile.
• Horizontal line inside the box represents the
median. 50% of the cases are included within the
box.
Dr. D. Dutta Roy, Indian Statistical Institute
5. Whiskers
The largest and smallest
observed values that
are not outliers are
shown in lines.
Lines are drawn from the
ends of the box to
these values. These
lines are called
whiskers.
Dr. D. Dutta Roy, Indian Statistical Institute
6. OUTLYING VALUES
:cases with values that are 1.5 box length
more than three box- 6
5
lengths from the upper or
6
4
lower age of the box are 3
called extreme values. 2
These are designated with an 1
asterisk(*) and O.
0
N= 6
DATA
3 box length
Cases with values that are 12
between 1.5 and 3 box- 10 6
lengths from the upper or 8
lower age of the box are 6
called outliers and 4
designated with a circle or 2
0
O N= 6
DATA
Dr. D. Dutta Roy, Indian Statistical Institute
7. Normal Probability Curve Properties
• Mean, Median, Mode values = 0.
• The mean, median and the mode all
coincide and there is perfect balance
between the right and left halves of the
curve.
• Between the Mean and ( + - 1 SD) or
the middle two-thirds = 68.27% of total
cases.
• Between the Mean and (+ - 2 SD) =
95% of total cases.
• Between the Mean and (+ - 3 SD) =
99.7% or 100% of total cases.
• Skewness = 0.
• Positive skewness = When distribution
spreads to the left, it is negatively
skewed and positive skewness is
opposite.
• Peakedness =Mesokurtic.
Dr. D. Dutta Roy, Indian Statistical Institute
10. What is outlier ?
• Outliers are
observations with a
unique combination
of characteristics
identifiable as
distinctly different
Do you find outliers in the pictures ?
from the other
observations.
Dr. D. Dutta Roy, Indian Statistical Institute
11. Impact of outliers
Correlations (After Eliminating 99999)
Correlations
income
income
Pearson
Pearson 1
1 income Correlation
income Correlation
Sig. (2-tailed)
Sig. (2-tailed)
N 58
N 60
Pearson
Pearson 0.50
0.16988 expenditure Correlation
expenditure Correlation
Sig. (2-tailed) 0.00
0.20646 N 55
Sig. (2-tailed)
N 57 **. Correlation is significant at the 0.01 level (2-tailed).
Dr. D. Dutta Roy, Indian Statistical Institute
13. 1. Procedural Error
This is data entry error for mistake in coding.
Dr. D. Dutta Roy, Indian Statistical Institute
14. 2. Extra ordinary event and researcher has
own explanation.
Correlations (After Eliminating 99999)
Correlations
income
income
Pearson
Pearson 1
1 income Correlation
income Correlation
Sig. (2-tailed)
Sig. (2-tailed)
N 58
N 60
Pearson
Pearson 0.50
0.16988 expenditure Correlation
expenditure Correlation
Sig. (2-tailed) 0.00
0.20646 N 55
Sig. (2-tailed)
N 57 **. Correlation is significant at the 0.01 level (2-tailed).
Dr. D. Dutta Roy, Indian Statistical Institute
15. 3. Extra ordinary event and researcher has no
explanation.
4. Observations that fall within the ordinary
range of values or each of the variables but
are unique in their combination of values
across the variables.
Dr. D. Dutta Roy, Indian Statistical Institute
16. Is outlier harmful ?
• Outliers can not be
categorized as either
beneficial or problematic,
but instead must be viewed
within the context of the
analysis and should be
evaluated by the types of
information they may
provide.
Dr. D. Dutta Roy, Indian Statistical Institute
17. Can outlier be detected ?
• Robust statistics like correlation is seriously affected by the outliers. Therefore outlier
detection is prelude for item analysis, or testing reliability and validity of the
psychological instrument using correlation coefficients.
• In univariate statistics, Outlier can be detected by stem-leaf plot and box-whisker plots.
• In bivariate statistics, scatter plot and in multivariate statistics, Mahalanobis D2 is
useful for outlier detection.
12
10 6
8
6
4
2
0
N= 6
DATA
Dr. D. Dutta Roy, Indian Statistical Institute
18. The Information out of properties
The box-plot contains an impressive amount of information.
• From the median one can determine the central tendency or location.
• From the length of the box one can determine the spread, or
variability, of observation.
• If the median is not in the centre of the box, the observed values are
skewed.
• If the median is closer to the bottom of the box than to the top, the
data are positively skewed.
• If the median is closer to the top of the box than to the bottom the
distribution is negatively skewed.
• The length of the tail is shown by the whiskers and the outline and
extreme points.
Dr. D. Dutta Roy, Indian Statistical Institute
19. CASE STUDY ON
APPLICATION OF BOX-
WHISKER PLOT IN DETECTING
CHANGE IN LEARNING
PROCESS
Dr. D. Dutta Roy, Indian Statistical Institute
20. Detecting change
in learning process
• Learning is the modification of
behaviour through practice and
experience.
• Change in learning process can
be usually detected using
Learning curve.
• A learning curve is a graphical
representation of the changing
rate of learning for a given
activity or tool.
• Typically, the increase in
retention of information is
sharpest after the initial attempts,
and then gradually evens out,
meaning that less and less new
information is retained after each
repetition.
Dr. D. Dutta Roy, Indian Statistical Institute
21. CASE STUDY
25 students were trained with 7 training
modules of Fast ForWord.
Results were analyzed in terms of box-
whisker plots.
Dr. D. Dutta Roy, Indian Statistical Institute
22. Circus Sequence (CS)
• The participant develops
listening accuracy by
presenting sweep sounds
at different frequencies,
durations, and with
different lengths of time
between sounds. The
frequencies and durations
of the sound sweeps
correspond to the rapid
transitions in the sounds of
the English language.
Dr. D. Dutta Roy, Indian Statistical Institute
23. Results of CS
Box & Whi s er Pl ot (CS exerci s
k e, T reatm ents = 34)
110
90
70
Percentage of Success
50
30
10
M i n-M ax
-10 25%-75%
T1 T3 T5 T7 T 9 T 11 T 13 T 15 T 17 T 19 T 21 T 23 T 25 T 27 T 29 T 31 T 33
T2 T4 T6 T 8 T 10 T 12 T 14 T 16 T 18 T 20 T 22 T 24 T 26 T 28 T 30 T 32 T 34 M edi an val ue
Size: Box size gradually becomes larger indicating inclusion of more number of cases in learning
competency group.
Location of median: Median moves upward with successive trials. This indicates successive learning
competency across trials.
Whiskers: Upper whisker gradually vanishes and lower whisker moves upward. This indicates
achievement of learning competency of most cases though few cases found difficulty to achieve.
After 100% achievement, box size increase indicating fluctuation of attention or operation of other
intervening factors operate when one achieves the goal.
Dr. D. Dutta Roy, Indian Statistical Institute
24. Old MacDonald’s Flying Farm
(OM)
• Students use the
computer mouse to
catch and hold a flying
animal. The animal
repeats a single syllable
several times, and
students must release
the animal when they
hear a change in the
syllable.
Dr. D. Dutta Roy, Indian Statistical Institute
25. Results of OM
Box & Whisker Plot (OM exercise, Treatment = 20)
110
100
90
80
70
Percentage of Success
60
50
40
30
20
10
0
Min-Max
-10 25%-75%
T1 T3 T5 T7 T9 T11 T13 T15 T17 T19
T2 T4 T6 T8 T10 T12 T14 T16 T18 T20 Median value
Dr. D. Dutta Roy, Indian Statistical Institute
26. Phonic Words (PW)
• Students see two
Box & Whisker Plot (PW Exercise, Treatment = 20)
110
pictures representing 90
70
two similar words that
Percentage of Success
50
differ only by initial or 30
final consonant (“tack” 10
versus “tag”). When
Min-Max
-10 25%-75%
T1 T3 T5 T7 T9 T11 T13 T15 T17 T19
T2 T4 T6 T8 T10 T12 T14 T16 T18 T20 Median value
students hear the word
representing one of the
pictures, they must
click the picture that
matches the word
Dr. D. Dutta Roy, Indian Statistical Institute
27. Compare relative effectiveness
of training modules Box & Whisker Plot (PW Exercise, Treatment = 20)
110
90
70
Percentage of Success
50
30
10
Min-Max
-10 25%-75%
T1 T3 T5 T7 T9 T11 T13 T15 T17 T19
T2 T4 T6 T8 T10 T12 T14 T16 T18 T20 Median value
Box & Whisker Plot (OM exercise, Treatment = 20)
Box & Whisker Plot (CS exercise, Treatments = 34)
110
110
100
90
90
80
70
Percentage of Success
70
Percentage of Success
60
50
50
40
30
30
20
10 10
0
Min-Max Min-Max
-10 -10 25%-75%
25%-75% T1 T3 T5 T7 T9 T11 T13 T15 T17 T19
T1 T3 T5 T7 T9 T11 T13 T15 T17 T19 T21 T23 T25 T27 T29 T31 T33
Median value T2 T4 T6 T8 T10 T12 T14 T16 T18 T20 Median value
T2 T4 T6 T8 T10 T12 T14 T16 T18 T20 T22 T24 T26 T28 T30 T32 T34
Dr. D. Dutta Roy, Indian Statistical Institute
28. SUMMARY
• Box-whisker plot is useful statistical tool to
detect outliers and to detect change in the
learning process.
• Box plot is effective statistical tool to
compare relative effectiveness of different
training modules.
Dr. D. Dutta Roy, Indian Statistical Institute