This document provides an overview of SPSS (Statistical Package for the Social Sciences) software and quantitative data analysis. It discusses what SPSS is used for, how to access SPSS, and some basic definitions related to quantitative research such as variables, levels of measurement, and descriptive statistics. It also gives examples of entering and working with quantitative data in SPSS, including creating a codebook, defining variables, inputting data, and saving files. The overall summary is that SPSS is a tool for analyzing quantitative data through descriptive statistics and significance testing to answer research questions.
2. ∗ SPSS is a computer program for analysing
quantitative data.
∗ This can range from basic descriptive statistics such as
the mean, mode, median and range to powerful tests
of significance (So whether we accept or reject a
hypothesis).
∗ What the data looks like, and what that means if
anything.
What is SPSS?
3. ∗ You can access SPSS on the vast majority of PC’s at UCS, in these labs, the
Waterfront PC’s and the library PC’s
∗ Additionally, if you wish you can borrow a copy of SPSS to install on your
own home PC or laptop. There are 16 copies in the library you just need to
borrow the disc and input the license code that comes with it (license’s do
come to an end and when they do you can just come and borrow a new
version of SPSS or attain a new license code)
∗ Over night loan only or you can bring your laptops in and do there and then
∗ Windows version only, no Apple version
SPSS Access
4. Overview
∗ Why do numbers matter in
research design?
∗ Numbers allow you to do
two basic things:
- Count how often
“something”
happens
- Count how big an
issue “something” is
5. Overview
∗ Once you can count the
extent (how often) and
nature (for quantitative
research a numerical
descriptor of an
attribute) you can
already do some pretty
important things. You
can answer questions
such as:
∗ How common is an issue?
∗ For instance, are black
children over-represented in
care? Are black adults over-
represented in psychiatric
hospital?
∗ How serious is a particular
issue? Or how is it distributed
within a sample?
∗ For instance, how serious are
the concerns about children in
families allocated a social
worker?
6. Overview
Once you can count stuff you can start to answer other
important and interesting questions, for instance:
7. ∗ Students may often come to you with various questions about
SPSS and difficulties that they are having
∗ If you are lucky enough to catch them early on a lot of
unnecessary frustration and stress about analysing data can be
avoided.
∗ The most important thing a student can do before they even
consider methodology, methods or analysis is to have a clear
research question/aim and hypotheses in place that
conceptualise and operationalise the variables they wish to
study.
SPSS and Quantitative Data
8. Some Basic Definitions
A variable is the “thing” that you’re
interested in studying e.g. depression,
gender differences, social deprivation,
specific crime rates, levels of
emotionality (how emotional someone
is) or different types of food!
9. • Things like depression, gender differences,
social deprivation, specific crime rates, levels of
emotionality and food type, etc. are called
“variables” because they vary.
● Some people are more depressed than others
● Some people are men, and others are women
● Some Social policies may be more successful than
others
● We may see different crimes committed in different
contexts, areas
● Some people are less emotional than others
● Food types can range from pizza to hamburgers to
filet mignon, or might be Thai, Ethiopian, Polish or
American cuisine, etc., etc.
10. TO “CONCEPTUALISE” A VARIABLE MEANS
TO MAKE CLEAR WHAT YOU MEAN BY THE
VARIABLE….
• For example, for the variable “food type,”
you need to be clear about whether you
mean
• (1) vegetarian or meat, OR
• (2) breakfast, lunch or dinner foods, OR
• (3) Ethiopian, Thai or American foods, OR
• (4) something else!
11. TO “OPERATIONALISE” A VARIABLE IS
TO DECIDE HOW YOU WILL MEASURE
IT
• For example, if the variable you’re
interested in is depression:
● Will you ask people to rate themselves, and if
so, on what sort of a scale?
● Alternatively, will you measure depression by
facial expression? By some behaviour that you
observe? In some other way?
12. TO “OPERATIONALISE” A VARIABLE IS
TO DECIDE HOW YOU WILL MEASURE
IT
• If the variable you’re studying is intelligence & you
don’t think Exam scores are a good measure of
intelligence, what measure WILL you use?
• Asking these sorts of questions is completing the
process of “operationalising” your variables.
• Conceptualisation & Operationalisation are
necessary for a Quantitative approach
13. Exploratory Descriptive Causal/Relationship
Exploratory research
is undertaken when
few or no previous
studies exist. The
aim is to look for
patterns, hypotheses
or ideas that can be
tested and will form
the basis for further
research.
Typical research
techniques would
include case studies,
observation and
reviews of previous
related studies and
data.
Data from exploratory studies
tends to be qualitative.
Expands on the Exploratory
Descriptive research
can be used to
identify and classify
the elements or
characteristics of
the subject, e.g.
number of days youth offenders
remained out of trouble.
Quantitative
techniques are most
often used to
collect, analyse and
summarise data.
Causal and Relationship
research focuses on being able
to predict/hypothesise cause and
effect between observed
behaviours, or relationships
between aspects of
behaviour/society/crime rates.
The idea is that Causal and
Relationship research is moving
a step beyond descriptive
research and the quantitative
data collected can be used and
analysed in a manner that allows
the researcher to infer a
significant effect/difference or
relationship
TYPES OF QUANTITATIVE RESEARCH
15. • The Quantitative approach sets out at the start of a study
with a research question and a hypothesis/prediction
• Hypotheses are formal statements of predictions derived
from evidence from earlier research and/or theory.
• The null hypothesis (H0) is a statement of ‘no
difference/effect/change’ between the variables
• The experimental hypothesis (H1) is a statement of
difference/relationships between variables
QUANTITATIVE DESIGNS AND
HYPOTHESES
16. • Experimental Hypothesis: Students who
study for tests in study groups will score
significantly better on their exams than
students who did not study in study groups
• Null Hypothesis: There will be no significant
difference in exam results between students
who do and do not study in study groups
EXAMPLE OF HYPOTHESIS
17. ∗ This clarity in the question and hypothesis can make life
markedly easier for yourselves and the student in the long run.
∗ However, I appreciate that this is not always the easiest/or will
not be the case for you more often than not.
∗ So what I will be covering with you today is a brief introduction
to the SPSS interface and as to how we would go about doing
the initial basics of data entry and beginning to explore
descriptive data.
∗ If we can I’ll also take you through examples of some basic
significance testing (otherwise I’ll put up so available)
18. ∗ A light but important session.
∗ Going over the basics of how to input data, label your variables
so it is clear and how to create codebooks.
∗ It’s all about building up your confidence with the interface,
and developing good practise.
∗ It’s about doing the basics so as to avoid confusion later on,
e.g. inputting the data correctly for different types of analysis.
Data Entry and Descriptives
19. ∗ Hopefully should be familiar with the idea of
descriptive data.
∗ As the name suggests they are what we use to describe
the data we have.
∗ There’s no point in knowing that the IQ scores between
two groups are significantly different if we don’t have a
way of describing the scores, and the difference.
∗ Measures of central tendency: Mean, mode, median etc.
∗ Measures of dispersion: Standard deviation etc.
Descriptive Stats
20. Levels of Measurement
∗ In 1946 Stevens proposed a theory of scales of
measurement.
∗ Nominal data (lowest level of measurement)
∗ Ordinal data (unable to differentiate points on
scale)
∗ Interval data (points on scale equal distance
apart)
∗ Ratio data (equal distance between points on
scale)
21. Nominal
∗ Provides the least exact information
∗ Participants are placed in categories
∗ Data that is categorical e.g. gender, colours, shoe type, play behaviour
∗ Variable must fit into one category
∗ Measure of frequency
∗ Numbers may be used but only as category labels
∗ Central tendency is described using the mode
∗ Data is represented using a frequency table or bar
chart
22. Examples: Nominal Data
∗ Type of Bicycle
∗ Mountain bike, road bike, chopper, folding, BMX.
∗ Ethnicity
∗ White British, Afro-Caribbean, Asian, Chinese,
other, etc. (note problems with these categories).
∗ Smoking status
∗ smoker, non-smoker
23. Ordinal
∗ Simplest true scale, orders measurements along a continuum
∗ Represent rank position in a group e.g. 1st
, 2nd
, 3rd
…10th
∗ No information on difference between positions
∗ Central tendency is described in terms of the median
∗ Dispersion can be measured using the range or inter-quartile
range (middle 50% of the distribution)
24. Ordinal Data
∗ A type of categorical data in which order is
important.
∗ Class of degree-1st
class, 2:1, 2:2, 3rd class, fail
∗ Degree of illness- none, mild, moderate,
acute, chronic.
∗ Opinion of students about stats classes-
Very unhappy, unhappy, neutral, happy, ecstatic!
25. Interval and ratio variables
∗ According to Fielding & Gilbert (2000) these are often used
interchangeably, and incorrectly by social scientists (pg15)
∗ Interval, ordered categories, no inherent concept of zero (Clark 2004),
we can calculate meaningful distance between categories, few real
examples of interval variables in social sciences (Fielding & Gilbert
2000:15)
∗ Ratio. A meaningful zero amount (e.g. income), possible to calculate
ratios so also has the interval property (e.g. someone earning £20,000
earns twice as much as someone who earns £10,000) (Fielding & Gilbert
2000:15)
∗ Difference between interval and ratio usually not important for statistical
analysis (Fielding & Gilbert 2000:15)
26. Interval variables- Examples
∗ Fahrenheit temperature scale- Zero is
arbitrary- 40 Degrees is not twice as hot as 20
degrees.
∗ IQ tests. No such thing as Zero IQ. 120 IQ not
twice as intelligent as 60.
∗ Question- Can we assume that attitudinal data
represents real, quantifiable measured
categories? (ie. That ‘very happy’ is twice as
happy as plain ‘happy’ or that ‘Very unhappy’
means no happiness at all). Statisticians not in
agreement on this.
27. Ratio variables-Examples
∗ Can be discrete or continuous data.
∗ The distance between any two adjacent units of
measurement (intervals) is the same and there is a
meaningful zero point (Papadopoulos, 2001)
∗ Income- someone earning £20,000 earns twice as much
as someone who earns £10,000.
∗ Height
∗ Unemployment rate- measured as the number of
jobseekers as a percentage of the labour force
(Papadopoulos, 2001).
28.
29. ∗ If you are still a little worried about your
understanding of Quantitative Data please see the
Key Information Handout in the Folder.
∗ By David Bowers (Learning Development)
∗ A reasonable summary of information about
quantitative data.
∗ Data types, appropriate measures of central tendency
etc.
Key Information Handout
30. ∗ Everything we do today is about good practice.
∗ Following the steps today, and developing correct
inputting skills, will save you lots of problems and
heartache later.
∗ SPSS is fussy when it comes to the way data is
entered.
Importance of Good Practice
31. ∗ As SPSS is a Quantitative Data analysis software you
often have to reduce information down to a
numerical state
∗ A Codebook allows you to keep a record of these
reductions and decisions
∗ A record of your own. Separate from SPSS.
∗ Electronic or on paper.
∗ A list of variables, full names, and how you have
coded data.
Codebook
32. ∗ The codes you give data to allow SPSS to analyse it.
∗ You can’t enter text so some variables need to be
converted.
E.g.
∗ Gender: Female may become 1, Male may become 2.
∗ Relationship Status: Single may become 1, Married 2,
Divorced 3, Widowed 4…
Coding
33. ∗ SPSS is fussy when it comes to the names you give variables.
∗ Can’t give them a full description in the main view.
∗ So you can give detailed labels in the special variable view.
∗ Along with a codebook it helps keep the information clear.
Labelling
34. ∗ Available on email that was circulated to you all
∗ File: Data Entry Exercise 1 - Optimism Data
∗ We’ll be creating a codebook, setting up SPSS
according to the codebook, and then entering the
data.
1st
Exercise
35. ∗ Good habits
∗ Create a new Folder on your Desktop
∗ Right-click on Desktop> New > Folder > “SPSS”
New Data Folder
36. ∗ Start>All Programs>IBM SPSS Statistics 19.
∗ Depending on version may have a slightly different
name.
∗ GIVE IT TIME SPSS IS RENOWNED FOR TAKING AN
AGE TO OPEN UP – CLICKING AGAIN ONLY SLOWS IT
DOWN MORE AS IT’LL THEN TRY TO OPEN ANOTHER
SPSS WINDOW
Open SPSS
40. ∗ Rules for naming of variables
∗ Variable names:
∗ must be unique (i.e. each variable in a data set must
have a different name);
∗ must begin with a letter (not a number);
∗ cannot include full stops, spaces or other characters (!, ?
* ");
∗ cannot include words used as commands by SPSS (all,
ne, eq, to, le, lt, by, or, gt, and, not, ge, with)
Coding Data
41. Optimism scale items op1 to 4 Enter number circled 1 (strongly disagree)
to 5 (strongly agree)
Coding Data
42. ∗ Now we have a codebook to keep things clear we can
set up SPSS so it is ready for the data.
∗ SPSS has 3 views: Data, Variable and Output.
∗ By switching to Variable we can define the variables
we need.
Creating a data file and inputting
data
48. Values
Enter the relevant
value and label as per
your codebook, then
click add. When all
have been entered,
click OK
Define the meaning
of the values used in
the codebook
(Gender) and click
add for each.
50. Values
When entering likert
data always use the
limits of the scale (1-5)
even if you know that
participants may not
have entered some
responses. You also
need to decide whether
you are going o just
enter the range or every
labeled point.
52. ∗ Data comes in different types.
∗ Categorical (Nominal in SPSS)
∗ Ordinal
∗ Scale/Interval (Scale in SPSS)
∗ Different types/measures suit different tests,
different measures of central tendency, different
forms of visualisation.
∗ Makes knowing what type of data you have KEY for
successful data analysis.
Measures
53. Measures
Scale refers to
interval/ratio level of
measurement - There is
some debate about data
type in relation to likert
data … for our purposes,
leave this as Scale
Nominal refers to
catergorical
55. ∗ Now you have the variables set up ready for the data you can
start to enter the actual data
∗ Go to the Data View
Inputting Data According to the
Codebook
60. ∗ You’ve saved the data so now it is ‘safe’
∗ You can have a play around with it and try a few
different things.
∗ Delete a case
∗ Insert a case between existing cases
∗ Delete a variable
∗ Insert a variable between existing variables
∗ Try during the workshop/at home so you get more
confident with SPSS.
Playing around with the data
61. ∗ Available on LearnUCS.
∗ Different experimental designs require a different style of
inputting.
∗ The structure you use will be different between Repeated
(Within-Group) and Independent (Between-Group)
experimental designs.
∗ Use the wrong structure and the analysis will fall down. It
will be meaningless at best.
2nd
Exercise: Inputting Repeated and
Independent Measures
62. ∗ So, to recap
∗ Repeated Measures. The same participants experience all
treatments/are in all the groups/conditions.
∗ If you wanted to investigate the effect of music on taking
an IQ test participants would experience the no music
condition, and the music condition.
∗ Hopefully with some counterbalancing.
Repeated Measures
65. ∗ Again to recap.
∗ Participants are split. One group will experience one
treatment/be in one group/condition.
∗ Another group will experience the other.
∗ Each condition will have a unique, non-shared, set of
participants.
Independent
72. ∗ A quick trick to show you.
∗ Good for those who aren’t fond of a screen full of
numbers.
∗ If you have coded your variables correctly there is
a button you can press that will make the
numbers in your data view appear as the names
coded.
∗ For example the 1’s and 2’s for gender could
appear as Male and Female.
Labelling Trick
73.
74. ∗ Data Entry Exercise 1 – Optimism Data Input
∗ Data Entry Exercise 2 – Repeated and Independent
∗ Extra Data Entry Exercises
∗ Exercise 3 – Giving electric shocks
∗ Exercise 4 – Shooting people
∗ We’ve gone through 1 and 2 here. Try them on your
own.
∗ 3 and 4 for extra practice.
∗ Make sure you are comfortable with data input,
coding and labelling.
Exercises
75. ∗ The theory and step-by-step guide will be covered in the slides following immediately
below.
∗ If you complete the first exercise move onto exercise 2.
Descriptive Exercise 1: survey.sav
The data is from a survey of staff about stress and emotions.
Generate the frequencies for 1) marital status and 2) level of education
Descriptive Exercise 2: staffsurvey.sav
The data is from a staff survey with likert scales for agreement and importance of
factors.
Generate appropriate descriptive statistics to answer the following questions:
(a) What percentage of the staff in this organisation are permanent employees? (Use the
variable employstatus.)
(b) What is the average length of service for staff in the organisation? (Use the variable service.)
(c) What percentage of respondents would recommend the organisation to others as a good
place to work? (Use the variable recommend.)
Lab Exercises
76. ∗ The theory and step-by-step guide will be covered in the slides following immediately
below.
∗ If you complete the first exercise move onto exercise 2.
Descriptive Exercise 1: survey.sav
The data is from a survey of staff about stress and emotions.
Generate the frequencies for 1) marital status and 2) level of education
Descriptive Exercise 2: staffsurvey.sav
The data is from a staff survey with likert scales for agreement and importance of
factors.
Generate appropriate descriptive statistics to answer the following questions:
(a) What percentage of the staff in this organisation are permanent employees? (Use the
variable employstatus.)
(b) What is the average length of service for staff in the organisation? (Use the variable service.)
(c) What percentage of respondents would recommend the organisation to others as a good
place to work? (Use the variable recommend.)
Lab Exercises
77. ∗ When you are trying to find your descriptive stats you
need to make sure you use the right ones.
∗ Certain types of data/measure, suit certain types of
measures of central tendency and dispersion.
∗ Use the wrong ones and your description of the
results will be confusing, wrong and won’t match
your inferential statistics.
Types of Variables & Descriptives
78. ∗ Also known as Nominal variables in SPSS.
∗ Data that has been classified and categorised.
∗ So gender, a participant will belong to a particular
category of gender.
∗ Marital Status.
∗ Anything that you can create a discrete classification
of. You can even take a scale variable like age, and
force it into categories (18 and under, 18 – 25, 25 – 35
etc.).
Categorical Variables
79. ∗ Measure of Central tendency to use for Categorical
data is the mode.
∗ Frequency of occurrence or amount.
∗ So using gender as an example you would use the
mode.
∗ 2 of the sample might be male, and 8 female.
∗ Mode = Female.
∗ 20% male, 80% female
Categorical
80. ∗ In SPSS you should use the Frequency option when
you want the descriptive stats for a categorical
variable.
∗ Go to Descriptive Exercise 1 on LearnUCS.
Categorical and Frequency
81. ∗ Save survey.sav to your SPSS folder on the Desktop
from LearnUCS
∗ Have a look at survey.sav questionnaire from
LearnUCS
∗ Open survey.sav dataset
Descriptive Exercise 1 - Survey
89. ∗ This is where graphs and the results from tests
(descriptive and inferential) will appear.
∗ Also notes about when you have saved and opened
files too.
∗ If you want to keep what is in the output you must
save it specifically.
∗ Saving the data/variable will not save what is in the
output, and vice versa.
Output pages
90. ∗ Aside from Categorical measures we also have
∗ Ordinal
∗ Scale/Interval (sometimes know as ratio too)
∗ These are also generally known as continuous
variables.
∗ Usually the mean or median are the measures of
central tendency used, and the standard deviation, or
error, the measure of dispersion.
Other measures
91. ∗ Ranked or ordered data. Sometimes Likert scales.
∗ Has some similarity to categorical data (You might
consider grade brackets to be categories; A, B, C, D,
etc).
∗ But importantly they are ranked, so there is meaning
to the position. A is better than B, B better than C and
so on.
∗ The median is used here.
∗ Central point with an equal amount above/below.
Ordinal
92. ∗ The median is used here.
∗ Central point with an equal amount above/below
∗ So if you had a collection of grades…
∗ 20 people had an A
∗ 10 had a B
∗ 10 had a C
∗ 10 had a D
∗ Then B would be the median grade, as 20 people had
higher, and 20 people had lower.
Ordinal
93. ∗ Imagine we wished to find the median for the highest
educational level attained by a population
∗ In descriptive exercise 1 (survey) we would click on ‘Analyze’
Using Explore to See the Median
94. Using Explore to See the Median
Select
‘Descriptive
Statistics’ and
then ‘Explore’
from the Drop-
down menus
95. Using Explore to See the Median
1. When the below box opens move
‘highest educ completed’ from the left
pane to the ‘Dependent List’ section
2. Click on
‘Statistics’ and
choose ‘Outliers’ and
‘Continue’
3. Click on ‘Plots’
and choose
‘Histograms’ and
‘Normality Plots with
tests’ and ‘Continue’
4. Click on ‘OK’
96. Using Explore to See the Median
The resulting ‘Output’ in the
Output window will show
you a number of descriptive
stats.
We can see the median is 4
for the ‘highest educ
completed’ which means
‘some additional training’ is
the median for the highest
education completed for 439
participants who took part in
the survey.
97. ∗ Interval – a scale with artificial limits, no true zero, and
usually some form of cap.
∗ Intervals are of equal size.
∗ IQ scores for example.
∗ Ratio – has a true zero, constant intervals and potentially
little or no cap.
∗ So timing scores on a task for example.
∗ SPSS doesn’t really differentiate between the two.
∗ Basically if it is a form of score it is likely to be scale.
Scale
98. ∗ The mean is the normal measure of central tendency,
and the measure of dispersion the standard deviation.
∗ So 5 people take a maths test.
∗ They score 10, 20, 18, 12 and 5.
∗ The average would be 13 (total/number of cases)
Scale
99. ∗ In SPSS we just need the descriptive option, rather
than the frequency option.
∗ So for example if we wished to find the mean and
standard deviation for ‘age’, ‘total optimism’, ‘total
mastery’, ‘total perceived stress’ and ‘total perceived
control of internal states’ (PCOISS), for participants
who answered the survey we are using for exercise 1.
Scale Descriptives
104. ∗ Sometimes information will be left out of a questionnaire,
or the value lost, but you will still need to conduct an
analysis.
∗ What happens if someone doesn’t fill in the age box on a
questionnaire?
∗ Rather than get rid of all their data you can use the
‘Exclude cases pairwise’ option.
∗ It excludes the case (person) only if they are missing the
data required for the specific analysis. They will still be
included in any of the analyses for which they have the
necessary information.
Missing Data
105. ∗ Exclude cases listwise
∗ A more extreme option.
∗ If the participant is missing any data then this option
should remove them entirely from the analysis.
∗ A matter of judgement as to which to use.
Missing Data
108. Adapted from Green, J. & D’Oliveira, M. (1999). Learning to use statistical tests in psychology. Buckingham, UK: Open University Press.
Differences ?
Categorical & Frequency
Data? Relationships ?
How many Independent
variables?
START
Within or
Between
participants in
each condition?
Two or more
Parametric:
Unrelated
t-test
Non-param:
Mann Whitney
Between
How many
experimental
conditions?
One
Factorial Within
Subjects (Repeated
Measures) ANOVA
Within
Factorial Mixed
Design (Split-Plot)
ANOVA
Both True
Between
Factorial Between
Groups ANOVA
3 or more
Within or
Between
participants in
each condition?
Two
Within or
Between
participants in
each condition?
Parametric: Non-param:
Oneway Friedman
Within Ss or
(Repeated Page’s L
measures) Trend Test
ANOVA
Within Between
Parametric: Non-param:
Oneway Kruskal-
Between Wallis or
Group Jonckheere
ANOVA Trend Test
Parametric: Non-Param:
Related Wilcoxon
t-test
Within
Parametric: Non-param:
Pearson's r Spearman's r
Flowchart for choosing basic
statistics
Summarising Univariate Data?
Descriptive statistics
(mean, standard deviation,
variance, etc)
1 or 2 sample
Chi-square
Within
McNemar
Between
109. ∗ Coolican, H. (2014). Research Methods and Statistics in Psychology (6th
ed.). Hove, UK: Psychology Press.
∗ A good introduction to the quantitative statistics incorporated in the social
sciences. A comprehensive coverage of the statistics covered in research
methods at this level in a clear and comprehensive format.
∗ Pallant, J. (2013). SPSS: Survival Manual (5th
ed.). Maidenhead, UK: Open
University Press
∗ A textbook that is of help with the statistical programme SPSS whatever
your level, as it takes you through the analysis in a step-by-step clear and
concise manner that allows you to learn while you put into practice.
∗ Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics (4th
ed.).
London, UK: Sage
∗ An easy to engage with text that covers research methods and statistics in
a fashion that makes it easy to read and follow.
Recommended Reading
110. ∗ You can use the below link to access the UCS library page that
has some useful videos showing how to use SPSS
∗ http://libguides.ucs.ac.uk/c.php?g=264784&p=1954991
∗ There is also a course that you can do (set up by Jen Versey our
Psychology technician and David Mullett from the library
support team)
∗ https://www.coursesites.com/webapps/Bb-sites-course-creation-BBLEA
383196_1
∗ There is always the IBM SPSS guide that you can access through
the help option in SPSS as a starting point.
Web Resources
111. Descriptive Statistics
∗ Descriptive statistics – are statistics that describe data. They
essentially summarise the data.
∗ They can be either numerical or graphic
∗ Numerical statistics come in 2 forms
∗ Measurement of central tendency
∗ Measurement of dispersion
112. Measure of Central Tendency
∗ Three measures of central tendency/ score, which we use is
dependent on our level of measurement. They are;
∗ Mean
∗ Arithmetic average/mean. Sum of all scores divided by the
number of scores
∗ Median
∗ The score that falls in the exact centre of the distribution
(middlemost score)
∗ Mode
∗ The most common/frequently occurring score
113. ‘the mean’
∗ Formula for the mean is
_ Σx
x = N
_
x = the mean
Σ = the sum of
x = the scores
N = the number of scores in set
∗ Advantages
∗ Powerful statistic used in estimating population parameters for significant
differences and correlations. Most sensitive, and works at an interval level.
∗ Disadvantages
∗ Can be overly sensitive causing it to easily distort due to outlier values
114. ‘the median’
∗ The measure of central tendency for ordinal data
∗ Shorthand may be Guildford’s (1956) Mdn
∗ It is the central value of a set
∗ A formula used to find the median is
N + 1
k = 2
∗ For odd number data sets this will reveal the central number
∗ For even number data sets this will reveal the two points of data that the median
falls between
∗ When you have a number of values the same in the data set you can use the same
method although it is not strictly correct. However, luckily for us as social scientists
there are statistical packages that will take care of this for us
115. ‘the mode’
∗ The measure of central tendency for nominal scale data. We are
unable to calculate mean and median with this type of data, but
we can see what occurred most often/highest frequency
∗ There can be two modes, which we call bi-modal
∗ Advantages
∗ Most typical, unaffected by extremes, can be more informative than mean
with discrete scales
∗ Disadvantages
∗ Does not account for differences between values, can’t be used in
estimates of population parameters, not all that useful for small sets of
data, for bi-modal two modal values reported, difficult to estimate
accurately when data grouped into class intervals
117. ‘the range’
∗ Report of the top/highest value and the bottom/lowest value
∗ To calculate what the range is (the difference between) you
subtract the lower value from the higher value and add 1
∗ Advantage
∗ Includes extremes, easy to calculate
∗ Disadvantages
∗ Can be distorted by extremes, can be unrepresentative of the distribution.
Doesn’t tell us whether values close to spaced out from mean
118. ‘the interquartile and semi-interquartile
range’
∗ The interquartile range allows us a better insight into
how values fall in relation to the central tendency
∗ Instead of the full range, the interquartile range
represents the distance between the central 50%,
removing the bottom and top 25%. The values are
known as the 1st
and 3rd
quartiles or the 25th
and 75th
percentiles
119. Interquartile range
Q1 M Q3
3 3 4 5 6 8 10 13 14 16 19
∗ The interquartile range is: Q3 – Q1
∗ Semi-interquartile is half of that: Q3 – Q1
2
∗ Advantages
∗ Representative of central group of values, useful for ordinal data
∗ Disadvantages
∗ No account of extremes, inaccurate where there are large class
intervals
120. Standard deviation and variance
∗ These estimate from a sample how the values of a
population are distributed
∗ Standard deviation provides us with an average score
telling us how different the scores are from the mean
∗ Formula for standard deviation (std, SD, stdev)
)(
−1
2
−
=
∑
n
Xx
s
− 1
=
∑
n
s
2
d Or
Editor's Notes
Statistical Package/Program for Social Sciences
Or
Something Painful Students Suffer
Imp – Can use mode on categories as well but you lose important information in regards to the ranking.