SlideShare a Scribd company logo
1 of 134
Download to read offline
A Step-by-Step Guide
to Analysis and Interpretotion
Brian C. Cronk
ll
I
L
:,-
ChoosingtheAppropriafeSfafistical lesf
Ytrh.t b Yq
I*l
QraJoi?
Dtfbsr
h
ProportdE
Mo.s Tha 1
lnd€Fndont
Varidl6
lldr Tho 2 L6Eb
d li(bsxlq*
Vdidb
lhre Thsn 2 L€wls
of Indop€nddtt
Varisd€
f'bre Tha 'l
Indopadqrl
Vdbue
'|
Ind.Fddrt
Vri*b
fro.! Itn I
l.doFfihnt
Vdi.bb
NOTE:Relevantsectionnumbersare
giveninparentheses.Forinstance,
'(6.9)"refersyouto Section6.9in
Chapter6.
I
Notice
SPSSis a registeredtrademarkof SPSS,Inc.Screenimages@by SPSS,Inc.
andMicrosoftCorporation.Usedwith permission.
Thisbookis not approvedor sponsoredby SPSS.
"PyrczakPublishing"isanimprintof FredPyrczak,Publisher,A CaliforniaCorporation.
Althoughtheauthorandpublisherhavemadeeveryefforttoensuretheaccuracyand
completenessof informationcontainedin thisbook,weassumenoresponsibilityfor
errors,inaccuracies,omissions,or anyinconsistencyherein.Any slightsof people,
places,or organizationsareunintentional.
ProjectDirector:MonicaLopez.
ConsultingEditors:GeorgeBumrss,JoseL. Galvan,MatthewGiblin,DeborahM. Oh,
JackPetit.andRichardRasor.
Editdrialassistanceprovidedby CherylAlcorn,RandallR.Bruce,KarenM. Disner,
BrendaKoplin,EricaSimmons,andSharonYoung.
Coverdesignby RobertKiblerandLarryNichols.
Printedin theUnitedStatesof AmericabyMalloy,Inc.
Copyright@2008,2006,2004,2002,1999byFredPyrczak,Publisher.All rights
reserved.No portionof thisbookmaybereproducedor transmittedin anyformorby any
meanswithoutthepriorwrittenpermissionof thepublisher.
rsBNl-884s85-79-5
Tableof Contents
IntroductiontotheFifthEdition
What'sNew?
Audience
Organization
SPSSVersions
Availabilityof SPSS
Conventions
Screenshots
PracticeExercises
Acknowledgments'/
ChapterI GettingStarted
Ll
t.2
1.3
1.4
1.5
1.6
1.7
Chapter2 EnteringandModifyingData
StartingSPSS
EnteringData
DefiningVariables
LoadingandSavingDataFiles
RunningYourFirstAnalysis
ExaminingandPrintingOutputFiles
Modi$ingDataFiles
VariablesandDataRepresentation
TransformationandSelectionof Data
Chapter3 DescriptiveStatistics
3.1
3.2
3.3
3.4
3.5
Chapter4 GraphingData
FrequencyDistributionsandpercentileRanksfor a singlevariable
FrequencyDistributionsandpercentileRanksfor Multille variables
Measuresof CentralTendencyandMeasuresof Dispersion
foraSingleGroup
Measuresof CentralTendencyandMeasuresof Dispersion
for MultipleGroups
StandardScores
4l
4l
43
45
49
2.1
') ')
v
v
v
v
vi
vi
vi
vi
vii
vii
I
I
I
2
5
6
8
ll
ll
t2
l7
t7
20
24
)7
29
29
29
3l
33
36
39
2l
Chapter5 PredictionandAssociation
4.1
4.2
4.3
4.4
4.5
4.6
5.1
5.2
5.3
5.4
GraphingBasics
TheNewSPSSChartBuilder
BarCharts,PieCharts,andHistograms
Scatterplots
AdvancedBarCharts
EditingSPSSGraphs
PearsonCorrelation Coefficient
SpearmanCorrelation Coefficient
SimpleLinear Regression
Multiple LinearRegression
u,
Chapter6
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
Chapter7
7.1
7.2
7.3
7.4
7.5
7.6
Chapter8
8.1
8.2
8.3
8.4
AppendixA
AppendixB
ParametricInferentialStatistics
Reviewof BasicHypothesisTesting
Single-Samplet Test
Independent-SamplesI Test
Paired-Samplest Test
One-WayANOVA
FactorialANOVA
Repeated-MeasuresANOVA
Mixed-DesignANOVA
Analysisof Covariance
MultivariateAnalysisof Variance(MANOVA)
NonparametricInferentialStatistics
Chi-SquareGoodnessof Fit
Chi-SquareTestof Independence
Mann-WhitneyUTest
WilcoxonTest
Kruskal-Wallis,F/Test
FriedmanTest
TestConstruction
Item-TotalAnalysis
Cronbach'sAlpha
Test-RetestReliability
Criterion-RelatedValidiw
EffectSize
PracticeExerciseDataSets
PracticeDataSetI
PracticeDataSet2
PracticeDataSet3
Glossary
SampleDataFilesUsedin Text
COINS.sav
GRADES.sav
HEIGHT.sav
QUESTIONS.sav
RACE.sav
SAMPLE.sav
SAT.sav
OtherFiles
Informationfor Usersof EarlierVersionsof SPSS
GraphingDatawithSPSS13.0and14.0
53
53
))
58
6l
65
69
72
75
79
8l
85
85
87
.90
93
95
97
99
99
100
l0l
t02
103
r09
109
ll0
ll0
lt3AppendixC
AppendixD
AppendixE
AppendixF
tt7
n7
ll7
ll7
n7
l18
l18
lt8
lt8
l19
t2l
tv
ChapterI
Section1.1 StartingSPSS
ffi$t't****
ffi
c rrnoitllttt
(- lhoari{irgqrory
r,Crcrt*rsrcq.,y urhgDd.b6.Wbrd
(i lpanrnaridirgdataura
f- Dml*ro* fe tf*E h lholifrra
GettingStarted
Startup proceduresfor SPSSwill differ
slightly,dependingon the exactconfigurationof
the machineon which it is installed.On most
computers,you can start SPSSby clicking on
Start, then clicking on Programs,then on SPSS.
On many installations,therewill be an SPSSicon
on the desktopthat you can double-clickto start
theprogram.
When SPSSis started,you may be pre-
sentedwith the dialog box to the left, depending
on theoptionsyour systemadministratorselected
for your versionof the program.If you havethe
dialog box, click Type in data and OK, which
will presenta blankdata window.'
If you were not presentedwith the dialog
box to the left, SPSSshouldopenautomatically
with a blankdata window.
The data window and the output win-
dow provide the basic interface for SPSS. A
blankdata window is shownbelow.
Section1.2 EnteringData
One of the keys to success
with SPSSis knowing how it stores
and usesyour data.To illustratethe
basicsof data entry with SPSS,we
will useExample1.2.1.
Example1.2.1
A surveywasgivento several
students from four different
classes (Tues/Thurs mom-
ings, Tues/Thursafternoons,
Mon/Wed/Fri mornings, and
Mon/Wed/Fri afternoons).
The students were asked
r! *9*_r1_*9lt.:g H*n-g:fH"gxr__}rry".**
rtlxlel&l *'.1rtlale| lgj'SlfilHl*lml sl el*l I
' Itemsthatappearin the glossaryarepresentedin bold. Italics areusedto indicatemenuitems.
ChapterI GeningStarted
whetheror not they were "morning people"and whetheror not they worked.This
surveyalso askedfor their final gradein the class(100% being the highestgade
possible).Theresponsesheetsfrom two studentsarepresentedbelow:
ResponseSheetI
ID:
Dayof class:
Classtime:
Areyouamorningperson?
Finalgradein class:
Doyouworkoutsideschool?
ResponseSheet2
ID:
Dayof class:
Classtime:
Are you a morningperson? X Yes - No
Finalgradein class:
Dovouworkoutsideschool?
4593
MWF X TTh
Morning X Aftemoon
Yes X No
8s%
Full-time Part{ime
XNo
l90l
x MwF _ TTh
X Morning - Afternoon
83%
Full-time X Part-time
No
Our goal is to enterthe datafrom the two studentsinto SPSSfor usein future
analyses.Thefirststepis to determinethevariablesthatneedto beentered.Any informa-
tion thatcanvary amongparticipantsis a variablethatneedsto be considered.Example
1.2.2liststhevariableswewill use.
Example1.2.2
ID
Dayof class
Classtime
Morningperson
Finalgrade
Whetheror notthestudentworksoutsideschool
In theSPSSdatawindow,columnsrepresentvariablesandrowsrepresentpartici-
pants.Therefore,wewill becreatinga datafile with sixcolumns(variables)andtworows
(students/participants).
Section1.3 Defining Variables
Beforewe canenteranydata,we mustfirst entersomebasicinformationabout
eachvariableintoSPSS.Forinstance,variablesmustfirstbegivennamesthat:
o beginwith aletter;
o donotcontainaspace.
ChapterI GettingStarted
Thus, the variablename"Q7" is acceptable,while the variablename"7Q" is not.
Similarly, the variable name "PRE_TEST" is acceptable,but the variable name
"PRE TEST" is not. Capitalizationdoesnot matter,but variablenamesare capitalizedin
this text to make it clear when we are referringto a variablename,even if the variable
nameis not necessarilycapitalizedin screenshots.
To definea variable.click on the VariableViewtabat
thebottomofthemainscreen.ThiswillshowyoutheVari-@
able Viewwindow. To returnto theData Viewwindow. click
on the Data View tab.
Fb m u9* o*.*Trqll t!-.G q".E u?x !!p_Ip
,'lul*lEll r"l*l ulhl **l{,lrl EiliEltfil_sJelrl
l
.lt-*l*lr"$,c"x.l
From the Variable Viewscreen,SPSSallows you to createandedit all of the vari-
ablesin your datafile. Eachcolumn representssomepropertyof a variable,andeachrow
representsa variable.All variablesmust be given a name.To do that, click on the first
empty cell in the Name column and type a valid SPSSvariablename.The programwill
thenfill in defaultvaluesfor mostof theotherproperties.
Oneusefulfunctionof SPSSis theabilityto definevariableandvaluelabels.Vari-
able labelsallow you to associatea descriptionwith eachvariable.Thesedescriptionscan
describethevariablesthemselvesor thevaluesof thevariables.
Value labelsallow you to associatea descriptionwith eachvalueof a variable.For
example,for most procedures,SPSSrequiresnumericalvalues.Thus, for datasuchasthe
day of the class(i.e., Mon/Wed/Fri and Tues/Thurs),we needto first code the valuesas
numbers.We can assignthe numberI to Mon/Wed/Friand the number2to Tues/Thurs.
To helpus keeptrackof thenumberswe haveassignedto thevalues,we usevaluelabels.
To assignvaluelabels,click in the cell you want to assignvaluesto in the Values
column.This will bring up a smallgraybutton(seeanow, below at left). Click on thatbut-
ton to bring up theValue Labelsdialog box.
When you enter a
value label, you must click
Add aftereachentry.This will
J::::*.-,.Tl mOVe the value and itS
associated label into the bottom section of
the window. When all labels have been
added, click OK to return to the Variable
Viewwindow.
iv*rl** ---
v& 12 -Jil
s*l
!!+ |
L.b.f ll6rhl|
ChapterI GeningStarred
In additionto namingandlabelingthevariable,you havetheoptionof definingthe
variabletype.To do so,simply click on theType,Width,or Decimalscolumnsin the Vari-
able Viewwindow. The defaultvalue is a numericfield that is eight digits wide with two
decimalplacesdisplayed.If your dataaremorethaneightdigitsto the left of the decimal
place,theywill be displayedin scientificnotation(e.g.,the number2,000,000,000will be
displayedas2.00E+09).'SPSSmaintainsaccuracybeyondtwo decimalplaces,but all out-
put will be roundedto two decimalplacesunlessotherwiseindicatedin the Decimals col-
umn.
In our example,we will beusingnumericvariableswith all of thedefaultvalues.
Practice Exercise
Createa datafile for the six variablesandtwo samplestudentspresentedin Exam-
ple 1.2.1.Nameyour variables:ID, DAY, TIME, MORNING, GRADE, andWORK. You
shouldcodeDAY as I : Mon/Wed/Fri,2 = Tues/Thurs.CodeTIME as I : morning,2 :
afternoon.CodeMORNING as0 = No, I : Yes.CodeWORK as0: No, I : Part-Time,2
: Full-Time. Be sureyou entervalue labelsfor the different variables.Note that because
valuelabelsarenot appropriatefor ID andGRADE, thesearenot coded.When done,your
Variable Viewwindow shouldlook like thescreenshotbelow:
J -rtrr,d
r9"o'ldq${:ilpt"?- "*- .?--
{!,_q,ru.g
Click on the Data Viewtab to openthe data-entryscreen.Enter datahorizontally,
beginningwith the first student'sID number.Enterthecodefor eachvariablein theappro-
priatecolumn;to entertheGRADE variablevalue,enterthestudent'sclassgrade.
F.E*UaUar Qgtr Irrddn Anhna gnphr Ufrrs Hhdow E*
*lgl dJl blblAl'ri-l-Etetmtototttrslglglqjglej ulFId't lr*lEl&lr6lglolrt'
2
Dependinguponyour versionof SPSS,it maybedisplayedas2.08 + 009.
ChapterI GettingStarted
-
Thepreviousdatawindowcanbechangedto lookinsteadlike thescreenshotbe-
l*.bv clickingontheValueLabelsicon(seeanow).In thiscase,thecellsdisplayvalue
labelsratherthanthecorrespondingcodes.If datais enteredin thismode,it is notneces-
saryto entercodes,asclickingthebuttonwhichappearsin eachcellasthecellis selected
will presenta drop-downlist of thepredefinedlablis.You mayuseeithermethod,accord-
ingtoyourpreference.
: [[o|vrwl vrkQ!9try /
*rn*to*u*J----.-- )1
Insteadof clicking the ValueLabels icon, you may
optionallytogglebetweenviewsby clickingvalueLaiels under
theViewmenu.
Section1.4 Loading and SavingData Files
Onceyou haveenteredyourdata,you will need
to saveit with a uniquenamefor lateruseso thatyou
canretrieveit whennecessary.
LoadingandsavingSpSSdatafilesworksin the
sameway asmostWindows-basedsoftware.Underthe
File menu, there are Open, Save, and Save As
commands.SPSSdata files have a .,.sav"
extension.
which is addedby defaultto the end of the filename.
ThistellsWindowsthatthefileisanSpSSdatafile.
SaveYourData
When you saveyour datafile (by clicking File, thenclicking Saveor SaveAs to
specifya uniquename),pay specialattentionto whereyou saveit. trrtistsystemsdefaultto
the.location<c:programfilesspss>.You will probablywant to saveyour dataon a floppy
disk,cD-R, or removableUSB drive sothatyou cantaie the file withvou.
,t
,t1
r
ti
il
'i. I
rlii
|:
H-
Load YourData
When you load your data (by clicking File, then
clicking Open,thenData, or by clicking theopenfile folder
icon),you get a similarwindow.This window listsall files
with the ".sav" extension.If you havetroublelocatingyour
saved file, make sure you are
looking in theright directory.
tu
l{il Ddr lrm#m Anrfrrr Cr6l!
D{l lriifqffi
ChapterI GeningStarted
PracticeExercise
To be surethatyou havemasteredsav-
ing andopeningdatafiles,nameyour sample
datafile "SAMPLE"andsaveit to a removable
FilE Edt $ew Data Transform Annhze @al
storagemedium.Onceit is saved,SPSSwill displaythe nameof the file at the top of the
data window. It is wise to saveyour work frequently,in caseof computercrashes.Note
thatfilenamesmay be upper-or lowercase.In thistext,uppercaseis usedfor clarity.
After you have savedyour data,exit SPSS(by clicking File, then Exit). Restart
SPSSandloadyour databy selectingthe"SAMPLE.sav"file youjust created.
Section1.5 RunningYour FirstAnalysis
Any time you opena data window, you canmn any of the analysesavailable.To
get started,we will calculatethe students'averagegrade.(With only two students,you can
easilycheckyour answerby hand,but imaginea datafile with 10,000studentrecords.)
The majority of the availablestatisticaltests are under the Analyze menu. This
menudisplaysall the optionsavailablefor your versionof the SPSSprogram(themenusin
thisbookwerecreatedwith SPSSStudentVersion15.0).Otherversionsmay haveslightly
differentsetsof options.
j rttrtJJ
File Edlt Vbw Data TransformI nnafzc Gretrs UUtias gdFrdov*Help
El tlorl rl(llnl
lVisible:6ol
GanoralHnnarf&dd
Corr*lrtr
Re$$r$on
Classfy
OdrRrdrrtMr
Scab
Norparimetrlclcrtt
Tirna5arl6t
Q.rlty Corfrd
Rff(trve,.,
)i
,)
)
ir l.
,.),.
Eipbrc,,.
CrogstSr,..
Rdio,.,
P-Pflok,.,
Q€ Phs.,,
)
l
)
)
To calculatea mean (average),we areaskingthe computerto summarizeour data
set.Therefore,we run the commandby clicking Analyze,thenDescriptive Statistics,then
Descriptives.
This brings up the Descriptives dialog
box. Note that the left side of the box containsa
list of all the variablesin our datafile. On theright
is an area labeled Variable(s), where we can
specifythe variableswe would like to usein this
particularanalysis.
.Srql
3s,l
A*r*.. I
r ktlmllff al
Cottpsr Milns )
't901.00
, Itjg*r*qgudrr,*ts"uss-
OAY
f- 9mloddrov*p*vri*lq
ChapterI GettingStarted
We want to compute the mean for the
variable called GRADE. Thus, we need to select
the variablename in the left window (by clicking
on it). To transferit to the right window, click on
the right arrow between the two windows. The
arrow always points to the window oppositethe
highlighted item and can be used to transfer
l:rt.Ij
in
m ;F* |
-t:g.J
-!tJ
PR:lf- Smdadr{rdvdarvai&
selectedvariablesin either direction.Note that double-clickingon the variablenamewill
also transfer the variable to the opposite window. StandardWindows conventionsof
"Shift" clickingor "Ctrl" clickingto selectmultiplevariablescanbe usedaswell.
When we click on the OK button,the analysiswill be conducted,and we will be
readyto examineour output.
Section1.6 ExaminingandPrintingOutputFiles
After an analysis is performed, the output is
placedin the output window, and the output window
becomesthe active window. If this is the first analysis
you have conductedsince starting SPSS,then a new
output window will be created.If you haverun previous
outputisaddedto theendof yourpreviousoutput.
To switchbackandforthbetweenthedatawindowandtheoutput window,select
thedesiredwindowfromtheWindowmenubar(seearrow,below).
Theoutputwindowis splitintotwo sections.Theleftsectionis anoutlineof the
output(SPSSreferstothisasthe"outlineview").Therightsectionis theoutputitself.
irllliliirrillliirrrI -d
* lnl-Xj
H. Ee lbw A*t lra'dorm
-qg*g!r*!e!|ro_
Craphr,Ufr!3 Uhdo'N Udp
slsl*glelsl*letssJsl#_#rl+l*l +l-l&hjl :lqlel,
* Descrlptlves
f]aiagarll l:  lrrs datcra&ple.lav
o
lle*crhlurr Sl.*liilca
N Mlnlmum Hadmum Xsrn Std.Dwiation
ufinuc
valldN(|lstrylsa)
I
2
83.00 85.00 81,0000 1.41421
ffiffi?iffi rr---*.* r*4
The sectionon the left of the output window providesan outline of the entireout-
put window. All of the analysesarelistedin theorderin which they wereconducted.Note
that this outline can be usedto quickly locatea sectionof the output.Simply click on the
sectionyou would like to see,andtheright window will jump to the appropriateplace.
analysesandsavedthem,your
ornt
El Pccc**tvs*
r'fi Trb
6r**
lS Adi€D*ard
ffi Dcscrtfhcsdkdics
ChapterI GeningStarted
Clicking on a statisticalprocedurealsoselectsall of the outputfor thatcommand.
By pressingtheDeletekey,thatoutputcanbe deletedfrom the output window. This is a
quick way to be surethatthe output window containsonly the desiredoutput.Outputcan
also be selectedand pastedinto a word processorby clicking Edit, then Copy Objeclsto
copy the output.You canthenswitchto your word processorand click Edit, thenPaste.
To print your output,simply click File, thenPrint, or click on the printer icon on
the toolbar.You will havethe option of printing all of your outputor just the currentlyse-
lected section.Be careful when printing! Each time you mn a command,the output is
addedto the end of your previousoutput.Thus,you could be printing a very largeoutput
file containinginformationyou may not want or need.
Oneway to ensurethatyour output window containsonly the resultsof thecurrent
commandis to createa new output window just beforerunningthe command.To do this,
click File, thenNew, then Outpul. All your subsequentcommandswill go into your new
output window.
Practice Exercise
Load the sampledatafile you createdearlier(SAMPLE.sav).Run theDescriptives
commandfor the variableGRADE and print the output.Your output shouldlook like the
exampleon page7. Next,selectthedata window andprint it.
Section1.7 ModifyingDataFiles
Once you havecreateda datafile, it is really quite simple to add additionalcases
(rows/participants)or additionalvariables(columns).ConsiderExample1.7.1.
Example1.7.1
Twomorestudentsprovideyouwithsurveys.Theirinformationis:
ResponseSheet3
ID:
Dayof class:
Classtime:
Are you a morningperson?
Finalgradein class:
Do you work outsideschool?
ResponseSheet4
ID:
Day of class:
Classtime:
Are you a morningperson?
Finalgradein class:
Do you work outsideschool?
8734
80%
MWF
Morning
Yes
Full-time
No
1909
X MWF
X Morning
X Yes
73%
Full+ime
No
X TTh
Afternoon
XNo
Part-time
TTH
Afternoon
No
X Part-time
ChapterI GettingStarted
To addthesedata,simply placetwo additionalrows in theData View window (af-
ter loadingyour sampledata).Notice that asnew participantsareadded,the row numbers
becomebold. when done,the screenshouldlook like the screenshothere.
New variablescan also be added.For example,if the first two participantswere
given specialtrainingon time management,andthetwo new participantswerenot, thedata
file canbe changedto reflectthis additionalinformation.The new variablecould be called
TRAINING (whetheror not the participantreceivedtraining), and it would be codedso
that 0 : No and I : Yes. Thus,the first two participantswould be assigneda "1" andthe
Iasttwo participantsa "0." To do this, switch to the Variable View window, then add the
TRAINING variableto the bottom of the list. Then switchback to theData View window
to updatethe data.
f+rilf,t - tt Inl vl
Sa E& Uew Qpta lransform &rpFzc gaphs Lffitcs t/itFdd^,SE__--
14:TRAINING l0 lvGbt€ri of
t0 NAY TIME MORNING GRADE woRKI mruruwe 1r
1 4593.0f1 Tueffhu aterncon No 85.0u Nol Yes
I 1901.OCIManA/Ved/ m0rnrng Yes ffi.0n iiart?mel- yes
3 8734"00 Tueffhu momtng No 80.n0 Noi No
4 1909.00MonrlVed/ morning Yes 73.00 Part-TimeI No '
s
I
(l) .rView { Vari$c Vlew
. l-.1 =J "isPssW
rll'l
,i
Adding dataand addingvariablesarejust logical extensionsof the procedureswe
usedto originally createthe datafile. Savethis new data file. We will be using it again
laterin thebook.
'..,
j .l lrrl vl
nh E*__$*'_P$f_I'Sgr &1{1zcOmhr t$*ues$ilndonHug_
Tffiffi
ID DAY TIME MORNING GRADE WORK var ^
1 4593.00 Tueffhu aternoon No 85.00 No
2 1gnl.B0MonMed/ m0rnrng Yes 83.00 Part-Time
3 8734.00 Tue/Thu mornrng No 80,00 No
1909.00MonAfVed/ mornrng Yeg 73.00 Part-Time
)
.mfuUiewffi
I
rb$ Vbw / l{l rll
'.- - -,,,---Jd*
15P55Procus*rlsready I i ,4
ChapterI GettingStarted
Practice Exercise
Follow the exampleabove(whereTRAINING is the new variable).Make the
modificationsto yourSAMPLE.savdatafile andsaveit.
l0
Chapter2
EnteringandModifying Data
In Chapter 1, we learnedhow to createa simpledatafile, saveit, perform a basic
analysis,and examinethe output.In this section,we will go into more detail aboutvari-
ablesanddata.
Section2.1 VariablesandDataRepresentation
In SPSS,variablesarerepresentedascolumnsin the datafile. Participantsarerep-
resentedasrows.Thus,if we collect4 piecesof informationfrom 100participants,we will
havea datafile with 4 columnsand 100rows.
Measurement Scales
Therearefour typesof measurementscales:nominal, ordinal, interval, andratio.
While themeasurementscalewill determinewhich statisticaltechniqueis appropriatefor a
given set of data,SPSSgenerallydoesnot discriminate.Thus, we startthis sectionwith
this warning: If you ask it to, SPSSmay conductan analysisthat is not appropriatefor
your data.For a morecompletedescriptionof thesefour measurementscales,consultyour
statisticstext or the glossaryin AppendixC.
Newer versionsof SPSSallow you to indicatewhich types of
data you have when you define your variable.You do this using the
Measurecolumn.You can indicateNominal,Ordinal,or Scale(SPSS
doesnot distinguishbetweeninterval andratio scales).
Look at the sampledatafile we createdin Chapterl. We calcu-
lateda mean for the variableGRADE. GRADE wasmeasuredon a ra-
tio scale,andthemeanis anacceptablesummarystatistic(assumingthatthedistribution
isnormal).
We could havehad SPSScalculatea mean for the variableTIME insteadof
GRADE.If wedid,wewouldgettheoutputpresentedhere.
TheoutputindicatesthattheaverageTIME was 1.25.RememberthatTIME was
coded as an ordinal variable (I =
morningclass,2-afternoon
class).Thus, the mean is not an
appropriatestatisticfor an ordinal
scale,but SPSScalculatedit any-
way. The importanceof consider-
ing the type of data cannot be
overemphasized. Just because
SPSSwill compute a statistic for
you doesnot meanthatyou should
Measure
@Nv
f $cale
.sriltr
r Nominal
ll
*lq]eH"N-ql*l trlllql eilr $l-g
:* Sl astts
.l.:D
gtb
:$sh
.6M6.ffi
$arlrba"t S#(|
ht6x0tMn a
LS 2.qg Lt@
ql total
2.00 2.Bn 4.00
3.00 1.00 4.00
4.00 3.00 7.00
2.00
1.00 2.UB 3.00
Chapter2 EnteringandModifying Data
useit. Later in the text,when specificstatisticalproceduresarediscussed,the conditions
underwhich they areappropriatewill be addressed.
Missing Data
Often,participantsdo not providecompletedata.For somestudents,you may have
a pretestscorebut not a posttestscore.Perhapsone studentleft one questionblank on a
survey,or perhapsshedid not stateher age.Missing datacanweakenany analysis.Often,
a singlemissingquestioncaneliminatea sub-
ject from all analyses.
If you havemissingdatain your data
set, leave that cell blank. In the exampleto
the left, the fourth subjectdid not complete
Question2. Note thatthetotal score(which is
calculatedfrom both questions)is alsoblank
becauseof the missing data for Question2.
SPSSrepresentsmissing data in the data
window with a period(althoughyou should
not entera period-just leaveit blank).
Section2.2 TransformationandSelectionof Data
Weoftenhavemoredatain a datafile thanwewantto includein a specificanaly-
sis.For example,our sampledatafile containsdatafrom four participants,two of whom
receivedspecialtrainingandtwo of whomdid not.If we wantedto conductananalysis
usingonlythetwo participantswhodidnotreceivethetraining,we wouldneedto specify
theappropriatesubset.
Selectinga Subset
F|! Ed vl6{ , O*. lr{lrfum An*/& e+hr (
We canusethe SelectCasescommandto specify
a subset of our data. The Select Cases command is
located under the Data menu. When you select this
command,the dialog box below will appear.
t'llitl&JE
il :id
O*fFV{ldrr PrS!tU6.,.
CoptO.tafropc,tir3,..
l,j.l,/r,:irrlrr! lif l ll:L*s,,.
Hh.o*rr,.,
Dsfti fi*blc Rc*pon$5ct5,,,
ConyD*S
sd.rt Csat
You can specify which cases(partici-
pants)you want to selectby using the selec-
tion criteria,which appearon the right sideof
theSelectCasesdialogbox.
q*d-:-"-- "-"""-*--*--**-""*-^*l
6 Alce
a llgdinlctidod
,rl
r irCmu*dcaa ]
i*np* | i{^ lccdotincoarrpr
:
;.,* |
-:--J
c llaffrvci*lc
l0&t
C6ttSldrDonoan!.ffi
foKl aar I c-"rl x* |
t2
Chapter2 EnteringandModifying Data
By default,All caseswill be selected.The most commonway to selecta subsetis
to click If condition is satisfied,thenclick on the button labeledfi This will bring up a
newdialogbox thatallowsyou to indicatewhichcasesyou would like to use.
You can enter the logic
used to select the subsetin the
upper section. If the logical
statement is true for a given
case, then that case will be
selected.If the logical statement
is false. that case will not be
selected.For example, you can
selectall casesthat were coded
as Mon/Wed/Fri by enteringthe
formula DAY = I in the upper-
?Ais"I c'-t I Ht I
rightpartof thewindow.If DAY is l, thenthestatementwill betrue,andSPSSwill select
the case.If DAY is anythingotherthan l, the statementwill be false,andthe casewill not
be selected.Once you have enteredthe logical statement,click Continueto return to the
SelectCasesdialogbox. Then,click OK to returnto thedata window.
After you haveselectedthecases,thedata window will changeslightly.
The casesthat werenot selectedwill be markedwith a diagonalline throughthe
casenumber.For example,for our sampledata,the first and third casesarenot
selected.only the secondandfourthcasesareselectedfor this subset.
U;J;J:.1-glL1 E{''di',*tI
, 'J-e.l-,'JlJ.!J-El[aasi"-Eo,t----i
ilqex4q lffiIl,?,l*;*"'=
,Jl _!JlJ 0 U IAFTAN(r"nasl
sl"J=tx-s*t"lBi!?Blt1trb:r
1
I
,
I
I
l
i{
1
,1
'l
1
I
1
:
t
'l
1
'l
EffEN'EEEgl''EEE'o ,.,:r. rt lnl vl
!k_l**
-#gdd.i.&lFlib'-
ID TIME MORNING ERADE WORK TRAINING
/,-< 4533.m Tueffhui affsrnoon No ffi.m Na Yes NotSelected
2 1901.m-
6h4lto*-
ieifrfft
MpnMed/i mornino. -..- ^,-.-.*.*..,-- J.- . - .-..,..".*-....- ':
Yss 83,U1Fad-Jime Yes Splacled
-'4
TuElThu. morning No m.m No No NotSelected
4 MonA/Ved/1morning Yes ru.mPart-Time No
s
!LJii. vbryJv,itayss7 I . *-J *]fsPssProcaesaFrcady I i ,1,
An additionalvariablewill also be createdin your data file. The new variableis
calledFILTER_$ andindicateswhethera casewasselectedor not.
If we calculatea mean
GRADE using the subsetwe
just selected,we will receive
the output at right. Notice that
we now havea mean of 78.00
with a samplesize(M) of 2 in-
steadof 4.
DescripthreStailstics
N Minimum Maximum Mean
std.
Deviation
UKAUE
ValidN
IliclwisP'l
2
2
73.00 83.00 78.0000 7.0711
l3
Chapter2 EnteringandModifyingData
Be carefulwhen you selectsubsets.Thesubsetremainsin ffict until you run the
commandagain and selectall cases.You cantell if you havea subsetselectedbecausethe
bottomof the data window will indicatethat a filter is on. In addition,when you examine
your output,N will be lessthanthe total numberof recordsin your dataset if a subsetis
selected.The diagonallines throughsomecaseswill also be evidentwhen a subsetis se-
lected.Be carefulnot to saveyour datafile with a subsetselected,asthis cancauseconsid-
erableconfusionlater.
Computing a New Variable
SPSScan alsobe used
to computea new variable or
manipulateyour existing vari-
ables. To illustrate this, we
will create a new data file.
This file will contain data for
four participants and three
variables(Ql, Q2, and Q3).
The variables represent the
number of points each
participant received on three
different questions.Now enter
the data shown on the screen to the right. When done, save this data file as
"QUESTIONS.sav."We will beusingit againin laterchapters.
I TrnnsformAnalyze Graphs Utilities Whds
Rersdeinto5ameVariable*,,,
RacodointoDffferantVarlables.,,
Ar*omSicRarode,,.
Vlsual8inrfrg,..
After clicking the Compute Variable
command,we get the dialog box at
right.
The blank field marked Target
Variable is where we enter the name
of the new variablewe want to create.
In this example, we are creating a
variablecalled TOTAL, so type the
word"total."
Notice that there is an equals
sign between the Target Variable
blank and the Numeric Expression
blank. Thesetwo blank areasare the
Now you will calculatethe total scorefor
eachsubject.We coulddo this manually,but if the
data file were large, or if there were a lot of
questions,this would take a long time. It is more
efficient (and more accurate) to have SPSS
compute the totals for you. To do this, click
Transform and then click Compute Variable.
U $J-:iidijl
lij -!CJ:l Jslcl
ll;s rtg-sJ
rt rt rl ,_g-.|J
:3 lll--g'L'"J til
, rr | {q*orfmsrccucrsdqf
l4
nh E* vir$, D.tr T|{dorm
*lslel EJ-rlrj -lgltj{l -|tlf,la*intt m eltj I
l* ,---- LHJ
{#i#ffirtr!;errtt*;
,
rrwI i+t*...
*l
gl
w
ca
lllmr*dCof
0rr/ti*
&fntndi)
Oldio.
E${t iil
:J
n*ri c*rl
"*l
Chapter2 EnteringandModifying Data
iii:Hffiliji:.:
.i .i>t ii"alCt
i-Jr:J::i i-3J:J
l:j -:15 JJJI
tJ -tJ-il --q-|J
is:Jlll --q*J m
|f-- | ldindm.!&dioncqdinl
tsil nact I c:nt I x* |
two sides of an equation that SPSS
will calculate.For example,total: ql
+ q2 + q3 is the equationthat is
enteredin the samplepresentedhere
(screenshotat left).Notethatit is pos-
sible to create any equation here
simply by using the number and
operationalkeypad at the bottom of
the dialog box. When we click OK,
SPSSwill createa new variablecalled
TOTAL andmakeit equalto the sum
of thethreequestions.
Save your data file again so
thatthenew variablewill be available
for futuresessions.
-lJ
t::,, - ltrl-Xl
Sindow Help
3.n0 3.0n 4,n0 10.00
4.00
31 2.ool 2.oo..........;.
41 1.001 3001
.:1 l-'r--i-----i
I il I i
, l, lqg,t_y!"*_i VariabteViewJ lit rljl
W*;
Recodinga Variable-Dffirent Variable
SPSS can create a new
variable based upon data from
another variable. Say we want to
split our participantson the basisof
their total score.We want to create
a variablecalledGROUP,which is
coded I if the total score is low
(lessthanor equalto 8) or 2 if the
total scoreis high (9 or larger).To
do this, we click Transform, then
Recodeinto Dffirent Variables.
,-.lu l,rll r-al +. conp$ovdiouc','
---.:1.- Cd.nVail'r*dnCasas.,,
l{
-l
I -- -
rr 'rtr I o..**^c--u-r-c
4.00
2.00
i.m
Racodrlrto 0ffrror* Yal
Art(tn*Rrcodr...
U*dFhn|ro,,.
S*a *rd llm tllhsd,,,
Oc!t6 I}F sairs..,
Rid&c l4sitE V*s.,.
Rrdon iMbar G.rs*trr,,.
l5
Eile gdit SEw Qata lransform $nalyza 9aphs [tilities Add'gns
F{| [dt !la{ Data j Trrx&tm Analrra
Chapter2 EnteringandModifyingData
This will bring up the
Recode into Different Variables
dialog box shown here. Transfer
the variableTOTAL to the middle
blank. Type "group" in the Name
field underOutputVariable.Click
Change,and the middle blank will
show that TOTAL is becoming
GROUP.asshownbelow.
ladtnl c€ rlccdm confbil
-'tt"
I rygJ**l-H+ |
r t *.!*lr
r&*ri*i*t
;rln
I r-":-'-'1**
lirli
iT-
I r nryrOr:frr**"L
,f-
i c nq.,saa*ld6lefl;
F-
,.F--*-_-_-_____
: "
*r***o
I a lrt*cn*r
I I nni.
rT..".''..."...-
I ir:L-_-
t'
l6 i4i'|(tthah*
;F-
I"
n*'L,*l'||.r.$,
: r----**-:
;
r {:ei.*
T &lrYdd.r*t li--
'-
i"r,.!*r h^.,",r y..,t larir,r it:.' I
gf-ll $q I
'*J
til
To help keep track of variablesthat have
been recoded, it's a good idea to open the
Variable View and enter"Recoded"in the Label
column in the TOTAL row. This is especially
useful with large datasetswhich may include
manyrecodedvariables.
Click Old andNew Values.This will bring
up the Recodedialog box. In this example,we
have entered a 9 in the Range, value through
HIGHEST field and a 2 in the Value field under
New Value.When we click Add, theblank on the
right displaysthe recodingformula.Now enteran
8 on the left in the Range, LOWEST through
valueblank and a I in the Valuefield underNew
Value.Click Add, thenContinue.Click OK. You
will be redirectedto the data window. A new
variable (GROUP) will have been added and
codedas I or 2, basedon TOTAL.
*u"'." -ltrlIl
Flc Ed Yl.ly Drt! Tr{lform {*!c ce|6.,||tf^,!!!ry I+
NtnHbvli|bL-lo|rnrV*#r
l6
Chapter3
DescriptiveStatistics
ln Chapter2, wediscussedmanyof theoptionsavailablein SPSSfor dealingwith
data.Now we will discusswaysto summarizeour data.Theproceduresusedto describe
andsummarizedataarecalleddescriptivestatistics.
Section3.1 FrequencyDistributionsand PercentileRanks
for a SingleVariable
Description
TheFrequenciescommandproducesfrequencydistributionsfor thespecifiedvari-
ables.Theoutputincludesthenumberof occurrences,percentages,validpercentages,and
cumulativepercentages.Thevalid percentagesandthe cumulativepercentagescomprise
onlythedatathatarenotdesignatedasmissing.
TheFrequenciescommandis usefulfor describingsampleswherethemeanis not
useful(e.g.,nominalor ordinalscales).It is alsousefulasa methodof gettingthefeelof
yourdata.It providesmoreinformationthanjust a meanandstandarddeviationandcan
beusefulin determiningskewandidentifyingoutliers.A specialfeatureof thecommand
isitsabilityto determinepercentileranks.
Assumptions
Cumulativepercentagesandpercentilesarevalidonly for datathataremeasured
onat leastanordinal scale.Becausetheoutputcontainsonelinefor eachvalueof a vari-
able,thiscommandworksbestonvariableswitharelativelysmallnumberof values.
Drawing Conclusions
TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases
in thesampleof a particularvalueandthepercentageof caseswith thatvalue.Thus,con-
clusionsdrawnshouldrelateonlyto describingthenumbersor percentagesof casesin the
sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper-
centageand/orpercentilescanbedrawn.
.SPSSData Format
TheSPSSdatafile for obtainingfrequencydistributionsrequiresonlyonevariable,
andthatvariablecanbeof anytype.
tt
Chapter3 DescriptiveStatistics
Creating a Frequency Distribution
To run the Frequer?ciescommand,
click Analyze, then Descriptive Statistics,
then Frequencies.(This exampleusesthe
CARS.savdatafile that comeswith SPSS.
It is typically located at <C:Program
FilesSPSSCars.sav>.)
This will bring up the main dialog
box. Transferthe variablefor which you
would like a frequencydistributioninto the
Disbtlvlr...
N
Erpbr,..
croac*a,..
Rrno,.,
F.Pt'lok,.,
aaPUs,.,
Variable(s)blank to the right. Be surethat
the Display frequency tables option is
checked.Click OK to receiveyour output.
Note that the dialog boxes in
newer versionsof SPSSshow both the
typeof variable(theicon immediatelyleft
of the variable name) and the variable
labels if they are entered. Thus, the
variableYEAR shows up in the dialog
box asModel Year(moduloI0).
i:rl.&{l&l&lslsl}sl
i1 rmpg i18
MilesperGallonlmr
/Erqlr,onispUcamr
/ Hurepowor[horc
dv*,id"w"bir 1|ut
d t!rc toAceileistc
dr',Ccxr*yolOrbin[c
l7 Oisgayhequercytder
xl
q!l
jq? |
.f"tq I
. He_l
sr**i,1..1f*:.,.I rry*,:.I
Outputfor a Frequency Distribution
The outputconsistsof two sections.The first sectionindicatesthe numberof re-
cordswith valid data for eachvariableselected.Recordswith a blank scorearelistedas
missing.In thisexample,thedatafile contained406 records.Noticethatthevariablelabel
is ModelYear(modulo100).
statistics
The second section of the output contains a
cumulative frequency distribution for each variable
Wselected.Atthetopofthesection,thevariablelabelis
|
* y.1"1 |
oo?
| given.The outputiiself consistsof five columns.The first
I MissingI t I Jolumnliststhi valuesof thevariablein sortedorder.There
is a row for eachvalueof your variable,
and additionalrows are added at the
bottom for the Total and Missing data.
The secondcolumngivesthe frequency
of eachvalue,includingmissingvalues.
Thethirdcolumngivesthepercentageof
all records (including records with
missingdata)for eachvalue.The fourth
column,labeledValidPercenl,givesthe
percentageof records(withoutincluding
records with missing data) for each
value.If therewereany missingvalues,
thesevalueswould be larger than the
valuesin columnthreebecausethe total
ModolYo.r (modulo 100)
Pcrcenl Valid P6rc€nl
Cumulativs
vatE
72
73
74
75
76
77
79
80
81
82
Total
Missing 0 (Missing)
Total
34
28
40
27
30
34
28
29
29
30
31
405
1
406
I 4
7.1
6.9
9.9
6.7
8.4
6.9
8.9
7.1
7.1
7.4
7.6
99.8
100.0
I 4
7.2
6.9
9.9
6.7
7.4
8.4
6.9
8.9
f.2
7.2
7.4
7.7
100.0
E4
15.6
22.5
32.3
39.0
46.4
54.8
61.7
70.6
77.8
84.9
92.3
|00.0
r8
&99rv I
@
cdrFrb'l{tirE }
r5117gl
Chapter3 DescriptiveStatistics
numberof recordswould havebeenreducedby thenumberof recordswith missingvalues.
The final column gives cumulativepercentages.Cumulativepercentagesindicatethe per-
centageof recordswith a scoreequalto or smallerthan the currentvalue.Thus, the last
value is always 100%.Thesevaluesare equivalentto percentile ranks for the values
listed.
Determining PercentiIe Ranl<s
:,,.
tril
YI
!rydI
|*"1
lT Oirpbarfrcqlcreyttblce
frfix*... I
Central TendencyandDispersior sections
suchasthe Median or Mode. whichcannot
(seeSection3.3).
This brings up the Frequencies:
Statisticsdialog box. Check any additional
desiredstatisticby clickingon the blanknext
to it. For percentiles, enter the desired
percentile rank in the blank to the right of
thePercentile(s)label.Then,click Add to add
it to the list of percentilesrequested.Once
you haveselectedall your requiredstatistics,
click Continue to return to the main dialog
box.Click OK.
The Frequencies command can be
used to provide a number of descriptive
statistics,as well as a variety of percentile
values(includingquartiles, cut points,and
scorescorrespondingto a specificpercentile
rank).
To obtain either the descriptiveor
percentile functions of the Frequencies
command,click the Statisticsbutton at the
bottomof the maindialog box. Note thatthe
of this box are useful for calculatingvalues,
be calculatedwith theDescriptiyescommand
PscdibV.lrr
xl
c{q I
*g"d I
Hdo I
tr Ourilr3
I
F nrs**rtd!i* ,crnqo,p, i
f- Vdrixtgor0mi&ohlr
Oi$.r$pn"
l* SUaa**
n v$*$i
I* nmgc
f Mi*n n
|- Hrrdilrtl
l- S"E.mcur
0idthfim'
t- ghsrurt
T Kutd*b
Statistics
ModelYear(modulo100
N Vatid
Missing
Percentiles 25
50
75
80
405
1
73.00
76.00
79.00
80.00
Outputfor PercentileRanl<s
The Statisticsdialog box adds on to the
previousoutput from the Frequenciescommand.The
new sectionof theoutputis shownat left.
The output containsa row for eachpieceof
informationyou requested.In the exampleabove,we
checkedQuartilesand askedfor the 80th percentile.
Thus, the output contains rows for the 25th, 50th.
75th,and80thpercentiles.
Mla pa Galmlm3
Sfndr*Pi*rcsnr
SHslsp{rierltuso
/v***v*$t*(ttu
/lino toaccrbrar
$1C**{ry o{Origr[c
l9
Chaprer,1 Descriptire Statistics
PracticeExercise
UsingPracticeDataSetI in AppendixB, createa frequencydistributiontablefor
themathematicsskillsscores.Determinethemathematicsskillsscoreat whichthe60th
percentilelies.
section3.2 FrequencyDistributionsand percentileRanks
for Multiple Variables
Description
The Crosslabscommandproducesfrequencydistributionsfor multiplevariables.
Theoutputincludesthenumberof occurrencesof eachcombinationof levelJof eachvari-
able.It ispossibleto havethecommandgivepercentagesfor anyor all variables.
The Crosslabscommandis usefulfor describingsampleswherethe meanis not
useful(e'g.,nominalor ordinalscales).It is alsousefulasa methodfor gettinga feelfor
yourdata.
Assumptions
Becausethe outputcontainsa row or columnfor eachvalueof a variable.this
commandworksbestonvariableswitharelativelysmallnumberof values.
ThisexampleusestheSAMpLE.savdata ;ilffi;
file, which you createdin Chapter l. To run the chrfy
procedure, ctick Analyze, then Descriptive DttaRcd.Etbn
Statistics,then Crosstabs.This will bring up ttt.
scah
mainCrosstabsdialogbox,below.
,SPSSData Format
The SPSSdata file for the Crosstabs
commandrequirestwo or morevariables.Those
variablescanbeof anytype.
RunningtheCrosstabsCommand
I lnalyzc Orphn Ut||Uot
RcF*r )
(orprycrllcEnr
G*ncralllrgarFlodcl
The dialog box initially lists all vari-
ableson the left and containstwo blanks la-
beled Row(s) and Column(s). Enter one vari-
able(TRAINING) in theRow(s)box. Enterthe
second (WORK) in the Column(s) box. To
analyzemore than two variables,you would
enter the third, fourth, etc., in the unlabeled
area(ust undertheLayer indicator).
)
)
,
)
)
)
)
i,
Ror{.} T€K I
r---r ftr;;ho.- '-l
lrJ I
.;lm&! ryq I
20
Chapter3 DescriptiveStatistics
percentagesand other information to be generatedfor
eachcombinationof values.Click Cells,andyou will get
thebox at right.
For the example presentedhere, check Row,
Column, and Total percentages.Then click Continue.
This will return you to the Crosstabsdialog box. Click
OK to run theanalvsis.
TRAINING'WURKCross|nl)tilntlo|l
WORK
TolalNO Parl-Time
TRAINING Yes Count
%withinTRAININO
%withinwoRK
%ofTolal
I
50.0%
50.0%
25.0%
1
50.0%
50.0%
25.0%
100.0%
50.0%
50.0%
No Count
%withinTRAINING
%withinWORK
%ofTolal
1
50.0%
50.0%
25.0%
1
50.0%
50.0%
25.0%
?
1000%
50.0%
50.0%
Total Count
%withinTRA|NtNo
%wilhinWORK
%ofTolal
50.0%
100.0%
50.0%
a
500%
100.0%
50.0%
4
r00.0%
100.0%
100.0%
Interpreting Crosstabs Output
The output consistsof a
contingencytable.Each level of
WORK is given a column.Each
level of TRAINING is given a
row. In addition, a row is added
for total, and a column is added
for total.
The Cells button allows you to specify W:
t C",ti* |
t*"1
,"1
Eachcell containsthe numberof participants(e.g.,one participantreceivedno
traininganddoesnot work; two participantsreceivedno training,regardlessof employ-
mentstatus).
Thepercentagesfor eachcell arealsoshown.Row percentagesaddup to 100%
horizontally.Columnpercentagesaddupto 100%vertically.Forexample,of all theindi-
vidualswhohadno training, 50ohdid notworkand50o%workedpart-time(usingthe"o/o
withinTRAINING" row).Of theindividualswhodid notwork,50o/ohadno trainingand
50%hadtraining(usingthe"o/owithinwork"row).
Practice Exercise
UsingPracticeDataSet I in AppendixB, createa contingencytableusingthe
Crosstabscommand.Determinethe numberof participantsin eachcombinationof the
variablesSEXandMARITAL. Whatpercentageof participantsis married?Whatpercent-
ageof participantsis maleandmarried?
Section3.3 Measuresof Central Tendencyand Measuresof Dispersion
for a SingleGroup
Description
Measuresof centraltendencyarevaluesthat representa typicalmemberof the
sampleor population.Thethreeprimarytypesarethemean,median,andmode.Measures
of dispersiontell you thevariabilityof yourscores.Theprimarytypesaretherangeand
thestandarddeviation.Together,a measureof centraltendencyanda measureof disper-
sionprovideagreatdealof informationabouttheentiredataset.
''Pd€rl.!p. - r-Bait*"
;F Bu : ,l- U]dadr&ad
F corm if- sragatrd
"1'"1--_rry-ys___ .
2l
Chapter,l DescriptiveStatistics
We will discussthesemeasuresof central
tendencyandmeasuresof dispersionin the con-
text of the Descriplives command. Note that
many of thesestatisticscan also be calculated
with several other commands (e.g., the
Frequenciesor CompareMeans commandsare
requiredto computethe mode or median-the
Statisticsoption for theFrequenciescommandis
shownhere).
iffi{ltl*::l'.,xl
Fac*Vd*c-----:":'-'-"-" "-
|7 Arruer
|* O*pai*furjF tqLteiotpr
F rac$*['*
r.-I 16-k'I
':'I I+l
lcer**r**nc*r1 !*{* |
f- rlm Cr* |
, f u"g.t -:.-i
i0hx*ioo*".'*-'
lf Sld.dr',iitbnl* lli*nn
]fV"iro
f.H**ntrn
lfnxrgo f.5.t.ncr
: T Modt
:-^t5m
l- Vdsm$apn&bcirr
oidrlatin-- --
r5tcffi:
; f Kutu{b
i
Assumptions
Eachmeasureof centraltendencyandmeasureof dispersionhasdifferent assump-
tionsassociatedwith it. The mean is the mostpowerfulmeasureof centraltendency,andit
hasthe mostassumptions.For example,to calculatea mean,the datamustbe measuredon
an interval or ratio scale.In addition,thedistributionshouldbe normally distributedor, at
least,not highly skewed.The median requiresat leastordinal data.Becausethe median
indicatesonly the middle score(when scoresarearrangedin order),thereareno assump-
tions aboutthe shapeof the distribution.The mode is the weakestmeasureof centralten-
dency.Thereareno assumptionsfor the mode.
The standard deviation is themostpowerful measureof dispersion,but it, too, has
severalrequirements.It is a mathematicaltransformationof the variance (the standard
deviationis the squareroot of thevariance).Thus,if oneis appropriate,theotheris also.
The standard deviation requiresdatameasuredon an interval or ratio scale.In addition,
the distributionshouldbe normal.The range is the weakestmeasureof dispersion.To cal-
culatea range, the variablemustbe at leastordinal. For nominal scaledata,the entire
frequencydistributionshouldbe presentedasa measureof dispersion.
Drawing Conclusions
A measureof centraltendencyshouldbe accompaniedby a measureof dispersion,
Thus, when reporting a mean, you shouldalso report a standard deviation. When pre-
sentinga median, you shouldalsostatetherange or interquartilerange.
.SPSSData Format
Only onevariableis required.
22
Chapter3 DescriptiveStatistics
Running the Command
The Descriptives command will be the
command you will most likely use for obtaining
measuresof centraltendencyandmeasuresof disper-
sion. This exampleusesthe SAMPLE.sav data file
we haveusedin thepreviouschapters.
,t X
dlt
da.v
qil
n".dI
cr*l I
f,"PI
opdqr"..I
To run the command, click Analyze,
then Descriptive Statistics,then Descriptives.
This will bring up the main dialog box for the
Descriptives command. Any variables you
would like informationaboutcanbe placedin
the right blank by double-clickingthem or by
selectingthem,thenclicking on theanow.
!
D
' cond*s
. Rolrar*n
: classfy
: 0€tdRedrctitrt
)
)
)
)
d**
?n-"*
?,r,qx
/t**ts
f S&r dr.d!r&!d Y*rcr ri vdi.bb
By default, you will receivethe N (number of
cases/participants),the minimum value, the maximum
value,the mean, and the standard deviation.Note that
someof thesemay not be appropriatefor the type of data
you haveselected.
If you would like to changethe defaultstatistics
that aregiven, click Optionsin the main dialog box. You
will begiventheOptionsdialogbox presentedhere.
F Morr l- Slm r@t
qq..'I
,|'?bl
ltl
{l
'!t
,l
,lt
il
'i
I
I
:
"i
I
",
;i
I
;
F su aa**n F, Mi*ilm
f u"or- F7Maiilrn
l- nrrcr I- S.r.npur
I otlnyotdq: *
I {f V;i*hlC
I r lpr,*an
I
r *car*remar
i r Dccemdnnmre
Reading the Output
The output for the Descriptivescommandis quite straightforward.Each type of
outputrequestedis presentedin a column,andeachvariableis given in a row. The output
presentedhereis for the sampledatafile. It showsthatwe haveonevariable(GRADE) and
that we obtainedthe N, minimum, maximum,mean, and standard deviation for this
variable.
DescriptiveStatistics
N Minimum Maximum Mean Std.Deviation
graoe
ValidN (listwise)
4
4
73.00 85.00 80.2500 5.25198
lA-dy* ct.dn Ltffibc
GonardtFra*!@
23
Chapter3 DescriptiveStatistics
Practice Exercise
UsingPracticeDataSet I in AppendixB, obtainthe descriptivestatisticsfor the
ageof theparticipants.What is themean?The median?The mode?What is thestandard
deviation?Minimum?Maximum?The range?
Section3.4 Measuresof Central Tendency and Measuresof Dispersion
for Multiple Groups
Description
The measuresof centraltendencydiscussedearlierare often needednot only for
theentiredataset,but alsofor severalsubsets.Oneway to obtainthesevaluesfor subsets
would be to usethe data-selectiontechniquesdiscussedin Chapter2 andapply theDe-
scriptivescommandto eachsubset.An easierway to performthis task is to usetheMeans
command.The Meanscommandis designedto providedescriptivestatisticsfor subsets
ofyour data.
Assumptions
The assumptionsdiscussedin the sectionon Measuresof CentralTendencyand
Measuresof Dispersionfor a SingleGroup(Section3.3)alsoapplyto multiplegroups.
Drawing Conclusions
A measureof centraltendencyshouldbe accompaniedby a measureof dispersion.
Thus,whengiving a mean,you shouldalsoreporta standarddeviation.Whenpresenting
a median,you shouldalsostatetherangeor interquartilerange.
SPSSData Format
Two variablesin the SPSSdatafile are required.One representsthe dependent
variable and will be the variablefor which you receivethe descriptivestatistics.The
otheris theindependentvariable andwill beusedin creatingthesubsets.Notethatwhile
SPSScallsthis variablean independentvariable, it may not meetthe strictcriteriathat
definea trueindependentvariable (e.g.,treatmentmanipulation).Thus,someSPSSpro-
ceduresreferto it asthegroupingvariable.
RunningtheCommand
This example ! RnalyzeGraphsUtilities
nsportt F
' DescriptiveStatistirs )
GeneralLinearftladel F
' Csrrelata )
. Regression I
' (fassify F
WindowHetp I-l
r.l
Firulbgt5il |
-
Ona-Sarnplef feft.
Independent-SamdesTTe
Falred-SarnplEsTTest,,,
Ons-Way*|iJOVA,,,
uses the
SAMPLE.sav data file you created in
Chapterl. The Meanscommandis run by
clicking Analyze, then Compare Means,
thenMeans.
This will bringup the maindialog
box for the Means command. Place the
selectedvariablein the blank field labeled
DependentList.
1A
LA
Chapter3 DescriptiveStatistics
Placethe grouping variable in thebox labeledIndependentList.In this example,
throughuseof the SAMPLE.savdatafile, measuresof centraltendencyand measuresof
dispersion for the variable GRADE will be given for each level of the variable
MORNING.
:I
tu
DependantList
€ arv
,du**
/wqrk
€tr"ining
rTril
ll".i I
lLayarlal1*-
I :'r:rrt| ..!'l?It.Ii
I IndependentLi$:
i r:ffi
lr-, tffi,
r l*i.rl I
L-:-
ryl
HesetI
CancelI
l"rpI
By default,the mean,numberof cases,and
standard deviation are given. If you would like
additionalmeasures,click Optionsand you will be
presentedwith the dialog box at right. You can opt
to includeany numberof measures.
Reading the Output
The output for the Means commandis split
into two sections.The first section,called a case
processingsummary, gives informationaboutthe
data used. In our sample data file, there are four
students(cases),all of whom were includedin the
analysis.
I
Std.Enord Kutosis
Skemrcro
fd Stdirtlx:
mil'*-*
lltlur$uofCa*o*
lStardad
Doviaion
ml
I
I
Lqlry-l c""dI x,r I
Sld.Enool$karm
HanorricMcan :J
Medan
5tt
Minirn"rm
Manimlrn
Rarqo
Fist
La{
VsianNc
GaseProcessingSummary
Cases
lncluded Excluded Total
N Percent N Percent N Percent
grade- morning 4 100.0% 0 .OYo 4 | 100.0%
25
Chapter3 DescriptiveStatistics
The secondsectionof the out-
put is the report from the Means com-
mand.
This report lists the name of
the dependent variable at the top
(GRADE). Every level of the inde-
pendent variable (MORNING) is
shown in a row in the table.In this example,the levelsare 0 and l, labeledNo and Yes.
Note thatif a variableis labeled,thelabelswill be usedinsteadof theraw values.
The summarystatisticsgiven in the reportcorrespondto the data,wherethe level
of theindependentvariable is equalto therow heading(e.g.,No, Yes).Thus,two partici-
pantswereincludedin eachrow.
An additionalrow is added,namedTotal. That row containsthe combineddata.
andthe valuesarethe sameasthey would be if we hadrun theDescriptiyescommandfor
thevariableGRADE.
Extension to More Than One Independent Variable
If you have more than one
independent variable, SPSScan
break down the output even fur-
ther. Rather than adding more
variables to the Independent List
section of the dialog box, you
need to add them in a different
layer. Note that SPSS indicates
with which layeryou areworking.
If you click Next, you will be presentedwith
Layer 2 of 2, and you can selecta secondindependent
variable (e.g., TRAINING). Now, when you run the
command(by clicking On, you will be given summary
statistics for the variable GRADE by each level of
MORNING andTRAINING.
Your output will look like
the output at right. You now have
two main sections(No and yes),
along with the Total. Now, how-
ever, each main section is broken
down into subsections(No, yes,
andTotal).
The variable you used in
Level I (MORNING) is the first
one listed,and it definesthe main
sections.The variableyou had in
Level 2 (TRAINING) is listedsec-
Repott
GRADE
MORNING Mean N Std.Deviation
NO
Yes
Total
82.5000
78.0000
80.2500
2
4
3.53553
7.07107
5.25198
Report
ORADE
MORNING TRAINING Mean N Std.Deviation
No Yes
NO
Total
85.0000
80.0000
82.5000
1
1
I 3.53553
Yes Yes
NO
Total
83.0000
73.0000
78.0000
1
1
1
7.07107
Total Yes
NO
Total
84.0000
76.5000
80.2500
a
z
4
1.41421
4.54575
5.?5198
id
26
Chapter3 DescriptiveStatistics
ond.Thus,the first row representsthoseparticipantswho werenot morningpeopleand
whoreceivedtraining.Thesecondrowrepresentsparticipantswhowerenotmorningpeo-
pleanddid notreceivetraining.Thethirdrow representsthetotalfor all participantswho
werenotmorningpeople.
Noticethatstandarddeviationsarenotgivenfor all of therows.Thisis because
thereisonlyoneparticipantpercellin thisexample.Oneproblemwithusingmanysubsets
is thatit increasesthenumberof participantsrequiredto obtainmeaningfulresults.Seea
researchdesigntextor yourinstructorfor moredetails.
Practice Exercise
UsingPracticeDataSetI in AppendixB, computethemeanandstandarddevia-
tion of agesfor eachvalueof maritalstatus.Whatis theaverageageof themarriedpar-
ticipants?Thesingleparticipants?Thedivorcedparticipants?
Section3.5 Standard Scores
Description
Standardscoresallowthecomparisonof differentscalesby transformingthescores
intoa commonscale.Themostcommonstandardscoreis thez-score.A z-scoreis based
ona standardnormaldistribution(e.g.,a meanof 0 anda standarddeviationof l). A
z-score,therefore,representsthenumberof standarddeviationsaboveor belowthemean
(e.9.,az-scoreof -1.5representsascoreI %standarddeviationsbelowthemean).
Assumptions
Z-scoresarebasedon thestandardnormal distribution.Therefore,thedistribu-
tionsthatareconvertedtoz-scoresshouldbenormallydistributed,andthescalesshouldbe
eitherintervalor ratio.
Drawing Conclusions
Conclusionsbasedonz-scoresconsistof thenumberof standarddeviationsabove
or belowthemean.Forexample,astudentscores85onamathematicsexamin aclassthat
hasa meanof 70andstandarddeviationof 5.Thestudent'stestscoreis l5 pointsabove
theclassmean(85- 70: l5). Thestudent'sz-scoreis 3 becauseshescored3 standard
deviationsabovethemean(15+ 5 :3). If thesamestudentscores90ona readingexam,
witha classmeanof 80anda standarddeviationof 10,thez-scorewill be I .0because
sheis onestandarddeviationabovethe mean.Thus,eventhoughher raw scorewas
higheronthereadingtest,sheactuallydidbetterin relationto otherstudentsonthemathe-
maticstestbecauseherz-scorewashigheronthattest.
.SPSSData Format
Calculatingz-scoresrequiresonlya singlevariablein SPSS.Thatvariablemustbe
numerical.
27
Chapter3 DescriptiveStatistics
Running the Command
Computingz-scoresis a componentof the
Descriptivescommand.To accessit, click Analyze,
thenDescriptive Statistics,thenDescriptives. This
exampleusesthe sampledata file (SAMPLE.sav)
createdin ChaptersI and2.
19 Srva*ndudi3advduosts vcriaHas
Myzc eqhs Uti$tbl WMow Help
) b,lrstlK- al
@nerdLlneuFbdel )
Correlate )
This will bring up the stan-
dard dialog box for the Descrip-
/ives command.Notice the check-
box in the bottom-left corner la-
beled Save standardized values as
variables.Checkthis box andmove
the variableGRADE into the right-
handblank. Then click OK to com-
pletethe analysis.You will be pre-
sented with the standard output
from theDescriptivescommand.Notice thatthez-scoresarenot listed.They wereinserted
into thedata window asa new variable.
Switch to the Data View window and examineyour data file. Notice that a new
variable,called ZGRADE, has beenadded.When you askedSPSSto save standardized
values,it createda new variablewith the samenameasyour old variableprecededby a Z.
Thez-scoreis computedfor eachcaseandplacedin thenew variable.
lr| -tsJXEb E* S€w Qpt. lrnsfam end/2. gr$t6
t*l
tsr.dI
c"odI
HdpI
ldry |
elslel&l *il{|lelej sJglelffilslffilfw,qlqj
$citffrtirffi
Tua/Thulaiemoon Yas
Yes
No
Mi-
Reading the Output
After you conductedyour analysis,the new variablewascreated.You canperform
any numberof subsequentanalyseson thenew variable.
Practice Exercise
Using PracticeData Set2 in AppendixB, determinethez-scorethatcorrespondsto
eachemployee'ssalary.Determinethe mean z-scoresfor salariesof male employeesand
femaleemployees.Determinethe meanz-scorefor salariesof thetotal sample.
rc11i-io-
doay
drnue
dMonNtNs
dwnnn
drR$HtNs
28
Chapter4
GraphingData
Section4.1 GraphingBasics
In addition to the frequencydistributions,the measuresof central tendencyand
measuresof dispersiondiscussedin Chapter3, graphingis a usefulway to summarize,or-
ganize,andreduceyour data.It hasbeensaidthat a pictureis worth a thousandwords.In
thecaseof complicateddatasets,this is certainlytrue.
With Version 15.0of SPSS,it is now possibleto makepublication-qualitygraphs
usingonly SPSS.One importantadvantageof usingSPSSto createyour graphsinsteadof
othersoftware(e.g.,Excel or SigmaPlot)is that the datahavealreadybeenentered.Thus,
duplicationis eliminated,andthechanceof makinga transcriptionerroris reduced.
Section4.2 TheNewSPSSChartBuilder
DataSet
For the graphingexamples,we will usea new setof data.Enterthe databelowby
defining the three subjectvariablesin the Variable View window: HEIGHT (in inches),
WEIGHT (in pounds),and SEX (l = male,2 = female).When you createthe variables,
designateHEIGHT and WEIGHT as Scalemeasuresand SEX as a Nominal measure(in
thefar-rightcolumnof the VariableView).Switchto theData Viewto
enterthedatavaluesfor the 16participants.Now usetheSaveAs com-
mandtosavethefile,namingit HEIGHT.sav.
bCIb
--
iNiomiiiai
-
Measure
Scale
HEIGHT
66
69
/5
72
68
63
74
70
66
64
60
67
64
63
67
65
WEIGHT
150
155
160
160
150
140
165
150
ll0
100
95
ll0
105
100
ll0
105
SEX
I
I
I
I
I
I
I
I
2
2
2
2
2
2
2
2
29
Chapter4 GraphingData
Make sureyou have enteredthe datacorrectlyby calculatinga mean for eachof
the threevariables(click Analyze,thenDescriptive Statistics,thenDescriptives).Compare
yourresultswith thosein thetablebelow.
DescrlptlveStatistics
N Minimum Maximum Mean
srd.
Dpvi2lion
l-ttstuFlI
WEIGHT
SEX
ValidN
(listwise)
16
16
16
16
60.00
06 nn
1.00
74.00
165.00
2.00
66.9375
129.0625
1.5000
J.9Ub//
26.3451
.5164
Chart Builder Basics
Make surethat the HEIGHT.savdatafile you createdaboveis open.In order to
usethe chartbuilder,you musthavea datafile open.
NewwithVersionl5.0ofSPSSistheChartBuildercom.W
mand. This command is accessedusing Graphs, then Chart
Builder in the submenu.This is a very versatilenew commandthat
canmakegraphsof excellentquality.
When you first run the Chart Builder command,you will
probablybepresentedwith the following dialog box:
Bcforeyur rrc thlsdalog,moasuranar*hvelshold bcsctgecrh fw cadrvadabb
h yourdurt. In dtbn, f yow chartcodahscataqo*d v6d&. v*re hbds
sha.rldbr &fhcd for eachcrtrgory
kass O( to doflrcyorr chart,
Pr6srDafineV.riaHafroportbsto mt masrcnrant brd orddhe v*.te l&b for
rhartvsi$bs,
:,
f* non't*row $rUdalogagaFr
This dialog box is
askingyouto ensurethatyour
variables are properly de-
fined.Referto Sections1.3
and2.1 if you haddifficulty
definingthevariablesusedin
creatingthe datasetfor this
example,or to refreshyour
knowledgeof thistopic.Click
oK.
cc[ffy
Eesknotnents
Ocfknvubt# kopcrtcr.,.
The Chart Builder allows you to makeany kind of graphthat
is normally usedin publicationor presentation,and much of it is be-
yond the scopeof this text. This text,however,will go overthe basics
of the ChartBuilder sothatyou canunderstandits mechanics.
On the left sideof the Chart Builder window arethe four main
tabsthat let you control the graphsyou are making. The first one is
theGallery tab.The Gallerytaballowsyou to choosethebasicformat
ofyour graph.
l"ry{Y:_
litleo/Footndar
-
rct"ph; Lulitieswindt
ol(
30
Chapter4 GraphingData
For example, the screenshothere
showsthedifferentkindsof barchartsthat
theChartBuilder cancreate.
After you have selectedthe basic
form of graph that you want using the
Gallery tab, you simply drag the image
from the bottom right of the window up to
the main window at the top (where it
reads,"Drag a Gallery charthereto useit
asyour startingpoint").
Alternatively,you can use the Ba-
sicElemenlstab to drag a coordinatesys-
tem (labeledChooseAxes)to the top win-
dow, then drag variables and elements
into thewindow.
The other tabs (Groups/Point ID
and Titles/Footnotes)can be usedfor add-
ing other standard elements to your
graphs.
The examples in this text will
cover some of the basic types of graphs
@9Pk8:
0rr9 a 63llst ctrt fsg b re it e
y* 6t'fig pohr
OR
Clkl m f€ 86r Ele|mb * b tulH
r dwt €lsffirt bf ele|Ft
Chrtpftrbv [43 airr?b deb
dnsrfiom:
Ll3
Aroa
PleFokr
Scalbillot
Hbbqran
HUH-ot,
8oph
DJ'lAm
8artsElpnF&
n"ct I cror | ,bh I
you canmakewith the ChartBuilder.After a little experimentationon your own, onceyou
havemasteredthe examplesin the chapter,you will soongain a full understandingof the
ChartBuilder.
Section4.3 Bar Charts, PieCharts,and Histograms
Description
Barcharts,piecharts,andhistogramsrepresentthenumberof timeseachscoreoc-
cursthroughthevaryingheightsof barsor sizesof piepieces.Theyaregraphicalrepresen-
tationsof thefrequencydistributionsdiscussedin Chapter3.
Drawing Conclusions
TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases
in the samplewith a particularvalueandthepercentageof caseswith thatvalue.Thus,
conclusionsdrawnshouldrelateonly to describingthe numbersor percentagesfor the
sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper-
centagesand/orpercentilescanalsobedrawn.
SPSSData Format
Youneedonlvonevariableto usethiscommand.
3l
Chapter4 GraphingData
Running the Command
The Frequenciescommandwill produce
graphicalfrequencydistributions.Click Analyze,
then Descriptive Statistics, then Frequencies.
You will be presentedwith the maindialog box
for the Frequenciescommand,where you can
enter the variablesfor which vou would like to
| *nalyze Gr;pk Udties Window Hdp
creategraphsor charts.(SeeChapter3 for otheroptionswith this command.)
You will receive the charts for any variables
lectedin the mainFrequenciescommanddialog box.
Output
The bar chartconsistsof a I'axis, representingthe
frequency,andanXaxis, representingeachscore.Note that
the only valuesrepresentedon the X axis are thosevalues
with nonzerofrequencies(61, 62, and 7l arenot repre-
sented).
h.lgtrt
66.!0 67.m 68.00
h.lght
G
a
,I
a
L
t LiwLlW .a'fJul
(6fnpSg MBan* )
GeneralLinearMsdel)
Click the Charts button at the bot-
tom to producefrequencydistributions.This
will giveyou theChartsdialogbox.
Therearethreetypesof chartsavail-
able with this command: Bar charts, Pie
charts, andHistograms. For eachtype, the I
axis can be either a frequencycount or a
percentage(selectedwith the Chart Values
option).
);,r.: xl
0Kl
n"*dI
c"!q I
l1t"l
65.00 70.s
Chapter4 GraphingData NEUMAf{l{COLLEiSELt*i:qARy
A$TO|',J,pA .igU14
hclght
The pie chart showsthe per-
centageof the whole that is repre-
sentedby eachvalue.
The Histogramcommandcre-
atesa groupedfrequencydistribution.
Therangeof scoresissplitintoevenly
spacedgroups.The midpointof each
groupis plottedon theX axis,andthe
I axisrepresentsthenumberof scores
for eachgroup.
If you select With Normal
Curve,a normalcurvewill be super-
imposedoverthedistribution.Thisis
very usefulin determiningif the dis-
tribution you have is approximately
normal.The distributionrepresented
hereis clearlynot normaldueto the
asymmetryof thevalues.
h166.9l
S. Oae,.lr07
flrl0
Practice Exercise
UsePracticeDataSet I in AppendixB. After you haveenteredthe data,constructa
histogramthat representsthe mathematicsskills scoresanddisplaysa normal curve,anda
barchartthatrepresentsthe frequenciesfor thevariableAGE.
Section4.4 Scatterplots
Description
Scatterplots(also called scattergramsor scatterdiagrams)display two values for
eachcasewith a mark on thegraph.TheXaxis representsthevaluefor onevariable.The I
axisrepresentsthevaluefor the secondvariable.
s0.00
t3r0
€alr
05!0
66.00
67.!0
Gen0
!9.!0
tos
nfit
13!o
il.m
h.lght
JJ
Chapter-1 GraphingData
Assumptions
Bothvariablesshouldbeintervalor ratio scales.If nominalor ordinaldataare
used,becautiousaboutyourinterpretationof thescattergram.
.SPSSData Format
Youneedtwovariablestoperformthiscommand.
Running the Command
You can producescatterplotsby clicking Graphs, then Chart
Builder. (Note:You canalsousetheLegacyDialogs. For this method,
pleaseseeAppendixF.)
r l0l ln Gallerv Choose
from: selectScatter/Dol.ThendragtheSimple
Scatter icon (top left) up to the main chart
areaas shownin the screenshotat left. Disre-
gardtheElementPropertieswindow thatpops
up by choosingClose.
Next,dragtheHEIGHT variableto the
X-Axis area,and the WEIGHT variableto the
Y-Axisarea(rememberthat standardgraphing
conventionsindicate that dependent vari-
ablesshouldbe I/ andindependentvariables
shouldbeX. This would meanthat we aretry-
ing to predictweightsfrom heights).At this
point,your screenshouldlook like the exam-
ple below. Note that your actual data arenot
shown-just a setof dummy values.
Wrilitll'.,: ,, .Jol
V*l&bi:
^ry.J Y*J - '"? |
Click OK. You should
graph(nextpage)asOutput.
get your new
orrq a 6ilby (h*t fes b & it e
tl ".:';oon,
l
ln
iLs
clr* s fE Bs[ pleitbnb t b b krth
3 cfst Bleffit by €l8ffit
Chrifrwr* (& mtrpb dstr
Ctffii'w:
Frwih
Si
LtE
lr@
Fb/Fq|n
gnt$rrOol
l,lbbgran
HlgfFl"tr
l@bt
Ral Ars
iEbM{ Ffip*t!4.,
opbr.,
I 6raph* ulfftlqs Wnd
8n
Lh
PlrifsLa
Scfflnal
xbbrs
Hg||rd
34
, x**J" s*J ...ryFl
Chapter4 GraphingData
Output
Theoutputwill consistofamarkforeachparticipantattheappropriateX and
levels.
Adding a Third Variable
Eventhoughthe scatterplotis a
two-dimensionalgraph,it canplota third
variable.To make it do so, selectthe
Groups/PointID tabin theChartBuilder.
Click theGrouping/stackingvariableop-
tion.Again,disregardtheElementProp-
ertieswindow that popsup. Next, drag
thevariableSEXintotheupper-rightcor-
ner whereit indicatesSet Color.When
thisis done,yourscreenshouldlooklike
theimageat right.If you arenotableto
dragthevariableSEX,it maybebecause
it is notidentifiedasnominalor ordinal
in the VariableViewwindow.
Click OK to haveSPSSproduce
thegraph.
arlo i?Jo ?0.00 t:.${
hdtht
!|||d d*|er btrdtn- b$tdl
l- cotrnrcpr:tvr$
I- aontpl*rt
35
Chapter4 GraphingData
Now our outputwill havetwo differentsetsof marks.One setrepresentsthe male
participants,and the secondsetrepresentsthe femaleparticipants.Thesetwo setswill ap-
pearin two differentcolorson your screen.You canusethe SPSScharteditor(seeSection
4.6) to makethemdifferentshapes,asshownin theexamplebelow.
os
65,00 67.50
helght
Practice Exercise
UsePracticeDataSet2 in AppendixB. Constructa scatterplotto examinetherela-
tionshipbetweenSALARYandEDUCATION.
Section4.5 AdvancedBar Charts
Description
Bar chartscan be producedwith the Frequencie.scommand(seeSection4.3).
Sometimes.however.we areinterestedin a barchartwherethe I/ axisis nota frequency.
To producesuchachart,weneedtousetheBarchartscommand.
SPSSData Format
You need at least two variablesto perform this command.There are two basic
kinds of bar charts-those for between-subjectsdesignsand thosefor repeated-measures
designs.Usethebetween-subjectsmethodif onevariableis theindependentvariable and
the other is the dependentvariable. Use the repeated-measuresmethodif you havea de-
pendentvariable for eachvalueof theindependentvariable (e.g.,you would havethree
sPx
iil
60.00
36
Chapter4 GraphingData
variablesfor a designwith threevaluesof the independentvariable).This normallyoc-
curswhenyou makemultiple observationsovertime.
This exampleusesthe GRADES.savdatafile, which will be createdin Chapter6.
Pleaseseesection6.4 forthedataif you would like to follow along.
Running the Command
Open the Chart Builder by clicking Graphs, then Chart
Builder. In the Gallery tab, selectBar. lf you had only one inde-
pendent variable, you would selectthe SimpleBar chart example
(top left corner).If you havemore thanone independentvariable
(as in this example),
tfldr(
select the Clustered Bar Chart example
from themiddle of the top row.
Drag the exampleto the top work-
ing area. Once you do, the working area
should look like the screenshotbelow.
(Note that you will need to open the data
file you would like to graphin order to run
thiscommand.)
h4 | G.laryahd lsr to @ t 6 p
cfwxry
m
ffi * $r 0* dds t bto h.td. drr
drrrl by.lr!*
y"J .*t I r,* |
:gi
lh. y*rfts yu vttdld {a b. rsd te grmt! yw d.t,
rh ffi qa..dr vrt d. {db. Edr.*6ot.' h *. dst,
vtlB enpcr*.dby |SddSri,lARV vrtdb cdon d b
Ur Yd. Vrtdrr U* d.ftr (&gqb n.ryst d !d c
*rdd nDe(rd*L, **h o b. red o. c&eskd d q 6
. gdslo a F Ftrg Yrt aic.
Cdtfry
LSdrl
f o,-l ryl *.r! "l
If you are using a repeated-measuresdesign like our example here using
GRADES.savfrom Chapter6 (threedifferent variablesrepresentingthe i valuesthat we
want),you needto selectall threevariables(you can<Ctrl>-clickthemto selectmultiple
variables)andthendragall threevariablenamesto the Y-Axisarea.Whenyou do. vou will
be giventhewarningmessageabove.Click OK.
tG*ptrl uti$Ueswh&
l?i;ffitF.t-
d'd{4rfr trrd...
/ft,Jthd)
/l*n*|ts,.,
dq*oAtrm, ,
9{
m
hlpd{
sc.ffp/Dat
tffotm
tldrtff
60elot
oidA#
JI
Chapter4 GraphingData
,'rsji,. *lgl$
*rrrt plYkrlur.r ollmbdaa.
8{
Lll.
,fat
H.JPd.,
t(.&|rih
Krtogrqn
HCtstoef
loxpbt
orrl Axas
ir?i:J g;
'!
I'
;:Nl
iai
inilrut
lr
&t:
nt
r*dlF*...
dnif*ntmld,..
/tudttbdJ
{i*rEkucrt}&".,
&rcqsradtrcq,,.
n"i* l. crot J rr! |
Output
Practice Exercise
Use PracticeData Set I in Appendix B. Constructa clusteredbar graphexamining
the relationshipbetweenMATHEMATICS SKILLS scores(as the OepenOentvariabtej
and MARITAL STATUS and SEX (as independentvariables).Make sureyou classify
bothSEX andMARITAL STATUSasnominalvariables.
Next, you will need to
dragthe INSTRUCT variableto
the top right in the Cluster: set
color area (see screenshotat
left).
Note: The Chart Builder pays
attention to the types of vari-
ablesthat you ask it to graph.If
you are getting etTormessages
or unusualresults,be sure that
your categorical variables are
properly designatedas Nominal
in the Variable View tab (See
Chapter2, Section2.l).
38
Chapter4 GraphingData
Section4.6 EditingSPSSGraphs
Whatever command you
use to createyour graph,you will
probably want to do some editing
to make it appearexactly as you
want it to look. In SPSS,you do
this in much the sameway thatyou
edit graphs in other software
programs(e.g.,Excel).After your
graph is made, in the output
window, select your graph (this
will createhandlesaroundthe out-
sideof the entireobject)and right-
click. Then. click SPSS Chart
Object, and click Open. Alter-
natively,you can double-clickon
the graphto openit for editing.
Whenyou openthe graph,theChartEditor window andthe correspondingProper-
lies window will appear.
qb li. lin.tlla. *rll..!!lflE.!l
,, ;l 61f L:lr!.H;gb.tct-]pu1 ri
IE :,- r--."1
Ittttr
tlttIr
tllrwel
w&&$!{!rJ
JJJJ-JJ
JJJJJJ
.nlqrlcnl,f,,!sl
r 9-,I rt
fil mlryl
OnceChart Editor is open,you caneasilyedit eachelementof the graph.To select
an element,just click on the relevantspoton the graph.For example,if you haveaddeda
title to your graph("Histogram" in the examplethat follows), you may selectthe element
representingthetitle of the graphby clicking anywhereon the title.
FFF,FfuFF|*"'4F&'E'
cFtA$-qli*LBul0l al ll rI
q
*.
$r
;l Jxr F4*.it.r":!..*
ltliL&{ il.dk'nl
39
Chapter4 GraphingData
jn ExYt ltb":€klgtH,U:; Li
^'irsGssir :J*ro:l A I 3 *l.A-I,--
Onceyou haveselected
an element, you can tell
whether the correct elementis
selectedbecauseit will have
handlesaroundit.
If the item you have
selectedis a text element(e.g.,
the title of the graph),a cursor
will be presentandyou canedit
the text asyou would in a word
processing program. If you
would like to change another
attributeof the element(e.g.,
the color or font size),usethe
Propertiesbox. (Text properties
areshownbelow.)
With a linle practice,
you can make excellentgraphs
using SPSS.Once your graph
is formattedthe way you want
it, simply select File, Save,
then Close.
$o gdt lbw gsion Ek
$vr {hat Trm$tr,,,
Spdy$a*Tmpt*c.,.
flpoft {bdt rf'.|1,,,
trTT.":.TJ*"' .'*t A:r::-'
o,tl*" ffiln*fot*.1
P?*l!r h ?frtmd Sa . .
AaBbCc123
gltaridfu; Ua*tr$Sie
40
Chapter5
PredictionandAssociation
Section5.1 PearsonCorrelation Coefficient
Description
ThePearsoncorrelationcoefficient(sometimescalledthePearsonproduct-moment
correlationcoefficientor simplythePearsonr) determinesthestrengthof thelinearrela-
tionshipbetweentwovariables.
Assumptions
Bothvariablesshouldbemeasuredonintervalor ratio scales(or a dichotomous
nominalvariable).If a relationshipexistsbetweenthem,thatrelationshipshouldbelinear.
Becausethe Pearsoncorrelationcoefficientis computedwith z-scores,both variables
shouldalsobenormallydistributed.If yourdatado notmeettheseassumptions,consider
usingtheSpearmanrhocorrelationcoefficientinstead.
SP.SSData Format
Two variablesarerequiredin yourSPSSdatafile.Eachsubjectmusthavedatafor
bothvariables.
4
n
1
..
n"."tI
ry{l
i*l
lfratyil qapns
Reportr
Utl&i*s t#irdow Heb
)
)
)
)
Move at leasttwo variablesfrom the
box at left into the box at right by usingthe
transferarrow (or by double-clickingeach
variable).Make surethat a check is in the
Pearson box under Correlation
Cofficients. It is acceptableto move more
thantwo variables.
4l
Running the Command
To selectthe Pearsoncorrelationcoefficient,
click Analyze, then Conelate, then Bivariate
(bivariate refers to two variables).This will bring
up the Bivariate Correlations dialog box. This
exampleusesthe HEIGHT.sav data file enteredat
the startof Chapter4.
Vdri.blcr
I
I
I
rqslDescripHveSalirtk*
CcmparaHranr
ue"qer:dlirwarmo{d
. .i lwolalad {. 0rG-tr8.d
9@,.1
Chapter5 PredictionandAssociation
For our example,we will move
all threevariablesoverandclick OK.
Reading the Output
The output consists of a
correlation matrix. Every variableyou
enteredin the command is represented
asboth a row and a column.We entered
three variables in our command.
Therefore,we havea 3 x 3 table.There
are also three rows in each cell-the
correlation,the significancelevel, and
Vdi{$b* OX I
lsffi -N-
ml/'*
I Tc* d $lrfmma*--*=*-*:-**-*l
l_i::x- .--i
17Flag{flbrrcorda&rn
nql
:rydl
!4 1
the N. If a correlation is signifi-
cant at lessthan the .05 level, a
single * will appearnext to the
correlation.If it is significantat
the .01 levelor lower, ** will ap-
pear next to the correlation. For
example, the correlation in the
output at right has a significance
level of < .001, so it is flagged
with ** to indicatethat it is less
than.01.
To read the correlations.
selecta row and a column. For
example,the correlationbetweenheightandweight is determinedthroughselectionof the
WEIGHT row andthe HEIGHT column(.806).We get the sameanswerby selectingthe
HEIGHT row and the WEIGHT column.The correlationbetweena variableand itself is
alwaysl, sothereis a diagonalsetof I s.
Drawing Conclusions
The correlationcoefficientwill be between-1.0 and+1.0.Coefficientscloseto 0.0
representa weakrelationship.Coefficientscloseto 1.0or-1.0 representa strongrelation-
ship. Generally,correlationsgreaterthan 0.7 areconsideredstrong.Correlationslessthan
0.3 areconsideredweak.Correlationsbetween0.3 and0.7areconsideredmoderate.
Significant correlationsare flaggedwith asterisks.A significant correlationindi-
catesa reliablerelationship,but not necessarilya strongcorrelation.With enoughpartici-
pants,a very small correlationcan be significant.PleaseseeAppendix A for a discussion
of effect sizesfor correlations.
Phrasinga SignificantResult
In the exampleabove,we obtaineda correlationof .806 betweenHEIGHT and
WEIGHT. A correlationof .806is a strongpositivecorrelation,andit is significantat the
.001level.Thus,we couldstatethefollowingin a resultssection:
Correlations
heioht weioht sex
netgnt Pearsonuorrelalron
Sig.(2-tailed)
N
1
16
.806'
.000
16
-.644'
.007
16
weight PearsonCorrelation
Sig.(2-tailed)
N
.806'
.000
16
I
16
.968'
.000
16
sex PearsonCorrelation
Sig.(2-tailed)
N
-.644'
.007
16
-.968'
.000
16
1
16
". Correlationis significantat the 0.01levet(2-tailed).
4/
Chapter5 PredictionandAssociation
A Pearsoncorrelationcoefficientwascalculatedfor the relationshipbetween
participants'height and weight. A strong positive correlationwas found
(r(14) : .806,p < .001),indicatinga significantlinearrelationshipbetween
thetwo variables.Tallerparticipantstendto weighmore.
The conclusionstatesthe direction(positive),strength(strong),value (.806),de-
greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a
statementof directionis included(talleris heavier).
Note thatthedegreesof freedomgivenin parenthesesis 14.The outputindicatesan
N of 16.While mostSPSSproceduresgive degreesof freedom,the correlationcommand
givesonly theN (thenumberof pairs).For a correlation,thedegreesof freedomis N - 2.
Phrasing ResultsThat Are Not Significant
Usingour SAMPLE.savdataset
from the previous chapters,we could
calculatea correlationbetweenID and
GRADE. If so, we get the outPut at
right.Thecorrelationhasa significance
level of .783.Thus,we could write the
following in a resultssection(notethat
thedegreesof freedomis N - 2):
A Pearsoncorrelationwas calculatedexaminingthe relationshipbetween
participants' ID numbers and grades.A weak correlation that was not
significantwasfound(, (2): .217,p > .05).ID numberis notrelatedto grade
in thecourse.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethe valueof the Pearsonconela-
tion coefficientfor therelationshipbetweenSALARY andYEARS OF EDUCATION.
Section5.2 SpearmanCorrelationCoeflicient
Description
The Spearmancorrelationcoefficientdeterminesthe strengthof the relationshipbe-
tweentwo variables.It is a nonparametricprocedure.Therefore,it is weakerthanthe Pear-
soncorrelationcoefficient.but it canbe usedin moresituations.
Assumptions
Becausethe Spearmancorrelationcoefficientfunctionson the basisof the ranksof
data,it requiresordinal (or interval or ratio) datafor both variables.They do not needto
be normallydistributed.
Correlations
ID GRADE
lD PearsonUorrelatlon
Sig.(2{ailed)
N
1.000
4
.217
7A?
4
GMDE PearsonCorrelation
Sig.(2-tailed)
N
.217
.783
4
1.000
4
43
Chapter5 PredictionandAssociation
SP.SSData Format
Two variablesarerequiredin yourSPSSdatafile. Eachsubjectmustprovidedata
forbothvariables.
Running the Command
Click Analyze, then Correlate, then
Bivariate.This will bringup themaindialogbox
for Bivariate Correlations(ust like the Pearson
correlation). About halfway down the dialog
box, there is a sectionfor indicatingthe type of
correlationyou will compute.You can selectas
many correlationsasyou want. For our example,
removethecheckin thePearsonbox (by clicking
on it) andclick on theSpearmanbox.
|;,rfiy* Grapk Utilitior wndow Halp
i*CsreldionCoefficientsj
j f f"igs-"jjl- fienddrstzu.b
Use the variablesHEIGHT and WEIGHT
from ourHEIGHT.savdatafile (Chapter4). This is
also one of the few commandsthat allows you to
choosea one-tailedtest.if desired.
Reading the Output
The output is essen-
tially the sameas for the Pear-
son correlation.Each pair of
variables has its correlation
coefficientindicatedtwice.The
Spearmanrho can range from
-1.0 to +1.0,just like thePear-
sonr.
The output listed above indicatesa correlationof .883 betweenHEIGHT and
WEIGHT. Note the significancelevelof .000,shownin the "Sig. (2-tailed)"row. This is,
in fact,a significancelevel of <.001. The actualalphalevelroundsout to.000, but it is
not zero.
Drawing Conclusions
The correlationwill bebetween-1.0 and+1.0.Scorescloseto 0.0representa weak
relationship.Scorescloseto 1.0or -1.0 representa strongrelationship.Significantcorrela-
tions are flaggedwith asterisks.A significantcorrelationindicatesa reliablerelationship,
but not necessarilya strongcorrelation.With enoughparticipants,a very small correlation
can be significant.Generally,correlationsgreaterthan 0.7 are consideredstrong.Correla-
tions lessthan 0.3 are consideredweak. Correlationsbetween0.3 and 0.7 arc considered
moderate.
RrFarts )
I Oescri$iveStatistics )
ComparcMeans )
" GenerdLinearf{udel )
Correlations
HEIGHT WEIGHT
Spearman'srho HEIGHT CorrelationCoeflicient
Sig.(2-tailed)
N
ffi
Sig.(2-tailed)
N
1.000
16
tr-4.
.000
16
.883
.000
't6
1.000
16
". Correlationis significantat the .01 level(2-tailed)
44
Chapter5 PredictionandAssociation
PhrasingResultsThatAreSignificant
In the exampleabove,we obtaineda correlationof .883 betweenHEIGHT and
WEIGHT. A correlationof .883is a strongpositivecorrelation,andit is significantat the
.001level.Thus,we couldstatethefollowingin a resultssection:
A Spearmanrho correlationcoefficientwas calculatedfor the relationship
betweenparticipants'height and weight. A strongpositive correlationwas
found (rho (14):.883, p <.001), indicatinga significantrelationship
betweenthetwo variables.Tallerparticipantstendto weighmore.
The conclusionstatesthe direction(positive),strength(strong),value(.883),de-
greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a
statementof directionis included(talleris heavier).Notethatthedegreesof freedomgiven
in parenthesesis 14.TheoutputindicatesanN of 16.For a correlation,thedegreesof free-
domisN-2.
Phrasing ResultsThat Are Not Significant
Using our SAMPLE.sav
datasetfrom the previouschapters,
we couldcalculatea Spearmanrho
correlation between ID and
GRADE. If so, we would get the
output at right. The correlationco-
efficientequals.000andhasa sig-
nificancelevelof 1.000.Note thatthoughthis valueis roundedup and is not, in fact,ex-
actly 1.000,we couldstatethefollowingin a resultssection:
A Spearmanrho correlationcoefficientwas calculatedfor the relationship
betweena subject'sID numberand grade.An extremelyweak correlation
thatwasnot significantwasfound(r (2 = .000,p > .05).ID numberis not
relatedto gradein thecourse.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethe strengthof the relationship
betweensalaryandjob classificationby calculatingtheSpearmanr&ocorrelation.
Section 5.3 Simple Linear Regression
Description
Simplelinearregressionallowsthepredictionof onevariablefrom another.
Assumptions
Simplelinearregressionassumesthatboth variablesareinterval- or ratio-scaled.
In addition,the dependentvariable shouldbe normallydistributedaroundthe prediction
line. This, of course,assumesthat the variablesare relatedto eachotherlinearly.Typi-
Correlations
to GRADE
Spearman'srho lD CorrelationCoenicten
Sig.(2{ailed)
N
ffi
Sig. (2{ailed)
N
000 .UUU
1.000
.000
1.000
4
1.000
45
Chapter5 PredictionandAssociation
cally, both variablesshouldbe normally distributed.Dichotomousvariables (variables
with only two levels)arealsoacceptableasindependentvariables.
.SPSSData Format
Two variablesare requiredin the SPSSdata file. Each subjectmust contributeto
bothvalues.
Running the Command
Click Analyze, thenRegression,then
Linear. This will bring up the main diatog
box for LinearRegression.On theleft sideof
the dialog box is a list of the variablesin
your datafile (we areusingthe HEIGHT.sav
data file from the start of this section).On
the right are blocks for the dependent
variable (the variable you are trying to
predict),and the independentvariable (the
variablefrom whichwe arepredicting).
0coandart
t '-J ff*r,'--
Aulyze Graphs
R;porte
LJtl$ties Whdow Help
' Descrptive5tatistkf
ComparcMems
Generallinear frlod
' Corrolate
>
)
l
j iL,:,,,r,,,'l u* I i -IqilItd.p.nd6r(rl I Crof I
rrr Pm- i Er{rl
Ucitbd lErra :J
SdrdhVui.bh
estimategivesyou a measure
of dispersionfor your predic-
tion equation. When the
predictionequationis used.
68%of thedatawill fallwithin
ModelSummary
Model R R Square
Adjusted
R Souare
Std.Errorof
theEstimate
1 .E06 .649 .624 16.14801
a. Predictors:(Constant),height
Ar-'"1
Est*6k
I'J
WLSWaidrl:
sui*br...I pbr.. I Srrs...I Oaly*..I
Variables Entered/Removed section.
For our example,you shouldseethis output.R Square(calledthe coeflicientof determi-
nation) givesyou theproportionof thevarianceof your dependentvariable (yEIGHT)
thatcanbe explainedby variationin your independentvariable (HEIGHT). Thus, 649%
of the variationin weight can be explainedby differencesin height (talier individuals
weighmore).
The standard error of Modetsummarv
Clasifu )
DataReductbn )
We are interestedin predicting
someone'sweighton thebasisof his or
her height.Thus, we shouldplace the
variable WEIGHT in the dependent
variable block and the variable
HEIGHT in the independentvariable
block.Thenwe canclick OK to run the
analysis.
Reading the Output
For simple linear regressions,
we are interestedin three components
of the output. The first is called the
Model Summary,and it occursafterthe
lt{*rt*
46
Chapter5 PredictionandAssociation
onestandard error of estimate(predicted)value.Justover 95ohwill fall within two stan-
dard errors.Thus, in the previousexample,95o/oof the time, our estimatedweight will be
within32.296poundsof beingcorrect(i.e.,2x 16.148:32.296).
ANOVAb
Model
Sumof
Sorrares df
Mean
Souare F Sio.
1 Kegressron
Residual
Total
6760.323
3650.614
10410.938
I
14
15
6760.323
260.758
25.926 .0004
a' Predictors:(Constant),HEIGHT
b.DependentVariable:WEIGHT
The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta-
ble, asshownabove.The importantnumberhereis the significancelevel in the rightmost
column.If that valueis lessthan.05,thenwe havea significantlinearregression.If it is
largerthan.05,we do not.
The final sectionof the outputis thetableof coefficients.This is wherethe actual
predictionequationcanbe found.
Coefficientt'
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sio.B Std.Error Beta
1 (Constant)
height
-234.681
5.434
71.552
1.067 .806
-3.280
5.092
.005
.000
a. DependentVariable:weight
In mosttexts,you learnthat Y' : a + bX is the regressionequation.f' (pronounced
"Y prime") is your dependentvariable (primesarenormally predictedvaluesor depend-
ent variables),andX is your independentvariable. In SPSSoutput,the valuesof botha
andb arefoundin theB column.The first value,-234.681,is thevalueof a (labeledCon-
stant).The secondvalue,5.434,is the valueof b (labeledwith thenameof the independ-
ent variable). Thus, our prediction equation for the example above is WEIGHT' :
-234.681+ 5.434(HEIGHT).In otherwords,theaveragesubjectwho is an inchtallerthan
anothersubjectweighs5.434poundsmore.A personwho is 60 inchestall shouldweigh
-234.681+ 5.434(60):91.359pounds.Givenourearlierdiscussionof standarderror of
estimate,95ohof individualswho are60 inchestall will weighbetween59.063(91.359-
32.296: 59.063)and123.655(91.359+ 32.296= 123.655)pounds.
/:
"
I
47
Chapter5 PredictionandAssociation
Drawing Conclusions
Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre-
diction equationwas obtained,(b) the directionof the relationship,and (c) the equation
itself.
Phrasing Results That Are Significant
In the exampleson pages46 and47, we obtainedanR Squareof .649anda regres-
sion equationof WEIGHT' : -234.681+ 5.434(HEIGHT). The ANOVA resultedin .F=
25.926with I and 14 degreesof freedom.The F is significantat the lessthan .001 level.
Thus,we could statethe following in a resultssection:
A simple linear regressionwas calculatedpredicting participants'weight
basedon theirheight.A significantregressionequationwasfound(F(1,14):
25.926,p < .001),with anR' of .649.Participants'predictedweight is equal
to -234.68 + 5.43 (HEIGHT) poundswhen height is measuredin inches.
Participants'averageweightincreased5.43poundsfor eachinchof height.
The conclusionstatesthe direction(increase),strength(.649), value (25.926),de-
greesof freedom(1,14),and significancelevel (<.001) of the regression.In addition,a
statementof theequationitselfis included.
Phrasing ResultsThatAre Not Significant
If the ANOVA is not significant
(e.g.,seethe outputat right),the section
of the output labeled SE for the
ANOVA will be greaterthan .05,andthe
regressionequationis not significant.A
results section might include the
followingstatement:
A simple linear regressionwas
calculatedpredictingparticipants'
ACT scoresbasedon their height.
The regressionequationwas not
significant(F(^1,14): 4.12,p >
.05)with an R' of .227.Heightis
not a significantpredictorof ACT
scores.
llorlol Srrrrrrry
Hodel R Souare
Adjuslsd
R Souare
Std.Eror of
lh. Fslimale
attt 221 112 3 06696
a. Predlclors:(Constan0,h8lghl
a. Prodlclors:(Conslan0.h8lghl
b. OependentVarlableracl
Cootlklqrrr
Hod€l
Unstandardiz€d Slandardizsd
Siots Std.Erol Bsta
(u0nslan0
hei9hl
| 9.35I
-.411
13590
203 . r17
J OJI
.2030
003
062
a. OBDendsnlva.iable:acl
Note that for resultsthat arenot significant,the ANOVA resultsandR2resultsare
given,but theregressionequationis not.
Practice Exercise
Use PracticeData Set2 in Appendix B. If we want to predictsalaryfrom yearsof
education,what salarywould you predict for someonewith l2 yearsof education?What
salarywould you predictfor someonewith a collegeeducation(16 years)?
rt{)vP
Xodel
Sumof
dl xeanSouare t Slo
Rssldual
Tolal
JU/?U
r31688
170t38
I
1a
t5
I 408
4.12U 0621
48
Chapter5 PredictionandAssociation
Section5.4 MultipleLinearRegression
Description
The multiple linear regressionanalysisallows the predictionof one variablefrom
severalothervariables.
Assumptions
Multiple linearregressionassumesthat all variablesareinterval- or ratio-scaled.
In addition,the dependentvariable shouldbe normally distributedaroundthe prediction
line. This, of course,assumesthatthe variablesarerelatedto eachother linearly.All vari-
ablesshouldbe normallydistributed.Dichotomousvariablesarealsoacceptableasinde-
pendentvariables.
,SP,S,SData Format
At leastthreevariablesarerequiredin the SPSSdatafile. Eachsubjectmust con-
tributeto all values.
RunningtheCommand
ClickAnalyze,thenRegression,thenLinear.
This will bring up the maindialog box for Linear
Regression.On theleft sideof thedialogbox is a
list of thevariablesin your datafile (we areusing
the HEIGHT.savdata file from the start of this
chapter).On the right sideof the dialog box are
blanksfor thedependentvariable(thevariableyou
aretryingto predict)andtheindependentvariables
(thevariablesfromwhichyouarepredicting).
Dmmd*
l-...G
LLI l&-*rt
I At"h* eoptrc utiltt 5 t{,lrdq., }l+
i &ry!$$sruruct
Cglpsaftladls
GarnrdLhcar ldd
S€lcdirnVdir*
fn f*---*-- ,it'r:,I
Cs Lrbr&:
Er-
'---
ti4svlit{
Li-Jr-
sr"u*t.I Pr,rr...I s* | oei*. I
We are interested in predicting
someone'sweightbasedon his or herheight
and sex. We believe that both sex and
height influenceweight. Thus, we should
placethe dependentvariable WEIGHT in
the Dependentblock and the independent
variables HEIGHT and SEX in the Inde-
pendent(s)block.Enterbothin Block l.
This will perform an analysisto de-
termine if WEIGHT can be predictedfrom
SEX and/or HEIGHT. There are several
methods SPSS can use to conduct this
analysis. These can be selectedwith the
Methodbox. MethodEnter. themostwidely
.roj I
n{.rI
ryl
tb.l
49
Chapter5 PredictionandAssociation
used,puts all variablesin the
methodsuse variousmeansto
Click OK to run theanalvsis.
UethodlE,rt-rl
ReadingtheOutput
For multiplelinearregres-
sion,therearethreecomponentsof
the outputin which we are inter-
ested.Thefirstis calledtheModel
Summary,whichis foundafterthe
VariablesEntered/Removedsection.For our example,you shouldget the outputabove.R
Square(calledthe coefficientof determination)tellsyou the proportionof the variance
in thedependentvariable (WEIGHT) thatcanbe explainedby variationin theindepend-
ent variables(HEIGHT andSEX,in thiscase).Thus,99.3%of thevariationin weightcan
be explainedby differencesin height and sex (taller individuals weigh more, and men
weigh more).Note that when a secondvariableis added,our R Squaregoesup from .649
to .993.The .649wasobtainedusingtheSimpleLinearRegressionexamplein Section5.3.
The StandardError of the Estimategives you a margin of error for the prediction
equation.Usingthepredictionequation,68%oof thedatawill fall within onestandard er-
ror of estimate(predicted)value.Justover95% will fall within two standard errors of
estimates.Thus, in the exampleabove,95ohof the time, our estimatedweight will be
within 4.591(2.296x 2) poundsof beingcorrect.In our SimpleLinearRegressionexam-
ple in Section5.3,thisnumberwas32.296.Notethehigherdegreeof accuracy.
The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta-
ble. For more informationon readingANOVA tables,referto the sectionson ANOVA in
Chapter6. For now, the importantnumberis the significancein the rightmostcolumn.If
thatvalueis lessthan.05,we havea significantlinearregression.If it is largerthan.05,we
do not.
equation,whether they are significant or not. The other
enter only thosevariablesthat are significant predictors.
ModelSummary
Model R R Souare
Adjusted
R Square
Std.Errorof
theEstimate
.99 .993 .992 2.29571
a. Predictors:(Constant),sex,height
eHoveb
Model
Sumof
Souares df MeanSouare F Sio.
xegresslon
Residual
Total
0342424
68.514
10410.938
z
13
15
5171.212
5.270
v61.ZUZ .0000
a. Predictors:(Constant),sex,height
b. DependentVariable:weight
The final sectionof outputwe areinterestedin is thetableof coefficients.This is
wherethe actualpredictionequationcanbe found.
50
Chapter5 PredictionandAssociation
Coefficientf
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sio.B Std.Error Beta
1 (Constant)
height
sex
47j38
2.101
-39.133
14.843
.198
1.501
.312
-.767
176
10.588
-26.071
.007
.000
.000
a. DependentVariable:weight
In mosttexts,you learnthat Y' = a + bX is theregressionequation.For multiple re-
gression,our equationchangesto l" = Bs+ B1X1+ BzXz+ ... + B.X.(where z is thenumber
of IndependentVariables).I/' is your dependentvariable, andtheXs areyour independ-
ent variables. The Bs arelistedin a column.Thus,our predictionequationfor theexample
aboveis WEIGHT' :47.138 - 39.133(SEX)+ 2.101(HEIGHT)(whereSEX is codedas
I : Male, 2 = Female,andHEIGHT is in inches).In otherwords,the averagedifferencein
weight for participantswho differ by one inch in heightis 2.101pounds.Malestendto
weigh 39.133poundsmore than females.A femalewho is 60 inchestall shouldweigh
47.138- 39.133(2)+ 2.101(60):94.932 pounds.Givenour earlierdiscussionof thestan-
dard error of estimate,95o/oof femaleswho are60 inchestall will weighbetween90.341
(94.932- 4.591: 90.341)and99.523(94.932+ 4.591= 99.523)pounds.
Drawing Conclusions
Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre-
diction equationwas obtained,(b) the direction of the relationship,and (c) the equation
itself. Multiple regressionis generallymuch more powerful than simple linear regression.
Compareour two examples.
With multipleregression,you mustalsoconsiderthe significancelevelof eachin-
dependentvariable. In the exampleabove,the significancelevel of both independent
variablesis lessthan.001.
PhrasingResultsThatAreSignificant
In our example,we obtainedan
R Squareof.993 anda regressionequa-
tion of WEIGHT' = 47.138
39.133(SEX)+ 2.101(HEIGHT).The
ANOVA resultedin F: 981.202with2
and 13degreesof freedom.F is signifi-
cantatthelessthan.001level.Thus.we
couldstatethefollowinein aresultssec-
tion:
MorblSratrtny
xodsl R Souars
Adlusted
R Souare
Std.Eror of
lheEstimatg
.997. 992 2 2C5r1
a Prsdictorsr(Conslan0,sex,hsighl
a.Predlctors:(Conslan0,ser,hoighl
b. OspBndontVariabloreighl
ANr:rVAD
Xodel
Sumof
Sdrrrraq dt XeanSouare
I Heorsssron
Residual
Tutal
ru3t2.424
68.5t4
|0410.938
2
15
5171212 981202 000r
Coefllcldasr
Xodel
Unslanda.dizsd Slandardizad
I SioStd.Eror Beta
hei0hl
sex
at 38
2.101
.39.133
4 843
.198
L501
.312
3 t6
10.588
-26.071
007
000
000
a.DepsndenlVarlabl€:rei0hl
5l
Chapter5 PredictionandAssociation
A multiple linear regressionwas calculatedto predict participants'weight
basedon their height and sex.A significantregressionequationwas found
(F(2,13): 981.202,p < .001),with an R' of .993.Participants'predicted
weightis equalto 47.138- 39.133(SEX)+ 2.10l(HEIGHT),whereSEX is
coded as I = Male, 2 : Female,and HEIGHT is measuredin inches.
Participantsincreased2.101 pounds for each inch of height, and males
weighed 39.133 pounds more than females.Both sex and height were
significantpredictors.
The conclusionstatesthe direction(increase),strength(.993),value(981.20),de-
greesof freedom(2,13),and significancelevel (< .001)of the regression.In addition,a
statementof the equationitself is included.Becausetherearemultiple independent vari-
ables,we havenotedwhetheror noteachis significant.
Phrasing ResultsThat Are Not Significant
If the ANOVA does not find a
significantrelationship,the Srg section
of the output will be greaterthan .05,
and the regressionequationis not sig-
nificant. A resultssectionfor the output
at right might include the following
statement:
A multiple linear regressionwas
calculated predicting partici-
pants'ACT scoresbasedon their
height and sex. The regression
equation was not significant
(F(2,13): 2.511,p > .05)withan
R" of .279. Neither height nor
weight is a significantpredictor
of lC7" scores.
llorlel Surrrwy
XodBl x R Souare
AdtuslBd
R Souare
Std Eror of
528. t68 3 07525
a Prsdlclors:(ConslanD.se4hel9ht
a Pr€dictors:(ConslanD,se( hsight
o.OoDendBnlVaiabloracl
Coetllclst 3r
Yodel
Unstandardizsd
Cosilcisnls
Standardized
Coeilcionts
stdSld E.rol Beia
I (Constan0
h€l9hl
s€x
oJttl
- 576
-t o??
19.88{
.266
2011
-.668
- 296
3.102
2.168
- s62
007
019
35{
Notethatforresultsthatare
"o,
,ir";;;;ilJlovA resultsandR2resultsare
given,buttheregressionequationisnot.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethepredictionequationfor pre-
dictingsalarybasedoneducation,yearsof service,andsex.Whichvariablesaresignificant
predictors?If you believethatmenwerepaidmorethanwomenwere,whatwouldyou
concludeafterconductingthisanalysis?
ANI]VIP
gumof
dt qin
I Reoressron
Rssidual
Total
1t.191
122.9a1
't70.t38
l3
't5
23.717
9.a57
2.5rI i tn.
52
Chapter6
ParametricInferentialStatistics
Parametricstatisticalproceduresallow you to draw inferencesaboutpopulations
basedon samplesof thosepopulations.To make theseinferences,you must be able to
makecertainassumptionsabouttheshapeof thedistributionsof thepopulationsamples.
Section6.1 Reviewof BasicHypothesisTesting
TheNull Hypothesis
In hypothesistesting,we createtwo hypothesesthat are mutually exclusive(i.e.,
bothcannotbe trueat thesametime)andall inclusive(i.e.,oneof themmustbe true).We
referto thosetwo hypothesesasthe null hypothesisandthe alternative hypothesis.The
null hypothesisgenerallystatesthatany differencewe observeis causedby randomerror.
The alternative hypothesisgenerallystatesthat any differencewe observeis causedby a
systematicdifferencebetweengroups.
TypeI andTypeII Eruors
All hypothesistestingattemptsto
draw conclusions about the real world
basedon the resultsof a test(a statistical
test,in this case).Thereare four possible
combinationsof results(seethe figure at <.r)
right). =
Two of thepossibleresultsarecor- A
rect test results.The other two resultsare
Uenors. A Type I error occurs when we ;
reject a null hypothesisthat is, in fact, fr
true, while a Type II error occurswhen l-
we fail to reject the null hypothesis that
is, in fact,false.
Significance tests determinethe
probabilityof makinga Type I error. In
otherwords,after performinga seriesof calculations,we obtaina probability that the null
hypothesisis true.If thereis a low probability,suchas5 or lessin 100(.05),by conven-
tion, we rejectthe null hypothesis.In otherwords,we typicallyusethe .05 level(or less)
asthemaximumType I error ratewe arewilling to accept.
Whenthereis a low probabilityof a Type I error, suchas.05,we canstatethatthe
significancetesthasled us to "rejectthe null hypothesis."This is synonymouswith say-
ing that a differenceis "statisticallysignificant."For example,on a readingtesr,suppose
you found thata randomsampleof girls from a schooldistrictscoredhigherthana random
zdi
6a
E-
-^6
6!u
trO
o>
'F:
n2
REALWORLD
NullHypothesisTrue NullHypothesisFalse
TypeI Error I NoError
NoError I Typell Error
53
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss
How to use spss

More Related Content

Similar to How to use spss

Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Sherri Gunder
 
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docx
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docxWEEK 4 – EXERCISESEnter your answers in the spaces pro.docx
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docxpaynetawnya
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxgilpinleeanna
 
Spss course session-II
Spss course session-IISpss course session-II
Spss course session-IIaltleo
 
Spss course session-II
Spss course session-IISpss course session-II
Spss course session-IIaltleo
 
data_analysis_using_spss.docx
data_analysis_using_spss.docxdata_analysis_using_spss.docx
data_analysis_using_spss.docxKevinShey1
 
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved. .docx
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved.  .docx© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved.  .docx
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved. .docxLynellBull52
 
De vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberDe vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberlenasour
 
Chapter 4 Problem 31. For problem three in chapter four, a teac.docx
Chapter 4 Problem 31. For problem three in chapter four,   a teac.docxChapter 4 Problem 31. For problem three in chapter four,   a teac.docx
Chapter 4 Problem 31. For problem three in chapter four, a teac.docxrobertad6
 
De vry math 221 all discussion+ilbs latest 2016 november 1
De vry math 221 all discussion+ilbs latest 2016 november 1De vry math 221 all discussion+ilbs latest 2016 november 1
De vry math 221 all discussion+ilbs latest 2016 november 1lenasour
 
Chapter 02-logistic regression
Chapter 02-logistic regressionChapter 02-logistic regression
Chapter 02-logistic regressionRaman Kannan
 
An Introduction to SPSS
An Introduction to SPSSAn Introduction to SPSS
An Introduction to SPSSRajesh Gunesh
 
toolkit13_sec9.pdf
toolkit13_sec9.pdftoolkit13_sec9.pdf
toolkit13_sec9.pdfAfrim Alili
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSANAND BALAJI
 
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docxHistograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docxpooleavelina
 
Mengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxMengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxandreecapon
 
Ggplot2 work
Ggplot2 workGgplot2 work
Ggplot2 workARUN DN
 
English 103 Final TestReading Poetry for 10Answer all five
English 103 Final TestReading Poetry for 10Answer all five English 103 Final TestReading Poetry for 10Answer all five
English 103 Final TestReading Poetry for 10Answer all five TanaMaeskm
 
Math 108 Chapter 1 Sections 1 through 5
Math 108 Chapter 1 Sections 1 through 5Math 108 Chapter 1 Sections 1 through 5
Math 108 Chapter 1 Sections 1 through 5maialangenberg
 

Similar to How to use spss (20)

Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
 
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docx
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docxWEEK 4 – EXERCISESEnter your answers in the spaces pro.docx
WEEK 4 – EXERCISESEnter your answers in the spaces pro.docx
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docx
 
Spss course session-II
Spss course session-IISpss course session-II
Spss course session-II
 
Spss course session-II
Spss course session-IISpss course session-II
Spss course session-II
 
data_analysis_using_spss.docx
data_analysis_using_spss.docxdata_analysis_using_spss.docx
data_analysis_using_spss.docx
 
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved. .docx
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved.  .docx© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved.  .docx
© Charles T. Diebold, Ph.D., 9152013. All Rights Reserved. .docx
 
De vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberDe vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 november
 
Chapter 4 Problem 31. For problem three in chapter four, a teac.docx
Chapter 4 Problem 31. For problem three in chapter four,   a teac.docxChapter 4 Problem 31. For problem three in chapter four,   a teac.docx
Chapter 4 Problem 31. For problem three in chapter four, a teac.docx
 
De vry math 221 all discussion+ilbs latest 2016 november 1
De vry math 221 all discussion+ilbs latest 2016 november 1De vry math 221 all discussion+ilbs latest 2016 november 1
De vry math 221 all discussion+ilbs latest 2016 november 1
 
Chapter 02-logistic regression
Chapter 02-logistic regressionChapter 02-logistic regression
Chapter 02-logistic regression
 
An Introduction to SPSS
An Introduction to SPSSAn Introduction to SPSS
An Introduction to SPSS
 
toolkit13_sec9.pdf
toolkit13_sec9.pdftoolkit13_sec9.pdf
toolkit13_sec9.pdf
 
Wk1 statnotes
Wk1 statnotesWk1 statnotes
Wk1 statnotes
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSS
 
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docxHistograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
 
Mengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxMengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docx
 
Ggplot2 work
Ggplot2 workGgplot2 work
Ggplot2 work
 
English 103 Final TestReading Poetry for 10Answer all five
English 103 Final TestReading Poetry for 10Answer all five English 103 Final TestReading Poetry for 10Answer all five
English 103 Final TestReading Poetry for 10Answer all five
 
Math 108 Chapter 1 Sections 1 through 5
Math 108 Chapter 1 Sections 1 through 5Math 108 Chapter 1 Sections 1 through 5
Math 108 Chapter 1 Sections 1 through 5
 

Recently uploaded

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

How to use spss

  • 1. A Step-by-Step Guide to Analysis and Interpretotion Brian C. Cronk ll I L :,-
  • 2. ChoosingtheAppropriafeSfafistical lesf Ytrh.t b Yq I*l QraJoi? Dtfbsr h ProportdE Mo.s Tha 1 lnd€Fndont Varidl6 lldr Tho 2 L6Eb d li(bsxlq* Vdidb lhre Thsn 2 L€wls of Indop€nddtt Varisd€ f'bre Tha 'l Indopadqrl Vdbue '| Ind.Fddrt Vri*b fro.! Itn I l.doFfihnt Vdi.bb NOTE:Relevantsectionnumbersare giveninparentheses.Forinstance, '(6.9)"refersyouto Section6.9in Chapter6. I
  • 3. Notice SPSSis a registeredtrademarkof SPSS,Inc.Screenimages@by SPSS,Inc. andMicrosoftCorporation.Usedwith permission. Thisbookis not approvedor sponsoredby SPSS. "PyrczakPublishing"isanimprintof FredPyrczak,Publisher,A CaliforniaCorporation. Althoughtheauthorandpublisherhavemadeeveryefforttoensuretheaccuracyand completenessof informationcontainedin thisbook,weassumenoresponsibilityfor errors,inaccuracies,omissions,or anyinconsistencyherein.Any slightsof people, places,or organizationsareunintentional. ProjectDirector:MonicaLopez. ConsultingEditors:GeorgeBumrss,JoseL. Galvan,MatthewGiblin,DeborahM. Oh, JackPetit.andRichardRasor. Editdrialassistanceprovidedby CherylAlcorn,RandallR.Bruce,KarenM. Disner, BrendaKoplin,EricaSimmons,andSharonYoung. Coverdesignby RobertKiblerandLarryNichols. Printedin theUnitedStatesof AmericabyMalloy,Inc. Copyright@2008,2006,2004,2002,1999byFredPyrczak,Publisher.All rights reserved.No portionof thisbookmaybereproducedor transmittedin anyformorby any meanswithoutthepriorwrittenpermissionof thepublisher. rsBNl-884s85-79-5
  • 4. Tableof Contents IntroductiontotheFifthEdition What'sNew? Audience Organization SPSSVersions Availabilityof SPSS Conventions Screenshots PracticeExercises Acknowledgments'/ ChapterI GettingStarted Ll t.2 1.3 1.4 1.5 1.6 1.7 Chapter2 EnteringandModifyingData StartingSPSS EnteringData DefiningVariables LoadingandSavingDataFiles RunningYourFirstAnalysis ExaminingandPrintingOutputFiles Modi$ingDataFiles VariablesandDataRepresentation TransformationandSelectionof Data Chapter3 DescriptiveStatistics 3.1 3.2 3.3 3.4 3.5 Chapter4 GraphingData FrequencyDistributionsandpercentileRanksfor a singlevariable FrequencyDistributionsandpercentileRanksfor Multille variables Measuresof CentralTendencyandMeasuresof Dispersion foraSingleGroup Measuresof CentralTendencyandMeasuresof Dispersion for MultipleGroups StandardScores 4l 4l 43 45 49 2.1 ') ') v v v v vi vi vi vi vii vii I I I 2 5 6 8 ll ll t2 l7 t7 20 24 )7 29 29 29 3l 33 36 39 2l Chapter5 PredictionandAssociation 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 GraphingBasics TheNewSPSSChartBuilder BarCharts,PieCharts,andHistograms Scatterplots AdvancedBarCharts EditingSPSSGraphs PearsonCorrelation Coefficient SpearmanCorrelation Coefficient SimpleLinear Regression Multiple LinearRegression u,
  • 5. Chapter6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 Chapter7 7.1 7.2 7.3 7.4 7.5 7.6 Chapter8 8.1 8.2 8.3 8.4 AppendixA AppendixB ParametricInferentialStatistics Reviewof BasicHypothesisTesting Single-Samplet Test Independent-SamplesI Test Paired-Samplest Test One-WayANOVA FactorialANOVA Repeated-MeasuresANOVA Mixed-DesignANOVA Analysisof Covariance MultivariateAnalysisof Variance(MANOVA) NonparametricInferentialStatistics Chi-SquareGoodnessof Fit Chi-SquareTestof Independence Mann-WhitneyUTest WilcoxonTest Kruskal-Wallis,F/Test FriedmanTest TestConstruction Item-TotalAnalysis Cronbach'sAlpha Test-RetestReliability Criterion-RelatedValidiw EffectSize PracticeExerciseDataSets PracticeDataSetI PracticeDataSet2 PracticeDataSet3 Glossary SampleDataFilesUsedin Text COINS.sav GRADES.sav HEIGHT.sav QUESTIONS.sav RACE.sav SAMPLE.sav SAT.sav OtherFiles Informationfor Usersof EarlierVersionsof SPSS GraphingDatawithSPSS13.0and14.0 53 53 )) 58 6l 65 69 72 75 79 8l 85 85 87 .90 93 95 97 99 99 100 l0l t02 103 r09 109 ll0 ll0 lt3AppendixC AppendixD AppendixE AppendixF tt7 n7 ll7 ll7 n7 l18 l18 lt8 lt8 l19 t2l tv
  • 6. ChapterI Section1.1 StartingSPSS ffi$t't**** ffi c rrnoitllttt (- lhoari{irgqrory r,Crcrt*rsrcq.,y urhgDd.b6.Wbrd (i lpanrnaridirgdataura f- Dml*ro* fe tf*E h lholifrra GettingStarted Startup proceduresfor SPSSwill differ slightly,dependingon the exactconfigurationof the machineon which it is installed.On most computers,you can start SPSSby clicking on Start, then clicking on Programs,then on SPSS. On many installations,therewill be an SPSSicon on the desktopthat you can double-clickto start theprogram. When SPSSis started,you may be pre- sentedwith the dialog box to the left, depending on theoptionsyour systemadministratorselected for your versionof the program.If you havethe dialog box, click Type in data and OK, which will presenta blankdata window.' If you were not presentedwith the dialog box to the left, SPSSshouldopenautomatically with a blankdata window. The data window and the output win- dow provide the basic interface for SPSS. A blankdata window is shownbelow. Section1.2 EnteringData One of the keys to success with SPSSis knowing how it stores and usesyour data.To illustratethe basicsof data entry with SPSS,we will useExample1.2.1. Example1.2.1 A surveywasgivento several students from four different classes (Tues/Thurs mom- ings, Tues/Thursafternoons, Mon/Wed/Fri mornings, and Mon/Wed/Fri afternoons). The students were asked r! *9*_r1_*9lt.:g H*n-g:fH"gxr__}rry".** rtlxlel&l *'.1rtlale| lgj'SlfilHl*lml sl el*l I ' Itemsthatappearin the glossaryarepresentedin bold. Italics areusedto indicatemenuitems.
  • 7. ChapterI GeningStarted whetheror not they were "morning people"and whetheror not they worked.This surveyalso askedfor their final gradein the class(100% being the highestgade possible).Theresponsesheetsfrom two studentsarepresentedbelow: ResponseSheetI ID: Dayof class: Classtime: Areyouamorningperson? Finalgradein class: Doyouworkoutsideschool? ResponseSheet2 ID: Dayof class: Classtime: Are you a morningperson? X Yes - No Finalgradein class: Dovouworkoutsideschool? 4593 MWF X TTh Morning X Aftemoon Yes X No 8s% Full-time Part{ime XNo l90l x MwF _ TTh X Morning - Afternoon 83% Full-time X Part-time No Our goal is to enterthe datafrom the two studentsinto SPSSfor usein future analyses.Thefirststepis to determinethevariablesthatneedto beentered.Any informa- tion thatcanvary amongparticipantsis a variablethatneedsto be considered.Example 1.2.2liststhevariableswewill use. Example1.2.2 ID Dayof class Classtime Morningperson Finalgrade Whetheror notthestudentworksoutsideschool In theSPSSdatawindow,columnsrepresentvariablesandrowsrepresentpartici- pants.Therefore,wewill becreatinga datafile with sixcolumns(variables)andtworows (students/participants). Section1.3 Defining Variables Beforewe canenteranydata,we mustfirst entersomebasicinformationabout eachvariableintoSPSS.Forinstance,variablesmustfirstbegivennamesthat: o beginwith aletter; o donotcontainaspace.
  • 8. ChapterI GettingStarted Thus, the variablename"Q7" is acceptable,while the variablename"7Q" is not. Similarly, the variable name "PRE_TEST" is acceptable,but the variable name "PRE TEST" is not. Capitalizationdoesnot matter,but variablenamesare capitalizedin this text to make it clear when we are referringto a variablename,even if the variable nameis not necessarilycapitalizedin screenshots. To definea variable.click on the VariableViewtabat thebottomofthemainscreen.ThiswillshowyoutheVari-@ able Viewwindow. To returnto theData Viewwindow. click on the Data View tab. Fb m u9* o*.*Trqll t!-.G q".E u?x !!p_Ip ,'lul*lEll r"l*l ulhl **l{,lrl EiliEltfil_sJelrl l .lt-*l*lr"$,c"x.l From the Variable Viewscreen,SPSSallows you to createandedit all of the vari- ablesin your datafile. Eachcolumn representssomepropertyof a variable,andeachrow representsa variable.All variablesmust be given a name.To do that, click on the first empty cell in the Name column and type a valid SPSSvariablename.The programwill thenfill in defaultvaluesfor mostof theotherproperties. Oneusefulfunctionof SPSSis theabilityto definevariableandvaluelabels.Vari- able labelsallow you to associatea descriptionwith eachvariable.Thesedescriptionscan describethevariablesthemselvesor thevaluesof thevariables. Value labelsallow you to associatea descriptionwith eachvalueof a variable.For example,for most procedures,SPSSrequiresnumericalvalues.Thus, for datasuchasthe day of the class(i.e., Mon/Wed/Fri and Tues/Thurs),we needto first code the valuesas numbers.We can assignthe numberI to Mon/Wed/Friand the number2to Tues/Thurs. To helpus keeptrackof thenumberswe haveassignedto thevalues,we usevaluelabels. To assignvaluelabels,click in the cell you want to assignvaluesto in the Values column.This will bring up a smallgraybutton(seeanow, below at left). Click on thatbut- ton to bring up theValue Labelsdialog box. When you enter a value label, you must click Add aftereachentry.This will J::::*.-,.Tl mOVe the value and itS associated label into the bottom section of the window. When all labels have been added, click OK to return to the Variable Viewwindow. iv*rl** --- v& 12 -Jil s*l !!+ | L.b.f ll6rhl|
  • 9. ChapterI GeningStarred In additionto namingandlabelingthevariable,you havetheoptionof definingthe variabletype.To do so,simply click on theType,Width,or Decimalscolumnsin the Vari- able Viewwindow. The defaultvalue is a numericfield that is eight digits wide with two decimalplacesdisplayed.If your dataaremorethaneightdigitsto the left of the decimal place,theywill be displayedin scientificnotation(e.g.,the number2,000,000,000will be displayedas2.00E+09).'SPSSmaintainsaccuracybeyondtwo decimalplaces,but all out- put will be roundedto two decimalplacesunlessotherwiseindicatedin the Decimals col- umn. In our example,we will beusingnumericvariableswith all of thedefaultvalues. Practice Exercise Createa datafile for the six variablesandtwo samplestudentspresentedin Exam- ple 1.2.1.Nameyour variables:ID, DAY, TIME, MORNING, GRADE, andWORK. You shouldcodeDAY as I : Mon/Wed/Fri,2 = Tues/Thurs.CodeTIME as I : morning,2 : afternoon.CodeMORNING as0 = No, I : Yes.CodeWORK as0: No, I : Part-Time,2 : Full-Time. Be sureyou entervalue labelsfor the different variables.Note that because valuelabelsarenot appropriatefor ID andGRADE, thesearenot coded.When done,your Variable Viewwindow shouldlook like thescreenshotbelow: J -rtrr,d r9"o'ldq${:ilpt"?- "*- .?-- {!,_q,ru.g Click on the Data Viewtab to openthe data-entryscreen.Enter datahorizontally, beginningwith the first student'sID number.Enterthecodefor eachvariablein theappro- priatecolumn;to entertheGRADE variablevalue,enterthestudent'sclassgrade. F.E*UaUar Qgtr Irrddn Anhna gnphr Ufrrs Hhdow E* *lgl dJl blblAl'ri-l-Etetmtototttrslglglqjglej ulFId't lr*lEl&lr6lglolrt' 2 Dependinguponyour versionof SPSS,it maybedisplayedas2.08 + 009.
  • 10. ChapterI GettingStarted - Thepreviousdatawindowcanbechangedto lookinsteadlike thescreenshotbe- l*.bv clickingontheValueLabelsicon(seeanow).In thiscase,thecellsdisplayvalue labelsratherthanthecorrespondingcodes.If datais enteredin thismode,it is notneces- saryto entercodes,asclickingthebuttonwhichappearsin eachcellasthecellis selected will presenta drop-downlist of thepredefinedlablis.You mayuseeithermethod,accord- ingtoyourpreference. : [[o|vrwl vrkQ!9try / *rn*to*u*J----.-- )1 Insteadof clicking the ValueLabels icon, you may optionallytogglebetweenviewsby clickingvalueLaiels under theViewmenu. Section1.4 Loading and SavingData Files Onceyou haveenteredyourdata,you will need to saveit with a uniquenamefor lateruseso thatyou canretrieveit whennecessary. LoadingandsavingSpSSdatafilesworksin the sameway asmostWindows-basedsoftware.Underthe File menu, there are Open, Save, and Save As commands.SPSSdata files have a .,.sav" extension. which is addedby defaultto the end of the filename. ThistellsWindowsthatthefileisanSpSSdatafile. SaveYourData When you saveyour datafile (by clicking File, thenclicking Saveor SaveAs to specifya uniquename),pay specialattentionto whereyou saveit. trrtistsystemsdefaultto the.location<c:programfilesspss>.You will probablywant to saveyour dataon a floppy disk,cD-R, or removableUSB drive sothatyou cantaie the file withvou. ,t ,t1 r ti il 'i. I rlii |: H- Load YourData When you load your data (by clicking File, then clicking Open,thenData, or by clicking theopenfile folder icon),you get a similarwindow.This window listsall files with the ".sav" extension.If you havetroublelocatingyour saved file, make sure you are looking in theright directory. tu l{il Ddr lrm#m Anrfrrr Cr6l! D{l lriifqffi
  • 11. ChapterI GeningStarted PracticeExercise To be surethatyou havemasteredsav- ing andopeningdatafiles,nameyour sample datafile "SAMPLE"andsaveit to a removable FilE Edt $ew Data Transform Annhze @al storagemedium.Onceit is saved,SPSSwill displaythe nameof the file at the top of the data window. It is wise to saveyour work frequently,in caseof computercrashes.Note thatfilenamesmay be upper-or lowercase.In thistext,uppercaseis usedfor clarity. After you have savedyour data,exit SPSS(by clicking File, then Exit). Restart SPSSandloadyour databy selectingthe"SAMPLE.sav"file youjust created. Section1.5 RunningYour FirstAnalysis Any time you opena data window, you canmn any of the analysesavailable.To get started,we will calculatethe students'averagegrade.(With only two students,you can easilycheckyour answerby hand,but imaginea datafile with 10,000studentrecords.) The majority of the availablestatisticaltests are under the Analyze menu. This menudisplaysall the optionsavailablefor your versionof the SPSSprogram(themenusin thisbookwerecreatedwith SPSSStudentVersion15.0).Otherversionsmay haveslightly differentsetsof options. j rttrtJJ File Edlt Vbw Data TransformI nnafzc Gretrs UUtias gdFrdov*Help El tlorl rl(llnl lVisible:6ol GanoralHnnarf&dd Corr*lrtr Re$$r$on Classfy OdrRrdrrtMr Scab Norparimetrlclcrtt Tirna5arl6t Q.rlty Corfrd Rff(trve,., )i ,) ) ir l. ,.),. Eipbrc,,. CrogstSr,.. Rdio,., P-Pflok,., Q€ Phs.,, ) l ) ) To calculatea mean (average),we areaskingthe computerto summarizeour data set.Therefore,we run the commandby clicking Analyze,thenDescriptive Statistics,then Descriptives. This brings up the Descriptives dialog box. Note that the left side of the box containsa list of all the variablesin our datafile. On theright is an area labeled Variable(s), where we can specifythe variableswe would like to usein this particularanalysis. .Srql 3s,l A*r*.. I r ktlmllff al Cottpsr Milns ) 't901.00 , Itjg*r*qgudrr,*ts"uss- OAY f- 9mloddrov*p*vri*lq
  • 12. ChapterI GettingStarted We want to compute the mean for the variable called GRADE. Thus, we need to select the variablename in the left window (by clicking on it). To transferit to the right window, click on the right arrow between the two windows. The arrow always points to the window oppositethe highlighted item and can be used to transfer l:rt.Ij in m ;F* | -t:g.J -!tJ PR:lf- Smdadr{rdvdarvai& selectedvariablesin either direction.Note that double-clickingon the variablenamewill also transfer the variable to the opposite window. StandardWindows conventionsof "Shift" clickingor "Ctrl" clickingto selectmultiplevariablescanbe usedaswell. When we click on the OK button,the analysiswill be conducted,and we will be readyto examineour output. Section1.6 ExaminingandPrintingOutputFiles After an analysis is performed, the output is placedin the output window, and the output window becomesthe active window. If this is the first analysis you have conductedsince starting SPSS,then a new output window will be created.If you haverun previous outputisaddedto theendof yourpreviousoutput. To switchbackandforthbetweenthedatawindowandtheoutput window,select thedesiredwindowfromtheWindowmenubar(seearrow,below). Theoutputwindowis splitintotwo sections.Theleftsectionis anoutlineof the output(SPSSreferstothisasthe"outlineview").Therightsectionis theoutputitself. irllliliirrillliirrrI -d * lnl-Xj H. Ee lbw A*t lra'dorm -qg*g!r*!e!|ro_ Craphr,Ufr!3 Uhdo'N Udp slsl*glelsl*letssJsl#_#rl+l*l +l-l&hjl :lqlel, * Descrlptlves f]aiagarll l: lrrs datcra&ple.lav o lle*crhlurr Sl.*liilca N Mlnlmum Hadmum Xsrn Std.Dwiation ufinuc valldN(|lstrylsa) I 2 83.00 85.00 81,0000 1.41421 ffiffi?iffi rr---*.* r*4 The sectionon the left of the output window providesan outline of the entireout- put window. All of the analysesarelistedin theorderin which they wereconducted.Note that this outline can be usedto quickly locatea sectionof the output.Simply click on the sectionyou would like to see,andtheright window will jump to the appropriateplace. analysesandsavedthem,your ornt El Pccc**tvs* r'fi Trb 6r** lS Adi€D*ard ffi Dcscrtfhcsdkdics
  • 13. ChapterI GeningStarted Clicking on a statisticalprocedurealsoselectsall of the outputfor thatcommand. By pressingtheDeletekey,thatoutputcanbe deletedfrom the output window. This is a quick way to be surethatthe output window containsonly the desiredoutput.Outputcan also be selectedand pastedinto a word processorby clicking Edit, then Copy Objeclsto copy the output.You canthenswitchto your word processorand click Edit, thenPaste. To print your output,simply click File, thenPrint, or click on the printer icon on the toolbar.You will havethe option of printing all of your outputor just the currentlyse- lected section.Be careful when printing! Each time you mn a command,the output is addedto the end of your previousoutput.Thus,you could be printing a very largeoutput file containinginformationyou may not want or need. Oneway to ensurethatyour output window containsonly the resultsof thecurrent commandis to createa new output window just beforerunningthe command.To do this, click File, thenNew, then Outpul. All your subsequentcommandswill go into your new output window. Practice Exercise Load the sampledatafile you createdearlier(SAMPLE.sav).Run theDescriptives commandfor the variableGRADE and print the output.Your output shouldlook like the exampleon page7. Next,selectthedata window andprint it. Section1.7 ModifyingDataFiles Once you havecreateda datafile, it is really quite simple to add additionalcases (rows/participants)or additionalvariables(columns).ConsiderExample1.7.1. Example1.7.1 Twomorestudentsprovideyouwithsurveys.Theirinformationis: ResponseSheet3 ID: Dayof class: Classtime: Are you a morningperson? Finalgradein class: Do you work outsideschool? ResponseSheet4 ID: Day of class: Classtime: Are you a morningperson? Finalgradein class: Do you work outsideschool? 8734 80% MWF Morning Yes Full-time No 1909 X MWF X Morning X Yes 73% Full+ime No X TTh Afternoon XNo Part-time TTH Afternoon No X Part-time
  • 14. ChapterI GettingStarted To addthesedata,simply placetwo additionalrows in theData View window (af- ter loadingyour sampledata).Notice that asnew participantsareadded,the row numbers becomebold. when done,the screenshouldlook like the screenshothere. New variablescan also be added.For example,if the first two participantswere given specialtrainingon time management,andthetwo new participantswerenot, thedata file canbe changedto reflectthis additionalinformation.The new variablecould be called TRAINING (whetheror not the participantreceivedtraining), and it would be codedso that 0 : No and I : Yes. Thus,the first two participantswould be assigneda "1" andthe Iasttwo participantsa "0." To do this, switch to the Variable View window, then add the TRAINING variableto the bottom of the list. Then switchback to theData View window to updatethe data. f+rilf,t - tt Inl vl Sa E& Uew Qpta lransform &rpFzc gaphs Lffitcs t/itFdd^,SE__-- 14:TRAINING l0 lvGbt€ri of t0 NAY TIME MORNING GRADE woRKI mruruwe 1r 1 4593.0f1 Tueffhu aterncon No 85.0u Nol Yes I 1901.OCIManA/Ved/ m0rnrng Yes ffi.0n iiart?mel- yes 3 8734"00 Tueffhu momtng No 80.n0 Noi No 4 1909.00MonrlVed/ morning Yes 73.00 Part-TimeI No ' s I (l) .rView { Vari$c Vlew . l-.1 =J "isPssW rll'l ,i Adding dataand addingvariablesarejust logical extensionsof the procedureswe usedto originally createthe datafile. Savethis new data file. We will be using it again laterin thebook. '.., j .l lrrl vl nh E*__$*'_P$f_I'Sgr &1{1zcOmhr t$*ues$ilndonHug_ Tffiffi ID DAY TIME MORNING GRADE WORK var ^ 1 4593.00 Tueffhu aternoon No 85.00 No 2 1gnl.B0MonMed/ m0rnrng Yes 83.00 Part-Time 3 8734.00 Tue/Thu mornrng No 80,00 No 1909.00MonAfVed/ mornrng Yeg 73.00 Part-Time ) .mfuUiewffi I rb$ Vbw / l{l rll '.- - -,,,---Jd* 15P55Procus*rlsready I i ,4
  • 15. ChapterI GettingStarted Practice Exercise Follow the exampleabove(whereTRAINING is the new variable).Make the modificationsto yourSAMPLE.savdatafile andsaveit. l0
  • 16. Chapter2 EnteringandModifying Data In Chapter 1, we learnedhow to createa simpledatafile, saveit, perform a basic analysis,and examinethe output.In this section,we will go into more detail aboutvari- ablesanddata. Section2.1 VariablesandDataRepresentation In SPSS,variablesarerepresentedascolumnsin the datafile. Participantsarerep- resentedasrows.Thus,if we collect4 piecesof informationfrom 100participants,we will havea datafile with 4 columnsand 100rows. Measurement Scales Therearefour typesof measurementscales:nominal, ordinal, interval, andratio. While themeasurementscalewill determinewhich statisticaltechniqueis appropriatefor a given set of data,SPSSgenerallydoesnot discriminate.Thus, we startthis sectionwith this warning: If you ask it to, SPSSmay conductan analysisthat is not appropriatefor your data.For a morecompletedescriptionof thesefour measurementscales,consultyour statisticstext or the glossaryin AppendixC. Newer versionsof SPSSallow you to indicatewhich types of data you have when you define your variable.You do this using the Measurecolumn.You can indicateNominal,Ordinal,or Scale(SPSS doesnot distinguishbetweeninterval andratio scales). Look at the sampledatafile we createdin Chapterl. We calcu- lateda mean for the variableGRADE. GRADE wasmeasuredon a ra- tio scale,andthemeanis anacceptablesummarystatistic(assumingthatthedistribution isnormal). We could havehad SPSScalculatea mean for the variableTIME insteadof GRADE.If wedid,wewouldgettheoutputpresentedhere. TheoutputindicatesthattheaverageTIME was 1.25.RememberthatTIME was coded as an ordinal variable (I = morningclass,2-afternoon class).Thus, the mean is not an appropriatestatisticfor an ordinal scale,but SPSScalculatedit any- way. The importanceof consider- ing the type of data cannot be overemphasized. Just because SPSSwill compute a statistic for you doesnot meanthatyou should Measure @Nv f $cale .sriltr r Nominal ll *lq]eH"N-ql*l trlllql eilr $l-g :* Sl astts .l.:D gtb :$sh .6M6.ffi $arlrba"t S#(| ht6x0tMn a LS 2.qg Lt@
  • 17. ql total 2.00 2.Bn 4.00 3.00 1.00 4.00 4.00 3.00 7.00 2.00 1.00 2.UB 3.00 Chapter2 EnteringandModifying Data useit. Later in the text,when specificstatisticalproceduresarediscussed,the conditions underwhich they areappropriatewill be addressed. Missing Data Often,participantsdo not providecompletedata.For somestudents,you may have a pretestscorebut not a posttestscore.Perhapsone studentleft one questionblank on a survey,or perhapsshedid not stateher age.Missing datacanweakenany analysis.Often, a singlemissingquestioncaneliminatea sub- ject from all analyses. If you havemissingdatain your data set, leave that cell blank. In the exampleto the left, the fourth subjectdid not complete Question2. Note thatthetotal score(which is calculatedfrom both questions)is alsoblank becauseof the missing data for Question2. SPSSrepresentsmissing data in the data window with a period(althoughyou should not entera period-just leaveit blank). Section2.2 TransformationandSelectionof Data Weoftenhavemoredatain a datafile thanwewantto includein a specificanaly- sis.For example,our sampledatafile containsdatafrom four participants,two of whom receivedspecialtrainingandtwo of whomdid not.If we wantedto conductananalysis usingonlythetwo participantswhodidnotreceivethetraining,we wouldneedto specify theappropriatesubset. Selectinga Subset F|! Ed vl6{ , O*. lr{lrfum An*/& e+hr ( We canusethe SelectCasescommandto specify a subset of our data. The Select Cases command is located under the Data menu. When you select this command,the dialog box below will appear. t'llitl&JE il :id O*fFV{ldrr PrS!tU6.,. CoptO.tafropc,tir3,.. l,j.l,/r,:irrlrr! lif l ll:L*s,,. Hh.o*rr,., Dsfti fi*blc Rc*pon$5ct5,,, ConyD*S sd.rt Csat You can specify which cases(partici- pants)you want to selectby using the selec- tion criteria,which appearon the right sideof theSelectCasesdialogbox. q*d-:-"-- "-"""-*--*--**-""*-^*l 6 Alce a llgdinlctidod ,rl r irCmu*dcaa ] i*np* | i{^ lccdotincoarrpr : ;.,* | -:--J c llaffrvci*lc l0&t C6ttSldrDonoan!.ffi foKl aar I c-"rl x* | t2
  • 18. Chapter2 EnteringandModifying Data By default,All caseswill be selected.The most commonway to selecta subsetis to click If condition is satisfied,thenclick on the button labeledfi This will bring up a newdialogbox thatallowsyou to indicatewhichcasesyou would like to use. You can enter the logic used to select the subsetin the upper section. If the logical statement is true for a given case, then that case will be selected.If the logical statement is false. that case will not be selected.For example, you can selectall casesthat were coded as Mon/Wed/Fri by enteringthe formula DAY = I in the upper- ?Ais"I c'-t I Ht I rightpartof thewindow.If DAY is l, thenthestatementwill betrue,andSPSSwill select the case.If DAY is anythingotherthan l, the statementwill be false,andthe casewill not be selected.Once you have enteredthe logical statement,click Continueto return to the SelectCasesdialogbox. Then,click OK to returnto thedata window. After you haveselectedthecases,thedata window will changeslightly. The casesthat werenot selectedwill be markedwith a diagonalline throughthe casenumber.For example,for our sampledata,the first and third casesarenot selected.only the secondandfourthcasesareselectedfor this subset. U;J;J:.1-glL1 E{''di',*tI , 'J-e.l-,'JlJ.!J-El[aasi"-Eo,t----i ilqex4q lffiIl,?,l*;*"'= ,Jl _!JlJ 0 U IAFTAN(r"nasl sl"J=tx-s*t"lBi!?Blt1trb:r 1 I , I I l i{ 1 ,1 'l 1 I 1 : t 'l 1 'l EffEN'EEEgl''EEE'o ,.,:r. rt lnl vl !k_l** -#gdd.i.&lFlib'- ID TIME MORNING ERADE WORK TRAINING /,-< 4533.m Tueffhui affsrnoon No ffi.m Na Yes NotSelected 2 1901.m- 6h4lto*- ieifrfft MpnMed/i mornino. -..- ^,-.-.*.*..,-- J.- . - .-..,..".*-....- ': Yss 83,U1Fad-Jime Yes Splacled -'4 TuElThu. morning No m.m No No NotSelected 4 MonA/Ved/1morning Yes ru.mPart-Time No s !LJii. vbryJv,itayss7 I . *-J *]fsPssProcaesaFrcady I i ,1, An additionalvariablewill also be createdin your data file. The new variableis calledFILTER_$ andindicateswhethera casewasselectedor not. If we calculatea mean GRADE using the subsetwe just selected,we will receive the output at right. Notice that we now havea mean of 78.00 with a samplesize(M) of 2 in- steadof 4. DescripthreStailstics N Minimum Maximum Mean std. Deviation UKAUE ValidN IliclwisP'l 2 2 73.00 83.00 78.0000 7.0711 l3
  • 19. Chapter2 EnteringandModifyingData Be carefulwhen you selectsubsets.Thesubsetremainsin ffict until you run the commandagain and selectall cases.You cantell if you havea subsetselectedbecausethe bottomof the data window will indicatethat a filter is on. In addition,when you examine your output,N will be lessthanthe total numberof recordsin your dataset if a subsetis selected.The diagonallines throughsomecaseswill also be evidentwhen a subsetis se- lected.Be carefulnot to saveyour datafile with a subsetselected,asthis cancauseconsid- erableconfusionlater. Computing a New Variable SPSScan alsobe used to computea new variable or manipulateyour existing vari- ables. To illustrate this, we will create a new data file. This file will contain data for four participants and three variables(Ql, Q2, and Q3). The variables represent the number of points each participant received on three different questions.Now enter the data shown on the screen to the right. When done, save this data file as "QUESTIONS.sav."We will beusingit againin laterchapters. I TrnnsformAnalyze Graphs Utilities Whds Rersdeinto5ameVariable*,,, RacodointoDffferantVarlables.,, Ar*omSicRarode,,. Vlsual8inrfrg,.. After clicking the Compute Variable command,we get the dialog box at right. The blank field marked Target Variable is where we enter the name of the new variablewe want to create. In this example, we are creating a variablecalled TOTAL, so type the word"total." Notice that there is an equals sign between the Target Variable blank and the Numeric Expression blank. Thesetwo blank areasare the Now you will calculatethe total scorefor eachsubject.We coulddo this manually,but if the data file were large, or if there were a lot of questions,this would take a long time. It is more efficient (and more accurate) to have SPSS compute the totals for you. To do this, click Transform and then click Compute Variable. U $J-:iidijl lij -!CJ:l Jslcl ll;s rtg-sJ rt rt rl ,_g-.|J :3 lll--g'L'"J til , rr | {q*orfmsrccucrsdqf l4 nh E* vir$, D.tr T|{dorm *lslel EJ-rlrj -lgltj{l -|tlf,la*intt m eltj I l* ,---- LHJ {#i#ffirtr!;errtt*; , rrwI i+t*... *l gl w ca lllmr*dCof 0rr/ti* &fntndi) Oldio. E${t iil :J n*ri c*rl "*l
  • 20. Chapter2 EnteringandModifying Data iii:Hffiliji:.: .i .i>t ii"alCt i-Jr:J::i i-3J:J l:j -:15 JJJI tJ -tJ-il --q-|J is:Jlll --q*J m |f-- | ldindm.!&dioncqdinl tsil nact I c:nt I x* | two sides of an equation that SPSS will calculate.For example,total: ql + q2 + q3 is the equationthat is enteredin the samplepresentedhere (screenshotat left).Notethatit is pos- sible to create any equation here simply by using the number and operationalkeypad at the bottom of the dialog box. When we click OK, SPSSwill createa new variablecalled TOTAL andmakeit equalto the sum of thethreequestions. Save your data file again so thatthenew variablewill be available for futuresessions. -lJ t::,, - ltrl-Xl Sindow Help 3.n0 3.0n 4,n0 10.00 4.00 31 2.ool 2.oo..........;. 41 1.001 3001 .:1 l-'r--i-----i I il I i , l, lqg,t_y!"*_i VariabteViewJ lit rljl W*; Recodinga Variable-Dffirent Variable SPSS can create a new variable based upon data from another variable. Say we want to split our participantson the basisof their total score.We want to create a variablecalledGROUP,which is coded I if the total score is low (lessthanor equalto 8) or 2 if the total scoreis high (9 or larger).To do this, we click Transform, then Recodeinto Dffirent Variables. ,-.lu l,rll r-al +. conp$ovdiouc',' ---.:1.- Cd.nVail'r*dnCasas.,, l{ -l I -- - rr 'rtr I o..**^c--u-r-c 4.00 2.00 i.m Racodrlrto 0ffrror* Yal Art(tn*Rrcodr... U*dFhn|ro,,. S*a *rd llm tllhsd,,, Oc!t6 I}F sairs.., Rid&c l4sitE V*s.,. Rrdon iMbar G.rs*trr,,. l5 Eile gdit SEw Qata lransform $nalyza 9aphs [tilities Add'gns F{| [dt !la{ Data j Trrx&tm Analrra
  • 21. Chapter2 EnteringandModifyingData This will bring up the Recode into Different Variables dialog box shown here. Transfer the variableTOTAL to the middle blank. Type "group" in the Name field underOutputVariable.Click Change,and the middle blank will show that TOTAL is becoming GROUP.asshownbelow. ladtnl c€ rlccdm confbil -'tt" I rygJ**l-H+ | r t *.!*lr r&*ri*i*t ;rln I r-":-'-'1** lirli iT- I r nryrOr:frr**"L ,f- i c nq.,saa*ld6lefl; F- ,.F--*-_-_-_____ : " *r***o I a lrt*cn*r I I nni. rT..".''..."...- I ir:L-_- t' l6 i4i'|(tthah* ;F- I" n*'L,*l'||.r.$, : r----**-: ; r {:ei.* T &lrYdd.r*t li-- '- i"r,.!*r h^.,",r y..,t larir,r it:.' I gf-ll $q I '*J til To help keep track of variablesthat have been recoded, it's a good idea to open the Variable View and enter"Recoded"in the Label column in the TOTAL row. This is especially useful with large datasetswhich may include manyrecodedvariables. Click Old andNew Values.This will bring up the Recodedialog box. In this example,we have entered a 9 in the Range, value through HIGHEST field and a 2 in the Value field under New Value.When we click Add, theblank on the right displaysthe recodingformula.Now enteran 8 on the left in the Range, LOWEST through valueblank and a I in the Valuefield underNew Value.Click Add, thenContinue.Click OK. You will be redirectedto the data window. A new variable (GROUP) will have been added and codedas I or 2, basedon TOTAL. *u"'." -ltrlIl Flc Ed Yl.ly Drt! Tr{lform {*!c ce|6.,||tf^,!!!ry I+ NtnHbvli|bL-lo|rnrV*#r l6
  • 22. Chapter3 DescriptiveStatistics ln Chapter2, wediscussedmanyof theoptionsavailablein SPSSfor dealingwith data.Now we will discusswaysto summarizeour data.Theproceduresusedto describe andsummarizedataarecalleddescriptivestatistics. Section3.1 FrequencyDistributionsand PercentileRanks for a SingleVariable Description TheFrequenciescommandproducesfrequencydistributionsfor thespecifiedvari- ables.Theoutputincludesthenumberof occurrences,percentages,validpercentages,and cumulativepercentages.Thevalid percentagesandthe cumulativepercentagescomprise onlythedatathatarenotdesignatedasmissing. TheFrequenciescommandis usefulfor describingsampleswherethemeanis not useful(e.g.,nominalor ordinalscales).It is alsousefulasa methodof gettingthefeelof yourdata.It providesmoreinformationthanjust a meanandstandarddeviationandcan beusefulin determiningskewandidentifyingoutliers.A specialfeatureof thecommand isitsabilityto determinepercentileranks. Assumptions Cumulativepercentagesandpercentilesarevalidonly for datathataremeasured onat leastanordinal scale.Becausetheoutputcontainsonelinefor eachvalueof a vari- able,thiscommandworksbestonvariableswitharelativelysmallnumberof values. Drawing Conclusions TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases in thesampleof a particularvalueandthepercentageof caseswith thatvalue.Thus,con- clusionsdrawnshouldrelateonlyto describingthenumbersor percentagesof casesin the sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper- centageand/orpercentilescanbedrawn. .SPSSData Format TheSPSSdatafile for obtainingfrequencydistributionsrequiresonlyonevariable, andthatvariablecanbeof anytype. tt
  • 23. Chapter3 DescriptiveStatistics Creating a Frequency Distribution To run the Frequer?ciescommand, click Analyze, then Descriptive Statistics, then Frequencies.(This exampleusesthe CARS.savdatafile that comeswith SPSS. It is typically located at <C:Program FilesSPSSCars.sav>.) This will bring up the main dialog box. Transferthe variablefor which you would like a frequencydistributioninto the Disbtlvlr... N Erpbr,.. croac*a,.. Rrno,., F.Pt'lok,., aaPUs,., Variable(s)blank to the right. Be surethat the Display frequency tables option is checked.Click OK to receiveyour output. Note that the dialog boxes in newer versionsof SPSSshow both the typeof variable(theicon immediatelyleft of the variable name) and the variable labels if they are entered. Thus, the variableYEAR shows up in the dialog box asModel Year(moduloI0). i:rl.&{l&l&lslsl}sl i1 rmpg i18 MilesperGallonlmr /Erqlr,onispUcamr / Hurepowor[horc dv*,id"w"bir 1|ut d t!rc toAceileistc dr',Ccxr*yolOrbin[c l7 Oisgayhequercytder xl q!l jq? | .f"tq I . He_l sr**i,1..1f*:.,.I rry*,:.I Outputfor a Frequency Distribution The outputconsistsof two sections.The first sectionindicatesthe numberof re- cordswith valid data for eachvariableselected.Recordswith a blank scorearelistedas missing.In thisexample,thedatafile contained406 records.Noticethatthevariablelabel is ModelYear(modulo100). statistics The second section of the output contains a cumulative frequency distribution for each variable Wselected.Atthetopofthesection,thevariablelabelis | * y.1"1 | oo? | given.The outputiiself consistsof five columns.The first I MissingI t I Jolumnliststhi valuesof thevariablein sortedorder.There is a row for eachvalueof your variable, and additionalrows are added at the bottom for the Total and Missing data. The secondcolumngivesthe frequency of eachvalue,includingmissingvalues. Thethirdcolumngivesthepercentageof all records (including records with missingdata)for eachvalue.The fourth column,labeledValidPercenl,givesthe percentageof records(withoutincluding records with missing data) for each value.If therewereany missingvalues, thesevalueswould be larger than the valuesin columnthreebecausethe total ModolYo.r (modulo 100) Pcrcenl Valid P6rc€nl Cumulativs vatE 72 73 74 75 76 77 79 80 81 82 Total Missing 0 (Missing) Total 34 28 40 27 30 34 28 29 29 30 31 405 1 406 I 4 7.1 6.9 9.9 6.7 8.4 6.9 8.9 7.1 7.1 7.4 7.6 99.8 100.0 I 4 7.2 6.9 9.9 6.7 7.4 8.4 6.9 8.9 f.2 7.2 7.4 7.7 100.0 E4 15.6 22.5 32.3 39.0 46.4 54.8 61.7 70.6 77.8 84.9 92.3 |00.0 r8 &99rv I @ cdrFrb'l{tirE } r5117gl
  • 24. Chapter3 DescriptiveStatistics numberof recordswould havebeenreducedby thenumberof recordswith missingvalues. The final column gives cumulativepercentages.Cumulativepercentagesindicatethe per- centageof recordswith a scoreequalto or smallerthan the currentvalue.Thus, the last value is always 100%.Thesevaluesare equivalentto percentile ranks for the values listed. Determining PercentiIe Ranl<s :,,. tril YI !rydI |*"1 lT Oirpbarfrcqlcreyttblce frfix*... I Central TendencyandDispersior sections suchasthe Median or Mode. whichcannot (seeSection3.3). This brings up the Frequencies: Statisticsdialog box. Check any additional desiredstatisticby clickingon the blanknext to it. For percentiles, enter the desired percentile rank in the blank to the right of thePercentile(s)label.Then,click Add to add it to the list of percentilesrequested.Once you haveselectedall your requiredstatistics, click Continue to return to the main dialog box.Click OK. The Frequencies command can be used to provide a number of descriptive statistics,as well as a variety of percentile values(includingquartiles, cut points,and scorescorrespondingto a specificpercentile rank). To obtain either the descriptiveor percentile functions of the Frequencies command,click the Statisticsbutton at the bottomof the maindialog box. Note thatthe of this box are useful for calculatingvalues, be calculatedwith theDescriptiyescommand PscdibV.lrr xl c{q I *g"d I Hdo I tr Ourilr3 I F nrs**rtd!i* ,crnqo,p, i f- Vdrixtgor0mi&ohlr Oi$.r$pn" l* SUaa** n v$*$i I* nmgc f Mi*n n |- Hrrdilrtl l- S"E.mcur 0idthfim' t- ghsrurt T Kutd*b Statistics ModelYear(modulo100 N Vatid Missing Percentiles 25 50 75 80 405 1 73.00 76.00 79.00 80.00 Outputfor PercentileRanl<s The Statisticsdialog box adds on to the previousoutput from the Frequenciescommand.The new sectionof theoutputis shownat left. The output containsa row for eachpieceof informationyou requested.In the exampleabove,we checkedQuartilesand askedfor the 80th percentile. Thus, the output contains rows for the 25th, 50th. 75th,and80thpercentiles. Mla pa Galmlm3 Sfndr*Pi*rcsnr SHslsp{rierltuso /v***v*$t*(ttu /lino toaccrbrar $1C**{ry o{Origr[c l9
  • 25. Chaprer,1 Descriptire Statistics PracticeExercise UsingPracticeDataSetI in AppendixB, createa frequencydistributiontablefor themathematicsskillsscores.Determinethemathematicsskillsscoreat whichthe60th percentilelies. section3.2 FrequencyDistributionsand percentileRanks for Multiple Variables Description The Crosslabscommandproducesfrequencydistributionsfor multiplevariables. Theoutputincludesthenumberof occurrencesof eachcombinationof levelJof eachvari- able.It ispossibleto havethecommandgivepercentagesfor anyor all variables. The Crosslabscommandis usefulfor describingsampleswherethe meanis not useful(e'g.,nominalor ordinalscales).It is alsousefulasa methodfor gettinga feelfor yourdata. Assumptions Becausethe outputcontainsa row or columnfor eachvalueof a variable.this commandworksbestonvariableswitharelativelysmallnumberof values. ThisexampleusestheSAMpLE.savdata ;ilffi; file, which you createdin Chapter l. To run the chrfy procedure, ctick Analyze, then Descriptive DttaRcd.Etbn Statistics,then Crosstabs.This will bring up ttt. scah mainCrosstabsdialogbox,below. ,SPSSData Format The SPSSdata file for the Crosstabs commandrequirestwo or morevariables.Those variablescanbeof anytype. RunningtheCrosstabsCommand I lnalyzc Orphn Ut||Uot RcF*r ) (orprycrllcEnr G*ncralllrgarFlodcl The dialog box initially lists all vari- ableson the left and containstwo blanks la- beled Row(s) and Column(s). Enter one vari- able(TRAINING) in theRow(s)box. Enterthe second (WORK) in the Column(s) box. To analyzemore than two variables,you would enter the third, fourth, etc., in the unlabeled area(ust undertheLayer indicator). ) ) , ) ) ) ) i, Ror{.} T€K I r---r ftr;;ho.- '-l lrJ I .;lm&! ryq I 20
  • 26. Chapter3 DescriptiveStatistics percentagesand other information to be generatedfor eachcombinationof values.Click Cells,andyou will get thebox at right. For the example presentedhere, check Row, Column, and Total percentages.Then click Continue. This will return you to the Crosstabsdialog box. Click OK to run theanalvsis. TRAINING'WURKCross|nl)tilntlo|l WORK TolalNO Parl-Time TRAINING Yes Count %withinTRAININO %withinwoRK %ofTolal I 50.0% 50.0% 25.0% 1 50.0% 50.0% 25.0% 100.0% 50.0% 50.0% No Count %withinTRAINING %withinWORK %ofTolal 1 50.0% 50.0% 25.0% 1 50.0% 50.0% 25.0% ? 1000% 50.0% 50.0% Total Count %withinTRA|NtNo %wilhinWORK %ofTolal 50.0% 100.0% 50.0% a 500% 100.0% 50.0% 4 r00.0% 100.0% 100.0% Interpreting Crosstabs Output The output consistsof a contingencytable.Each level of WORK is given a column.Each level of TRAINING is given a row. In addition, a row is added for total, and a column is added for total. The Cells button allows you to specify W: t C",ti* | t*"1 ,"1 Eachcell containsthe numberof participants(e.g.,one participantreceivedno traininganddoesnot work; two participantsreceivedno training,regardlessof employ- mentstatus). Thepercentagesfor eachcell arealsoshown.Row percentagesaddup to 100% horizontally.Columnpercentagesaddupto 100%vertically.Forexample,of all theindi- vidualswhohadno training, 50ohdid notworkand50o%workedpart-time(usingthe"o/o withinTRAINING" row).Of theindividualswhodid notwork,50o/ohadno trainingand 50%hadtraining(usingthe"o/owithinwork"row). Practice Exercise UsingPracticeDataSet I in AppendixB, createa contingencytableusingthe Crosstabscommand.Determinethe numberof participantsin eachcombinationof the variablesSEXandMARITAL. Whatpercentageof participantsis married?Whatpercent- ageof participantsis maleandmarried? Section3.3 Measuresof Central Tendencyand Measuresof Dispersion for a SingleGroup Description Measuresof centraltendencyarevaluesthat representa typicalmemberof the sampleor population.Thethreeprimarytypesarethemean,median,andmode.Measures of dispersiontell you thevariabilityof yourscores.Theprimarytypesaretherangeand thestandarddeviation.Together,a measureof centraltendencyanda measureof disper- sionprovideagreatdealof informationabouttheentiredataset. ''Pd€rl.!p. - r-Bait*" ;F Bu : ,l- U]dadr&ad F corm if- sragatrd "1'"1--_rry-ys___ . 2l
  • 27. Chapter,l DescriptiveStatistics We will discussthesemeasuresof central tendencyandmeasuresof dispersionin the con- text of the Descriplives command. Note that many of thesestatisticscan also be calculated with several other commands (e.g., the Frequenciesor CompareMeans commandsare requiredto computethe mode or median-the Statisticsoption for theFrequenciescommandis shownhere). iffi{ltl*::l'.,xl Fac*Vd*c-----:":'-'-"-" "- |7 Arruer |* O*pai*furjF tqLteiotpr F rac$*['* r.-I 16-k'I ':'I I+l lcer**r**nc*r1 !*{* | f- rlm Cr* | , f u"g.t -:.-i i0hx*ioo*".'*-' lf Sld.dr',iitbnl* lli*nn ]fV"iro f.H**ntrn lfnxrgo f.5.t.ncr : T Modt :-^t5m l- Vdsm$apn&bcirr oidrlatin-- -- r5tcffi: ; f Kutu{b i Assumptions Eachmeasureof centraltendencyandmeasureof dispersionhasdifferent assump- tionsassociatedwith it. The mean is the mostpowerfulmeasureof centraltendency,andit hasthe mostassumptions.For example,to calculatea mean,the datamustbe measuredon an interval or ratio scale.In addition,thedistributionshouldbe normally distributedor, at least,not highly skewed.The median requiresat leastordinal data.Becausethe median indicatesonly the middle score(when scoresarearrangedin order),thereareno assump- tions aboutthe shapeof the distribution.The mode is the weakestmeasureof centralten- dency.Thereareno assumptionsfor the mode. The standard deviation is themostpowerful measureof dispersion,but it, too, has severalrequirements.It is a mathematicaltransformationof the variance (the standard deviationis the squareroot of thevariance).Thus,if oneis appropriate,theotheris also. The standard deviation requiresdatameasuredon an interval or ratio scale.In addition, the distributionshouldbe normal.The range is the weakestmeasureof dispersion.To cal- culatea range, the variablemustbe at leastordinal. For nominal scaledata,the entire frequencydistributionshouldbe presentedasa measureof dispersion. Drawing Conclusions A measureof centraltendencyshouldbe accompaniedby a measureof dispersion, Thus, when reporting a mean, you shouldalso report a standard deviation. When pre- sentinga median, you shouldalsostatetherange or interquartilerange. .SPSSData Format Only onevariableis required. 22
  • 28. Chapter3 DescriptiveStatistics Running the Command The Descriptives command will be the command you will most likely use for obtaining measuresof centraltendencyandmeasuresof disper- sion. This exampleusesthe SAMPLE.sav data file we haveusedin thepreviouschapters. ,t X dlt da.v qil n".dI cr*l I f,"PI opdqr"..I To run the command, click Analyze, then Descriptive Statistics,then Descriptives. This will bring up the main dialog box for the Descriptives command. Any variables you would like informationaboutcanbe placedin the right blank by double-clickingthem or by selectingthem,thenclicking on theanow. ! D ' cond*s . Rolrar*n : classfy : 0€tdRedrctitrt ) ) ) ) d** ?n-"* ?,r,qx /t**ts f S&r dr.d!r&!d Y*rcr ri vdi.bb By default, you will receivethe N (number of cases/participants),the minimum value, the maximum value,the mean, and the standard deviation.Note that someof thesemay not be appropriatefor the type of data you haveselected. If you would like to changethe defaultstatistics that aregiven, click Optionsin the main dialog box. You will begiventheOptionsdialogbox presentedhere. F Morr l- Slm r@t qq..'I ,|'?bl ltl {l '!t ,l ,lt il 'i I I : "i I ", ;i I ; F su aa**n F, Mi*ilm f u"or- F7Maiilrn l- nrrcr I- S.r.npur I otlnyotdq: * I {f V;i*hlC I r lpr,*an I r *car*remar i r Dccemdnnmre Reading the Output The output for the Descriptivescommandis quite straightforward.Each type of outputrequestedis presentedin a column,andeachvariableis given in a row. The output presentedhereis for the sampledatafile. It showsthatwe haveonevariable(GRADE) and that we obtainedthe N, minimum, maximum,mean, and standard deviation for this variable. DescriptiveStatistics N Minimum Maximum Mean Std.Deviation graoe ValidN (listwise) 4 4 73.00 85.00 80.2500 5.25198 lA-dy* ct.dn Ltffibc GonardtFra*!@ 23
  • 29. Chapter3 DescriptiveStatistics Practice Exercise UsingPracticeDataSet I in AppendixB, obtainthe descriptivestatisticsfor the ageof theparticipants.What is themean?The median?The mode?What is thestandard deviation?Minimum?Maximum?The range? Section3.4 Measuresof Central Tendency and Measuresof Dispersion for Multiple Groups Description The measuresof centraltendencydiscussedearlierare often needednot only for theentiredataset,but alsofor severalsubsets.Oneway to obtainthesevaluesfor subsets would be to usethe data-selectiontechniquesdiscussedin Chapter2 andapply theDe- scriptivescommandto eachsubset.An easierway to performthis task is to usetheMeans command.The Meanscommandis designedto providedescriptivestatisticsfor subsets ofyour data. Assumptions The assumptionsdiscussedin the sectionon Measuresof CentralTendencyand Measuresof Dispersionfor a SingleGroup(Section3.3)alsoapplyto multiplegroups. Drawing Conclusions A measureof centraltendencyshouldbe accompaniedby a measureof dispersion. Thus,whengiving a mean,you shouldalsoreporta standarddeviation.Whenpresenting a median,you shouldalsostatetherangeor interquartilerange. SPSSData Format Two variablesin the SPSSdatafile are required.One representsthe dependent variable and will be the variablefor which you receivethe descriptivestatistics.The otheris theindependentvariable andwill beusedin creatingthesubsets.Notethatwhile SPSScallsthis variablean independentvariable, it may not meetthe strictcriteriathat definea trueindependentvariable (e.g.,treatmentmanipulation).Thus,someSPSSpro- ceduresreferto it asthegroupingvariable. RunningtheCommand This example ! RnalyzeGraphsUtilities nsportt F ' DescriptiveStatistirs ) GeneralLinearftladel F ' Csrrelata ) . Regression I ' (fassify F WindowHetp I-l r.l Firulbgt5il | - Ona-Sarnplef feft. Independent-SamdesTTe Falred-SarnplEsTTest,,, Ons-Way*|iJOVA,,, uses the SAMPLE.sav data file you created in Chapterl. The Meanscommandis run by clicking Analyze, then Compare Means, thenMeans. This will bringup the maindialog box for the Means command. Place the selectedvariablein the blank field labeled DependentList. 1A LA
  • 30. Chapter3 DescriptiveStatistics Placethe grouping variable in thebox labeledIndependentList.In this example, throughuseof the SAMPLE.savdatafile, measuresof centraltendencyand measuresof dispersion for the variable GRADE will be given for each level of the variable MORNING. :I tu DependantList € arv ,du** /wqrk €tr"ining rTril ll".i I lLayarlal1*- I :'r:rrt| ..!'l?It.Ii I IndependentLi$: i r:ffi lr-, tffi, r l*i.rl I L-:- ryl HesetI CancelI l"rpI By default,the mean,numberof cases,and standard deviation are given. If you would like additionalmeasures,click Optionsand you will be presentedwith the dialog box at right. You can opt to includeany numberof measures. Reading the Output The output for the Means commandis split into two sections.The first section,called a case processingsummary, gives informationaboutthe data used. In our sample data file, there are four students(cases),all of whom were includedin the analysis. I Std.Enord Kutosis Skemrcro fd Stdirtlx: mil'*-* lltlur$uofCa*o* lStardad Doviaion ml I I Lqlry-l c""dI x,r I Sld.Enool$karm HanorricMcan :J Medan 5tt Minirn"rm Manimlrn Rarqo Fist La{ VsianNc GaseProcessingSummary Cases lncluded Excluded Total N Percent N Percent N Percent grade- morning 4 100.0% 0 .OYo 4 | 100.0% 25
  • 31. Chapter3 DescriptiveStatistics The secondsectionof the out- put is the report from the Means com- mand. This report lists the name of the dependent variable at the top (GRADE). Every level of the inde- pendent variable (MORNING) is shown in a row in the table.In this example,the levelsare 0 and l, labeledNo and Yes. Note thatif a variableis labeled,thelabelswill be usedinsteadof theraw values. The summarystatisticsgiven in the reportcorrespondto the data,wherethe level of theindependentvariable is equalto therow heading(e.g.,No, Yes).Thus,two partici- pantswereincludedin eachrow. An additionalrow is added,namedTotal. That row containsthe combineddata. andthe valuesarethe sameasthey would be if we hadrun theDescriptiyescommandfor thevariableGRADE. Extension to More Than One Independent Variable If you have more than one independent variable, SPSScan break down the output even fur- ther. Rather than adding more variables to the Independent List section of the dialog box, you need to add them in a different layer. Note that SPSS indicates with which layeryou areworking. If you click Next, you will be presentedwith Layer 2 of 2, and you can selecta secondindependent variable (e.g., TRAINING). Now, when you run the command(by clicking On, you will be given summary statistics for the variable GRADE by each level of MORNING andTRAINING. Your output will look like the output at right. You now have two main sections(No and yes), along with the Total. Now, how- ever, each main section is broken down into subsections(No, yes, andTotal). The variable you used in Level I (MORNING) is the first one listed,and it definesthe main sections.The variableyou had in Level 2 (TRAINING) is listedsec- Repott GRADE MORNING Mean N Std.Deviation NO Yes Total 82.5000 78.0000 80.2500 2 4 3.53553 7.07107 5.25198 Report ORADE MORNING TRAINING Mean N Std.Deviation No Yes NO Total 85.0000 80.0000 82.5000 1 1 I 3.53553 Yes Yes NO Total 83.0000 73.0000 78.0000 1 1 1 7.07107 Total Yes NO Total 84.0000 76.5000 80.2500 a z 4 1.41421 4.54575 5.?5198 id 26
  • 32. Chapter3 DescriptiveStatistics ond.Thus,the first row representsthoseparticipantswho werenot morningpeopleand whoreceivedtraining.Thesecondrowrepresentsparticipantswhowerenotmorningpeo- pleanddid notreceivetraining.Thethirdrow representsthetotalfor all participantswho werenotmorningpeople. Noticethatstandarddeviationsarenotgivenfor all of therows.Thisis because thereisonlyoneparticipantpercellin thisexample.Oneproblemwithusingmanysubsets is thatit increasesthenumberof participantsrequiredto obtainmeaningfulresults.Seea researchdesigntextor yourinstructorfor moredetails. Practice Exercise UsingPracticeDataSetI in AppendixB, computethemeanandstandarddevia- tion of agesfor eachvalueof maritalstatus.Whatis theaverageageof themarriedpar- ticipants?Thesingleparticipants?Thedivorcedparticipants? Section3.5 Standard Scores Description Standardscoresallowthecomparisonof differentscalesby transformingthescores intoa commonscale.Themostcommonstandardscoreis thez-score.A z-scoreis based ona standardnormaldistribution(e.g.,a meanof 0 anda standarddeviationof l). A z-score,therefore,representsthenumberof standarddeviationsaboveor belowthemean (e.9.,az-scoreof -1.5representsascoreI %standarddeviationsbelowthemean). Assumptions Z-scoresarebasedon thestandardnormal distribution.Therefore,thedistribu- tionsthatareconvertedtoz-scoresshouldbenormallydistributed,andthescalesshouldbe eitherintervalor ratio. Drawing Conclusions Conclusionsbasedonz-scoresconsistof thenumberof standarddeviationsabove or belowthemean.Forexample,astudentscores85onamathematicsexamin aclassthat hasa meanof 70andstandarddeviationof 5.Thestudent'stestscoreis l5 pointsabove theclassmean(85- 70: l5). Thestudent'sz-scoreis 3 becauseshescored3 standard deviationsabovethemean(15+ 5 :3). If thesamestudentscores90ona readingexam, witha classmeanof 80anda standarddeviationof 10,thez-scorewill be I .0because sheis onestandarddeviationabovethe mean.Thus,eventhoughher raw scorewas higheronthereadingtest,sheactuallydidbetterin relationto otherstudentsonthemathe- maticstestbecauseherz-scorewashigheronthattest. .SPSSData Format Calculatingz-scoresrequiresonlya singlevariablein SPSS.Thatvariablemustbe numerical. 27
  • 33. Chapter3 DescriptiveStatistics Running the Command Computingz-scoresis a componentof the Descriptivescommand.To accessit, click Analyze, thenDescriptive Statistics,thenDescriptives. This exampleusesthe sampledata file (SAMPLE.sav) createdin ChaptersI and2. 19 Srva*ndudi3advduosts vcriaHas Myzc eqhs Uti$tbl WMow Help ) b,lrstlK- al @nerdLlneuFbdel ) Correlate ) This will bring up the stan- dard dialog box for the Descrip- /ives command.Notice the check- box in the bottom-left corner la- beled Save standardized values as variables.Checkthis box andmove the variableGRADE into the right- handblank. Then click OK to com- pletethe analysis.You will be pre- sented with the standard output from theDescriptivescommand.Notice thatthez-scoresarenot listed.They wereinserted into thedata window asa new variable. Switch to the Data View window and examineyour data file. Notice that a new variable,called ZGRADE, has beenadded.When you askedSPSSto save standardized values,it createda new variablewith the samenameasyour old variableprecededby a Z. Thez-scoreis computedfor eachcaseandplacedin thenew variable. lr| -tsJXEb E* S€w Qpt. lrnsfam end/2. gr$t6 t*l tsr.dI c"odI HdpI ldry | elslel&l *il{|lelej sJglelffilslffilfw,qlqj $citffrtirffi Tua/Thulaiemoon Yas Yes No Mi- Reading the Output After you conductedyour analysis,the new variablewascreated.You canperform any numberof subsequentanalyseson thenew variable. Practice Exercise Using PracticeData Set2 in AppendixB, determinethez-scorethatcorrespondsto eachemployee'ssalary.Determinethe mean z-scoresfor salariesof male employeesand femaleemployees.Determinethe meanz-scorefor salariesof thetotal sample. rc11i-io- doay drnue dMonNtNs dwnnn drR$HtNs 28
  • 34. Chapter4 GraphingData Section4.1 GraphingBasics In addition to the frequencydistributions,the measuresof central tendencyand measuresof dispersiondiscussedin Chapter3, graphingis a usefulway to summarize,or- ganize,andreduceyour data.It hasbeensaidthat a pictureis worth a thousandwords.In thecaseof complicateddatasets,this is certainlytrue. With Version 15.0of SPSS,it is now possibleto makepublication-qualitygraphs usingonly SPSS.One importantadvantageof usingSPSSto createyour graphsinsteadof othersoftware(e.g.,Excel or SigmaPlot)is that the datahavealreadybeenentered.Thus, duplicationis eliminated,andthechanceof makinga transcriptionerroris reduced. Section4.2 TheNewSPSSChartBuilder DataSet For the graphingexamples,we will usea new setof data.Enterthe databelowby defining the three subjectvariablesin the Variable View window: HEIGHT (in inches), WEIGHT (in pounds),and SEX (l = male,2 = female).When you createthe variables, designateHEIGHT and WEIGHT as Scalemeasuresand SEX as a Nominal measure(in thefar-rightcolumnof the VariableView).Switchto theData Viewto enterthedatavaluesfor the 16participants.Now usetheSaveAs com- mandtosavethefile,namingit HEIGHT.sav. bCIb -- iNiomiiiai - Measure Scale HEIGHT 66 69 /5 72 68 63 74 70 66 64 60 67 64 63 67 65 WEIGHT 150 155 160 160 150 140 165 150 ll0 100 95 ll0 105 100 ll0 105 SEX I I I I I I I I 2 2 2 2 2 2 2 2 29
  • 35. Chapter4 GraphingData Make sureyou have enteredthe datacorrectlyby calculatinga mean for eachof the threevariables(click Analyze,thenDescriptive Statistics,thenDescriptives).Compare yourresultswith thosein thetablebelow. DescrlptlveStatistics N Minimum Maximum Mean srd. Dpvi2lion l-ttstuFlI WEIGHT SEX ValidN (listwise) 16 16 16 16 60.00 06 nn 1.00 74.00 165.00 2.00 66.9375 129.0625 1.5000 J.9Ub// 26.3451 .5164 Chart Builder Basics Make surethat the HEIGHT.savdatafile you createdaboveis open.In order to usethe chartbuilder,you musthavea datafile open. NewwithVersionl5.0ofSPSSistheChartBuildercom.W mand. This command is accessedusing Graphs, then Chart Builder in the submenu.This is a very versatilenew commandthat canmakegraphsof excellentquality. When you first run the Chart Builder command,you will probablybepresentedwith the following dialog box: Bcforeyur rrc thlsdalog,moasuranar*hvelshold bcsctgecrh fw cadrvadabb h yourdurt. In dtbn, f yow chartcodahscataqo*d v6d&. v*re hbds sha.rldbr &fhcd for eachcrtrgory kass O( to doflrcyorr chart, Pr6srDafineV.riaHafroportbsto mt masrcnrant brd orddhe v*.te l&b for rhartvsi$bs, :, f* non't*row $rUdalogagaFr This dialog box is askingyouto ensurethatyour variables are properly de- fined.Referto Sections1.3 and2.1 if you haddifficulty definingthevariablesusedin creatingthe datasetfor this example,or to refreshyour knowledgeof thistopic.Click oK. cc[ffy Eesknotnents Ocfknvubt# kopcrtcr.,. The Chart Builder allows you to makeany kind of graphthat is normally usedin publicationor presentation,and much of it is be- yond the scopeof this text. This text,however,will go overthe basics of the ChartBuilder sothatyou canunderstandits mechanics. On the left sideof the Chart Builder window arethe four main tabsthat let you control the graphsyou are making. The first one is theGallery tab.The Gallerytaballowsyou to choosethebasicformat ofyour graph. l"ry{Y:_ litleo/Footndar - rct"ph; Lulitieswindt ol( 30
  • 36. Chapter4 GraphingData For example, the screenshothere showsthedifferentkindsof barchartsthat theChartBuilder cancreate. After you have selectedthe basic form of graph that you want using the Gallery tab, you simply drag the image from the bottom right of the window up to the main window at the top (where it reads,"Drag a Gallery charthereto useit asyour startingpoint"). Alternatively,you can use the Ba- sicElemenlstab to drag a coordinatesys- tem (labeledChooseAxes)to the top win- dow, then drag variables and elements into thewindow. The other tabs (Groups/Point ID and Titles/Footnotes)can be usedfor add- ing other standard elements to your graphs. The examples in this text will cover some of the basic types of graphs @9Pk8: 0rr9 a 63llst ctrt fsg b re it e y* 6t'fig pohr OR Clkl m f€ 86r Ele|mb * b tulH r dwt €lsffirt bf ele|Ft Chrtpftrbv [43 airr?b deb dnsrfiom: Ll3 Aroa PleFokr Scalbillot Hbbqran HUH-ot, 8oph DJ'lAm 8artsElpnF& n"ct I cror | ,bh I you canmakewith the ChartBuilder.After a little experimentationon your own, onceyou havemasteredthe examplesin the chapter,you will soongain a full understandingof the ChartBuilder. Section4.3 Bar Charts, PieCharts,and Histograms Description Barcharts,piecharts,andhistogramsrepresentthenumberof timeseachscoreoc- cursthroughthevaryingheightsof barsor sizesof piepieces.Theyaregraphicalrepresen- tationsof thefrequencydistributionsdiscussedin Chapter3. Drawing Conclusions TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases in the samplewith a particularvalueandthepercentageof caseswith thatvalue.Thus, conclusionsdrawnshouldrelateonly to describingthe numbersor percentagesfor the sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper- centagesand/orpercentilescanalsobedrawn. SPSSData Format Youneedonlvonevariableto usethiscommand. 3l
  • 37. Chapter4 GraphingData Running the Command The Frequenciescommandwill produce graphicalfrequencydistributions.Click Analyze, then Descriptive Statistics, then Frequencies. You will be presentedwith the maindialog box for the Frequenciescommand,where you can enter the variablesfor which vou would like to | *nalyze Gr;pk Udties Window Hdp creategraphsor charts.(SeeChapter3 for otheroptionswith this command.) You will receive the charts for any variables lectedin the mainFrequenciescommanddialog box. Output The bar chartconsistsof a I'axis, representingthe frequency,andanXaxis, representingeachscore.Note that the only valuesrepresentedon the X axis are thosevalues with nonzerofrequencies(61, 62, and 7l arenot repre- sented). h.lgtrt 66.!0 67.m 68.00 h.lght G a ,I a L t LiwLlW .a'fJul (6fnpSg MBan* ) GeneralLinearMsdel) Click the Charts button at the bot- tom to producefrequencydistributions.This will giveyou theChartsdialogbox. Therearethreetypesof chartsavail- able with this command: Bar charts, Pie charts, andHistograms. For eachtype, the I axis can be either a frequencycount or a percentage(selectedwith the Chart Values option). );,r.: xl 0Kl n"*dI c"!q I l1t"l 65.00 70.s
  • 38. Chapter4 GraphingData NEUMAf{l{COLLEiSELt*i:qARy A$TO|',J,pA .igU14 hclght The pie chart showsthe per- centageof the whole that is repre- sentedby eachvalue. The Histogramcommandcre- atesa groupedfrequencydistribution. Therangeof scoresissplitintoevenly spacedgroups.The midpointof each groupis plottedon theX axis,andthe I axisrepresentsthenumberof scores for eachgroup. If you select With Normal Curve,a normalcurvewill be super- imposedoverthedistribution.Thisis very usefulin determiningif the dis- tribution you have is approximately normal.The distributionrepresented hereis clearlynot normaldueto the asymmetryof thevalues. h166.9l S. Oae,.lr07 flrl0 Practice Exercise UsePracticeDataSet I in AppendixB. After you haveenteredthe data,constructa histogramthat representsthe mathematicsskills scoresanddisplaysa normal curve,anda barchartthatrepresentsthe frequenciesfor thevariableAGE. Section4.4 Scatterplots Description Scatterplots(also called scattergramsor scatterdiagrams)display two values for eachcasewith a mark on thegraph.TheXaxis representsthevaluefor onevariable.The I axisrepresentsthevaluefor the secondvariable. s0.00 t3r0 €alr 05!0 66.00 67.!0 Gen0 !9.!0 tos nfit 13!o il.m h.lght JJ
  • 39. Chapter-1 GraphingData Assumptions Bothvariablesshouldbeintervalor ratio scales.If nominalor ordinaldataare used,becautiousaboutyourinterpretationof thescattergram. .SPSSData Format Youneedtwovariablestoperformthiscommand. Running the Command You can producescatterplotsby clicking Graphs, then Chart Builder. (Note:You canalsousetheLegacyDialogs. For this method, pleaseseeAppendixF.) r l0l ln Gallerv Choose from: selectScatter/Dol.ThendragtheSimple Scatter icon (top left) up to the main chart areaas shownin the screenshotat left. Disre- gardtheElementPropertieswindow thatpops up by choosingClose. Next,dragtheHEIGHT variableto the X-Axis area,and the WEIGHT variableto the Y-Axisarea(rememberthat standardgraphing conventionsindicate that dependent vari- ablesshouldbe I/ andindependentvariables shouldbeX. This would meanthat we aretry- ing to predictweightsfrom heights).At this point,your screenshouldlook like the exam- ple below. Note that your actual data arenot shown-just a setof dummy values. Wrilitll'.,: ,, .Jol V*l&bi: ^ry.J Y*J - '"? | Click OK. You should graph(nextpage)asOutput. get your new orrq a 6ilby (h*t fes b & it e tl ".:';oon, l ln iLs clr* s fE Bs[ pleitbnb t b b krth 3 cfst Bleffit by €l8ffit Chrifrwr* (& mtrpb dstr Ctffii'w: Frwih Si LtE lr@ Fb/Fq|n gnt$rrOol l,lbbgran HlgfFl"tr l@bt Ral Ars iEbM{ Ffip*t!4., opbr., I 6raph* ulfftlqs Wnd 8n Lh PlrifsLa Scfflnal xbbrs Hg||rd 34 , x**J" s*J ...ryFl
  • 40. Chapter4 GraphingData Output Theoutputwill consistofamarkforeachparticipantattheappropriateX and levels. Adding a Third Variable Eventhoughthe scatterplotis a two-dimensionalgraph,it canplota third variable.To make it do so, selectthe Groups/PointID tabin theChartBuilder. Click theGrouping/stackingvariableop- tion.Again,disregardtheElementProp- ertieswindow that popsup. Next, drag thevariableSEXintotheupper-rightcor- ner whereit indicatesSet Color.When thisis done,yourscreenshouldlooklike theimageat right.If you arenotableto dragthevariableSEX,it maybebecause it is notidentifiedasnominalor ordinal in the VariableViewwindow. Click OK to haveSPSSproduce thegraph. arlo i?Jo ?0.00 t:.${ hdtht !|||d d*|er btrdtn- b$tdl l- cotrnrcpr:tvr$ I- aontpl*rt 35
  • 41. Chapter4 GraphingData Now our outputwill havetwo differentsetsof marks.One setrepresentsthe male participants,and the secondsetrepresentsthe femaleparticipants.Thesetwo setswill ap- pearin two differentcolorson your screen.You canusethe SPSScharteditor(seeSection 4.6) to makethemdifferentshapes,asshownin theexamplebelow. os 65,00 67.50 helght Practice Exercise UsePracticeDataSet2 in AppendixB. Constructa scatterplotto examinetherela- tionshipbetweenSALARYandEDUCATION. Section4.5 AdvancedBar Charts Description Bar chartscan be producedwith the Frequencie.scommand(seeSection4.3). Sometimes.however.we areinterestedin a barchartwherethe I/ axisis nota frequency. To producesuchachart,weneedtousetheBarchartscommand. SPSSData Format You need at least two variablesto perform this command.There are two basic kinds of bar charts-those for between-subjectsdesignsand thosefor repeated-measures designs.Usethebetween-subjectsmethodif onevariableis theindependentvariable and the other is the dependentvariable. Use the repeated-measuresmethodif you havea de- pendentvariable for eachvalueof theindependentvariable (e.g.,you would havethree sPx iil 60.00 36
  • 42. Chapter4 GraphingData variablesfor a designwith threevaluesof the independentvariable).This normallyoc- curswhenyou makemultiple observationsovertime. This exampleusesthe GRADES.savdatafile, which will be createdin Chapter6. Pleaseseesection6.4 forthedataif you would like to follow along. Running the Command Open the Chart Builder by clicking Graphs, then Chart Builder. In the Gallery tab, selectBar. lf you had only one inde- pendent variable, you would selectthe SimpleBar chart example (top left corner).If you havemore thanone independentvariable (as in this example), tfldr( select the Clustered Bar Chart example from themiddle of the top row. Drag the exampleto the top work- ing area. Once you do, the working area should look like the screenshotbelow. (Note that you will need to open the data file you would like to graphin order to run thiscommand.) h4 | G.laryahd lsr to @ t 6 p cfwxry m ffi * $r 0* dds t bto h.td. drr drrrl by.lr!* y"J .*t I r,* | :gi lh. y*rfts yu vttdld {a b. rsd te grmt! yw d.t, rh ffi qa..dr vrt d. {db. Edr.*6ot.' h *. dst, vtlB enpcr*.dby |SddSri,lARV vrtdb cdon d b Ur Yd. Vrtdrr U* d.ftr (&gqb n.ryst d !d c *rdd nDe(rd*L, **h o b. red o. c&eskd d q 6 . gdslo a F Ftrg Yrt aic. Cdtfry LSdrl f o,-l ryl *.r! "l If you are using a repeated-measuresdesign like our example here using GRADES.savfrom Chapter6 (threedifferent variablesrepresentingthe i valuesthat we want),you needto selectall threevariables(you can<Ctrl>-clickthemto selectmultiple variables)andthendragall threevariablenamesto the Y-Axisarea.Whenyou do. vou will be giventhewarningmessageabove.Click OK. tG*ptrl uti$Ueswh& l?i;ffitF.t- d'd{4rfr trrd... /ft,Jthd) /l*n*|ts,., dq*oAtrm, , 9{ m hlpd{ sc.ffp/Dat tffotm tldrtff 60elot oidA# JI
  • 43. Chapter4 GraphingData ,'rsji,. *lgl$ *rrrt plYkrlur.r ollmbdaa. 8{ Lll. ,fat H.JPd., t(.&|rih Krtogrqn HCtstoef loxpbt orrl Axas ir?i:J g; '! I' ;:Nl iai inilrut lr &t: nt r*dlF*... dnif*ntmld,.. /tudttbdJ {i*rEkucrt}&"., &rcqsradtrcq,,. n"i* l. crot J rr! | Output Practice Exercise Use PracticeData Set I in Appendix B. Constructa clusteredbar graphexamining the relationshipbetweenMATHEMATICS SKILLS scores(as the OepenOentvariabtej and MARITAL STATUS and SEX (as independentvariables).Make sureyou classify bothSEX andMARITAL STATUSasnominalvariables. Next, you will need to dragthe INSTRUCT variableto the top right in the Cluster: set color area (see screenshotat left). Note: The Chart Builder pays attention to the types of vari- ablesthat you ask it to graph.If you are getting etTormessages or unusualresults,be sure that your categorical variables are properly designatedas Nominal in the Variable View tab (See Chapter2, Section2.l). 38
  • 44. Chapter4 GraphingData Section4.6 EditingSPSSGraphs Whatever command you use to createyour graph,you will probably want to do some editing to make it appearexactly as you want it to look. In SPSS,you do this in much the sameway thatyou edit graphs in other software programs(e.g.,Excel).After your graph is made, in the output window, select your graph (this will createhandlesaroundthe out- sideof the entireobject)and right- click. Then. click SPSS Chart Object, and click Open. Alter- natively,you can double-clickon the graphto openit for editing. Whenyou openthe graph,theChartEditor window andthe correspondingProper- lies window will appear. qb li. lin.tlla. *rll..!!lflE.!l ,, ;l 61f L:lr!.H;gb.tct-]pu1 ri IE :,- r--."1 Ittttr tlttIr tllrwel w&&$!{!rJ JJJJ-JJ JJJJJJ .nlqrlcnl,f,,!sl r 9-,I rt fil mlryl OnceChart Editor is open,you caneasilyedit eachelementof the graph.To select an element,just click on the relevantspoton the graph.For example,if you haveaddeda title to your graph("Histogram" in the examplethat follows), you may selectthe element representingthetitle of the graphby clicking anywhereon the title. FFF,FfuFF|*"'4F&'E' cFtA$-qli*LBul0l al ll rI q *. $r ;l Jxr F4*.it.r":!..* ltliL&{ il.dk'nl 39
  • 45. Chapter4 GraphingData jn ExYt ltb":€klgtH,U:; Li ^'irsGssir :J*ro:l A I 3 *l.A-I,-- Onceyou haveselected an element, you can tell whether the correct elementis selectedbecauseit will have handlesaroundit. If the item you have selectedis a text element(e.g., the title of the graph),a cursor will be presentandyou canedit the text asyou would in a word processing program. If you would like to change another attributeof the element(e.g., the color or font size),usethe Propertiesbox. (Text properties areshownbelow.) With a linle practice, you can make excellentgraphs using SPSS.Once your graph is formattedthe way you want it, simply select File, Save, then Close. $o gdt lbw gsion Ek $vr {hat Trm$tr,,, Spdy$a*Tmpt*c.,. flpoft {bdt rf'.|1,,, trTT.":.TJ*"' .'*t A:r::-' o,tl*" ffiln*fot*.1 P?*l!r h ?frtmd Sa . . AaBbCc123 gltaridfu; Ua*tr$Sie 40
  • 46. Chapter5 PredictionandAssociation Section5.1 PearsonCorrelation Coefficient Description ThePearsoncorrelationcoefficient(sometimescalledthePearsonproduct-moment correlationcoefficientor simplythePearsonr) determinesthestrengthof thelinearrela- tionshipbetweentwovariables. Assumptions Bothvariablesshouldbemeasuredonintervalor ratio scales(or a dichotomous nominalvariable).If a relationshipexistsbetweenthem,thatrelationshipshouldbelinear. Becausethe Pearsoncorrelationcoefficientis computedwith z-scores,both variables shouldalsobenormallydistributed.If yourdatado notmeettheseassumptions,consider usingtheSpearmanrhocorrelationcoefficientinstead. SP.SSData Format Two variablesarerequiredin yourSPSSdatafile.Eachsubjectmusthavedatafor bothvariables. 4 n 1 .. n"."tI ry{l i*l lfratyil qapns Reportr Utl&i*s t#irdow Heb ) ) ) ) Move at leasttwo variablesfrom the box at left into the box at right by usingthe transferarrow (or by double-clickingeach variable).Make surethat a check is in the Pearson box under Correlation Cofficients. It is acceptableto move more thantwo variables. 4l Running the Command To selectthe Pearsoncorrelationcoefficient, click Analyze, then Conelate, then Bivariate (bivariate refers to two variables).This will bring up the Bivariate Correlations dialog box. This exampleusesthe HEIGHT.sav data file enteredat the startof Chapter4. Vdri.blcr I I I rqslDescripHveSalirtk* CcmparaHranr ue"qer:dlirwarmo{d . .i lwolalad {. 0rG-tr8.d 9@,.1
  • 47. Chapter5 PredictionandAssociation For our example,we will move all threevariablesoverandclick OK. Reading the Output The output consists of a correlation matrix. Every variableyou enteredin the command is represented asboth a row and a column.We entered three variables in our command. Therefore,we havea 3 x 3 table.There are also three rows in each cell-the correlation,the significancelevel, and Vdi{$b* OX I lsffi -N- ml/'* I Tc* d $lrfmma*--*=*-*:-**-*l l_i::x- .--i 17Flag{flbrrcorda&rn nql :rydl !4 1 the N. If a correlation is signifi- cant at lessthan the .05 level, a single * will appearnext to the correlation.If it is significantat the .01 levelor lower, ** will ap- pear next to the correlation. For example, the correlation in the output at right has a significance level of < .001, so it is flagged with ** to indicatethat it is less than.01. To read the correlations. selecta row and a column. For example,the correlationbetweenheightandweight is determinedthroughselectionof the WEIGHT row andthe HEIGHT column(.806).We get the sameanswerby selectingthe HEIGHT row and the WEIGHT column.The correlationbetweena variableand itself is alwaysl, sothereis a diagonalsetof I s. Drawing Conclusions The correlationcoefficientwill be between-1.0 and+1.0.Coefficientscloseto 0.0 representa weakrelationship.Coefficientscloseto 1.0or-1.0 representa strongrelation- ship. Generally,correlationsgreaterthan 0.7 areconsideredstrong.Correlationslessthan 0.3 areconsideredweak.Correlationsbetween0.3 and0.7areconsideredmoderate. Significant correlationsare flaggedwith asterisks.A significant correlationindi- catesa reliablerelationship,but not necessarilya strongcorrelation.With enoughpartici- pants,a very small correlationcan be significant.PleaseseeAppendix A for a discussion of effect sizesfor correlations. Phrasinga SignificantResult In the exampleabove,we obtaineda correlationof .806 betweenHEIGHT and WEIGHT. A correlationof .806is a strongpositivecorrelation,andit is significantat the .001level.Thus,we couldstatethefollowingin a resultssection: Correlations heioht weioht sex netgnt Pearsonuorrelalron Sig.(2-tailed) N 1 16 .806' .000 16 -.644' .007 16 weight PearsonCorrelation Sig.(2-tailed) N .806' .000 16 I 16 .968' .000 16 sex PearsonCorrelation Sig.(2-tailed) N -.644' .007 16 -.968' .000 16 1 16 ". Correlationis significantat the 0.01levet(2-tailed). 4/
  • 48. Chapter5 PredictionandAssociation A Pearsoncorrelationcoefficientwascalculatedfor the relationshipbetween participants'height and weight. A strong positive correlationwas found (r(14) : .806,p < .001),indicatinga significantlinearrelationshipbetween thetwo variables.Tallerparticipantstendto weighmore. The conclusionstatesthe direction(positive),strength(strong),value (.806),de- greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a statementof directionis included(talleris heavier). Note thatthedegreesof freedomgivenin parenthesesis 14.The outputindicatesan N of 16.While mostSPSSproceduresgive degreesof freedom,the correlationcommand givesonly theN (thenumberof pairs).For a correlation,thedegreesof freedomis N - 2. Phrasing ResultsThat Are Not Significant Usingour SAMPLE.savdataset from the previous chapters,we could calculatea correlationbetweenID and GRADE. If so, we get the outPut at right.Thecorrelationhasa significance level of .783.Thus,we could write the following in a resultssection(notethat thedegreesof freedomis N - 2): A Pearsoncorrelationwas calculatedexaminingthe relationshipbetween participants' ID numbers and grades.A weak correlation that was not significantwasfound(, (2): .217,p > .05).ID numberis notrelatedto grade in thecourse. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethe valueof the Pearsonconela- tion coefficientfor therelationshipbetweenSALARY andYEARS OF EDUCATION. Section5.2 SpearmanCorrelationCoeflicient Description The Spearmancorrelationcoefficientdeterminesthe strengthof the relationshipbe- tweentwo variables.It is a nonparametricprocedure.Therefore,it is weakerthanthe Pear- soncorrelationcoefficient.but it canbe usedin moresituations. Assumptions Becausethe Spearmancorrelationcoefficientfunctionson the basisof the ranksof data,it requiresordinal (or interval or ratio) datafor both variables.They do not needto be normallydistributed. Correlations ID GRADE lD PearsonUorrelatlon Sig.(2{ailed) N 1.000 4 .217 7A? 4 GMDE PearsonCorrelation Sig.(2-tailed) N .217 .783 4 1.000 4 43
  • 49. Chapter5 PredictionandAssociation SP.SSData Format Two variablesarerequiredin yourSPSSdatafile. Eachsubjectmustprovidedata forbothvariables. Running the Command Click Analyze, then Correlate, then Bivariate.This will bringup themaindialogbox for Bivariate Correlations(ust like the Pearson correlation). About halfway down the dialog box, there is a sectionfor indicatingthe type of correlationyou will compute.You can selectas many correlationsasyou want. For our example, removethecheckin thePearsonbox (by clicking on it) andclick on theSpearmanbox. |;,rfiy* Grapk Utilitior wndow Halp i*CsreldionCoefficientsj j f f"igs-"jjl- fienddrstzu.b Use the variablesHEIGHT and WEIGHT from ourHEIGHT.savdatafile (Chapter4). This is also one of the few commandsthat allows you to choosea one-tailedtest.if desired. Reading the Output The output is essen- tially the sameas for the Pear- son correlation.Each pair of variables has its correlation coefficientindicatedtwice.The Spearmanrho can range from -1.0 to +1.0,just like thePear- sonr. The output listed above indicatesa correlationof .883 betweenHEIGHT and WEIGHT. Note the significancelevelof .000,shownin the "Sig. (2-tailed)"row. This is, in fact,a significancelevel of <.001. The actualalphalevelroundsout to.000, but it is not zero. Drawing Conclusions The correlationwill bebetween-1.0 and+1.0.Scorescloseto 0.0representa weak relationship.Scorescloseto 1.0or -1.0 representa strongrelationship.Significantcorrela- tions are flaggedwith asterisks.A significantcorrelationindicatesa reliablerelationship, but not necessarilya strongcorrelation.With enoughparticipants,a very small correlation can be significant.Generally,correlationsgreaterthan 0.7 are consideredstrong.Correla- tions lessthan 0.3 are consideredweak. Correlationsbetween0.3 and 0.7 arc considered moderate. RrFarts ) I Oescri$iveStatistics ) ComparcMeans ) " GenerdLinearf{udel ) Correlations HEIGHT WEIGHT Spearman'srho HEIGHT CorrelationCoeflicient Sig.(2-tailed) N ffi Sig.(2-tailed) N 1.000 16 tr-4. .000 16 .883 .000 't6 1.000 16 ". Correlationis significantat the .01 level(2-tailed) 44
  • 50. Chapter5 PredictionandAssociation PhrasingResultsThatAreSignificant In the exampleabove,we obtaineda correlationof .883 betweenHEIGHT and WEIGHT. A correlationof .883is a strongpositivecorrelation,andit is significantat the .001level.Thus,we couldstatethefollowingin a resultssection: A Spearmanrho correlationcoefficientwas calculatedfor the relationship betweenparticipants'height and weight. A strongpositive correlationwas found (rho (14):.883, p <.001), indicatinga significantrelationship betweenthetwo variables.Tallerparticipantstendto weighmore. The conclusionstatesthe direction(positive),strength(strong),value(.883),de- greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a statementof directionis included(talleris heavier).Notethatthedegreesof freedomgiven in parenthesesis 14.TheoutputindicatesanN of 16.For a correlation,thedegreesof free- domisN-2. Phrasing ResultsThat Are Not Significant Using our SAMPLE.sav datasetfrom the previouschapters, we couldcalculatea Spearmanrho correlation between ID and GRADE. If so, we would get the output at right. The correlationco- efficientequals.000andhasa sig- nificancelevelof 1.000.Note thatthoughthis valueis roundedup and is not, in fact,ex- actly 1.000,we couldstatethefollowingin a resultssection: A Spearmanrho correlationcoefficientwas calculatedfor the relationship betweena subject'sID numberand grade.An extremelyweak correlation thatwasnot significantwasfound(r (2 = .000,p > .05).ID numberis not relatedto gradein thecourse. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethe strengthof the relationship betweensalaryandjob classificationby calculatingtheSpearmanr&ocorrelation. Section 5.3 Simple Linear Regression Description Simplelinearregressionallowsthepredictionof onevariablefrom another. Assumptions Simplelinearregressionassumesthatboth variablesareinterval- or ratio-scaled. In addition,the dependentvariable shouldbe normallydistributedaroundthe prediction line. This, of course,assumesthat the variablesare relatedto eachotherlinearly.Typi- Correlations to GRADE Spearman'srho lD CorrelationCoenicten Sig.(2{ailed) N ffi Sig. (2{ailed) N 000 .UUU 1.000 .000 1.000 4 1.000 45
  • 51. Chapter5 PredictionandAssociation cally, both variablesshouldbe normally distributed.Dichotomousvariables (variables with only two levels)arealsoacceptableasindependentvariables. .SPSSData Format Two variablesare requiredin the SPSSdata file. Each subjectmust contributeto bothvalues. Running the Command Click Analyze, thenRegression,then Linear. This will bring up the main diatog box for LinearRegression.On theleft sideof the dialog box is a list of the variablesin your datafile (we areusingthe HEIGHT.sav data file from the start of this section).On the right are blocks for the dependent variable (the variable you are trying to predict),and the independentvariable (the variablefrom whichwe arepredicting). 0coandart t '-J ff*r,'-- Aulyze Graphs R;porte LJtl$ties Whdow Help ' Descrptive5tatistkf ComparcMems Generallinear frlod ' Corrolate > ) l j iL,:,,,r,,,'l u* I i -IqilItd.p.nd6r(rl I Crof I rrr Pm- i Er{rl Ucitbd lErra :J SdrdhVui.bh estimategivesyou a measure of dispersionfor your predic- tion equation. When the predictionequationis used. 68%of thedatawill fallwithin ModelSummary Model R R Square Adjusted R Souare Std.Errorof theEstimate 1 .E06 .649 .624 16.14801 a. Predictors:(Constant),height Ar-'"1 Est*6k I'J WLSWaidrl: sui*br...I pbr.. I Srrs...I Oaly*..I Variables Entered/Removed section. For our example,you shouldseethis output.R Square(calledthe coeflicientof determi- nation) givesyou theproportionof thevarianceof your dependentvariable (yEIGHT) thatcanbe explainedby variationin your independentvariable (HEIGHT). Thus, 649% of the variationin weight can be explainedby differencesin height (talier individuals weighmore). The standard error of Modetsummarv Clasifu ) DataReductbn ) We are interestedin predicting someone'sweighton thebasisof his or her height.Thus, we shouldplace the variable WEIGHT in the dependent variable block and the variable HEIGHT in the independentvariable block.Thenwe canclick OK to run the analysis. Reading the Output For simple linear regressions, we are interestedin three components of the output. The first is called the Model Summary,and it occursafterthe lt{*rt* 46
  • 52. Chapter5 PredictionandAssociation onestandard error of estimate(predicted)value.Justover 95ohwill fall within two stan- dard errors.Thus, in the previousexample,95o/oof the time, our estimatedweight will be within32.296poundsof beingcorrect(i.e.,2x 16.148:32.296). ANOVAb Model Sumof Sorrares df Mean Souare F Sio. 1 Kegressron Residual Total 6760.323 3650.614 10410.938 I 14 15 6760.323 260.758 25.926 .0004 a' Predictors:(Constant),HEIGHT b.DependentVariable:WEIGHT The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta- ble, asshownabove.The importantnumberhereis the significancelevel in the rightmost column.If that valueis lessthan.05,thenwe havea significantlinearregression.If it is largerthan.05,we do not. The final sectionof the outputis thetableof coefficients.This is wherethe actual predictionequationcanbe found. Coefficientt' Model Unstandardized Coefficients Standardized Coefficients t Sio.B Std.Error Beta 1 (Constant) height -234.681 5.434 71.552 1.067 .806 -3.280 5.092 .005 .000 a. DependentVariable:weight In mosttexts,you learnthat Y' : a + bX is the regressionequation.f' (pronounced "Y prime") is your dependentvariable (primesarenormally predictedvaluesor depend- ent variables),andX is your independentvariable. In SPSSoutput,the valuesof botha andb arefoundin theB column.The first value,-234.681,is thevalueof a (labeledCon- stant).The secondvalue,5.434,is the valueof b (labeledwith thenameof the independ- ent variable). Thus, our prediction equation for the example above is WEIGHT' : -234.681+ 5.434(HEIGHT).In otherwords,theaveragesubjectwho is an inchtallerthan anothersubjectweighs5.434poundsmore.A personwho is 60 inchestall shouldweigh -234.681+ 5.434(60):91.359pounds.Givenourearlierdiscussionof standarderror of estimate,95ohof individualswho are60 inchestall will weighbetween59.063(91.359- 32.296: 59.063)and123.655(91.359+ 32.296= 123.655)pounds. /: " I 47
  • 53. Chapter5 PredictionandAssociation Drawing Conclusions Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre- diction equationwas obtained,(b) the directionof the relationship,and (c) the equation itself. Phrasing Results That Are Significant In the exampleson pages46 and47, we obtainedanR Squareof .649anda regres- sion equationof WEIGHT' : -234.681+ 5.434(HEIGHT). The ANOVA resultedin .F= 25.926with I and 14 degreesof freedom.The F is significantat the lessthan .001 level. Thus,we could statethe following in a resultssection: A simple linear regressionwas calculatedpredicting participants'weight basedon theirheight.A significantregressionequationwasfound(F(1,14): 25.926,p < .001),with anR' of .649.Participants'predictedweight is equal to -234.68 + 5.43 (HEIGHT) poundswhen height is measuredin inches. Participants'averageweightincreased5.43poundsfor eachinchof height. The conclusionstatesthe direction(increase),strength(.649), value (25.926),de- greesof freedom(1,14),and significancelevel (<.001) of the regression.In addition,a statementof theequationitselfis included. Phrasing ResultsThatAre Not Significant If the ANOVA is not significant (e.g.,seethe outputat right),the section of the output labeled SE for the ANOVA will be greaterthan .05,andthe regressionequationis not significant.A results section might include the followingstatement: A simple linear regressionwas calculatedpredictingparticipants' ACT scoresbasedon their height. The regressionequationwas not significant(F(^1,14): 4.12,p > .05)with an R' of .227.Heightis not a significantpredictorof ACT scores. llorlol Srrrrrrry Hodel R Souare Adjuslsd R Souare Std.Eror of lh. Fslimale attt 221 112 3 06696 a. Predlclors:(Constan0,h8lghl a. Prodlclors:(Conslan0.h8lghl b. OependentVarlableracl Cootlklqrrr Hod€l Unstandardiz€d Slandardizsd Siots Std.Erol Bsta (u0nslan0 hei9hl | 9.35I -.411 13590 203 . r17 J OJI .2030 003 062 a. OBDendsnlva.iable:acl Note that for resultsthat arenot significant,the ANOVA resultsandR2resultsare given,but theregressionequationis not. Practice Exercise Use PracticeData Set2 in Appendix B. If we want to predictsalaryfrom yearsof education,what salarywould you predict for someonewith l2 yearsof education?What salarywould you predictfor someonewith a collegeeducation(16 years)? rt{)vP Xodel Sumof dl xeanSouare t Slo Rssldual Tolal JU/?U r31688 170t38 I 1a t5 I 408 4.12U 0621 48
  • 54. Chapter5 PredictionandAssociation Section5.4 MultipleLinearRegression Description The multiple linear regressionanalysisallows the predictionof one variablefrom severalothervariables. Assumptions Multiple linearregressionassumesthat all variablesareinterval- or ratio-scaled. In addition,the dependentvariable shouldbe normally distributedaroundthe prediction line. This, of course,assumesthatthe variablesarerelatedto eachother linearly.All vari- ablesshouldbe normallydistributed.Dichotomousvariablesarealsoacceptableasinde- pendentvariables. ,SP,S,SData Format At leastthreevariablesarerequiredin the SPSSdatafile. Eachsubjectmust con- tributeto all values. RunningtheCommand ClickAnalyze,thenRegression,thenLinear. This will bring up the maindialog box for Linear Regression.On theleft sideof thedialogbox is a list of thevariablesin your datafile (we areusing the HEIGHT.savdata file from the start of this chapter).On the right sideof the dialog box are blanksfor thedependentvariable(thevariableyou aretryingto predict)andtheindependentvariables (thevariablesfromwhichyouarepredicting). Dmmd* l-...G LLI l&-*rt I At"h* eoptrc utiltt 5 t{,lrdq., }l+ i &ry!$$sruruct Cglpsaftladls GarnrdLhcar ldd S€lcdirnVdir* fn f*---*-- ,it'r:,I Cs Lrbr&: Er- '--- ti4svlit{ Li-Jr- sr"u*t.I Pr,rr...I s* | oei*. I We are interested in predicting someone'sweightbasedon his or herheight and sex. We believe that both sex and height influenceweight. Thus, we should placethe dependentvariable WEIGHT in the Dependentblock and the independent variables HEIGHT and SEX in the Inde- pendent(s)block.Enterbothin Block l. This will perform an analysisto de- termine if WEIGHT can be predictedfrom SEX and/or HEIGHT. There are several methods SPSS can use to conduct this analysis. These can be selectedwith the Methodbox. MethodEnter. themostwidely .roj I n{.rI ryl tb.l 49
  • 55. Chapter5 PredictionandAssociation used,puts all variablesin the methodsuse variousmeansto Click OK to run theanalvsis. UethodlE,rt-rl ReadingtheOutput For multiplelinearregres- sion,therearethreecomponentsof the outputin which we are inter- ested.Thefirstis calledtheModel Summary,whichis foundafterthe VariablesEntered/Removedsection.For our example,you shouldget the outputabove.R Square(calledthe coefficientof determination)tellsyou the proportionof the variance in thedependentvariable (WEIGHT) thatcanbe explainedby variationin theindepend- ent variables(HEIGHT andSEX,in thiscase).Thus,99.3%of thevariationin weightcan be explainedby differencesin height and sex (taller individuals weigh more, and men weigh more).Note that when a secondvariableis added,our R Squaregoesup from .649 to .993.The .649wasobtainedusingtheSimpleLinearRegressionexamplein Section5.3. The StandardError of the Estimategives you a margin of error for the prediction equation.Usingthepredictionequation,68%oof thedatawill fall within onestandard er- ror of estimate(predicted)value.Justover95% will fall within two standard errors of estimates.Thus, in the exampleabove,95ohof the time, our estimatedweight will be within 4.591(2.296x 2) poundsof beingcorrect.In our SimpleLinearRegressionexam- ple in Section5.3,thisnumberwas32.296.Notethehigherdegreeof accuracy. The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta- ble. For more informationon readingANOVA tables,referto the sectionson ANOVA in Chapter6. For now, the importantnumberis the significancein the rightmostcolumn.If thatvalueis lessthan.05,we havea significantlinearregression.If it is largerthan.05,we do not. equation,whether they are significant or not. The other enter only thosevariablesthat are significant predictors. ModelSummary Model R R Souare Adjusted R Square Std.Errorof theEstimate .99 .993 .992 2.29571 a. Predictors:(Constant),sex,height eHoveb Model Sumof Souares df MeanSouare F Sio. xegresslon Residual Total 0342424 68.514 10410.938 z 13 15 5171.212 5.270 v61.ZUZ .0000 a. Predictors:(Constant),sex,height b. DependentVariable:weight The final sectionof outputwe areinterestedin is thetableof coefficients.This is wherethe actualpredictionequationcanbe found. 50
  • 56. Chapter5 PredictionandAssociation Coefficientf Model Unstandardized Coefficients Standardized Coefficients t Sio.B Std.Error Beta 1 (Constant) height sex 47j38 2.101 -39.133 14.843 .198 1.501 .312 -.767 176 10.588 -26.071 .007 .000 .000 a. DependentVariable:weight In mosttexts,you learnthat Y' = a + bX is theregressionequation.For multiple re- gression,our equationchangesto l" = Bs+ B1X1+ BzXz+ ... + B.X.(where z is thenumber of IndependentVariables).I/' is your dependentvariable, andtheXs areyour independ- ent variables. The Bs arelistedin a column.Thus,our predictionequationfor theexample aboveis WEIGHT' :47.138 - 39.133(SEX)+ 2.101(HEIGHT)(whereSEX is codedas I : Male, 2 = Female,andHEIGHT is in inches).In otherwords,the averagedifferencein weight for participantswho differ by one inch in heightis 2.101pounds.Malestendto weigh 39.133poundsmore than females.A femalewho is 60 inchestall shouldweigh 47.138- 39.133(2)+ 2.101(60):94.932 pounds.Givenour earlierdiscussionof thestan- dard error of estimate,95o/oof femaleswho are60 inchestall will weighbetween90.341 (94.932- 4.591: 90.341)and99.523(94.932+ 4.591= 99.523)pounds. Drawing Conclusions Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre- diction equationwas obtained,(b) the direction of the relationship,and (c) the equation itself. Multiple regressionis generallymuch more powerful than simple linear regression. Compareour two examples. With multipleregression,you mustalsoconsiderthe significancelevelof eachin- dependentvariable. In the exampleabove,the significancelevel of both independent variablesis lessthan.001. PhrasingResultsThatAreSignificant In our example,we obtainedan R Squareof.993 anda regressionequa- tion of WEIGHT' = 47.138 39.133(SEX)+ 2.101(HEIGHT).The ANOVA resultedin F: 981.202with2 and 13degreesof freedom.F is signifi- cantatthelessthan.001level.Thus.we couldstatethefollowinein aresultssec- tion: MorblSratrtny xodsl R Souars Adlusted R Souare Std.Eror of lheEstimatg .997. 992 2 2C5r1 a Prsdictorsr(Conslan0,sex,hsighl a.Predlctors:(Conslan0,ser,hoighl b. OspBndontVariabloreighl ANr:rVAD Xodel Sumof Sdrrrraq dt XeanSouare I Heorsssron Residual Tutal ru3t2.424 68.5t4 |0410.938 2 15 5171212 981202 000r Coefllcldasr Xodel Unslanda.dizsd Slandardizad I SioStd.Eror Beta hei0hl sex at 38 2.101 .39.133 4 843 .198 L501 .312 3 t6 10.588 -26.071 007 000 000 a.DepsndenlVarlabl€:rei0hl 5l
  • 57. Chapter5 PredictionandAssociation A multiple linear regressionwas calculatedto predict participants'weight basedon their height and sex.A significantregressionequationwas found (F(2,13): 981.202,p < .001),with an R' of .993.Participants'predicted weightis equalto 47.138- 39.133(SEX)+ 2.10l(HEIGHT),whereSEX is coded as I = Male, 2 : Female,and HEIGHT is measuredin inches. Participantsincreased2.101 pounds for each inch of height, and males weighed 39.133 pounds more than females.Both sex and height were significantpredictors. The conclusionstatesthe direction(increase),strength(.993),value(981.20),de- greesof freedom(2,13),and significancelevel (< .001)of the regression.In addition,a statementof the equationitself is included.Becausetherearemultiple independent vari- ables,we havenotedwhetheror noteachis significant. Phrasing ResultsThat Are Not Significant If the ANOVA does not find a significantrelationship,the Srg section of the output will be greaterthan .05, and the regressionequationis not sig- nificant. A resultssectionfor the output at right might include the following statement: A multiple linear regressionwas calculated predicting partici- pants'ACT scoresbasedon their height and sex. The regression equation was not significant (F(2,13): 2.511,p > .05)withan R" of .279. Neither height nor weight is a significantpredictor of lC7" scores. llorlel Surrrwy XodBl x R Souare AdtuslBd R Souare Std Eror of 528. t68 3 07525 a Prsdlclors:(ConslanD.se4hel9ht a Pr€dictors:(ConslanD,se( hsight o.OoDendBnlVaiabloracl Coetllclst 3r Yodel Unstandardizsd Cosilcisnls Standardized Coeilcionts stdSld E.rol Beia I (Constan0 h€l9hl s€x oJttl - 576 -t o?? 19.88{ .266 2011 -.668 - 296 3.102 2.168 - s62 007 019 35{ Notethatforresultsthatare "o, ,ir";;;;ilJlovA resultsandR2resultsare given,buttheregressionequationisnot. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethepredictionequationfor pre- dictingsalarybasedoneducation,yearsof service,andsex.Whichvariablesaresignificant predictors?If you believethatmenwerepaidmorethanwomenwere,whatwouldyou concludeafterconductingthisanalysis? ANI]VIP gumof dt qin I Reoressron Rssidual Total 1t.191 122.9a1 't70.t38 l3 't5 23.717 9.a57 2.5rI i tn. 52
  • 58. Chapter6 ParametricInferentialStatistics Parametricstatisticalproceduresallow you to draw inferencesaboutpopulations basedon samplesof thosepopulations.To make theseinferences,you must be able to makecertainassumptionsabouttheshapeof thedistributionsof thepopulationsamples. Section6.1 Reviewof BasicHypothesisTesting TheNull Hypothesis In hypothesistesting,we createtwo hypothesesthat are mutually exclusive(i.e., bothcannotbe trueat thesametime)andall inclusive(i.e.,oneof themmustbe true).We referto thosetwo hypothesesasthe null hypothesisandthe alternative hypothesis.The null hypothesisgenerallystatesthatany differencewe observeis causedby randomerror. The alternative hypothesisgenerallystatesthat any differencewe observeis causedby a systematicdifferencebetweengroups. TypeI andTypeII Eruors All hypothesistestingattemptsto draw conclusions about the real world basedon the resultsof a test(a statistical test,in this case).Thereare four possible combinationsof results(seethe figure at <.r) right). = Two of thepossibleresultsarecor- A rect test results.The other two resultsare Uenors. A Type I error occurs when we ; reject a null hypothesisthat is, in fact, fr true, while a Type II error occurswhen l- we fail to reject the null hypothesis that is, in fact,false. Significance tests determinethe probabilityof makinga Type I error. In otherwords,after performinga seriesof calculations,we obtaina probability that the null hypothesisis true.If thereis a low probability,suchas5 or lessin 100(.05),by conven- tion, we rejectthe null hypothesis.In otherwords,we typicallyusethe .05 level(or less) asthemaximumType I error ratewe arewilling to accept. Whenthereis a low probabilityof a Type I error, suchas.05,we canstatethatthe significancetesthasled us to "rejectthe null hypothesis."This is synonymouswith say- ing that a differenceis "statisticallysignificant."For example,on a readingtesr,suppose you found thata randomsampleof girls from a schooldistrictscoredhigherthana random zdi 6a E- -^6 6!u trO o> 'F: n2 REALWORLD NullHypothesisTrue NullHypothesisFalse TypeI Error I NoError NoError I Typell Error 53