6. ChapterI
Section1.1 StartingSPSS
ffi$t't****
ffi
c rrnoitllttt
(- lhoari{irgqrory
r,Crcrt*rsrcq.,y urhgDd.b6.Wbrd
(i lpanrnaridirgdataura
f- Dml*ro* fe tf*E h lholifrra
GettingStarted
Startup proceduresfor SPSSwill differ
slightly,dependingon the exactconfigurationof
the machineon which it is installed.On most
computers,you can start SPSSby clicking on
Start, then clicking on Programs,then on SPSS.
On many installations,therewill be an SPSSicon
on the desktopthat you can double-clickto start
theprogram.
When SPSSis started,you may be pre-
sentedwith the dialog box to the left, depending
on theoptionsyour systemadministratorselected
for your versionof the program.If you havethe
dialog box, click Type in data and OK, which
will presenta blankdata window.'
If you were not presentedwith the dialog
box to the left, SPSSshouldopenautomatically
with a blankdata window.
The data window and the output win-
dow provide the basic interface for SPSS. A
blankdata window is shownbelow.
Section1.2 EnteringData
One of the keys to success
with SPSSis knowing how it stores
and usesyour data.To illustratethe
basicsof data entry with SPSS,we
will useExample1.2.1.
Example1.2.1
A surveywasgivento several
students from four different
classes (Tues/Thurs mom-
ings, Tues/Thursafternoons,
Mon/Wed/Fri mornings, and
Mon/Wed/Fri afternoons).
The students were asked
r! *9*_r1_*9lt.:g H*n-g:fH"gxr__}rry".**
rtlxlel&l *'.1rtlale| lgj'SlfilHl*lml sl el*l I
' Itemsthatappearin the glossaryarepresentedin bold. Italics areusedto indicatemenuitems.
7. ChapterI GeningStarted
whetheror not they were "morning people"and whetheror not they worked.This
surveyalso askedfor their final gradein the class(100% being the highestgade
possible).Theresponsesheetsfrom two studentsarepresentedbelow:
ResponseSheetI
ID:
Dayof class:
Classtime:
Areyouamorningperson?
Finalgradein class:
Doyouworkoutsideschool?
ResponseSheet2
ID:
Dayof class:
Classtime:
Are you a morningperson? X Yes - No
Finalgradein class:
Dovouworkoutsideschool?
4593
MWF X TTh
Morning X Aftemoon
Yes X No
8s%
Full-time Part{ime
XNo
l90l
x MwF _ TTh
X Morning - Afternoon
83%
Full-time X Part-time
No
Our goal is to enterthe datafrom the two studentsinto SPSSfor usein future
analyses.Thefirststepis to determinethevariablesthatneedto beentered.Any informa-
tion thatcanvary amongparticipantsis a variablethatneedsto be considered.Example
1.2.2liststhevariableswewill use.
Example1.2.2
ID
Dayof class
Classtime
Morningperson
Finalgrade
Whetheror notthestudentworksoutsideschool
In theSPSSdatawindow,columnsrepresentvariablesandrowsrepresentpartici-
pants.Therefore,wewill becreatinga datafile with sixcolumns(variables)andtworows
(students/participants).
Section1.3 Defining Variables
Beforewe canenteranydata,we mustfirst entersomebasicinformationabout
eachvariableintoSPSS.Forinstance,variablesmustfirstbegivennamesthat:
o beginwith aletter;
o donotcontainaspace.
8. ChapterI GettingStarted
Thus, the variablename"Q7" is acceptable,while the variablename"7Q" is not.
Similarly, the variable name "PRE_TEST" is acceptable,but the variable name
"PRE TEST" is not. Capitalizationdoesnot matter,but variablenamesare capitalizedin
this text to make it clear when we are referringto a variablename,even if the variable
nameis not necessarilycapitalizedin screenshots.
To definea variable.click on the VariableViewtabat
thebottomofthemainscreen.ThiswillshowyoutheVari-@
able Viewwindow. To returnto theData Viewwindow. click
on the Data View tab.
Fb m u9* o*.*Trqll t!-.G q".E u?x !!p_Ip
,'lul*lEll r"l*l ulhl **l{,lrl EiliEltfil_sJelrl
l
.lt-*l*lr"$,c"x.l
From the Variable Viewscreen,SPSSallows you to createandedit all of the vari-
ablesin your datafile. Eachcolumn representssomepropertyof a variable,andeachrow
representsa variable.All variablesmust be given a name.To do that, click on the first
empty cell in the Name column and type a valid SPSSvariablename.The programwill
thenfill in defaultvaluesfor mostof theotherproperties.
Oneusefulfunctionof SPSSis theabilityto definevariableandvaluelabels.Vari-
able labelsallow you to associatea descriptionwith eachvariable.Thesedescriptionscan
describethevariablesthemselvesor thevaluesof thevariables.
Value labelsallow you to associatea descriptionwith eachvalueof a variable.For
example,for most procedures,SPSSrequiresnumericalvalues.Thus, for datasuchasthe
day of the class(i.e., Mon/Wed/Fri and Tues/Thurs),we needto first code the valuesas
numbers.We can assignthe numberI to Mon/Wed/Friand the number2to Tues/Thurs.
To helpus keeptrackof thenumberswe haveassignedto thevalues,we usevaluelabels.
To assignvaluelabels,click in the cell you want to assignvaluesto in the Values
column.This will bring up a smallgraybutton(seeanow, below at left). Click on thatbut-
ton to bring up theValue Labelsdialog box.
When you enter a
value label, you must click
Add aftereachentry.This will
J::::*.-,.Tl mOVe the value and itS
associated label into the bottom section of
the window. When all labels have been
added, click OK to return to the Variable
Viewwindow.
iv*rl** ---
v& 12 -Jil
s*l
!!+ |
L.b.f ll6rhl|
9. ChapterI GeningStarred
In additionto namingandlabelingthevariable,you havetheoptionof definingthe
variabletype.To do so,simply click on theType,Width,or Decimalscolumnsin the Vari-
able Viewwindow. The defaultvalue is a numericfield that is eight digits wide with two
decimalplacesdisplayed.If your dataaremorethaneightdigitsto the left of the decimal
place,theywill be displayedin scientificnotation(e.g.,the number2,000,000,000will be
displayedas2.00E+09).'SPSSmaintainsaccuracybeyondtwo decimalplaces,but all out-
put will be roundedto two decimalplacesunlessotherwiseindicatedin the Decimals col-
umn.
In our example,we will beusingnumericvariableswith all of thedefaultvalues.
Practice Exercise
Createa datafile for the six variablesandtwo samplestudentspresentedin Exam-
ple 1.2.1.Nameyour variables:ID, DAY, TIME, MORNING, GRADE, andWORK. You
shouldcodeDAY as I : Mon/Wed/Fri,2 = Tues/Thurs.CodeTIME as I : morning,2 :
afternoon.CodeMORNING as0 = No, I : Yes.CodeWORK as0: No, I : Part-Time,2
: Full-Time. Be sureyou entervalue labelsfor the different variables.Note that because
valuelabelsarenot appropriatefor ID andGRADE, thesearenot coded.When done,your
Variable Viewwindow shouldlook like thescreenshotbelow:
J -rtrr,d
r9"o'ldq${:ilpt"?- "*- .?--
{!,_q,ru.g
Click on the Data Viewtab to openthe data-entryscreen.Enter datahorizontally,
beginningwith the first student'sID number.Enterthecodefor eachvariablein theappro-
priatecolumn;to entertheGRADE variablevalue,enterthestudent'sclassgrade.
F.E*UaUar Qgtr Irrddn Anhna gnphr Ufrrs Hhdow E*
*lgl dJl blblAl'ri-l-Etetmtototttrslglglqjglej ulFId't lr*lEl&lr6lglolrt'
2
Dependinguponyour versionof SPSS,it maybedisplayedas2.08 + 009.
10. ChapterI GettingStarted
-
Thepreviousdatawindowcanbechangedto lookinsteadlike thescreenshotbe-
l*.bv clickingontheValueLabelsicon(seeanow).In thiscase,thecellsdisplayvalue
labelsratherthanthecorrespondingcodes.If datais enteredin thismode,it is notneces-
saryto entercodes,asclickingthebuttonwhichappearsin eachcellasthecellis selected
will presenta drop-downlist of thepredefinedlablis.You mayuseeithermethod,accord-
ingtoyourpreference.
: [[o|vrwl vrkQ!9try /
*rn*to*u*J----.-- )1
Insteadof clicking the ValueLabels icon, you may
optionallytogglebetweenviewsby clickingvalueLaiels under
theViewmenu.
Section1.4 Loading and SavingData Files
Onceyou haveenteredyourdata,you will need
to saveit with a uniquenamefor lateruseso thatyou
canretrieveit whennecessary.
LoadingandsavingSpSSdatafilesworksin the
sameway asmostWindows-basedsoftware.Underthe
File menu, there are Open, Save, and Save As
commands.SPSSdata files have a .,.sav"
extension.
which is addedby defaultto the end of the filename.
ThistellsWindowsthatthefileisanSpSSdatafile.
SaveYourData
When you saveyour datafile (by clicking File, thenclicking Saveor SaveAs to
specifya uniquename),pay specialattentionto whereyou saveit. trrtistsystemsdefaultto
the.location<c:programfilesspss>.You will probablywant to saveyour dataon a floppy
disk,cD-R, or removableUSB drive sothatyou cantaie the file withvou.
,t
,t1
r
ti
il
'i. I
rlii
|:
H-
Load YourData
When you load your data (by clicking File, then
clicking Open,thenData, or by clicking theopenfile folder
icon),you get a similarwindow.This window listsall files
with the ".sav" extension.If you havetroublelocatingyour
saved file, make sure you are
looking in theright directory.
tu
l{il Ddr lrm#m Anrfrrr Cr6l!
D{l lriifqffi
11. ChapterI GeningStarted
PracticeExercise
To be surethatyou havemasteredsav-
ing andopeningdatafiles,nameyour sample
datafile "SAMPLE"andsaveit to a removable
FilE Edt $ew Data Transform Annhze @al
storagemedium.Onceit is saved,SPSSwill displaythe nameof the file at the top of the
data window. It is wise to saveyour work frequently,in caseof computercrashes.Note
thatfilenamesmay be upper-or lowercase.In thistext,uppercaseis usedfor clarity.
After you have savedyour data,exit SPSS(by clicking File, then Exit). Restart
SPSSandloadyour databy selectingthe"SAMPLE.sav"file youjust created.
Section1.5 RunningYour FirstAnalysis
Any time you opena data window, you canmn any of the analysesavailable.To
get started,we will calculatethe students'averagegrade.(With only two students,you can
easilycheckyour answerby hand,but imaginea datafile with 10,000studentrecords.)
The majority of the availablestatisticaltests are under the Analyze menu. This
menudisplaysall the optionsavailablefor your versionof the SPSSprogram(themenusin
thisbookwerecreatedwith SPSSStudentVersion15.0).Otherversionsmay haveslightly
differentsetsof options.
j rttrtJJ
File Edlt Vbw Data TransformI nnafzc Gretrs UUtias gdFrdov*Help
El tlorl rl(llnl
lVisible:6ol
GanoralHnnarf&dd
Corr*lrtr
Re$$r$on
Classfy
OdrRrdrrtMr
Scab
Norparimetrlclcrtt
Tirna5arl6t
Q.rlty Corfrd
Rff(trve,.,
)i
,)
)
ir l.
,.),.
Eipbrc,,.
CrogstSr,..
Rdio,.,
P-Pflok,.,
Q€ Phs.,,
)
l
)
)
To calculatea mean (average),we areaskingthe computerto summarizeour data
set.Therefore,we run the commandby clicking Analyze,thenDescriptive Statistics,then
Descriptives.
This brings up the Descriptives dialog
box. Note that the left side of the box containsa
list of all the variablesin our datafile. On theright
is an area labeled Variable(s), where we can
specifythe variableswe would like to usein this
particularanalysis.
.Srql
3s,l
A*r*.. I
r ktlmllff al
Cottpsr Milns )
't901.00
, Itjg*r*qgudrr,*ts"uss-
OAY
f- 9mloddrov*p*vri*lq
12. ChapterI GettingStarted
We want to compute the mean for the
variable called GRADE. Thus, we need to select
the variablename in the left window (by clicking
on it). To transferit to the right window, click on
the right arrow between the two windows. The
arrow always points to the window oppositethe
highlighted item and can be used to transfer
l:rt.Ij
in
m ;F* |
-t:g.J
-!tJ
PR:lf- Smdadr{rdvdarvai&
selectedvariablesin either direction.Note that double-clickingon the variablenamewill
also transfer the variable to the opposite window. StandardWindows conventionsof
"Shift" clickingor "Ctrl" clickingto selectmultiplevariablescanbe usedaswell.
When we click on the OK button,the analysiswill be conducted,and we will be
readyto examineour output.
Section1.6 ExaminingandPrintingOutputFiles
After an analysis is performed, the output is
placedin the output window, and the output window
becomesthe active window. If this is the first analysis
you have conductedsince starting SPSS,then a new
output window will be created.If you haverun previous
outputisaddedto theendof yourpreviousoutput.
To switchbackandforthbetweenthedatawindowandtheoutput window,select
thedesiredwindowfromtheWindowmenubar(seearrow,below).
Theoutputwindowis splitintotwo sections.Theleftsectionis anoutlineof the
output(SPSSreferstothisasthe"outlineview").Therightsectionis theoutputitself.
irllliliirrillliirrrI -d
* lnl-Xj
H. Ee lbw A*t lra'dorm
-qg*g!r*!e!|ro_
Craphr,Ufr!3 Uhdo'N Udp
slsl*glelsl*letssJsl#_#rl+l*l +l-l&hjl :lqlel,
* Descrlptlves
f]aiagarll l: lrrs datcra&ple.lav
o
lle*crhlurr Sl.*liilca
N Mlnlmum Hadmum Xsrn Std.Dwiation
ufinuc
valldN(|lstrylsa)
I
2
83.00 85.00 81,0000 1.41421
ffiffi?iffi rr---*.* r*4
The sectionon the left of the output window providesan outline of the entireout-
put window. All of the analysesarelistedin theorderin which they wereconducted.Note
that this outline can be usedto quickly locatea sectionof the output.Simply click on the
sectionyou would like to see,andtheright window will jump to the appropriateplace.
analysesandsavedthem,your
ornt
El Pccc**tvs*
r'fi Trb
6r**
lS Adi€D*ard
ffi Dcscrtfhcsdkdics
13. ChapterI GeningStarted
Clicking on a statisticalprocedurealsoselectsall of the outputfor thatcommand.
By pressingtheDeletekey,thatoutputcanbe deletedfrom the output window. This is a
quick way to be surethatthe output window containsonly the desiredoutput.Outputcan
also be selectedand pastedinto a word processorby clicking Edit, then Copy Objeclsto
copy the output.You canthenswitchto your word processorand click Edit, thenPaste.
To print your output,simply click File, thenPrint, or click on the printer icon on
the toolbar.You will havethe option of printing all of your outputor just the currentlyse-
lected section.Be careful when printing! Each time you mn a command,the output is
addedto the end of your previousoutput.Thus,you could be printing a very largeoutput
file containinginformationyou may not want or need.
Oneway to ensurethatyour output window containsonly the resultsof thecurrent
commandis to createa new output window just beforerunningthe command.To do this,
click File, thenNew, then Outpul. All your subsequentcommandswill go into your new
output window.
Practice Exercise
Load the sampledatafile you createdearlier(SAMPLE.sav).Run theDescriptives
commandfor the variableGRADE and print the output.Your output shouldlook like the
exampleon page7. Next,selectthedata window andprint it.
Section1.7 ModifyingDataFiles
Once you havecreateda datafile, it is really quite simple to add additionalcases
(rows/participants)or additionalvariables(columns).ConsiderExample1.7.1.
Example1.7.1
Twomorestudentsprovideyouwithsurveys.Theirinformationis:
ResponseSheet3
ID:
Dayof class:
Classtime:
Are you a morningperson?
Finalgradein class:
Do you work outsideschool?
ResponseSheet4
ID:
Day of class:
Classtime:
Are you a morningperson?
Finalgradein class:
Do you work outsideschool?
8734
80%
MWF
Morning
Yes
Full-time
No
1909
X MWF
X Morning
X Yes
73%
Full+ime
No
X TTh
Afternoon
XNo
Part-time
TTH
Afternoon
No
X Part-time
14. ChapterI GettingStarted
To addthesedata,simply placetwo additionalrows in theData View window (af-
ter loadingyour sampledata).Notice that asnew participantsareadded,the row numbers
becomebold. when done,the screenshouldlook like the screenshothere.
New variablescan also be added.For example,if the first two participantswere
given specialtrainingon time management,andthetwo new participantswerenot, thedata
file canbe changedto reflectthis additionalinformation.The new variablecould be called
TRAINING (whetheror not the participantreceivedtraining), and it would be codedso
that 0 : No and I : Yes. Thus,the first two participantswould be assigneda "1" andthe
Iasttwo participantsa "0." To do this, switch to the Variable View window, then add the
TRAINING variableto the bottom of the list. Then switchback to theData View window
to updatethe data.
f+rilf,t - tt Inl vl
Sa E& Uew Qpta lransform &rpFzc gaphs Lffitcs t/itFdd^,SE__--
14:TRAINING l0 lvGbt€ri of
t0 NAY TIME MORNING GRADE woRKI mruruwe 1r
1 4593.0f1 Tueffhu aterncon No 85.0u Nol Yes
I 1901.OCIManA/Ved/ m0rnrng Yes ffi.0n iiart?mel- yes
3 8734"00 Tueffhu momtng No 80.n0 Noi No
4 1909.00MonrlVed/ morning Yes 73.00 Part-TimeI No '
s
I
(l) .rView { Vari$c Vlew
. l-.1 =J "isPssW
rll'l
,i
Adding dataand addingvariablesarejust logical extensionsof the procedureswe
usedto originally createthe datafile. Savethis new data file. We will be using it again
laterin thebook.
'..,
j .l lrrl vl
nh E*__$*'_P$f_I'Sgr &1{1zcOmhr t$*ues$ilndonHug_
Tffiffi
ID DAY TIME MORNING GRADE WORK var ^
1 4593.00 Tueffhu aternoon No 85.00 No
2 1gnl.B0MonMed/ m0rnrng Yes 83.00 Part-Time
3 8734.00 Tue/Thu mornrng No 80,00 No
1909.00MonAfVed/ mornrng Yeg 73.00 Part-Time
)
.mfuUiewffi
I
rb$ Vbw / l{l rll
'.- - -,,,---Jd*
15P55Procus*rlsready I i ,4
16. Chapter2
EnteringandModifying Data
In Chapter 1, we learnedhow to createa simpledatafile, saveit, perform a basic
analysis,and examinethe output.In this section,we will go into more detail aboutvari-
ablesanddata.
Section2.1 VariablesandDataRepresentation
In SPSS,variablesarerepresentedascolumnsin the datafile. Participantsarerep-
resentedasrows.Thus,if we collect4 piecesof informationfrom 100participants,we will
havea datafile with 4 columnsand 100rows.
Measurement Scales
Therearefour typesof measurementscales:nominal, ordinal, interval, andratio.
While themeasurementscalewill determinewhich statisticaltechniqueis appropriatefor a
given set of data,SPSSgenerallydoesnot discriminate.Thus, we startthis sectionwith
this warning: If you ask it to, SPSSmay conductan analysisthat is not appropriatefor
your data.For a morecompletedescriptionof thesefour measurementscales,consultyour
statisticstext or the glossaryin AppendixC.
Newer versionsof SPSSallow you to indicatewhich types of
data you have when you define your variable.You do this using the
Measurecolumn.You can indicateNominal,Ordinal,or Scale(SPSS
doesnot distinguishbetweeninterval andratio scales).
Look at the sampledatafile we createdin Chapterl. We calcu-
lateda mean for the variableGRADE. GRADE wasmeasuredon a ra-
tio scale,andthemeanis anacceptablesummarystatistic(assumingthatthedistribution
isnormal).
We could havehad SPSScalculatea mean for the variableTIME insteadof
GRADE.If wedid,wewouldgettheoutputpresentedhere.
TheoutputindicatesthattheaverageTIME was 1.25.RememberthatTIME was
coded as an ordinal variable (I =
morningclass,2-afternoon
class).Thus, the mean is not an
appropriatestatisticfor an ordinal
scale,but SPSScalculatedit any-
way. The importanceof consider-
ing the type of data cannot be
overemphasized. Just because
SPSSwill compute a statistic for
you doesnot meanthatyou should
Measure
@Nv
f $cale
.sriltr
r Nominal
ll
*lq]eH"N-ql*l trlllql eilr $l-g
:* Sl astts
.l.:D
gtb
:$sh
.6M6.ffi
$arlrba"t S#(|
ht6x0tMn a
LS 2.qg Lt@
17. ql total
2.00 2.Bn 4.00
3.00 1.00 4.00
4.00 3.00 7.00
2.00
1.00 2.UB 3.00
Chapter2 EnteringandModifying Data
useit. Later in the text,when specificstatisticalproceduresarediscussed,the conditions
underwhich they areappropriatewill be addressed.
Missing Data
Often,participantsdo not providecompletedata.For somestudents,you may have
a pretestscorebut not a posttestscore.Perhapsone studentleft one questionblank on a
survey,or perhapsshedid not stateher age.Missing datacanweakenany analysis.Often,
a singlemissingquestioncaneliminatea sub-
ject from all analyses.
If you havemissingdatain your data
set, leave that cell blank. In the exampleto
the left, the fourth subjectdid not complete
Question2. Note thatthetotal score(which is
calculatedfrom both questions)is alsoblank
becauseof the missing data for Question2.
SPSSrepresentsmissing data in the data
window with a period(althoughyou should
not entera period-just leaveit blank).
Section2.2 TransformationandSelectionof Data
Weoftenhavemoredatain a datafile thanwewantto includein a specificanaly-
sis.For example,our sampledatafile containsdatafrom four participants,two of whom
receivedspecialtrainingandtwo of whomdid not.If we wantedto conductananalysis
usingonlythetwo participantswhodidnotreceivethetraining,we wouldneedto specify
theappropriatesubset.
Selectinga Subset
F|! Ed vl6{ , O*. lr{lrfum An*/& e+hr (
We canusethe SelectCasescommandto specify
a subset of our data. The Select Cases command is
located under the Data menu. When you select this
command,the dialog box below will appear.
t'llitl&JE
il :id
O*fFV{ldrr PrS!tU6.,.
CoptO.tafropc,tir3,..
l,j.l,/r,:irrlrr! lif l ll:L*s,,.
Hh.o*rr,.,
Dsfti fi*blc Rc*pon$5ct5,,,
ConyD*S
sd.rt Csat
You can specify which cases(partici-
pants)you want to selectby using the selec-
tion criteria,which appearon the right sideof
theSelectCasesdialogbox.
q*d-:-"-- "-"""-*--*--**-""*-^*l
6 Alce
a llgdinlctidod
,rl
r irCmu*dcaa ]
i*np* | i{^ lccdotincoarrpr
:
;.,* |
-:--J
c llaffrvci*lc
l0&t
C6ttSldrDonoan!.ffi
foKl aar I c-"rl x* |
t2
18. Chapter2 EnteringandModifying Data
By default,All caseswill be selected.The most commonway to selecta subsetis
to click If condition is satisfied,thenclick on the button labeledfi This will bring up a
newdialogbox thatallowsyou to indicatewhichcasesyou would like to use.
You can enter the logic
used to select the subsetin the
upper section. If the logical
statement is true for a given
case, then that case will be
selected.If the logical statement
is false. that case will not be
selected.For example, you can
selectall casesthat were coded
as Mon/Wed/Fri by enteringthe
formula DAY = I in the upper-
?Ais"I c'-t I Ht I
rightpartof thewindow.If DAY is l, thenthestatementwill betrue,andSPSSwill select
the case.If DAY is anythingotherthan l, the statementwill be false,andthe casewill not
be selected.Once you have enteredthe logical statement,click Continueto return to the
SelectCasesdialogbox. Then,click OK to returnto thedata window.
After you haveselectedthecases,thedata window will changeslightly.
The casesthat werenot selectedwill be markedwith a diagonalline throughthe
casenumber.For example,for our sampledata,the first and third casesarenot
selected.only the secondandfourthcasesareselectedfor this subset.
U;J;J:.1-glL1 E{''di',*tI
, 'J-e.l-,'JlJ.!J-El[aasi"-Eo,t----i
ilqex4q lffiIl,?,l*;*"'=
,Jl _!JlJ 0 U IAFTAN(r"nasl
sl"J=tx-s*t"lBi!?Blt1trb:r
1
I
,
I
I
l
i{
1
,1
'l
1
I
1
:
t
'l
1
'l
EffEN'EEEgl''EEE'o ,.,:r. rt lnl vl
!k_l**
-#gdd.i.&lFlib'-
ID TIME MORNING ERADE WORK TRAINING
/,-< 4533.m Tueffhui affsrnoon No ffi.m Na Yes NotSelected
2 1901.m-
6h4lto*-
ieifrfft
MpnMed/i mornino. -..- ^,-.-.*.*..,-- J.- . - .-..,..".*-....- ':
Yss 83,U1Fad-Jime Yes Splacled
-'4
TuElThu. morning No m.m No No NotSelected
4 MonA/Ved/1morning Yes ru.mPart-Time No
s
!LJii. vbryJv,itayss7 I . *-J *]fsPssProcaesaFrcady I i ,1,
An additionalvariablewill also be createdin your data file. The new variableis
calledFILTER_$ andindicateswhethera casewasselectedor not.
If we calculatea mean
GRADE using the subsetwe
just selected,we will receive
the output at right. Notice that
we now havea mean of 78.00
with a samplesize(M) of 2 in-
steadof 4.
DescripthreStailstics
N Minimum Maximum Mean
std.
Deviation
UKAUE
ValidN
IliclwisP'l
2
2
73.00 83.00 78.0000 7.0711
l3
19. Chapter2 EnteringandModifyingData
Be carefulwhen you selectsubsets.Thesubsetremainsin ffict until you run the
commandagain and selectall cases.You cantell if you havea subsetselectedbecausethe
bottomof the data window will indicatethat a filter is on. In addition,when you examine
your output,N will be lessthanthe total numberof recordsin your dataset if a subsetis
selected.The diagonallines throughsomecaseswill also be evidentwhen a subsetis se-
lected.Be carefulnot to saveyour datafile with a subsetselected,asthis cancauseconsid-
erableconfusionlater.
Computing a New Variable
SPSScan alsobe used
to computea new variable or
manipulateyour existing vari-
ables. To illustrate this, we
will create a new data file.
This file will contain data for
four participants and three
variables(Ql, Q2, and Q3).
The variables represent the
number of points each
participant received on three
different questions.Now enter
the data shown on the screen to the right. When done, save this data file as
"QUESTIONS.sav."We will beusingit againin laterchapters.
I TrnnsformAnalyze Graphs Utilities Whds
Rersdeinto5ameVariable*,,,
RacodointoDffferantVarlables.,,
Ar*omSicRarode,,.
Vlsual8inrfrg,..
After clicking the Compute Variable
command,we get the dialog box at
right.
The blank field marked Target
Variable is where we enter the name
of the new variablewe want to create.
In this example, we are creating a
variablecalled TOTAL, so type the
word"total."
Notice that there is an equals
sign between the Target Variable
blank and the Numeric Expression
blank. Thesetwo blank areasare the
Now you will calculatethe total scorefor
eachsubject.We coulddo this manually,but if the
data file were large, or if there were a lot of
questions,this would take a long time. It is more
efficient (and more accurate) to have SPSS
compute the totals for you. To do this, click
Transform and then click Compute Variable.
U $J-:iidijl
lij -!CJ:l Jslcl
ll;s rtg-sJ
rt rt rl ,_g-.|J
:3 lll--g'L'"J til
, rr | {q*orfmsrccucrsdqf
l4
nh E* vir$, D.tr T|{dorm
*lslel EJ-rlrj -lgltj{l -|tlf,la*intt m eltj I
l* ,---- LHJ
{#i#ffirtr!;errtt*;
,
rrwI i+t*...
*l
gl
w
ca
lllmr*dCof
0rr/ti*
&fntndi)
Oldio.
E${t iil
:J
n*ri c*rl
"*l
20. Chapter2 EnteringandModifying Data
iii:Hffiliji:.:
.i .i>t ii"alCt
i-Jr:J::i i-3J:J
l:j -:15 JJJI
tJ -tJ-il --q-|J
is:Jlll --q*J m
|f-- | ldindm.!&dioncqdinl
tsil nact I c:nt I x* |
two sides of an equation that SPSS
will calculate.For example,total: ql
+ q2 + q3 is the equationthat is
enteredin the samplepresentedhere
(screenshotat left).Notethatit is pos-
sible to create any equation here
simply by using the number and
operationalkeypad at the bottom of
the dialog box. When we click OK,
SPSSwill createa new variablecalled
TOTAL andmakeit equalto the sum
of thethreequestions.
Save your data file again so
thatthenew variablewill be available
for futuresessions.
-lJ
t::,, - ltrl-Xl
Sindow Help
3.n0 3.0n 4,n0 10.00
4.00
31 2.ool 2.oo..........;.
41 1.001 3001
.:1 l-'r--i-----i
I il I i
, l, lqg,t_y!"*_i VariabteViewJ lit rljl
W*;
Recodinga Variable-Dffirent Variable
SPSS can create a new
variable based upon data from
another variable. Say we want to
split our participantson the basisof
their total score.We want to create
a variablecalledGROUP,which is
coded I if the total score is low
(lessthanor equalto 8) or 2 if the
total scoreis high (9 or larger).To
do this, we click Transform, then
Recodeinto Dffirent Variables.
,-.lu l,rll r-al +. conp$ovdiouc','
---.:1.- Cd.nVail'r*dnCasas.,,
l{
-l
I -- -
rr 'rtr I o..**^c--u-r-c
4.00
2.00
i.m
Racodrlrto 0ffrror* Yal
Art(tn*Rrcodr...
U*dFhn|ro,,.
S*a *rd llm tllhsd,,,
Oc!t6 I}F sairs..,
Rid&c l4sitE V*s.,.
Rrdon iMbar G.rs*trr,,.
l5
Eile gdit SEw Qata lransform $nalyza 9aphs [tilities Add'gns
F{| [dt !la{ Data j Trrx&tm Analrra
21. Chapter2 EnteringandModifyingData
This will bring up the
Recode into Different Variables
dialog box shown here. Transfer
the variableTOTAL to the middle
blank. Type "group" in the Name
field underOutputVariable.Click
Change,and the middle blank will
show that TOTAL is becoming
GROUP.asshownbelow.
ladtnl c€ rlccdm confbil
-'tt"
I rygJ**l-H+ |
r t *.!*lr
r&*ri*i*t
;rln
I r-":-'-'1**
lirli
iT-
I r nryrOr:frr**"L
,f-
i c nq.,saa*ld6lefl;
F-
,.F--*-_-_-_____
: "
*r***o
I a lrt*cn*r
I I nni.
rT..".''..."...-
I ir:L-_-
t'
l6 i4i'|(tthah*
;F-
I"
n*'L,*l'||.r.$,
: r----**-:
;
r {:ei.*
T &lrYdd.r*t li--
'-
i"r,.!*r h^.,",r y..,t larir,r it:.' I
gf-ll $q I
'*J
til
To help keep track of variablesthat have
been recoded, it's a good idea to open the
Variable View and enter"Recoded"in the Label
column in the TOTAL row. This is especially
useful with large datasetswhich may include
manyrecodedvariables.
Click Old andNew Values.This will bring
up the Recodedialog box. In this example,we
have entered a 9 in the Range, value through
HIGHEST field and a 2 in the Value field under
New Value.When we click Add, theblank on the
right displaysthe recodingformula.Now enteran
8 on the left in the Range, LOWEST through
valueblank and a I in the Valuefield underNew
Value.Click Add, thenContinue.Click OK. You
will be redirectedto the data window. A new
variable (GROUP) will have been added and
codedas I or 2, basedon TOTAL.
*u"'." -ltrlIl
Flc Ed Yl.ly Drt! Tr{lform {*!c ce|6.,||tf^,!!!ry I+
NtnHbvli|bL-lo|rnrV*#r
l6
22. Chapter3
DescriptiveStatistics
ln Chapter2, wediscussedmanyof theoptionsavailablein SPSSfor dealingwith
data.Now we will discusswaysto summarizeour data.Theproceduresusedto describe
andsummarizedataarecalleddescriptivestatistics.
Section3.1 FrequencyDistributionsand PercentileRanks
for a SingleVariable
Description
TheFrequenciescommandproducesfrequencydistributionsfor thespecifiedvari-
ables.Theoutputincludesthenumberof occurrences,percentages,validpercentages,and
cumulativepercentages.Thevalid percentagesandthe cumulativepercentagescomprise
onlythedatathatarenotdesignatedasmissing.
TheFrequenciescommandis usefulfor describingsampleswherethemeanis not
useful(e.g.,nominalor ordinalscales).It is alsousefulasa methodof gettingthefeelof
yourdata.It providesmoreinformationthanjust a meanandstandarddeviationandcan
beusefulin determiningskewandidentifyingoutliers.A specialfeatureof thecommand
isitsabilityto determinepercentileranks.
Assumptions
Cumulativepercentagesandpercentilesarevalidonly for datathataremeasured
onat leastanordinal scale.Becausetheoutputcontainsonelinefor eachvalueof a vari-
able,thiscommandworksbestonvariableswitharelativelysmallnumberof values.
Drawing Conclusions
TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases
in thesampleof a particularvalueandthepercentageof caseswith thatvalue.Thus,con-
clusionsdrawnshouldrelateonlyto describingthenumbersor percentagesof casesin the
sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper-
centageand/orpercentilescanbedrawn.
.SPSSData Format
TheSPSSdatafile for obtainingfrequencydistributionsrequiresonlyonevariable,
andthatvariablecanbeof anytype.
tt
23. Chapter3 DescriptiveStatistics
Creating a Frequency Distribution
To run the Frequer?ciescommand,
click Analyze, then Descriptive Statistics,
then Frequencies.(This exampleusesthe
CARS.savdatafile that comeswith SPSS.
It is typically located at <C:Program
FilesSPSSCars.sav>.)
This will bring up the main dialog
box. Transferthe variablefor which you
would like a frequencydistributioninto the
Disbtlvlr...
N
Erpbr,..
croac*a,..
Rrno,.,
F.Pt'lok,.,
aaPUs,.,
Variable(s)blank to the right. Be surethat
the Display frequency tables option is
checked.Click OK to receiveyour output.
Note that the dialog boxes in
newer versionsof SPSSshow both the
typeof variable(theicon immediatelyleft
of the variable name) and the variable
labels if they are entered. Thus, the
variableYEAR shows up in the dialog
box asModel Year(moduloI0).
i:rl.&{l&l&lslsl}sl
i1 rmpg i18
MilesperGallonlmr
/Erqlr,onispUcamr
/ Hurepowor[horc
dv*,id"w"bir 1|ut
d t!rc toAceileistc
dr',Ccxr*yolOrbin[c
l7 Oisgayhequercytder
xl
q!l
jq? |
.f"tq I
. He_l
sr**i,1..1f*:.,.I rry*,:.I
Outputfor a Frequency Distribution
The outputconsistsof two sections.The first sectionindicatesthe numberof re-
cordswith valid data for eachvariableselected.Recordswith a blank scorearelistedas
missing.In thisexample,thedatafile contained406 records.Noticethatthevariablelabel
is ModelYear(modulo100).
statistics
The second section of the output contains a
cumulative frequency distribution for each variable
Wselected.Atthetopofthesection,thevariablelabelis
|
* y.1"1 |
oo?
| given.The outputiiself consistsof five columns.The first
I MissingI t I Jolumnliststhi valuesof thevariablein sortedorder.There
is a row for eachvalueof your variable,
and additionalrows are added at the
bottom for the Total and Missing data.
The secondcolumngivesthe frequency
of eachvalue,includingmissingvalues.
Thethirdcolumngivesthepercentageof
all records (including records with
missingdata)for eachvalue.The fourth
column,labeledValidPercenl,givesthe
percentageof records(withoutincluding
records with missing data) for each
value.If therewereany missingvalues,
thesevalueswould be larger than the
valuesin columnthreebecausethe total
ModolYo.r (modulo 100)
Pcrcenl Valid P6rc€nl
Cumulativs
vatE
72
73
74
75
76
77
79
80
81
82
Total
Missing 0 (Missing)
Total
34
28
40
27
30
34
28
29
29
30
31
405
1
406
I 4
7.1
6.9
9.9
6.7
8.4
6.9
8.9
7.1
7.1
7.4
7.6
99.8
100.0
I 4
7.2
6.9
9.9
6.7
7.4
8.4
6.9
8.9
f.2
7.2
7.4
7.7
100.0
E4
15.6
22.5
32.3
39.0
46.4
54.8
61.7
70.6
77.8
84.9
92.3
|00.0
r8
&99rv I
@
cdrFrb'l{tirE }
r5117gl
24. Chapter3 DescriptiveStatistics
numberof recordswould havebeenreducedby thenumberof recordswith missingvalues.
The final column gives cumulativepercentages.Cumulativepercentagesindicatethe per-
centageof recordswith a scoreequalto or smallerthan the currentvalue.Thus, the last
value is always 100%.Thesevaluesare equivalentto percentile ranks for the values
listed.
Determining PercentiIe Ranl<s
:,,.
tril
YI
!rydI
|*"1
lT Oirpbarfrcqlcreyttblce
frfix*... I
Central TendencyandDispersior sections
suchasthe Median or Mode. whichcannot
(seeSection3.3).
This brings up the Frequencies:
Statisticsdialog box. Check any additional
desiredstatisticby clickingon the blanknext
to it. For percentiles, enter the desired
percentile rank in the blank to the right of
thePercentile(s)label.Then,click Add to add
it to the list of percentilesrequested.Once
you haveselectedall your requiredstatistics,
click Continue to return to the main dialog
box.Click OK.
The Frequencies command can be
used to provide a number of descriptive
statistics,as well as a variety of percentile
values(includingquartiles, cut points,and
scorescorrespondingto a specificpercentile
rank).
To obtain either the descriptiveor
percentile functions of the Frequencies
command,click the Statisticsbutton at the
bottomof the maindialog box. Note thatthe
of this box are useful for calculatingvalues,
be calculatedwith theDescriptiyescommand
PscdibV.lrr
xl
c{q I
*g"d I
Hdo I
tr Ourilr3
I
F nrs**rtd!i* ,crnqo,p, i
f- Vdrixtgor0mi&ohlr
Oi$.r$pn"
l* SUaa**
n v$*$i
I* nmgc
f Mi*n n
|- Hrrdilrtl
l- S"E.mcur
0idthfim'
t- ghsrurt
T Kutd*b
Statistics
ModelYear(modulo100
N Vatid
Missing
Percentiles 25
50
75
80
405
1
73.00
76.00
79.00
80.00
Outputfor PercentileRanl<s
The Statisticsdialog box adds on to the
previousoutput from the Frequenciescommand.The
new sectionof theoutputis shownat left.
The output containsa row for eachpieceof
informationyou requested.In the exampleabove,we
checkedQuartilesand askedfor the 80th percentile.
Thus, the output contains rows for the 25th, 50th.
75th,and80thpercentiles.
Mla pa Galmlm3
Sfndr*Pi*rcsnr
SHslsp{rierltuso
/v***v*$t*(ttu
/lino toaccrbrar
$1C**{ry o{Origr[c
l9
25. Chaprer,1 Descriptire Statistics
PracticeExercise
UsingPracticeDataSetI in AppendixB, createa frequencydistributiontablefor
themathematicsskillsscores.Determinethemathematicsskillsscoreat whichthe60th
percentilelies.
section3.2 FrequencyDistributionsand percentileRanks
for Multiple Variables
Description
The Crosslabscommandproducesfrequencydistributionsfor multiplevariables.
Theoutputincludesthenumberof occurrencesof eachcombinationof levelJof eachvari-
able.It ispossibleto havethecommandgivepercentagesfor anyor all variables.
The Crosslabscommandis usefulfor describingsampleswherethe meanis not
useful(e'g.,nominalor ordinalscales).It is alsousefulasa methodfor gettinga feelfor
yourdata.
Assumptions
Becausethe outputcontainsa row or columnfor eachvalueof a variable.this
commandworksbestonvariableswitharelativelysmallnumberof values.
ThisexampleusestheSAMpLE.savdata ;ilffi;
file, which you createdin Chapter l. To run the chrfy
procedure, ctick Analyze, then Descriptive DttaRcd.Etbn
Statistics,then Crosstabs.This will bring up ttt.
scah
mainCrosstabsdialogbox,below.
,SPSSData Format
The SPSSdata file for the Crosstabs
commandrequirestwo or morevariables.Those
variablescanbeof anytype.
RunningtheCrosstabsCommand
I lnalyzc Orphn Ut||Uot
RcF*r )
(orprycrllcEnr
G*ncralllrgarFlodcl
The dialog box initially lists all vari-
ableson the left and containstwo blanks la-
beled Row(s) and Column(s). Enter one vari-
able(TRAINING) in theRow(s)box. Enterthe
second (WORK) in the Column(s) box. To
analyzemore than two variables,you would
enter the third, fourth, etc., in the unlabeled
area(ust undertheLayer indicator).
)
)
,
)
)
)
)
i,
Ror{.} T€K I
r---r ftr;;ho.- '-l
lrJ I
.;lm&! ryq I
20
26. Chapter3 DescriptiveStatistics
percentagesand other information to be generatedfor
eachcombinationof values.Click Cells,andyou will get
thebox at right.
For the example presentedhere, check Row,
Column, and Total percentages.Then click Continue.
This will return you to the Crosstabsdialog box. Click
OK to run theanalvsis.
TRAINING'WURKCross|nl)tilntlo|l
WORK
TolalNO Parl-Time
TRAINING Yes Count
%withinTRAININO
%withinwoRK
%ofTolal
I
50.0%
50.0%
25.0%
1
50.0%
50.0%
25.0%
100.0%
50.0%
50.0%
No Count
%withinTRAINING
%withinWORK
%ofTolal
1
50.0%
50.0%
25.0%
1
50.0%
50.0%
25.0%
?
1000%
50.0%
50.0%
Total Count
%withinTRA|NtNo
%wilhinWORK
%ofTolal
50.0%
100.0%
50.0%
a
500%
100.0%
50.0%
4
r00.0%
100.0%
100.0%
Interpreting Crosstabs Output
The output consistsof a
contingencytable.Each level of
WORK is given a column.Each
level of TRAINING is given a
row. In addition, a row is added
for total, and a column is added
for total.
The Cells button allows you to specify W:
t C",ti* |
t*"1
,"1
Eachcell containsthe numberof participants(e.g.,one participantreceivedno
traininganddoesnot work; two participantsreceivedno training,regardlessof employ-
mentstatus).
Thepercentagesfor eachcell arealsoshown.Row percentagesaddup to 100%
horizontally.Columnpercentagesaddupto 100%vertically.Forexample,of all theindi-
vidualswhohadno training, 50ohdid notworkand50o%workedpart-time(usingthe"o/o
withinTRAINING" row).Of theindividualswhodid notwork,50o/ohadno trainingand
50%hadtraining(usingthe"o/owithinwork"row).
Practice Exercise
UsingPracticeDataSet I in AppendixB, createa contingencytableusingthe
Crosstabscommand.Determinethe numberof participantsin eachcombinationof the
variablesSEXandMARITAL. Whatpercentageof participantsis married?Whatpercent-
ageof participantsis maleandmarried?
Section3.3 Measuresof Central Tendencyand Measuresof Dispersion
for a SingleGroup
Description
Measuresof centraltendencyarevaluesthat representa typicalmemberof the
sampleor population.Thethreeprimarytypesarethemean,median,andmode.Measures
of dispersiontell you thevariabilityof yourscores.Theprimarytypesaretherangeand
thestandarddeviation.Together,a measureof centraltendencyanda measureof disper-
sionprovideagreatdealof informationabouttheentiredataset.
''Pd€rl.!p. - r-Bait*"
;F Bu : ,l- U]dadr&ad
F corm if- sragatrd
"1'"1--_rry-ys___ .
2l
27. Chapter,l DescriptiveStatistics
We will discussthesemeasuresof central
tendencyandmeasuresof dispersionin the con-
text of the Descriplives command. Note that
many of thesestatisticscan also be calculated
with several other commands (e.g., the
Frequenciesor CompareMeans commandsare
requiredto computethe mode or median-the
Statisticsoption for theFrequenciescommandis
shownhere).
iffi{ltl*::l'.,xl
Fac*Vd*c-----:":'-'-"-" "-
|7 Arruer
|* O*pai*furjF tqLteiotpr
F rac$*['*
r.-I 16-k'I
':'I I+l
lcer**r**nc*r1 !*{* |
f- rlm Cr* |
, f u"g.t -:.-i
i0hx*ioo*".'*-'
lf Sld.dr',iitbnl* lli*nn
]fV"iro
f.H**ntrn
lfnxrgo f.5.t.ncr
: T Modt
:-^t5m
l- Vdsm$apn&bcirr
oidrlatin-- --
r5tcffi:
; f Kutu{b
i
Assumptions
Eachmeasureof centraltendencyandmeasureof dispersionhasdifferent assump-
tionsassociatedwith it. The mean is the mostpowerfulmeasureof centraltendency,andit
hasthe mostassumptions.For example,to calculatea mean,the datamustbe measuredon
an interval or ratio scale.In addition,thedistributionshouldbe normally distributedor, at
least,not highly skewed.The median requiresat leastordinal data.Becausethe median
indicatesonly the middle score(when scoresarearrangedin order),thereareno assump-
tions aboutthe shapeof the distribution.The mode is the weakestmeasureof centralten-
dency.Thereareno assumptionsfor the mode.
The standard deviation is themostpowerful measureof dispersion,but it, too, has
severalrequirements.It is a mathematicaltransformationof the variance (the standard
deviationis the squareroot of thevariance).Thus,if oneis appropriate,theotheris also.
The standard deviation requiresdatameasuredon an interval or ratio scale.In addition,
the distributionshouldbe normal.The range is the weakestmeasureof dispersion.To cal-
culatea range, the variablemustbe at leastordinal. For nominal scaledata,the entire
frequencydistributionshouldbe presentedasa measureof dispersion.
Drawing Conclusions
A measureof centraltendencyshouldbe accompaniedby a measureof dispersion,
Thus, when reporting a mean, you shouldalso report a standard deviation. When pre-
sentinga median, you shouldalsostatetherange or interquartilerange.
.SPSSData Format
Only onevariableis required.
22
28. Chapter3 DescriptiveStatistics
Running the Command
The Descriptives command will be the
command you will most likely use for obtaining
measuresof centraltendencyandmeasuresof disper-
sion. This exampleusesthe SAMPLE.sav data file
we haveusedin thepreviouschapters.
,t X
dlt
da.v
qil
n".dI
cr*l I
f,"PI
opdqr"..I
To run the command, click Analyze,
then Descriptive Statistics,then Descriptives.
This will bring up the main dialog box for the
Descriptives command. Any variables you
would like informationaboutcanbe placedin
the right blank by double-clickingthem or by
selectingthem,thenclicking on theanow.
!
D
' cond*s
. Rolrar*n
: classfy
: 0€tdRedrctitrt
)
)
)
)
d**
?n-"*
?,r,qx
/t**ts
f S&r dr.d!r&!d Y*rcr ri vdi.bb
By default, you will receivethe N (number of
cases/participants),the minimum value, the maximum
value,the mean, and the standard deviation.Note that
someof thesemay not be appropriatefor the type of data
you haveselected.
If you would like to changethe defaultstatistics
that aregiven, click Optionsin the main dialog box. You
will begiventheOptionsdialogbox presentedhere.
F Morr l- Slm r@t
qq..'I
,|'?bl
ltl
{l
'!t
,l
,lt
il
'i
I
I
:
"i
I
",
;i
I
;
F su aa**n F, Mi*ilm
f u"or- F7Maiilrn
l- nrrcr I- S.r.npur
I otlnyotdq: *
I {f V;i*hlC
I r lpr,*an
I
r *car*remar
i r Dccemdnnmre
Reading the Output
The output for the Descriptivescommandis quite straightforward.Each type of
outputrequestedis presentedin a column,andeachvariableis given in a row. The output
presentedhereis for the sampledatafile. It showsthatwe haveonevariable(GRADE) and
that we obtainedthe N, minimum, maximum,mean, and standard deviation for this
variable.
DescriptiveStatistics
N Minimum Maximum Mean Std.Deviation
graoe
ValidN (listwise)
4
4
73.00 85.00 80.2500 5.25198
lA-dy* ct.dn Ltffibc
GonardtFra*!@
23
29. Chapter3 DescriptiveStatistics
Practice Exercise
UsingPracticeDataSet I in AppendixB, obtainthe descriptivestatisticsfor the
ageof theparticipants.What is themean?The median?The mode?What is thestandard
deviation?Minimum?Maximum?The range?
Section3.4 Measuresof Central Tendency and Measuresof Dispersion
for Multiple Groups
Description
The measuresof centraltendencydiscussedearlierare often needednot only for
theentiredataset,but alsofor severalsubsets.Oneway to obtainthesevaluesfor subsets
would be to usethe data-selectiontechniquesdiscussedin Chapter2 andapply theDe-
scriptivescommandto eachsubset.An easierway to performthis task is to usetheMeans
command.The Meanscommandis designedto providedescriptivestatisticsfor subsets
ofyour data.
Assumptions
The assumptionsdiscussedin the sectionon Measuresof CentralTendencyand
Measuresof Dispersionfor a SingleGroup(Section3.3)alsoapplyto multiplegroups.
Drawing Conclusions
A measureof centraltendencyshouldbe accompaniedby a measureof dispersion.
Thus,whengiving a mean,you shouldalsoreporta standarddeviation.Whenpresenting
a median,you shouldalsostatetherangeor interquartilerange.
SPSSData Format
Two variablesin the SPSSdatafile are required.One representsthe dependent
variable and will be the variablefor which you receivethe descriptivestatistics.The
otheris theindependentvariable andwill beusedin creatingthesubsets.Notethatwhile
SPSScallsthis variablean independentvariable, it may not meetthe strictcriteriathat
definea trueindependentvariable (e.g.,treatmentmanipulation).Thus,someSPSSpro-
ceduresreferto it asthegroupingvariable.
RunningtheCommand
This example ! RnalyzeGraphsUtilities
nsportt F
' DescriptiveStatistirs )
GeneralLinearftladel F
' Csrrelata )
. Regression I
' (fassify F
WindowHetp I-l
r.l
Firulbgt5il |
-
Ona-Sarnplef feft.
Independent-SamdesTTe
Falred-SarnplEsTTest,,,
Ons-Way*|iJOVA,,,
uses the
SAMPLE.sav data file you created in
Chapterl. The Meanscommandis run by
clicking Analyze, then Compare Means,
thenMeans.
This will bringup the maindialog
box for the Means command. Place the
selectedvariablein the blank field labeled
DependentList.
1A
LA
30. Chapter3 DescriptiveStatistics
Placethe grouping variable in thebox labeledIndependentList.In this example,
throughuseof the SAMPLE.savdatafile, measuresof centraltendencyand measuresof
dispersion for the variable GRADE will be given for each level of the variable
MORNING.
:I
tu
DependantList
€ arv
,du**
/wqrk
€tr"ining
rTril
ll".i I
lLayarlal1*-
I :'r:rrt| ..!'l?It.Ii
I IndependentLi$:
i r:ffi
lr-, tffi,
r l*i.rl I
L-:-
ryl
HesetI
CancelI
l"rpI
By default,the mean,numberof cases,and
standard deviation are given. If you would like
additionalmeasures,click Optionsand you will be
presentedwith the dialog box at right. You can opt
to includeany numberof measures.
Reading the Output
The output for the Means commandis split
into two sections.The first section,called a case
processingsummary, gives informationaboutthe
data used. In our sample data file, there are four
students(cases),all of whom were includedin the
analysis.
I
Std.Enord Kutosis
Skemrcro
fd Stdirtlx:
mil'*-*
lltlur$uofCa*o*
lStardad
Doviaion
ml
I
I
Lqlry-l c""dI x,r I
Sld.Enool$karm
HanorricMcan :J
Medan
5tt
Minirn"rm
Manimlrn
Rarqo
Fist
La{
VsianNc
GaseProcessingSummary
Cases
lncluded Excluded Total
N Percent N Percent N Percent
grade- morning 4 100.0% 0 .OYo 4 | 100.0%
25
31. Chapter3 DescriptiveStatistics
The secondsectionof the out-
put is the report from the Means com-
mand.
This report lists the name of
the dependent variable at the top
(GRADE). Every level of the inde-
pendent variable (MORNING) is
shown in a row in the table.In this example,the levelsare 0 and l, labeledNo and Yes.
Note thatif a variableis labeled,thelabelswill be usedinsteadof theraw values.
The summarystatisticsgiven in the reportcorrespondto the data,wherethe level
of theindependentvariable is equalto therow heading(e.g.,No, Yes).Thus,two partici-
pantswereincludedin eachrow.
An additionalrow is added,namedTotal. That row containsthe combineddata.
andthe valuesarethe sameasthey would be if we hadrun theDescriptiyescommandfor
thevariableGRADE.
Extension to More Than One Independent Variable
If you have more than one
independent variable, SPSScan
break down the output even fur-
ther. Rather than adding more
variables to the Independent List
section of the dialog box, you
need to add them in a different
layer. Note that SPSS indicates
with which layeryou areworking.
If you click Next, you will be presentedwith
Layer 2 of 2, and you can selecta secondindependent
variable (e.g., TRAINING). Now, when you run the
command(by clicking On, you will be given summary
statistics for the variable GRADE by each level of
MORNING andTRAINING.
Your output will look like
the output at right. You now have
two main sections(No and yes),
along with the Total. Now, how-
ever, each main section is broken
down into subsections(No, yes,
andTotal).
The variable you used in
Level I (MORNING) is the first
one listed,and it definesthe main
sections.The variableyou had in
Level 2 (TRAINING) is listedsec-
Repott
GRADE
MORNING Mean N Std.Deviation
NO
Yes
Total
82.5000
78.0000
80.2500
2
4
3.53553
7.07107
5.25198
Report
ORADE
MORNING TRAINING Mean N Std.Deviation
No Yes
NO
Total
85.0000
80.0000
82.5000
1
1
I 3.53553
Yes Yes
NO
Total
83.0000
73.0000
78.0000
1
1
1
7.07107
Total Yes
NO
Total
84.0000
76.5000
80.2500
a
z
4
1.41421
4.54575
5.?5198
id
26
32. Chapter3 DescriptiveStatistics
ond.Thus,the first row representsthoseparticipantswho werenot morningpeopleand
whoreceivedtraining.Thesecondrowrepresentsparticipantswhowerenotmorningpeo-
pleanddid notreceivetraining.Thethirdrow representsthetotalfor all participantswho
werenotmorningpeople.
Noticethatstandarddeviationsarenotgivenfor all of therows.Thisis because
thereisonlyoneparticipantpercellin thisexample.Oneproblemwithusingmanysubsets
is thatit increasesthenumberof participantsrequiredto obtainmeaningfulresults.Seea
researchdesigntextor yourinstructorfor moredetails.
Practice Exercise
UsingPracticeDataSetI in AppendixB, computethemeanandstandarddevia-
tion of agesfor eachvalueof maritalstatus.Whatis theaverageageof themarriedpar-
ticipants?Thesingleparticipants?Thedivorcedparticipants?
Section3.5 Standard Scores
Description
Standardscoresallowthecomparisonof differentscalesby transformingthescores
intoa commonscale.Themostcommonstandardscoreis thez-score.A z-scoreis based
ona standardnormaldistribution(e.g.,a meanof 0 anda standarddeviationof l). A
z-score,therefore,representsthenumberof standarddeviationsaboveor belowthemean
(e.9.,az-scoreof -1.5representsascoreI %standarddeviationsbelowthemean).
Assumptions
Z-scoresarebasedon thestandardnormal distribution.Therefore,thedistribu-
tionsthatareconvertedtoz-scoresshouldbenormallydistributed,andthescalesshouldbe
eitherintervalor ratio.
Drawing Conclusions
Conclusionsbasedonz-scoresconsistof thenumberof standarddeviationsabove
or belowthemean.Forexample,astudentscores85onamathematicsexamin aclassthat
hasa meanof 70andstandarddeviationof 5.Thestudent'stestscoreis l5 pointsabove
theclassmean(85- 70: l5). Thestudent'sz-scoreis 3 becauseshescored3 standard
deviationsabovethemean(15+ 5 :3). If thesamestudentscores90ona readingexam,
witha classmeanof 80anda standarddeviationof 10,thez-scorewill be I .0because
sheis onestandarddeviationabovethe mean.Thus,eventhoughher raw scorewas
higheronthereadingtest,sheactuallydidbetterin relationto otherstudentsonthemathe-
maticstestbecauseherz-scorewashigheronthattest.
.SPSSData Format
Calculatingz-scoresrequiresonlya singlevariablein SPSS.Thatvariablemustbe
numerical.
27
33. Chapter3 DescriptiveStatistics
Running the Command
Computingz-scoresis a componentof the
Descriptivescommand.To accessit, click Analyze,
thenDescriptive Statistics,thenDescriptives. This
exampleusesthe sampledata file (SAMPLE.sav)
createdin ChaptersI and2.
19 Srva*ndudi3advduosts vcriaHas
Myzc eqhs Uti$tbl WMow Help
) b,lrstlK- al
@nerdLlneuFbdel )
Correlate )
This will bring up the stan-
dard dialog box for the Descrip-
/ives command.Notice the check-
box in the bottom-left corner la-
beled Save standardized values as
variables.Checkthis box andmove
the variableGRADE into the right-
handblank. Then click OK to com-
pletethe analysis.You will be pre-
sented with the standard output
from theDescriptivescommand.Notice thatthez-scoresarenot listed.They wereinserted
into thedata window asa new variable.
Switch to the Data View window and examineyour data file. Notice that a new
variable,called ZGRADE, has beenadded.When you askedSPSSto save standardized
values,it createda new variablewith the samenameasyour old variableprecededby a Z.
Thez-scoreis computedfor eachcaseandplacedin thenew variable.
lr| -tsJXEb E* S€w Qpt. lrnsfam end/2. gr$t6
t*l
tsr.dI
c"odI
HdpI
ldry |
elslel&l *il{|lelej sJglelffilslffilfw,qlqj
$citffrtirffi
Tua/Thulaiemoon Yas
Yes
No
Mi-
Reading the Output
After you conductedyour analysis,the new variablewascreated.You canperform
any numberof subsequentanalyseson thenew variable.
Practice Exercise
Using PracticeData Set2 in AppendixB, determinethez-scorethatcorrespondsto
eachemployee'ssalary.Determinethe mean z-scoresfor salariesof male employeesand
femaleemployees.Determinethe meanz-scorefor salariesof thetotal sample.
rc11i-io-
doay
drnue
dMonNtNs
dwnnn
drR$HtNs
28
34. Chapter4
GraphingData
Section4.1 GraphingBasics
In addition to the frequencydistributions,the measuresof central tendencyand
measuresof dispersiondiscussedin Chapter3, graphingis a usefulway to summarize,or-
ganize,andreduceyour data.It hasbeensaidthat a pictureis worth a thousandwords.In
thecaseof complicateddatasets,this is certainlytrue.
With Version 15.0of SPSS,it is now possibleto makepublication-qualitygraphs
usingonly SPSS.One importantadvantageof usingSPSSto createyour graphsinsteadof
othersoftware(e.g.,Excel or SigmaPlot)is that the datahavealreadybeenentered.Thus,
duplicationis eliminated,andthechanceof makinga transcriptionerroris reduced.
Section4.2 TheNewSPSSChartBuilder
DataSet
For the graphingexamples,we will usea new setof data.Enterthe databelowby
defining the three subjectvariablesin the Variable View window: HEIGHT (in inches),
WEIGHT (in pounds),and SEX (l = male,2 = female).When you createthe variables,
designateHEIGHT and WEIGHT as Scalemeasuresand SEX as a Nominal measure(in
thefar-rightcolumnof the VariableView).Switchto theData Viewto
enterthedatavaluesfor the 16participants.Now usetheSaveAs com-
mandtosavethefile,namingit HEIGHT.sav.
bCIb
--
iNiomiiiai
-
Measure
Scale
HEIGHT
66
69
/5
72
68
63
74
70
66
64
60
67
64
63
67
65
WEIGHT
150
155
160
160
150
140
165
150
ll0
100
95
ll0
105
100
ll0
105
SEX
I
I
I
I
I
I
I
I
2
2
2
2
2
2
2
2
29
35. Chapter4 GraphingData
Make sureyou have enteredthe datacorrectlyby calculatinga mean for eachof
the threevariables(click Analyze,thenDescriptive Statistics,thenDescriptives).Compare
yourresultswith thosein thetablebelow.
DescrlptlveStatistics
N Minimum Maximum Mean
srd.
Dpvi2lion
l-ttstuFlI
WEIGHT
SEX
ValidN
(listwise)
16
16
16
16
60.00
06 nn
1.00
74.00
165.00
2.00
66.9375
129.0625
1.5000
J.9Ub//
26.3451
.5164
Chart Builder Basics
Make surethat the HEIGHT.savdatafile you createdaboveis open.In order to
usethe chartbuilder,you musthavea datafile open.
NewwithVersionl5.0ofSPSSistheChartBuildercom.W
mand. This command is accessedusing Graphs, then Chart
Builder in the submenu.This is a very versatilenew commandthat
canmakegraphsof excellentquality.
When you first run the Chart Builder command,you will
probablybepresentedwith the following dialog box:
Bcforeyur rrc thlsdalog,moasuranar*hvelshold bcsctgecrh fw cadrvadabb
h yourdurt. In dtbn, f yow chartcodahscataqo*d v6d&. v*re hbds
sha.rldbr &fhcd for eachcrtrgory
kass O( to doflrcyorr chart,
Pr6srDafineV.riaHafroportbsto mt masrcnrant brd orddhe v*.te l&b for
rhartvsi$bs,
:,
f* non't*row $rUdalogagaFr
This dialog box is
askingyouto ensurethatyour
variables are properly de-
fined.Referto Sections1.3
and2.1 if you haddifficulty
definingthevariablesusedin
creatingthe datasetfor this
example,or to refreshyour
knowledgeof thistopic.Click
oK.
cc[ffy
Eesknotnents
Ocfknvubt# kopcrtcr.,.
The Chart Builder allows you to makeany kind of graphthat
is normally usedin publicationor presentation,and much of it is be-
yond the scopeof this text. This text,however,will go overthe basics
of the ChartBuilder sothatyou canunderstandits mechanics.
On the left sideof the Chart Builder window arethe four main
tabsthat let you control the graphsyou are making. The first one is
theGallery tab.The Gallerytaballowsyou to choosethebasicformat
ofyour graph.
l"ry{Y:_
litleo/Footndar
-
rct"ph; Lulitieswindt
ol(
30
36. Chapter4 GraphingData
For example, the screenshothere
showsthedifferentkindsof barchartsthat
theChartBuilder cancreate.
After you have selectedthe basic
form of graph that you want using the
Gallery tab, you simply drag the image
from the bottom right of the window up to
the main window at the top (where it
reads,"Drag a Gallery charthereto useit
asyour startingpoint").
Alternatively,you can use the Ba-
sicElemenlstab to drag a coordinatesys-
tem (labeledChooseAxes)to the top win-
dow, then drag variables and elements
into thewindow.
The other tabs (Groups/Point ID
and Titles/Footnotes)can be usedfor add-
ing other standard elements to your
graphs.
The examples in this text will
cover some of the basic types of graphs
@9Pk8:
0rr9 a 63llst ctrt fsg b re it e
y* 6t'fig pohr
OR
Clkl m f€ 86r Ele|mb * b tulH
r dwt €lsffirt bf ele|Ft
Chrtpftrbv [43 airr?b deb
dnsrfiom:
Ll3
Aroa
PleFokr
Scalbillot
Hbbqran
HUH-ot,
8oph
DJ'lAm
8artsElpnF&
n"ct I cror | ,bh I
you canmakewith the ChartBuilder.After a little experimentationon your own, onceyou
havemasteredthe examplesin the chapter,you will soongain a full understandingof the
ChartBuilder.
Section4.3 Bar Charts, PieCharts,and Histograms
Description
Barcharts,piecharts,andhistogramsrepresentthenumberof timeseachscoreoc-
cursthroughthevaryingheightsof barsor sizesof piepieces.Theyaregraphicalrepresen-
tationsof thefrequencydistributionsdiscussedin Chapter3.
Drawing Conclusions
TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases
in the samplewith a particularvalueandthepercentageof caseswith thatvalue.Thus,
conclusionsdrawnshouldrelateonly to describingthe numbersor percentagesfor the
sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper-
centagesand/orpercentilescanalsobedrawn.
SPSSData Format
Youneedonlvonevariableto usethiscommand.
3l
37. Chapter4 GraphingData
Running the Command
The Frequenciescommandwill produce
graphicalfrequencydistributions.Click Analyze,
then Descriptive Statistics, then Frequencies.
You will be presentedwith the maindialog box
for the Frequenciescommand,where you can
enter the variablesfor which vou would like to
| *nalyze Gr;pk Udties Window Hdp
creategraphsor charts.(SeeChapter3 for otheroptionswith this command.)
You will receive the charts for any variables
lectedin the mainFrequenciescommanddialog box.
Output
The bar chartconsistsof a I'axis, representingthe
frequency,andanXaxis, representingeachscore.Note that
the only valuesrepresentedon the X axis are thosevalues
with nonzerofrequencies(61, 62, and 7l arenot repre-
sented).
h.lgtrt
66.!0 67.m 68.00
h.lght
G
a
,I
a
L
t LiwLlW .a'fJul
(6fnpSg MBan* )
GeneralLinearMsdel)
Click the Charts button at the bot-
tom to producefrequencydistributions.This
will giveyou theChartsdialogbox.
Therearethreetypesof chartsavail-
able with this command: Bar charts, Pie
charts, andHistograms. For eachtype, the I
axis can be either a frequencycount or a
percentage(selectedwith the Chart Values
option).
);,r.: xl
0Kl
n"*dI
c"!q I
l1t"l
65.00 70.s
38. Chapter4 GraphingData NEUMAf{l{COLLEiSELt*i:qARy
A$TO|',J,pA .igU14
hclght
The pie chart showsthe per-
centageof the whole that is repre-
sentedby eachvalue.
The Histogramcommandcre-
atesa groupedfrequencydistribution.
Therangeof scoresissplitintoevenly
spacedgroups.The midpointof each
groupis plottedon theX axis,andthe
I axisrepresentsthenumberof scores
for eachgroup.
If you select With Normal
Curve,a normalcurvewill be super-
imposedoverthedistribution.Thisis
very usefulin determiningif the dis-
tribution you have is approximately
normal.The distributionrepresented
hereis clearlynot normaldueto the
asymmetryof thevalues.
h166.9l
S. Oae,.lr07
flrl0
Practice Exercise
UsePracticeDataSet I in AppendixB. After you haveenteredthe data,constructa
histogramthat representsthe mathematicsskills scoresanddisplaysa normal curve,anda
barchartthatrepresentsthe frequenciesfor thevariableAGE.
Section4.4 Scatterplots
Description
Scatterplots(also called scattergramsor scatterdiagrams)display two values for
eachcasewith a mark on thegraph.TheXaxis representsthevaluefor onevariable.The I
axisrepresentsthevaluefor the secondvariable.
s0.00
t3r0
€alr
05!0
66.00
67.!0
Gen0
!9.!0
tos
nfit
13!o
il.m
h.lght
JJ
39. Chapter-1 GraphingData
Assumptions
Bothvariablesshouldbeintervalor ratio scales.If nominalor ordinaldataare
used,becautiousaboutyourinterpretationof thescattergram.
.SPSSData Format
Youneedtwovariablestoperformthiscommand.
Running the Command
You can producescatterplotsby clicking Graphs, then Chart
Builder. (Note:You canalsousetheLegacyDialogs. For this method,
pleaseseeAppendixF.)
r l0l ln Gallerv Choose
from: selectScatter/Dol.ThendragtheSimple
Scatter icon (top left) up to the main chart
areaas shownin the screenshotat left. Disre-
gardtheElementPropertieswindow thatpops
up by choosingClose.
Next,dragtheHEIGHT variableto the
X-Axis area,and the WEIGHT variableto the
Y-Axisarea(rememberthat standardgraphing
conventionsindicate that dependent vari-
ablesshouldbe I/ andindependentvariables
shouldbeX. This would meanthat we aretry-
ing to predictweightsfrom heights).At this
point,your screenshouldlook like the exam-
ple below. Note that your actual data arenot
shown-just a setof dummy values.
Wrilitll'.,: ,, .Jol
V*l&bi:
^ry.J Y*J - '"? |
Click OK. You should
graph(nextpage)asOutput.
get your new
orrq a 6ilby (h*t fes b & it e
tl ".:';oon,
l
ln
iLs
clr* s fE Bs[ pleitbnb t b b krth
3 cfst Bleffit by €l8ffit
Chrifrwr* (& mtrpb dstr
Ctffii'w:
Frwih
Si
LtE
lr@
Fb/Fq|n
gnt$rrOol
l,lbbgran
HlgfFl"tr
l@bt
Ral Ars
iEbM{ Ffip*t!4.,
opbr.,
I 6raph* ulfftlqs Wnd
8n
Lh
PlrifsLa
Scfflnal
xbbrs
Hg||rd
34
, x**J" s*J ...ryFl
40. Chapter4 GraphingData
Output
Theoutputwill consistofamarkforeachparticipantattheappropriateX and
levels.
Adding a Third Variable
Eventhoughthe scatterplotis a
two-dimensionalgraph,it canplota third
variable.To make it do so, selectthe
Groups/PointID tabin theChartBuilder.
Click theGrouping/stackingvariableop-
tion.Again,disregardtheElementProp-
ertieswindow that popsup. Next, drag
thevariableSEXintotheupper-rightcor-
ner whereit indicatesSet Color.When
thisis done,yourscreenshouldlooklike
theimageat right.If you arenotableto
dragthevariableSEX,it maybebecause
it is notidentifiedasnominalor ordinal
in the VariableViewwindow.
Click OK to haveSPSSproduce
thegraph.
arlo i?Jo ?0.00 t:.${
hdtht
!|||d d*|er btrdtn- b$tdl
l- cotrnrcpr:tvr$
I- aontpl*rt
35
41. Chapter4 GraphingData
Now our outputwill havetwo differentsetsof marks.One setrepresentsthe male
participants,and the secondsetrepresentsthe femaleparticipants.Thesetwo setswill ap-
pearin two differentcolorson your screen.You canusethe SPSScharteditor(seeSection
4.6) to makethemdifferentshapes,asshownin theexamplebelow.
os
65,00 67.50
helght
Practice Exercise
UsePracticeDataSet2 in AppendixB. Constructa scatterplotto examinetherela-
tionshipbetweenSALARYandEDUCATION.
Section4.5 AdvancedBar Charts
Description
Bar chartscan be producedwith the Frequencie.scommand(seeSection4.3).
Sometimes.however.we areinterestedin a barchartwherethe I/ axisis nota frequency.
To producesuchachart,weneedtousetheBarchartscommand.
SPSSData Format
You need at least two variablesto perform this command.There are two basic
kinds of bar charts-those for between-subjectsdesignsand thosefor repeated-measures
designs.Usethebetween-subjectsmethodif onevariableis theindependentvariable and
the other is the dependentvariable. Use the repeated-measuresmethodif you havea de-
pendentvariable for eachvalueof theindependentvariable (e.g.,you would havethree
sPx
iil
60.00
36
42. Chapter4 GraphingData
variablesfor a designwith threevaluesof the independentvariable).This normallyoc-
curswhenyou makemultiple observationsovertime.
This exampleusesthe GRADES.savdatafile, which will be createdin Chapter6.
Pleaseseesection6.4 forthedataif you would like to follow along.
Running the Command
Open the Chart Builder by clicking Graphs, then Chart
Builder. In the Gallery tab, selectBar. lf you had only one inde-
pendent variable, you would selectthe SimpleBar chart example
(top left corner).If you havemore thanone independentvariable
(as in this example),
tfldr(
select the Clustered Bar Chart example
from themiddle of the top row.
Drag the exampleto the top work-
ing area. Once you do, the working area
should look like the screenshotbelow.
(Note that you will need to open the data
file you would like to graphin order to run
thiscommand.)
h4 | G.laryahd lsr to @ t 6 p
cfwxry
m
ffi * $r 0* dds t bto h.td. drr
drrrl by.lr!*
y"J .*t I r,* |
:gi
lh. y*rfts yu vttdld {a b. rsd te grmt! yw d.t,
rh ffi qa..dr vrt d. {db. Edr.*6ot.' h *. dst,
vtlB enpcr*.dby |SddSri,lARV vrtdb cdon d b
Ur Yd. Vrtdrr U* d.ftr (&gqb n.ryst d !d c
*rdd nDe(rd*L, **h o b. red o. c&eskd d q 6
. gdslo a F Ftrg Yrt aic.
Cdtfry
LSdrl
f o,-l ryl *.r! "l
If you are using a repeated-measuresdesign like our example here using
GRADES.savfrom Chapter6 (threedifferent variablesrepresentingthe i valuesthat we
want),you needto selectall threevariables(you can<Ctrl>-clickthemto selectmultiple
variables)andthendragall threevariablenamesto the Y-Axisarea.Whenyou do. vou will
be giventhewarningmessageabove.Click OK.
tG*ptrl uti$Ueswh&
l?i;ffitF.t-
d'd{4rfr trrd...
/ft,Jthd)
/l*n*|ts,.,
dq*oAtrm, ,
9{
m
hlpd{
sc.ffp/Dat
tffotm
tldrtff
60elot
oidA#
JI
43. Chapter4 GraphingData
,'rsji,. *lgl$
*rrrt plYkrlur.r ollmbdaa.
8{
Lll.
,fat
H.JPd.,
t(.&|rih
Krtogrqn
HCtstoef
loxpbt
orrl Axas
ir?i:J g;
'!
I'
;:Nl
iai
inilrut
lr
&t:
nt
r*dlF*...
dnif*ntmld,..
/tudttbdJ
{i*rEkucrt}&".,
&rcqsradtrcq,,.
n"i* l. crot J rr! |
Output
Practice Exercise
Use PracticeData Set I in Appendix B. Constructa clusteredbar graphexamining
the relationshipbetweenMATHEMATICS SKILLS scores(as the OepenOentvariabtej
and MARITAL STATUS and SEX (as independentvariables).Make sureyou classify
bothSEX andMARITAL STATUSasnominalvariables.
Next, you will need to
dragthe INSTRUCT variableto
the top right in the Cluster: set
color area (see screenshotat
left).
Note: The Chart Builder pays
attention to the types of vari-
ablesthat you ask it to graph.If
you are getting etTormessages
or unusualresults,be sure that
your categorical variables are
properly designatedas Nominal
in the Variable View tab (See
Chapter2, Section2.l).
38
44. Chapter4 GraphingData
Section4.6 EditingSPSSGraphs
Whatever command you
use to createyour graph,you will
probably want to do some editing
to make it appearexactly as you
want it to look. In SPSS,you do
this in much the sameway thatyou
edit graphs in other software
programs(e.g.,Excel).After your
graph is made, in the output
window, select your graph (this
will createhandlesaroundthe out-
sideof the entireobject)and right-
click. Then. click SPSS Chart
Object, and click Open. Alter-
natively,you can double-clickon
the graphto openit for editing.
Whenyou openthe graph,theChartEditor window andthe correspondingProper-
lies window will appear.
qb li. lin.tlla. *rll..!!lflE.!l
,, ;l 61f L:lr!.H;gb.tct-]pu1 ri
IE :,- r--."1
Ittttr
tlttIr
tllrwel
w&&$!{!rJ
JJJJ-JJ
JJJJJJ
.nlqrlcnl,f,,!sl
r 9-,I rt
fil mlryl
OnceChart Editor is open,you caneasilyedit eachelementof the graph.To select
an element,just click on the relevantspoton the graph.For example,if you haveaddeda
title to your graph("Histogram" in the examplethat follows), you may selectthe element
representingthetitle of the graphby clicking anywhereon the title.
FFF,FfuFF|*"'4F&'E'
cFtA$-qli*LBul0l al ll rI
q
*.
$r
;l Jxr F4*.it.r":!..*
ltliL&{ il.dk'nl
39
45. Chapter4 GraphingData
jn ExYt ltb":€klgtH,U:; Li
^'irsGssir :J*ro:l A I 3 *l.A-I,--
Onceyou haveselected
an element, you can tell
whether the correct elementis
selectedbecauseit will have
handlesaroundit.
If the item you have
selectedis a text element(e.g.,
the title of the graph),a cursor
will be presentandyou canedit
the text asyou would in a word
processing program. If you
would like to change another
attributeof the element(e.g.,
the color or font size),usethe
Propertiesbox. (Text properties
areshownbelow.)
With a linle practice,
you can make excellentgraphs
using SPSS.Once your graph
is formattedthe way you want
it, simply select File, Save,
then Close.
$o gdt lbw gsion Ek
$vr {hat Trm$tr,,,
Spdy$a*Tmpt*c.,.
flpoft {bdt rf'.|1,,,
trTT.":.TJ*"' .'*t A:r::-'
o,tl*" ffiln*fot*.1
P?*l!r h ?frtmd Sa . .
AaBbCc123
gltaridfu; Ua*tr$Sie
40
46. Chapter5
PredictionandAssociation
Section5.1 PearsonCorrelation Coefficient
Description
ThePearsoncorrelationcoefficient(sometimescalledthePearsonproduct-moment
correlationcoefficientor simplythePearsonr) determinesthestrengthof thelinearrela-
tionshipbetweentwovariables.
Assumptions
Bothvariablesshouldbemeasuredonintervalor ratio scales(or a dichotomous
nominalvariable).If a relationshipexistsbetweenthem,thatrelationshipshouldbelinear.
Becausethe Pearsoncorrelationcoefficientis computedwith z-scores,both variables
shouldalsobenormallydistributed.If yourdatado notmeettheseassumptions,consider
usingtheSpearmanrhocorrelationcoefficientinstead.
SP.SSData Format
Two variablesarerequiredin yourSPSSdatafile.Eachsubjectmusthavedatafor
bothvariables.
4
n
1
..
n"."tI
ry{l
i*l
lfratyil qapns
Reportr
Utl&i*s t#irdow Heb
)
)
)
)
Move at leasttwo variablesfrom the
box at left into the box at right by usingthe
transferarrow (or by double-clickingeach
variable).Make surethat a check is in the
Pearson box under Correlation
Cofficients. It is acceptableto move more
thantwo variables.
4l
Running the Command
To selectthe Pearsoncorrelationcoefficient,
click Analyze, then Conelate, then Bivariate
(bivariate refers to two variables).This will bring
up the Bivariate Correlations dialog box. This
exampleusesthe HEIGHT.sav data file enteredat
the startof Chapter4.
Vdri.blcr
I
I
I
rqslDescripHveSalirtk*
CcmparaHranr
ue"qer:dlirwarmo{d
. .i lwolalad {. 0rG-tr8.d
9@,.1
47. Chapter5 PredictionandAssociation
For our example,we will move
all threevariablesoverandclick OK.
Reading the Output
The output consists of a
correlation matrix. Every variableyou
enteredin the command is represented
asboth a row and a column.We entered
three variables in our command.
Therefore,we havea 3 x 3 table.There
are also three rows in each cell-the
correlation,the significancelevel, and
Vdi{$b* OX I
lsffi -N-
ml/'*
I Tc* d $lrfmma*--*=*-*:-**-*l
l_i::x- .--i
17Flag{flbrrcorda&rn
nql
:rydl
!4 1
the N. If a correlation is signifi-
cant at lessthan the .05 level, a
single * will appearnext to the
correlation.If it is significantat
the .01 levelor lower, ** will ap-
pear next to the correlation. For
example, the correlation in the
output at right has a significance
level of < .001, so it is flagged
with ** to indicatethat it is less
than.01.
To read the correlations.
selecta row and a column. For
example,the correlationbetweenheightandweight is determinedthroughselectionof the
WEIGHT row andthe HEIGHT column(.806).We get the sameanswerby selectingthe
HEIGHT row and the WEIGHT column.The correlationbetweena variableand itself is
alwaysl, sothereis a diagonalsetof I s.
Drawing Conclusions
The correlationcoefficientwill be between-1.0 and+1.0.Coefficientscloseto 0.0
representa weakrelationship.Coefficientscloseto 1.0or-1.0 representa strongrelation-
ship. Generally,correlationsgreaterthan 0.7 areconsideredstrong.Correlationslessthan
0.3 areconsideredweak.Correlationsbetween0.3 and0.7areconsideredmoderate.
Significant correlationsare flaggedwith asterisks.A significant correlationindi-
catesa reliablerelationship,but not necessarilya strongcorrelation.With enoughpartici-
pants,a very small correlationcan be significant.PleaseseeAppendix A for a discussion
of effect sizesfor correlations.
Phrasinga SignificantResult
In the exampleabove,we obtaineda correlationof .806 betweenHEIGHT and
WEIGHT. A correlationof .806is a strongpositivecorrelation,andit is significantat the
.001level.Thus,we couldstatethefollowingin a resultssection:
Correlations
heioht weioht sex
netgnt Pearsonuorrelalron
Sig.(2-tailed)
N
1
16
.806'
.000
16
-.644'
.007
16
weight PearsonCorrelation
Sig.(2-tailed)
N
.806'
.000
16
I
16
.968'
.000
16
sex PearsonCorrelation
Sig.(2-tailed)
N
-.644'
.007
16
-.968'
.000
16
1
16
". Correlationis significantat the 0.01levet(2-tailed).
4/
48. Chapter5 PredictionandAssociation
A Pearsoncorrelationcoefficientwascalculatedfor the relationshipbetween
participants'height and weight. A strong positive correlationwas found
(r(14) : .806,p < .001),indicatinga significantlinearrelationshipbetween
thetwo variables.Tallerparticipantstendto weighmore.
The conclusionstatesthe direction(positive),strength(strong),value (.806),de-
greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a
statementof directionis included(talleris heavier).
Note thatthedegreesof freedomgivenin parenthesesis 14.The outputindicatesan
N of 16.While mostSPSSproceduresgive degreesof freedom,the correlationcommand
givesonly theN (thenumberof pairs).For a correlation,thedegreesof freedomis N - 2.
Phrasing ResultsThat Are Not Significant
Usingour SAMPLE.savdataset
from the previous chapters,we could
calculatea correlationbetweenID and
GRADE. If so, we get the outPut at
right.Thecorrelationhasa significance
level of .783.Thus,we could write the
following in a resultssection(notethat
thedegreesof freedomis N - 2):
A Pearsoncorrelationwas calculatedexaminingthe relationshipbetween
participants' ID numbers and grades.A weak correlation that was not
significantwasfound(, (2): .217,p > .05).ID numberis notrelatedto grade
in thecourse.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethe valueof the Pearsonconela-
tion coefficientfor therelationshipbetweenSALARY andYEARS OF EDUCATION.
Section5.2 SpearmanCorrelationCoeflicient
Description
The Spearmancorrelationcoefficientdeterminesthe strengthof the relationshipbe-
tweentwo variables.It is a nonparametricprocedure.Therefore,it is weakerthanthe Pear-
soncorrelationcoefficient.but it canbe usedin moresituations.
Assumptions
Becausethe Spearmancorrelationcoefficientfunctionson the basisof the ranksof
data,it requiresordinal (or interval or ratio) datafor both variables.They do not needto
be normallydistributed.
Correlations
ID GRADE
lD PearsonUorrelatlon
Sig.(2{ailed)
N
1.000
4
.217
7A?
4
GMDE PearsonCorrelation
Sig.(2-tailed)
N
.217
.783
4
1.000
4
43
49. Chapter5 PredictionandAssociation
SP.SSData Format
Two variablesarerequiredin yourSPSSdatafile. Eachsubjectmustprovidedata
forbothvariables.
Running the Command
Click Analyze, then Correlate, then
Bivariate.This will bringup themaindialogbox
for Bivariate Correlations(ust like the Pearson
correlation). About halfway down the dialog
box, there is a sectionfor indicatingthe type of
correlationyou will compute.You can selectas
many correlationsasyou want. For our example,
removethecheckin thePearsonbox (by clicking
on it) andclick on theSpearmanbox.
|;,rfiy* Grapk Utilitior wndow Halp
i*CsreldionCoefficientsj
j f f"igs-"jjl- fienddrstzu.b
Use the variablesHEIGHT and WEIGHT
from ourHEIGHT.savdatafile (Chapter4). This is
also one of the few commandsthat allows you to
choosea one-tailedtest.if desired.
Reading the Output
The output is essen-
tially the sameas for the Pear-
son correlation.Each pair of
variables has its correlation
coefficientindicatedtwice.The
Spearmanrho can range from
-1.0 to +1.0,just like thePear-
sonr.
The output listed above indicatesa correlationof .883 betweenHEIGHT and
WEIGHT. Note the significancelevelof .000,shownin the "Sig. (2-tailed)"row. This is,
in fact,a significancelevel of <.001. The actualalphalevelroundsout to.000, but it is
not zero.
Drawing Conclusions
The correlationwill bebetween-1.0 and+1.0.Scorescloseto 0.0representa weak
relationship.Scorescloseto 1.0or -1.0 representa strongrelationship.Significantcorrela-
tions are flaggedwith asterisks.A significantcorrelationindicatesa reliablerelationship,
but not necessarilya strongcorrelation.With enoughparticipants,a very small correlation
can be significant.Generally,correlationsgreaterthan 0.7 are consideredstrong.Correla-
tions lessthan 0.3 are consideredweak. Correlationsbetween0.3 and 0.7 arc considered
moderate.
RrFarts )
I Oescri$iveStatistics )
ComparcMeans )
" GenerdLinearf{udel )
Correlations
HEIGHT WEIGHT
Spearman'srho HEIGHT CorrelationCoeflicient
Sig.(2-tailed)
N
ffi
Sig.(2-tailed)
N
1.000
16
tr-4.
.000
16
.883
.000
't6
1.000
16
". Correlationis significantat the .01 level(2-tailed)
44
50. Chapter5 PredictionandAssociation
PhrasingResultsThatAreSignificant
In the exampleabove,we obtaineda correlationof .883 betweenHEIGHT and
WEIGHT. A correlationof .883is a strongpositivecorrelation,andit is significantat the
.001level.Thus,we couldstatethefollowingin a resultssection:
A Spearmanrho correlationcoefficientwas calculatedfor the relationship
betweenparticipants'height and weight. A strongpositive correlationwas
found (rho (14):.883, p <.001), indicatinga significantrelationship
betweenthetwo variables.Tallerparticipantstendto weighmore.
The conclusionstatesthe direction(positive),strength(strong),value(.883),de-
greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a
statementof directionis included(talleris heavier).Notethatthedegreesof freedomgiven
in parenthesesis 14.TheoutputindicatesanN of 16.For a correlation,thedegreesof free-
domisN-2.
Phrasing ResultsThat Are Not Significant
Using our SAMPLE.sav
datasetfrom the previouschapters,
we couldcalculatea Spearmanrho
correlation between ID and
GRADE. If so, we would get the
output at right. The correlationco-
efficientequals.000andhasa sig-
nificancelevelof 1.000.Note thatthoughthis valueis roundedup and is not, in fact,ex-
actly 1.000,we couldstatethefollowingin a resultssection:
A Spearmanrho correlationcoefficientwas calculatedfor the relationship
betweena subject'sID numberand grade.An extremelyweak correlation
thatwasnot significantwasfound(r (2 = .000,p > .05).ID numberis not
relatedto gradein thecourse.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethe strengthof the relationship
betweensalaryandjob classificationby calculatingtheSpearmanr&ocorrelation.
Section 5.3 Simple Linear Regression
Description
Simplelinearregressionallowsthepredictionof onevariablefrom another.
Assumptions
Simplelinearregressionassumesthatboth variablesareinterval- or ratio-scaled.
In addition,the dependentvariable shouldbe normallydistributedaroundthe prediction
line. This, of course,assumesthat the variablesare relatedto eachotherlinearly.Typi-
Correlations
to GRADE
Spearman'srho lD CorrelationCoenicten
Sig.(2{ailed)
N
ffi
Sig. (2{ailed)
N
000 .UUU
1.000
.000
1.000
4
1.000
45
51. Chapter5 PredictionandAssociation
cally, both variablesshouldbe normally distributed.Dichotomousvariables (variables
with only two levels)arealsoacceptableasindependentvariables.
.SPSSData Format
Two variablesare requiredin the SPSSdata file. Each subjectmust contributeto
bothvalues.
Running the Command
Click Analyze, thenRegression,then
Linear. This will bring up the main diatog
box for LinearRegression.On theleft sideof
the dialog box is a list of the variablesin
your datafile (we areusingthe HEIGHT.sav
data file from the start of this section).On
the right are blocks for the dependent
variable (the variable you are trying to
predict),and the independentvariable (the
variablefrom whichwe arepredicting).
0coandart
t '-J ff*r,'--
Aulyze Graphs
R;porte
LJtl$ties Whdow Help
' Descrptive5tatistkf
ComparcMems
Generallinear frlod
' Corrolate
>
)
l
j iL,:,,,r,,,'l u* I i -IqilItd.p.nd6r(rl I Crof I
rrr Pm- i Er{rl
Ucitbd lErra :J
SdrdhVui.bh
estimategivesyou a measure
of dispersionfor your predic-
tion equation. When the
predictionequationis used.
68%of thedatawill fallwithin
ModelSummary
Model R R Square
Adjusted
R Souare
Std.Errorof
theEstimate
1 .E06 .649 .624 16.14801
a. Predictors:(Constant),height
Ar-'"1
Est*6k
I'J
WLSWaidrl:
sui*br...I pbr.. I Srrs...I Oaly*..I
Variables Entered/Removed section.
For our example,you shouldseethis output.R Square(calledthe coeflicientof determi-
nation) givesyou theproportionof thevarianceof your dependentvariable (yEIGHT)
thatcanbe explainedby variationin your independentvariable (HEIGHT). Thus, 649%
of the variationin weight can be explainedby differencesin height (talier individuals
weighmore).
The standard error of Modetsummarv
Clasifu )
DataReductbn )
We are interestedin predicting
someone'sweighton thebasisof his or
her height.Thus, we shouldplace the
variable WEIGHT in the dependent
variable block and the variable
HEIGHT in the independentvariable
block.Thenwe canclick OK to run the
analysis.
Reading the Output
For simple linear regressions,
we are interestedin three components
of the output. The first is called the
Model Summary,and it occursafterthe
lt{*rt*
46
52. Chapter5 PredictionandAssociation
onestandard error of estimate(predicted)value.Justover 95ohwill fall within two stan-
dard errors.Thus, in the previousexample,95o/oof the time, our estimatedweight will be
within32.296poundsof beingcorrect(i.e.,2x 16.148:32.296).
ANOVAb
Model
Sumof
Sorrares df
Mean
Souare F Sio.
1 Kegressron
Residual
Total
6760.323
3650.614
10410.938
I
14
15
6760.323
260.758
25.926 .0004
a' Predictors:(Constant),HEIGHT
b.DependentVariable:WEIGHT
The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta-
ble, asshownabove.The importantnumberhereis the significancelevel in the rightmost
column.If that valueis lessthan.05,thenwe havea significantlinearregression.If it is
largerthan.05,we do not.
The final sectionof the outputis thetableof coefficients.This is wherethe actual
predictionequationcanbe found.
Coefficientt'
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sio.B Std.Error Beta
1 (Constant)
height
-234.681
5.434
71.552
1.067 .806
-3.280
5.092
.005
.000
a. DependentVariable:weight
In mosttexts,you learnthat Y' : a + bX is the regressionequation.f' (pronounced
"Y prime") is your dependentvariable (primesarenormally predictedvaluesor depend-
ent variables),andX is your independentvariable. In SPSSoutput,the valuesof botha
andb arefoundin theB column.The first value,-234.681,is thevalueof a (labeledCon-
stant).The secondvalue,5.434,is the valueof b (labeledwith thenameof the independ-
ent variable). Thus, our prediction equation for the example above is WEIGHT' :
-234.681+ 5.434(HEIGHT).In otherwords,theaveragesubjectwho is an inchtallerthan
anothersubjectweighs5.434poundsmore.A personwho is 60 inchestall shouldweigh
-234.681+ 5.434(60):91.359pounds.Givenourearlierdiscussionof standarderror of
estimate,95ohof individualswho are60 inchestall will weighbetween59.063(91.359-
32.296: 59.063)and123.655(91.359+ 32.296= 123.655)pounds.
/:
"
I
47
53. Chapter5 PredictionandAssociation
Drawing Conclusions
Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre-
diction equationwas obtained,(b) the directionof the relationship,and (c) the equation
itself.
Phrasing Results That Are Significant
In the exampleson pages46 and47, we obtainedanR Squareof .649anda regres-
sion equationof WEIGHT' : -234.681+ 5.434(HEIGHT). The ANOVA resultedin .F=
25.926with I and 14 degreesof freedom.The F is significantat the lessthan .001 level.
Thus,we could statethe following in a resultssection:
A simple linear regressionwas calculatedpredicting participants'weight
basedon theirheight.A significantregressionequationwasfound(F(1,14):
25.926,p < .001),with anR' of .649.Participants'predictedweight is equal
to -234.68 + 5.43 (HEIGHT) poundswhen height is measuredin inches.
Participants'averageweightincreased5.43poundsfor eachinchof height.
The conclusionstatesthe direction(increase),strength(.649), value (25.926),de-
greesof freedom(1,14),and significancelevel (<.001) of the regression.In addition,a
statementof theequationitselfis included.
Phrasing ResultsThatAre Not Significant
If the ANOVA is not significant
(e.g.,seethe outputat right),the section
of the output labeled SE for the
ANOVA will be greaterthan .05,andthe
regressionequationis not significant.A
results section might include the
followingstatement:
A simple linear regressionwas
calculatedpredictingparticipants'
ACT scoresbasedon their height.
The regressionequationwas not
significant(F(^1,14): 4.12,p >
.05)with an R' of .227.Heightis
not a significantpredictorof ACT
scores.
llorlol Srrrrrrry
Hodel R Souare
Adjuslsd
R Souare
Std.Eror of
lh. Fslimale
attt 221 112 3 06696
a. Predlclors:(Constan0,h8lghl
a. Prodlclors:(Conslan0.h8lghl
b. OependentVarlableracl
Cootlklqrrr
Hod€l
Unstandardiz€d Slandardizsd
Siots Std.Erol Bsta
(u0nslan0
hei9hl
| 9.35I
-.411
13590
203 . r17
J OJI
.2030
003
062
a. OBDendsnlva.iable:acl
Note that for resultsthat arenot significant,the ANOVA resultsandR2resultsare
given,but theregressionequationis not.
Practice Exercise
Use PracticeData Set2 in Appendix B. If we want to predictsalaryfrom yearsof
education,what salarywould you predict for someonewith l2 yearsof education?What
salarywould you predictfor someonewith a collegeeducation(16 years)?
rt{)vP
Xodel
Sumof
dl xeanSouare t Slo
Rssldual
Tolal
JU/?U
r31688
170t38
I
1a
t5
I 408
4.12U 0621
48
54. Chapter5 PredictionandAssociation
Section5.4 MultipleLinearRegression
Description
The multiple linear regressionanalysisallows the predictionof one variablefrom
severalothervariables.
Assumptions
Multiple linearregressionassumesthat all variablesareinterval- or ratio-scaled.
In addition,the dependentvariable shouldbe normally distributedaroundthe prediction
line. This, of course,assumesthatthe variablesarerelatedto eachother linearly.All vari-
ablesshouldbe normallydistributed.Dichotomousvariablesarealsoacceptableasinde-
pendentvariables.
,SP,S,SData Format
At leastthreevariablesarerequiredin the SPSSdatafile. Eachsubjectmust con-
tributeto all values.
RunningtheCommand
ClickAnalyze,thenRegression,thenLinear.
This will bring up the maindialog box for Linear
Regression.On theleft sideof thedialogbox is a
list of thevariablesin your datafile (we areusing
the HEIGHT.savdata file from the start of this
chapter).On the right sideof the dialog box are
blanksfor thedependentvariable(thevariableyou
aretryingto predict)andtheindependentvariables
(thevariablesfromwhichyouarepredicting).
Dmmd*
l-...G
LLI l&-*rt
I At"h* eoptrc utiltt 5 t{,lrdq., }l+
i &ry!$$sruruct
Cglpsaftladls
GarnrdLhcar ldd
S€lcdirnVdir*
fn f*---*-- ,it'r:,I
Cs Lrbr&:
Er-
'---
ti4svlit{
Li-Jr-
sr"u*t.I Pr,rr...I s* | oei*. I
We are interested in predicting
someone'sweightbasedon his or herheight
and sex. We believe that both sex and
height influenceweight. Thus, we should
placethe dependentvariable WEIGHT in
the Dependentblock and the independent
variables HEIGHT and SEX in the Inde-
pendent(s)block.Enterbothin Block l.
This will perform an analysisto de-
termine if WEIGHT can be predictedfrom
SEX and/or HEIGHT. There are several
methods SPSS can use to conduct this
analysis. These can be selectedwith the
Methodbox. MethodEnter. themostwidely
.roj I
n{.rI
ryl
tb.l
49
55. Chapter5 PredictionandAssociation
used,puts all variablesin the
methodsuse variousmeansto
Click OK to run theanalvsis.
UethodlE,rt-rl
ReadingtheOutput
For multiplelinearregres-
sion,therearethreecomponentsof
the outputin which we are inter-
ested.Thefirstis calledtheModel
Summary,whichis foundafterthe
VariablesEntered/Removedsection.For our example,you shouldget the outputabove.R
Square(calledthe coefficientof determination)tellsyou the proportionof the variance
in thedependentvariable (WEIGHT) thatcanbe explainedby variationin theindepend-
ent variables(HEIGHT andSEX,in thiscase).Thus,99.3%of thevariationin weightcan
be explainedby differencesin height and sex (taller individuals weigh more, and men
weigh more).Note that when a secondvariableis added,our R Squaregoesup from .649
to .993.The .649wasobtainedusingtheSimpleLinearRegressionexamplein Section5.3.
The StandardError of the Estimategives you a margin of error for the prediction
equation.Usingthepredictionequation,68%oof thedatawill fall within onestandard er-
ror of estimate(predicted)value.Justover95% will fall within two standard errors of
estimates.Thus, in the exampleabove,95ohof the time, our estimatedweight will be
within 4.591(2.296x 2) poundsof beingcorrect.In our SimpleLinearRegressionexam-
ple in Section5.3,thisnumberwas32.296.Notethehigherdegreeof accuracy.
The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta-
ble. For more informationon readingANOVA tables,referto the sectionson ANOVA in
Chapter6. For now, the importantnumberis the significancein the rightmostcolumn.If
thatvalueis lessthan.05,we havea significantlinearregression.If it is largerthan.05,we
do not.
equation,whether they are significant or not. The other
enter only thosevariablesthat are significant predictors.
ModelSummary
Model R R Souare
Adjusted
R Square
Std.Errorof
theEstimate
.99 .993 .992 2.29571
a. Predictors:(Constant),sex,height
eHoveb
Model
Sumof
Souares df MeanSouare F Sio.
xegresslon
Residual
Total
0342424
68.514
10410.938
z
13
15
5171.212
5.270
v61.ZUZ .0000
a. Predictors:(Constant),sex,height
b. DependentVariable:weight
The final sectionof outputwe areinterestedin is thetableof coefficients.This is
wherethe actualpredictionequationcanbe found.
50
56. Chapter5 PredictionandAssociation
Coefficientf
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sio.B Std.Error Beta
1 (Constant)
height
sex
47j38
2.101
-39.133
14.843
.198
1.501
.312
-.767
176
10.588
-26.071
.007
.000
.000
a. DependentVariable:weight
In mosttexts,you learnthat Y' = a + bX is theregressionequation.For multiple re-
gression,our equationchangesto l" = Bs+ B1X1+ BzXz+ ... + B.X.(where z is thenumber
of IndependentVariables).I/' is your dependentvariable, andtheXs areyour independ-
ent variables. The Bs arelistedin a column.Thus,our predictionequationfor theexample
aboveis WEIGHT' :47.138 - 39.133(SEX)+ 2.101(HEIGHT)(whereSEX is codedas
I : Male, 2 = Female,andHEIGHT is in inches).In otherwords,the averagedifferencein
weight for participantswho differ by one inch in heightis 2.101pounds.Malestendto
weigh 39.133poundsmore than females.A femalewho is 60 inchestall shouldweigh
47.138- 39.133(2)+ 2.101(60):94.932 pounds.Givenour earlierdiscussionof thestan-
dard error of estimate,95o/oof femaleswho are60 inchestall will weighbetween90.341
(94.932- 4.591: 90.341)and99.523(94.932+ 4.591= 99.523)pounds.
Drawing Conclusions
Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre-
diction equationwas obtained,(b) the direction of the relationship,and (c) the equation
itself. Multiple regressionis generallymuch more powerful than simple linear regression.
Compareour two examples.
With multipleregression,you mustalsoconsiderthe significancelevelof eachin-
dependentvariable. In the exampleabove,the significancelevel of both independent
variablesis lessthan.001.
PhrasingResultsThatAreSignificant
In our example,we obtainedan
R Squareof.993 anda regressionequa-
tion of WEIGHT' = 47.138
39.133(SEX)+ 2.101(HEIGHT).The
ANOVA resultedin F: 981.202with2
and 13degreesof freedom.F is signifi-
cantatthelessthan.001level.Thus.we
couldstatethefollowinein aresultssec-
tion:
MorblSratrtny
xodsl R Souars
Adlusted
R Souare
Std.Eror of
lheEstimatg
.997. 992 2 2C5r1
a Prsdictorsr(Conslan0,sex,hsighl
a.Predlctors:(Conslan0,ser,hoighl
b. OspBndontVariabloreighl
ANr:rVAD
Xodel
Sumof
Sdrrrraq dt XeanSouare
I Heorsssron
Residual
Tutal
ru3t2.424
68.5t4
|0410.938
2
15
5171212 981202 000r
Coefllcldasr
Xodel
Unslanda.dizsd Slandardizad
I SioStd.Eror Beta
hei0hl
sex
at 38
2.101
.39.133
4 843
.198
L501
.312
3 t6
10.588
-26.071
007
000
000
a.DepsndenlVarlabl€:rei0hl
5l
57. Chapter5 PredictionandAssociation
A multiple linear regressionwas calculatedto predict participants'weight
basedon their height and sex.A significantregressionequationwas found
(F(2,13): 981.202,p < .001),with an R' of .993.Participants'predicted
weightis equalto 47.138- 39.133(SEX)+ 2.10l(HEIGHT),whereSEX is
coded as I = Male, 2 : Female,and HEIGHT is measuredin inches.
Participantsincreased2.101 pounds for each inch of height, and males
weighed 39.133 pounds more than females.Both sex and height were
significantpredictors.
The conclusionstatesthe direction(increase),strength(.993),value(981.20),de-
greesof freedom(2,13),and significancelevel (< .001)of the regression.In addition,a
statementof the equationitself is included.Becausetherearemultiple independent vari-
ables,we havenotedwhetheror noteachis significant.
Phrasing ResultsThat Are Not Significant
If the ANOVA does not find a
significantrelationship,the Srg section
of the output will be greaterthan .05,
and the regressionequationis not sig-
nificant. A resultssectionfor the output
at right might include the following
statement:
A multiple linear regressionwas
calculated predicting partici-
pants'ACT scoresbasedon their
height and sex. The regression
equation was not significant
(F(2,13): 2.511,p > .05)withan
R" of .279. Neither height nor
weight is a significantpredictor
of lC7" scores.
llorlel Surrrwy
XodBl x R Souare
AdtuslBd
R Souare
Std Eror of
528. t68 3 07525
a Prsdlclors:(ConslanD.se4hel9ht
a Pr€dictors:(ConslanD,se( hsight
o.OoDendBnlVaiabloracl
Coetllclst 3r
Yodel
Unstandardizsd
Cosilcisnls
Standardized
Coeilcionts
stdSld E.rol Beia
I (Constan0
h€l9hl
s€x
oJttl
- 576
-t o??
19.88{
.266
2011
-.668
- 296
3.102
2.168
- s62
007
019
35{
Notethatforresultsthatare
"o,
,ir";;;;ilJlovA resultsandR2resultsare
given,buttheregressionequationisnot.
Practice Exercise
UsePracticeDataSet2 in AppendixB. Determinethepredictionequationfor pre-
dictingsalarybasedoneducation,yearsof service,andsex.Whichvariablesaresignificant
predictors?If you believethatmenwerepaidmorethanwomenwere,whatwouldyou
concludeafterconductingthisanalysis?
ANI]VIP
gumof
dt qin
I Reoressron
Rssidual
Total
1t.191
122.9a1
't70.t38
l3
't5
23.717
9.a57
2.5rI i tn.
52
58. Chapter6
ParametricInferentialStatistics
Parametricstatisticalproceduresallow you to draw inferencesaboutpopulations
basedon samplesof thosepopulations.To make theseinferences,you must be able to
makecertainassumptionsabouttheshapeof thedistributionsof thepopulationsamples.
Section6.1 Reviewof BasicHypothesisTesting
TheNull Hypothesis
In hypothesistesting,we createtwo hypothesesthat are mutually exclusive(i.e.,
bothcannotbe trueat thesametime)andall inclusive(i.e.,oneof themmustbe true).We
referto thosetwo hypothesesasthe null hypothesisandthe alternative hypothesis.The
null hypothesisgenerallystatesthatany differencewe observeis causedby randomerror.
The alternative hypothesisgenerallystatesthat any differencewe observeis causedby a
systematicdifferencebetweengroups.
TypeI andTypeII Eruors
All hypothesistestingattemptsto
draw conclusions about the real world
basedon the resultsof a test(a statistical
test,in this case).Thereare four possible
combinationsof results(seethe figure at <.r)
right). =
Two of thepossibleresultsarecor- A
rect test results.The other two resultsare
Uenors. A Type I error occurs when we ;
reject a null hypothesisthat is, in fact, fr
true, while a Type II error occurswhen l-
we fail to reject the null hypothesis that
is, in fact,false.
Significance tests determinethe
probabilityof makinga Type I error. In
otherwords,after performinga seriesof calculations,we obtaina probability that the null
hypothesisis true.If thereis a low probability,suchas5 or lessin 100(.05),by conven-
tion, we rejectthe null hypothesis.In otherwords,we typicallyusethe .05 level(or less)
asthemaximumType I error ratewe arewilling to accept.
Whenthereis a low probabilityof a Type I error, suchas.05,we canstatethatthe
significancetesthasled us to "rejectthe null hypothesis."This is synonymouswith say-
ing that a differenceis "statisticallysignificant."For example,on a readingtesr,suppose
you found thata randomsampleof girls from a schooldistrictscoredhigherthana random
zdi
6a
E-
-^6
6!u
trO
o>
'F:
n2
REALWORLD
NullHypothesisTrue NullHypothesisFalse
TypeI Error I NoError
NoError I Typell Error
53