Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
(Manual spss)
1.
2. The SPSS 16 for Windows icon should be on the Start menu. If
you are using a
computer in a lab, it is common for the icon to be placed in a
folder. If you customize
your computer, all you have to do to start SPSS is to point to
the SPSS 16 icon on the
desktop and double click. Then wait while SPSS loads.
3. Log in to SPSS
There are two ways to launch the SPSS program. One is
to simply click on the SPSS icon shown in
red letters on your desktop. If you cannot find the icon,
you can click Start on the bottom of your
screen, then Program Files, and then SPSS. Or if you are
not sure whether the computer you are
using has SPSS, click Start, then Find, then Files or
Folders, then type “SPSS.” When the SPSS
window launches, a dialogue box will pop up as shown
below. You have several choices; you can
either start a tutorial, type in new data, or open an
existing file
6. Input Data
If you want to start from scratch and enter data manually in SPSS, select the “Type
in Data” option
from the Open dialogue box. A blank window with a spreadsheet appears. You can
click on any cell
and enter numbers. If you want to enter characters, you need to define the
variables as a string first.
It is recommended that you define the variables first even if they contain
numbers. Note there are
two tabson the bottom-left corner of the SPSS window. One is the data
spreadsheet and the other
is the sheet where users define and annotate variables. To open a file, you can click
File, then Open,
then Data (File/Open/Data). A dialogue box should appear. You need to do two
things to open
your file. First, you needto locate the directory of your file. In this example, it is in
C:/Program
Files/SPSS. Then choose the correct file type, “Cars.sav”then click Open. You
should have a
window filled with data.
7.
8. 1.) Creating a data file.
Open SPSS: --> Start, Programs, SPSS. The initial
window (center of the screen) will be asking you if
you want to open an existing file; close that for
now by clicking the "Cancel" button.
9. What you will be looking at is the Data window; one of
three windows generally used when working with SPSS.
Data View is used to input and access data. The Variable View is
used to specify the details of each variable in the data file.
10. Name is used to type a short or abbreviated name of the
variable; this will appear as the column name when in
Data
View.
Type allows you to specify the type of variable this is
((e.g.scale, nominal and ordinal
Width refers to the column width this variable will have
in.the Data View
11. Decimals refers to how many places to the right of the decimal you
would like displayed in Data View.
Label is used to type a description of this variable (i.e. non-
abbreviated). The Label will appear in Data View if one holds his
or her cursor over the Name at the top of the column.
Values are used to assign names to each value of the variable (i.e.
what will each number refer to).
Missing allows the user to specify how missing values are coded for
recognition by SPSS.
Columns allows the user to specify more than one column (in Data
View) for this variable.
Alignment allows the user to specify the left, center, or right
alignment of data within the column of this variable.
Measurement allows the user to specify the type of variable; here
SPSS uses Nominal, Ordinal, and Scale (which refers to both
Interval and Ratio).
Role can also be used to specify the type of variable (input, target,
both, none, partition, split).
12. An example for creating and setting up a data file.
1. Click on the Variable View tab at the bottom of the spreadsheet.
2. Click on the first row under Name.
3. Type the word “ID” (this will stand for the Identification number of
each participant).
4. Press <enter>
5. Click on the cell under the Decimals column and type a zero (0).
6. Click on the cell under the Label column.
7. Type “Participant Identification”
8. Click on cell below the Measure column and select Nominal.
9. Click on the Name cell of the next variable.
10. Type “IV” (this will stand for Independent Variable [or
condition]).
11. Press <enter>
12. Click on the cell under the Decimals column and type a zero (0).
13. Click on the cell under the Label column
14. Type “Condition”
15. Click on the Values cell.
13. 16. You will have to click the definition button (…) in the cell. A new
window will open.
17. Type 1 in the Value box, and then click on the Value Label box.
18. Type “Control” and click Add.
19. Repeat steps 17 – 18 using the value “2” and the value label
“Experimental”.
20. Click okay.
21. Click on the cell under Measure, then select Nominal.
22. Click on the Name cell of the next variable.
23. Type “DV” (this will stand for Dependent Variable).
24. Click on the cell under the Decimals column and type a zero (0).
25. Click on the cell under the label column.
26. Type “Number Correct”.
14. Using the Data View tab will open the data spreadsheet. It is
time to enter the data. The variable names that were typed
under the Name column in the Variable View should be at
the top of the first three columns. In the Data View, each row
represents data for one participant. Data should be entered
under each variable for each participant. To enter data simply
position the cursor in the appropriate cell and type the
number. Pressing the “enter” key will move the highlighted
position down one row. Pressing the “tab” key after entering a
value will move the position over one column to the right. So,
the user can either enter all the values for one variable at a
time by using “enter” or all the variables for one participant
can be entered by using “tab.”
15. Now enter the following data for 12 participants with
the first 6 in the control condition and the second 6
in the experimental condition. Their number correct
(from the top): 10, 8, 14, 12, 11, 13, 22, 23, 22, 19, 20, 24.
Notice that when you hold the cursor over the column
headings, the Label for that column is displayed.
16. Also notice that when you click on the Value Labels button (shown
below), the Value Labels (names) are displayed instead of the
Values (numbers).
17. An icon next to each variable
provides information about data
type and level of measurement.
19. You will be presented with the following screen:
20. Transfer the variable that needs to be tested for normality
into the Dependent List: box by either drag-and-
dropping or using the button
:Click thebutton. You will be presented with the following screen
22. Click the button. Change the options so that you are
presented with the following screen
23. Click the button .
Click thebutton.
Output
Shapiro-Wilk Test of Normality
The above table presents the results from two well-known tests of
normality, namely the Kolmogorov-Smirnov Test and the Shapiro-
Wilk Test. The Shapiro-Wilk Test is more appropriate for small
sample sizes (< 50 samples), but can also handle sample sizes as
large as 2000. For this reason, we will use the Shapiro-Wilk test as
our numerical means of assessing normality.
24. We can see from the above table that for the "Beginner",
"Intermediate" and "Advanced" Course Group the dependent
variable, "Time", was normally distributed. How do we know this?
If the Sig. value of the Shapiro-Wilk Test is greater than 0.05, the
data is normal. If it is below 0.05, the data significantly deviate
from a normal distribution.
26. Now you should have a smaller window open, highlight/select
"Time to Accelerate from 0 to 60 (sec) [accel]" and use the arrow
to put it into the variables box.
27. Next, click on "Options..." and select the descriptive statistics you
want (typically mean, standard deviation, variance, range,
standard error (S.E.) of the mean, minimum and maximum, as
well as kurtosis and skewness). Then click "Continue".
28.
29. Method 2:
go to Analyze, Descriptive Statistics, and then Frequencies...
30. Now you should have a smaller window open, highlight/select ""Time
to Accelerate from 0 to 60 (sec) [accel]" and use the arrow to put it into
the variables box.
31. Next, click on "Statistics..." and select all the statistics specified
earlier, as well as quartiles; then click "Continue".
32. Next, click on "Charts..." and select Histograms and Show normal
curve on histogram. Then click "Continue" and then click "OK".
You should now see some output similar to
that below. You'll notice the output table
containing all the descriptive statistics is
smaller and easier to read than the one
provided by the Descriptive Statistics
function above.
33. You should now see some output similar to that below. You'll
notice the output table containing all the descriptive statistics is
smaller and easier to read than the one provided by the
Descriptive Statistics function above.
34. There are four benefits to using the Frequencies function for
gathering descriptive statistics. First, you can get more
descriptive statistics (quartiles), second; you can get a
graphical display of the variable (histogram for continuous
variables and bar graph for categorical variables). Third, you
get a frequencies table; and fourth, the descriptive statistics
table is smaller and easier to read with frequencies function.
35. Method 3:
The Explore Function for getting descriptive statistics by
group
With the Explore Example data file open in the Data window, go to
Analyze, Descriptive Statistics, and then Explore...
36. Next, pick your dependent variable, in this example we'll use the variable "total
score on blame scale [bt]". Highlight and move it to the Dependent List: box.
Then, pick your independent variable, in this example we'll use the grouping
variable "GENDER [sex]". Highlight it and move it to the Factor List: box. Then
click on the Statistics... button.
37. Now we can specify what we want to get. Check Descriptives, M-estimators,
Outliers, and Percentiles. Then click the Continue button. Next, click on the
Plots button and select Histogram and Normality plots with tests. Then click the
Continue button. Then click the OK button.
38. You should see some output similar to that displayed below.
39. Parametric tests
t-tests in SPSS.
The t-tests are used to determine if there exists a significant
difference between means. There are traditionally, three
types of t-tests. The seldom used one sample t-test, the
dependent samples t-test, and the independent samples t-
test
(1). One sample t-test is used to determine if the sample
mean is different from some constant value; typically
assumed to be a population mean.
First, we'll test whether or not our sample mean (in this case
age) is significantly different from zero. Begin by importing
the data, then click on Analyze, Compare Means, One-
Sample T test...
40.
41. Next, highlight the Age variable and use the arrow to move it into
the Test Variable(s): box.
Next, click the OK button to complete the t-test.
42. The output provides two tables. The first, offers descriptive statistics for the
variable we tested (Age), which includes number of cases/observations, mean,
standard deviation, and standard error. The second table provides the actual t-test
output--where we see that our sample's age (M = 21.04, SD = 1.85) was significantly
different from zero, t(53) = 83.440, p < .001. As you might imagine, this is not
terribly useful information. A more informative test might include testing
whether or not our sample is significantly different from a specified value. SPSS
allows us to specify a value in the One Sample T Test dialog.
43. Again, click on Analyze, Compare Means, One-Sample T test...
44. Notice the previous run is still specified (i.e. the Age variable is
already in the Test Variable(s): box.
Next, we want to specify a value, say 20 which might represent
the mean of all undergraduate college students. We simply type
the value in the Test Value: box.
Then click the OK button to complete the t test.
45. Here, we see that our sample's age (M = 21.04, SD = 1.85) was
significantly different from 20, t(53) = 4.113, p < .001.
46. (2). Dependent samples t-test is used to determine if the difference
between two related sample means is different from zero. It is known
by many names: dependent samples t test, paired samples t test.
Example
A new fitness program is devised for obese people. Each
participant's weight was measured before and after the program
to see if the fitness program is effective in reducing their
weights.
In this example, our null hypothesis is that the program is not
effective, i.e., there is no difference between the weight
measured before and after the program. The alternative
hypothesis is that the program is effective and the weight
measured after is less than the weight measured before the
program.
In the data, the first column is the weight measured before the
program and the second column is the weight after.
47.
48. Select "Analyze -> Compare Means -> Paired-Samples T Test".
A new window pops out. Drag the variable "Before" and "After" from the list on the left to
the pair 1 variable 1 and variable 2 respectively, as shown below. Then click "OK".
51. We can now interpret the result.
From A, since the p-value is 0.472, we reject the alternative
hypothesis and conclude that the fitness program is not effective
at 5% significant level.
52. Independent samples t-test is used to test whether or not two independent
sample means are significantly different from one another. It is the most
commonly used of the t tests.
For example, suppose one is interested in learning whether differences exist
between females and males in mathematics scores. Data for such a comparison
are presented below
53. The first column, labeled Math_Scores contains individual
student mathematics scores. The second column,
labeled Sex, identifies whether the student is male or
female. As noted in Figure 1, females are coded as 1 and
males are coded as 2. One may be curious why numbers
are used to represent sex when letters such as F and M
should suffice. Often statistical programs, such as SPSS,
are programmed to work with numbers rather than
letters or other symbols. That is the case for the
independent samples t-test command in SPSS.
54. To help users more easily recognize which students are females and males,
one may opt to provide Value Labels for the Sex variable. Figure 1 above
shows the "Data View" of the SPSS spreadsheet. Note the tab at the
bottom of the spreadsheet labeled "Data View." Next to that tab is a
second tab labeled "Variable View." Click on "Variable View" to access the
variable characteristics section of the spreadsheet.
55. the label "Female" is added for a Sex value = 1, and the label
"Male" will be added for those students with a Sex value = 2. One
much click on the "Add" button to complete adding both value
labels for sex.
56. To access the Independent Samples t-test
select "Analyze"
select "Compare Means"
select "Independent Samples t-test"
57. The Independent Samples t-test pop-up window will appear. Select
the dependent variable (the quantitative variable, mathematics scores in this
example) and move it to the "Test Variable(s)" box,
and move the grouping variable (the categorical independent variable, sex in
this example) to the "Grouping Variable" box.
58. Now the groups must be defined so SPSS correctly compares
the two groups of interest (if more than two groups are
present in the data). In the current example there are only
two groups; to define these
click on "Define Groups" button,
identify which group will serve as Group 1 (in this case
Females--coded 1-- were selected), then
identify which group will serve as Group 2 (males coded
2).
Click "Continue" then click "OK" to run the t-test and
obtain results.
59.
60. Results
Interesting to note is the Levene's Test for Equality of Variances. This tests the assumption
that our two groups have approximately equal variances; sometimes called the
homogeneity of variance assumption
In the current example, the Levene's test indicates we do not have significantly different
variances between our two groups, which is what we want to see as this supports the
assumption.
61. Analysis of Variance (ANOVA) in
SPSS.
The ANOVA family of analysis are used for testing
whether or not a significant difference exists
between more than two groups. There are many
forms of ANOVA which allows it to be used in a
variety of situations. The simplest is the oneway
ANOVA which is used for testing multiple groups
of one independent variable's effect on one
continuous or nearly continuous dependent
variable. The oneway name implies one
independent variable.
62. dependent variable should be measured at the interval or
ratio level (i.e., they are continuous). Examples of
variables that meet this criterion include revision time
(measured in(hours
independent variable should consist of two or more
categorical, independent groups
63. Example
A manager wants to raise the productivity at his company by increasing the speed at which
his employees can use a particular spreadsheet program. As he does not have the skills in-
house, he employs an external agency which provides training in this spreadsheet program.
They offer 3 courses: a beginner, intermediate and advanced course. He is unsure which
course is needed for the type of work they do at his company, so he sends 10 employees on
the beginner course, 10 on the intermediate and 10 on the advanced course. When they all
return from the training, he gives them a problem to solve using the spreadsheet program,
and times how long it takes them to complete the problem. He then compares the three
courses (beginner, intermediate, advanced) to see if there are any differences in the average
time it took to complete the problem.
In SPSS, we separated the groups for analysis by creating a grouping variable called
Course (i.e., the independent variable), and gave the beginners course a value of "1", the
intermediate course a value of "2" and the advanced course a value of "3". Time to
complete the set problem was entered under the variable name Time (i.e., the
dependent variable).
64. Click Analyze > Compare Means > One-Way ANOVA... on the top menu as shown below.
65. You will be presented with the following screen:
Transfer the dependent variable (Time( into the Dependent List: box and the
independent variable (Course( into the Factor: box using the appropriate buttons (or
drag-and-drop the variables into the boxes(, as indicted in the diagram below
67. Click the button.
Click the button. Tick the Descriptive checkbox in the–
Statistics–area, as shown below:
68. Click the button.
Click the button.
Descriptives Table
The descriptives table (see below) provides some very useful descriptive statistics,
including the mean, standard deviation and 95% confidence intervals for the dependent
variable (Time) for each separate group (Beginners, Intermediate and Advanced), as well as
when all groups are combined (Total).
69. This is the table that shows the output of the ANOVA analysis
and whether we have a statistically significant difference
between our group means. We can see that the significance
level is 0.021 (p = .021), which is below 0.05. and, therefore,
there is a statistically significant difference in the mean
length of time to complete the spreadsheet problem between
the different courses taken. This is great to know, but we do
not know which of the specific groups differed. Luckily, we
can find this out in the Multiple Comparisons Table which
contains the results of post-hoc tests.
70. Multiple Comparisons Table
From the results so far, we know that there are significant differences between the groups
as a whole. The table below, Multiple Comparisons, shows which groups differed from
each other. The Tukey post-hoc test is generally the preferred test for conducting post-hoc
tests on a one-way ANOVA, but there are many others. We can see from the table below
that there is a significant difference in time to complete the problem between the group
that took the beginner course and the intermediate course (p = 0.046), as well as between
the beginner course and advanced course (p = 0.034). However, there were no differences
between the groups that took the intermediate and advanced course (p = 0.989).
71. Reporting the output of the one-way ANOVA
There was a statistically significant difference between
groups as determined by one-way ANOVA (F(2,27) = 4.467,
p = .021). A Tukey post-hoc test revealed that the time to
complete the problem was statistically significantly lower
after taking the intermediate (23.6 ± 3.3 min, p = .046) and
advanced (23.4 ± 3.2 min, p = .034) course compared to the
beginners course (27.2 ± 3.0 min). There were no
statistically significant differences between the
intermediate and advanced groups (p = .989).
72. Factorial ANOVA
The Factorial ANOVA is an extension of the Oneway
situation where the design is composed of more than
one independent variable, each with two or more
groups (sometimes called multi-way ANOVA). The
major benefit of factorial ANOVA is the ability to
investigate interactions among the independent
variables. The Factorial ANOVA is still considered a
univariate analysis (as opposed to a multivariate
analysis) because, it deals with only one dependent
variable (where the multivariate ANOVA deals with
multiple dependent variables).
73. Two-way ANOVA
the two-way ANOVA is used when there is more than one independent
variable and multiple observations for each independent variable.
Example
A researcher was interested in whether an individual's
interest in politics was influenced by their level of
education and gender. They recruited a random sample of
participants to their study and asked them about their
interest in politics, which they scored from 0 to 100, with
higher scores indicating a greater interest in politics. The
researcher then divided the participants by gender
(Male/Female) and then again by level of education
(School/College/University). Therefore, the dependent
variable was "interest in politics", and the two independent
variables were "gender" and "education".
74. In SPSS, we separated the individuals into their appropriate groups by using two
columns representing the two independent variables, and labelled them Gender
and Edu_Level. For Gender, we coded "males" as 1 and "females" as 2, and for
Edu_Level, we coded "school" as 1, "college" as 2 and "university" as 3. The
participants' interest in politics – the dependent variable – was entered under the
variable name, Int_Politics. The setup for this example can be seen below:
75. Click Analyze > General Linear Model > Univariate... on
the top menu,
76. You will be presented with the Univariate dialogue box
77. Transfer the dependent variable, Int_Politics, into the Dependent
Variable: box, and transfer both independent variables, Gender and
Edu_Level, into the Fixed Factor(s): box. You can do this by drag-and-
dropping the variables into the respective boxes or by using the
button. If you are using older versions of SPSS you will need to use the
latter method. You will end up with a screen similar to that shown
below:
78. Click on the button. You will be presented with
the Univariate: Profile Plots dialogue box
79. Transfer the independent variable, Edu_Level, from the Factors:
box into the Horizontal Axis: box, and transfer the other
independent variable, Gender, into the Separate Lines: box. You
will be presented with the following screen:
80. Click the button. You will see that
"Edu_Level*Gender" has been added to the Plots: box
Click the button. This will return you to the Univariate
xob eugolaid.
81. Click the button. You will be presented with the
Univariate: Post Hoc Multiple Comparisons for Observed Means.
82. Transfer Edu_Level from the Factor(s): box to the Post Hoc Tests for:
box. This will make the –Equal Variances Assumed– area become
active (lose the "grey sheen") and present you with some choices for
which post hoc test to use. For this example, we are going to select
Tukey, which is a good, all-round post hoc test.
Note: You only need to transfer independent variables
that have more than two groups into the Post Hoc
Tests for: box. This is why we do not transfer Gender.
83. Click the button to return to the Univariatexob eugolaid
Click the button. This will present you with the Univariate: Options
84. Transfer Gender, Edu_Level and Gender*Edu_Level from the
Factor(s) and Factor Interactions: box into the Display Means for:
box. In the –Display– area, tick the Descriptive Statistics option.
Click thebutton to return to the Univariatexob eugolaid
85. Click the button to generate the output
You can find appropriate descriptive statistics for when you report the results of your two-
way ANOVA in the aptly named "Descriptive Statistics" table, as shown below:
This table is very useful because it provides the mean and standard deviation for each
combination of the groups of the independent variables (what is sometimes referred to as
each "cell" of the design). In addition, the table provides "Total" rows, which allows means
and standard deviations for groups only split by one independent variable, or none at all, to
be known. This might be more useful if you do not have a statistically significant
interaction.
86. The actual result of the two-way ANOVA – namely, whether either of the two
independent variables or their interaction are statistically significant – is shown in
the Tests of Between-Subjects Effects table
The particular rows we are interested in are the "Gender", "Edu_Level" and
"Gender*Edu_Level" rows
87. These rows inform us whether our independent variables (the "Gender" and
"Edu_Level" rows) and their interaction (the "Gender*Edu_Level" row) have a
statistically significant effect on the dependent variable
We can see from the above table that there was no statistically significant
difference in mean interest in politics between males and females (p =
.207), but there were statistically significant differences between
educational levels (p < .0005).
88. When you have a statistically significant interaction, reporting the
main effects
If you do not have a statistically significant interaction, you might
interpret the Tukey post hoc test results for the different levels of
education, which can be found in the Multiple Comparisons
table
89. You can see from the above table that there is some repetition of the results,
but regardless of which row we choose to read from, we are interested in the
differences between (1) School and College, (2) School and University, and (3)
College and University. From the results, we can see that there is a statistically
significant difference between all three different educational levels (p < .0005).
Reporting the results of a two-way ANOVA
A two-way ANOVA was conducted that examined the effect of
gender and education level on interest in politics. There was a
statistically significant interaction between the effects of gender
and education level on interest in politics, F (2, 54) = 4.643, p =
.014.
From post Hoc males were significantly more interested in politics
than females when educated to university level (p = .002), but there
were no differences between gender when educated to school (p =
.465) or college level (p = .793).
90. Multivariate analysis of variance
It is used when there are two or more dependent variables.
√ Analyze
√ General Linear Model
√ Multivariate
91. After clicking on Multivariate, the following screen will appear. You
will send your dependent variables to the Dependent Variables box
and your independent variable to the Fixed Factor box.
92. Now send the four dependent variables (i.e., Verbal 1 through
Verbal 4) over one at a time to the Dependent Variables screen.
93.
94. Then send over the independent variable, Reading Group
Membership, to the Fixed Factor box. Then click on Options.
95. We will use this screen to obtain descriptive statistics of
our four dependent variables for each of our three
reading groups
96. To obtain the information just mentioned, you will need to click on:
Descriptive Statistics
Estimates of Effect Size
Homogeneity tests
102. Next we’ll have to define the factor of study. From the table
above we have 3 levels of treatment (i.e. time 1,2,3), and we’ll call
the factor TIME (instead of factor1). Click “Add” after giving the
name and number of levels:
103. Now we click “Define” and we’re all set. Now on the box we’ll need to highlight
our 3 variables of interest (time1-3), and move them over to the “Within
Subjects Variables” area by clicking on the arrow between where the variables
are on the left and where they’re going on the right.
104. Lastly, we are going to select a few options. Click the “Options”
button. Here check the “Descriptive Statistics” box and the
“Estimates of effect size” box. Click “Continue”. Click “OK”. SPSS
will produce the output
105. To perform the repeated-measures ANOVA in SPSS, click on Analyze, then
General Linear Model, and then Repeated Measures
106. Corporation. In the resulting Repeated Measures dialog, you must
specify the number of factors and the number of levels for each factor. In
this case, the single factor is the time the algebra test was taken, and
there are three levels: at the beginning of the course, immediately after
the course, and six months after the course. You can accept the default
label of factor1, or change it to a more descriptive one. We will use
"Time" as the label for our factor, and specify that there are three levels
107. Non-parametric statistics
Refers to comparative properties (statistics) of the data,
or population, which do not include the typical
parameters, of mean, variance, standard deviation, etc.
109. In Chi-Square goodness of fit test, sample data is divided
into intervals. Then the numbers of points that fall
into the interval are compared, with the expected
numbers of points in each interval.
The Chi-Square test of Independence is used to
determine if there is a significant relationship between
two nominal (categorical) variables. The frequency of
one nominal variable is compared with different values
of the second nominal variable.
110. To test if there is an association between two.nominal
variables
In SPSS you just indicate that one variable (the
independent one) should come in the row, and the
other variable (the dependent one) should come in the
column of the cross table. Then you ask for row
percentages and the Chi-square statistic.
111. Example
Final year psychology students were asked about their
career plans. 12 females and 26 males said they would
like to work in the field of clinical psychology, while
24 females and 8 males said they preferred the area of
organisational psychology. We want to investigate if
there is any relationship between gender and career
preference.
In this example, our null hypothesis is that there is no
relationship between gender and career preference.
Our alternative hypothesis is that there is a
relationship between gender and career preference
112. Two variables that are ordinal or nominal (categorical
data).
There are two or more groups in each variable.
115. It does not matter which variables we select as rows and which as
column. For illustration purpose, we select "Gender" as "Row(s)"
and "Career_Preference" as "Column(s)".
116. Now click "Statistics" on the right. A new window pops out. Make
sure that the "Chi-square" box at the top is checked. Click
"Continue".
117. Click "Cells" on the right. A new window pops out. You can check the box
"Observed", "Expected", "Row", "Column" and "Total" if you want to extract more
information from crosstab. Click "Continue". The window will then be closed.
Now click "OK" in the original window.
118.
119. The results now pop out in the "Output" window.
We can now interpret the result.
120. From A in the third table, since the p-value is 0, we can reject the
null hypothesis and conclude that there is a relationship between
gender and career preference at 5% significant level. From the
second table, it appears that males tend to work in the area of
clinical psychology and females tend to work in the field of
121. The second option. You create a nominal variable as usual, but also
create a frequency or count variable which will contain the number
of cases belonging to each category. This means that each row will
not represent a different participant, but instead a different category
If you use this method, you must tell SPSS that the numbers in the
frequency variable are not scores for individual participants, but overall
counts. To do this, go to the Data menuWeight Cases…, and transfer
across the variable that contains the frequencies or counts to the
Frequency Variable box. Click on OK.
122.
123. The Wilcoxon Sign Test in SPSS
In SPSS we need to have two variables representing the
before and after
The Wilcoxon sign test can be found in
Analyze/Nonparametric Tests/2 Related Samples…
124. In the next dialogue box for the nonparametric two dependent samples tests we need to
define the paired observations. Enter X as variable 1 of the first pair and Y as Variable 2 of
the first pair
125. We also need to select the Test Type. The Wilcoxon Signed
Rank Test is marked by default
126. The Wilcox sign test output contains only two tables.
The first table contains all statistics that are required to calculate the
Wilcoxon signed ranks test’s W. These are the sample size and the
sum of ranks. It also includes the mean rank, which is not necessary
to calculate the W-value but helps with the interpretation of the
data.
127. n our example we see that 20*2 observations were made for X
and Y. The Wilcox Sign Test answers the question if the difference
is significantly different from zero, and thus if the observed
difference in mean ranks (4.5 vs. 10.65) can also be found in the
general population
The answer to this question is in the second table,
which contains the test of significance statistics
The SPSS output contains the z-value of -3.472. The test value z is
approximately normally distributed for large samples that are n>10,
so that p = 0.001.
Thus we can reject the null hypothesis that both samples are from
the same population, and we might assume that the novel teaching
method caused a significant increase in literacy scores.
128. Mann Whitney U Test
compare differences between two independent groups
when the dependent variable is either ordinal or
continuous, but not normally distributed.
131. Select the dependent variable of interest from the list at the
left by clicking on it, and then move it into the Test Variable
List by clicking on the upper arrow button.
Select the independent variable of interest from the list at
the left by clicking on it, and then move it into the
Grouping Variable box by clicking on the lower arrow
button.
132. Next, we must define the groups of the independent variable. Click
on the Define Groups button that is just below the Grouping
Variable box. The Two Independent Samples: Define Groups dialog
box appears:
Enter the value that corresponds to one level of the independent variable in the
Group 1 box and the value that corresponds to the other level of the independent
variable in the Group 2 box.
133. Click on the Continue button in the Two Independent Samples: Define Groups
dialog box. The Two-Independent Samples Test dialog box should be on top now.
Make sure that the Mann-Whitney U option is selected in the Test Type frame.
That is, there should be a check mark next in the box to the left of Mann-Whitney
U:
134. Click on the Options button. The Two-Independent-
Samples: Options dialog box appears:
Select the Descriptive statistics option by clicking in the box to the
left of Descriptives if it does not already have a check mark in it:
135. Click on the Continue button in the Two-Independent-Samples:
Options dialog box. Click on OK in the Two-Independent-Samples
Tests box to perform the Mann-Whitney U test. The SPSS output
viewer will appear. It should contain three sections:
The first section gives the descriptive statistics for the dependent variable and
(less usefully) for the independent variable. In this example, there were 31
people (N) who responded to the PLANNER question. They gave a mean
response of 2.42 (between AGREE and UNDECIDED) with a standard deviation
of 1.43 (although this number may not be meaningful in this example
136. The second section of the output shows the number (N) of people
in each condition (8 people do not intend to get a Ph.D. or Psy.D in
psychology and 23 people do) and the mean rank and sum of ranks
for each group (useful if you were calculating the U statistic by
hand.)
137. In this example, the Mann-Whitney U value is 92.0. There are
two p values given -- one on the row labeled Asymp. Sig (2-
Tailed) and the other on the row labeled Exact Sig. [2*(1-
tailed Sig.)]. Typically, we will use the Exact significance,
although if the sample size is large, the asymptotic
signifance value can be used to gain a little statistical
power.
Decide whether to reject H0. We will use the exact p value. It
is a two-tailed p value, but we have a one-tailed test. So we
need to divide the two-tailed p value by 2 to get the one-
tailed p value: 1.000 / 2 = .500. Since the exact p value is
greater than the specified level (.05), we fail to reject H0.
Thus, we have insufficient evidence to conclude that people
who intend to get a Ph.D. or Psy.D. in psychology are more
likely to use a day planner or calendar than the people who
do not intend to get a Ph.D. or Psy.D. in psychology.
138. "Kruskal-Wallis Test"
The Kruskal-Wallis test is the nonparametric test
equivalent to the one-way ANOVA, and an extension
of the Mann-Whitney U test to allow the comparison
of more than two independent groups.
140. There are two ways transfer your variables. You can either highlight drag-and-drop each
variable into the respective boxes or you highlight the variable by using the cursor and
clicking thebutton. Make sure that the Kruskal-Wallis H checkbox is ticked in the
Test Type box.
141.
142. Press the button and type "1" into the Minimum box
and "3" into the Maximum box. This is defining the range of the
values for the categories of the independent variables. In this
case, there are 3 groups/categories, called Drug A, Drug B and
Drug C. If there had been 4 groups, but you did not want to
include the first group in the analysis, you would have entered "2"
and "4" into the Minimum and Maximum boxes, respectively
(assuming you ordered the groups numerically).
143. Click the button
Click the button. Tick the Descriptive checkbox if
you want descriptives and/or Quartiles if you want
quartiles. You will be presented with the following if you
select Descriptives
.
144. Click the button
Click thebutton
we can report that there was a statistically significant difference
between the different drug treatments
145. Friedman Test
The Friedman test is the non-parametric alternative to the one-
way ANOVA with repeated measures.
One group that is measured on three or more
different occasions Group is a random sample from
the population.
Your dependent variable should be measured at the
ordinal or continuous level. Examples of ordinal
variables include Likert scales (e.g., a 7-point scale
from strongly agree through to strongly disagree),
146. Click Analyze > Nonparametric Tests > Legacy Dialogs > K Related Samples... on
the top menu, as shown below:
147. You will be presented with the Tests for Several Related
Samples dialogue box
148. Transfer the dependent variables none, classical and dance to the
Test Variables: box by using the button or by dragging-and-
dropping the variables into the box. You will end up with the
following screen:
149. Make sure that Friedman is selected in the –Test Type–
area.
Click the button. You will be presented with the following
Several Related Samples: Statisticsnwohs sa ,xob eugolaid
woleb:
Click thebutton. This will return you back to the Tests for Several Related
Samplesxob eugolaid
150. •Click the button to run the Friedman test
The Descriptives Statisticsdetceles uoy fi decudorp eb lliw elbat
noitpo selitrauQ eht
151. The Ranks table shows the mean rank for each of the related
groups
The Friedman test compares the mean ranks between the related
groups and indicates how the groups differed, and it is included for
this reason. However, you are not very likely to actually report these
values in your results section, but most likely will report the
median value for each related group.
152. The table above provides the test statistic (χ2) value ("Chi-
square"), degrees of freedom ("df") and the significance
level ("Asymp. Sig."), which is all we need to report the
result of the Friedman test. From our example, we can see
that there is an overall statistically significant difference
between the mean ranks of the related groups.
153. chart builder
We can also use graphs to visualize the statistical
relationships between variables. For example, we
want to discover if there is any relationship between
miles per gallon and car weight
Enter variables into the X-and
Y-axis boxes by selecting the variable and clicking the
arrow to the left ofthe box.
154. The Gallery includes many different predefined charts, which are organized
by chart type
Example bar chart .
155. Icons representing the available bar charts in the Gallery appear in the
dialog box. The pictures should provide enough information to identify
the specific chart type.
If you need more information, you can also display a ToolTip description of the
chart by pausing your cursor over an icon.
Click Bar if it is not selected
156. Drag the icon for the simple bar chart onto the "canvas,"
which is the large area above the Gallery. The Chart
Builder displays a preview of the chart on the canvas.
Note that the data used to draw the chart are not your
actual data. They are example data.
157. The drop zone for the x axis is required. The variable in this drop
zone controls where the bars appear on the x axis
Depending on the type of chart you are creating, you may also need a variable in the y axis
drop zone. For example, when you want to display a summary statistic of another variable
(such as mean of salary), you need a variable in the y axis drop zone. Scatterplots also
require a variable in the y axis. In that case, the drop zone identifies the dependent variable.
158. Now drag Job satisfaction from the Variables list to
the x axis drop zone.
159. The Element Properties window allows you to change the properties of the various chart
elements.
160.
161. Return to the Chart Builder dialog box and drag Household income in thousands from the
Variables list to the y axis drop zone.
162. ou can also add titles and footnotes to the chart.
► Click the Titles/Footnotes tab.
163. The title appears on the canvas with the label T1.
The bar chart reveals that respondents who are more satisfied with their jobs tend to have
higher household incomes.
164.
165. interpretation According to significance we remain or reject null
hypothesis
If the Sig. is greater than 0.05, the data is normal. If it is
below 0.05, the data significantly deviate from a normal
distribution.