Chi-square tests are great to show if distributions differ or i.docx

Chi-square tests are great to show if distributions differ or if
two variables interact in producing outcomes. What are some
examples of variables that you might want to check using the
chi-square tests? What would these results tell you?
DataSee comments at the right of the data
set.IDSalaryCompaMidpointAgePerformance
RatingServiceGenderRaiseDegreeGender1Grade8231.000233290
915.80FAThe ongoing question that the weekly assignments
will focus on is: Are males and females paid the same for equal
work (under the Equal Pay Act)?
10220.956233080714.70FANote: to simplfy the analysis, we
will assume that jobs within each grade comprise equal
work.11231.00023411001914.80FA14241.04323329012160FAT
he column labels in the table
mean:15241.043233280814.90FAID – Employee sample number
Salary – Salary in thousands 23231.000233665613.31FAAge
– Age in yearsPerformance Rating – Appraisal rating
(Employee evaluation score)26241.043232295216.21FAService
– Years of service (rounded)Gender: 0 = male, 1 = female
31241.043232960413.90FAMidpoint – salary grade midpoint
Raise – percent of last raise35241.043232390415.31FAGrade –
job/pay gradeDegree (0= BSBA 1 =
MS)36231.000232775314.31FAGender1 (Male or
Female)Compa - salary divided by
midpoint37220.956232295216.21FA42241.0432332100815.70F
A3341.096313075513.60FB18361.1613131801115.61FB20341.0
963144701614.81FB39351.129312790615.51FB7411.025403210
0815.70FC13421.0504030100214.71FC22571.187484865613.80
FD24501.041483075913.81FD45551.145483695815.20FD17691
.2105727553130FE48651.1405734901115.31FE28751.11967449
5914.41FF43771.1496742952015.51FF19241.043233285104.61

MA25241.0432341704040MA40251.086232490206.30MA2270.
870315280703.90MB32280.903312595405.60MB34280.903312
680204.91MB16471.175404490405.70MC27401.000403580703.
91MC41431.075402580504.30MC5470.9794836901605.71MD3
0491.0204845901804.30MD1581.017573485805.70ME4661.157
57421001605.51ME12601.0525752952204.50ME33641.1225735
90905.51ME38560.9825745951104.50ME44601.0525745901605
.21ME46651.1405739752003.91ME47621.087573795505.51ME
49601.0525741952106.60ME50661.1575738801204.60ME6761.
1346736701204.51MF9771.149674910010041MF21761.134674
3951306.31MF29721.074675295505.40MF
Week 1Week 1.Measurement and Description - chapters 1 and
21Measurement issues. Data, even numerically coded variables,
can be one of 4 levels - nominal, ordinal, interval, or ratio. It is
important to identify which level a variable is, asthis impact the
kind of analysis we can do with the data. For example,
descriptive statistics such as means can only be done on interval
or ratio level data.Please list under each label, the variables in
our data set that belong in each
group.NominalOrdinalIntervalRatiob.For each variable that you
did not call ratio, why did you make that decision?2The first
step in analyzing data sets is to find some summary descriptive
statistics for key variables.For salary, compa, age, performance
rating, and service; find the mean, standard deviation, and range
for 3 groups: overall sample, Females, and Males.You can use
either the Data Analysis Descriptive Statistics tool or the Fx
=average and =stdev functions. (the range must be found using
the difference between the =max and =min functions with Fx)
functions.Note: Place data to the right, if you use Descriptive
statistics, place that to the right as well.SalaryCompaAgePerf.
Rat.ServiceOverallMeanStandard
DeviationRangeFemaleMeanStandard
DeviationRangeMaleMeanStandard DeviationRange3What is the
probability for a:Probabilitya. Randomly selected person
being a male in grade E?b. Randomly selected male being in
grade E? Note part b is the same as given a male, what is

probabilty of being in grade E?c. Why are the results
different?4For each group (overall, females, and males)
find:OverallFemaleMalea.The value that cuts off the top 1/3
salary in each group.b.The z score for each value:c.The normal
curve probability of exceeding this score:d.What is the
empirical probability of being at or exceeding this salary
value?e.The value that cuts off the top 1/3 compa in each
group.f.The z score for each value:g.The normal curve
probability of exceeding this score:h.What is the empirical
probability of being at or exceeding this compa value?i.How do
you interpret the relationship between the data sets? What do
they mean about our equal pay for equal work question?5.
What conclusions can you make about the issue of male and
female pay equality? Are all of the results consistent? What is
the difference between the sal and compa measures of
pay?Conclusions from looking at salary results:Conclusions
from looking at compa results:Do both salary measures show
the same results?Can we make any conclusions about equal pay
for equal work yet?
Week 2 Week 2Testing meansQ3In questions 2 and 3, be sure to
include the null and alternate hypotheses you will be testing.
HoFemaleMaleFemaleIn the first 3 questions use alpha = 0.05 in
making your decisions on rejecting or not rejecting the null
hypothesis.45341.0171.09645410.8701.0251Below are 2 one-
sample t-tests comparing male and female average salaries to
the overall sample mean. 45231.1571.000(Note: a one-sample
t-test in Excel can be performed by selecting the 2-sample
unequal variance t-test and making the second variable = Ho
value -- see column S)45220.9790.956Based on our sample, how
do you interpret the results and what do these results suggest
about the population means for male and female average
salaries?45231.1341.000MalesFemales45421.1491.050Ho: Mean
salary = 45Ho: Mean salary = 4545241.0521.043Ha: Mean
salary =/= 45Ha: Mean salary =/=
4545241.1751.04345691.0431.210Note: While the results both
below are actually from Excel's t-Test: Two-Sample Assuming

Unequal Variances, 45361.1341.161having no variance in the
Ho variable makes the calculations default to the one-sample t-
test outcome - we are tricking Excel into doing a one sample
test for
us.45341.0431.096MaleHoFemaleHo45571.0001.187Mean5245
Mean384545231.0741.000Variance3160Variance334.666666666
7045501.0201.041Observations2525Observations252545240.90
31.043Hypothesized Mean Difference0Hypothesized Mean
Difference045751.1221.119df24df2445240.9031.043t
Stat1.9689038266t Stat-1.913206357345240.9821.043P(T<=t)
one-tail0.0303078503P(T<=t) one-
tail0.033862118445231.0861.000t Critical one-
tail1.7108820799t Critical one-
tail1.710882079945221.0750.956P(T<=t) two-
tail0.0606157006P(T<=t) two-
tail0.067724236945351.0521.129t Critical two-
tail2.0638985616t Critical two-
tail2.063898561645241.1401.043Conclusion: Do not reject Ho;
mean equals 45Conclusion: Do not reject Ho; mean equals
4545771.0871.149Is this a 1 or 2 tail test?Is this a 1 or 2 tail
test?- why?- why?P-value is:P-value is:45551.0521.145Is P-
value > 0.05?Is P-value > 0.05?45651.1571.140Why do we not
reject Ho?Why do we not reject Ho?Interpretation:2Based on
our sample data set, perform a 2-sample t-test to see if the
population male and female average salaries could be equal to
each other.(Since we have not yet covered testing for variance
equality, assume the data sets have statistically equal
variances.)Ho: Ha: Test to use:Place B43 in Outcome range
box.P-value is:Is P-value < 0.05?Reject or do not reject Ho:If
the null hypothesis was rejected, what is the effect size
value:Meaning of effect size measure:Interpretation:b.Since the
one and two tail t-test results provided different outcomes,
which is the proper/correct apporach to comparing salary
equality? Why?3Based on our sample data set, can the male and
female compas in the population be equal to each other?
(Another 2-sample t-test.)Ho:Ha:Statistical test to use:Place

B75 in Outcome range box.What is the p-value:Is P-value <
0.05?Reject or do not reject Ho:If the null hypothesis was
rejected, what is the effect size value:Meaning of effect size
measure: Interpretation: 4Since performance is often a factor in
pay levels, is the average Performance Rating the same for both
genders?Ho:Ha:Test to use:Place B106 in Outcome range
box.What is the p-value:Is P-value < 0.05?Do we REJ or Not
reject the null?If the null hypothesis was rejected, what is the
effect size value:Meaning of effect size
measure:Interpretation:5If the salary and compa mean tests in
questions 2 and 3 provide different results about male and
female salary equality, which would be more appropriate to
use in answering the question about salary equity? Why?What
are your conclusions about equal pay at this point?
Week 3Week 3At this point we know the following about male
and female salaries.a.Male and female overall average salaries
are not equal in the population.b.Male and female overall
average compas are equal in the population, but males are a bit
more spread out.c.The male and female salary range are almost
the same, as is their age and service.d. Average performance
ratings per gender are equal.Let's look at some other factors that
might influence pay - education(degree) and performance
ratings.1Last week, we found that average performance ratings
do not differ between males and females in the population.Now
we need to see if they differ among the grades. Is the average
performace rating the same for all grades?(Assume variances
are equal across the grades for this ANOVA.)ABCDEFNull
Hypothesis:Alt. Hypothesis:Place B17 in Outcome range
box.Interpretation:What is the p-value:Is P-value < 0.05?Do we
REJ or Not reject the null?If the null hypothesis was rejected,
what is the effect size value (eta squared):Meaning of effect
size measure:What does that decision mean in terms of our
equal pay question:2While it appears that average salaries per
each grade differ, we need to test this assumption. Is the
average salary the same for each of the grade levels? (Assume
equal variance, and use the analysis toolpak function ANOVA.)

Use the input table to the right to list salaries under each grade
level.Null Hypothesis:Alt. Hypothesis:ABCDEFPlace B55 in
Outcome range box.What is the p-value:Is P-value < 0.05?Do
you reject or not reject the null hypothesis:If the null
hypothesis was rejected, what is the effect size value (eta
squared):Meaning of effect size measure:Interpretation:3The
table and analysis below demonstrate a 2-way ANOVA with
replication. Please interpret the results.BAMAHo: Average
compas by gender are equalMale1.0171.157Ha: Average compas
by gender are not equal0.8700.979Ho: Average compas are
equal for each degree1.0521.134Ho: Average compas are not
equal for each degree1.1751.149Ho: Interaction is not
significant1.0431.043Ha: Interaction is
significant1.0741.1341.0201.000Perform
analysis:0.9031.1220.9820.903Anova: Two-Factor With
Replication1.0861.0521.0751.140SUMMARYBAMATotal1.052
1.087MaleFemale1.0961.050Count1212241.0251.161Sum12.349
12.925.2491.0001.096Average1.02908333331.0751.0520416667
0.9561.000Variance0.0066864470.00651981820.00686604171.0
001.0411.0431.043Female1.0431.119Count1212241.2101.043Su
m12.79112.78725.5781.1871.000Average1.06591666671.06558
333331.065751.0430.956Variance0.0061024470.00421281060.0
049334131.0431.1291.1451.149TotalCount2424Sum25.1425.68
7Average1.04751.0702916667Variance0.00647034780.0051561
286ANOVASource of VariationSSdfMSFP-valueF
critSample0.002255020810.00225502080.38348211710.5389389
5074.0617064601 (This is the row variable or
gender.)Columns0.006233520810.00623352081.06005396090.3
0882956334.0617064601 (This is the column variable or
Degree.)Interaction0.006417187510.00641718751.09128776640
.30189150624.0617064601Within0.25873675440.0058803807To
tal0.273642479247Interpretation:For Ho: Average compas by
gender are equalHa: Average compas by gender are not
equalWhat is the p-value:Is P-value < 0.05?Do you reject or not
reject the null hypothesis:If the null hypothesis was rejected,
what is the effect size value (eta squared):Meaning of effect

size measure:For Ho: Average salaries are equal for all grades
Ha: Average salaries are not equal for all gradesWhat is the p-
value:Is P-value < 0.05?Do you reject or not reject the null
hypothesis:If the null hypothesis was rejected, what is the
effect size value (eta squared):Meaning of effect size
measure:For: Ho: Interaction is not significantHa: Interaction is
significantWhat is the p-value:Do you reject or not reject the
null hypothesis:If the null hypothesis was rejected, what is the
effect size value (eta squared):Meaning of effect size
measure:What do these decisions mean in terms of our equal
pay question:4Many companies consider the grade midpoint to
be the "market rate" - what is needed to hire a new
employee.MidpointSalaryDoes the company, on average, pay its
existing employees at or above the market rate?Null
Hypothesis:Alt. Hypothesis:Statistical test to use:Place the
cursor in B160 for correl.What is the p-value:Is P-value <
0.05?Do we REJ or Not reject the null?If the null hypothesis
was rejected, what is the effect size value:Since the effect size
was not discussed in this chapter, we do not have a formula for
it - it differs from the non-paired t.Meaning of effect size
measure:NAInterpretation:5. Using the results up thru this
week, what are your conclusions about gender equal pay for
equal work at this point?
Week 4Week 4Confidence Intervals and Chi Square (Chs 11 -
12)For questions 3 and 4 below, be sure to list the null and
alternate hypothesis statements. Use .05 for your significance
level in making your decisions.For full credit, you need to also
show the statistical outcomes - either the Excel test result or the
calculations you performed.1Using our sample data, construct a
95% confidence interval for the population's mean salary for
each gender. Interpret the results. How do they compare with
the findings in the week 2 one sample t-test outcomes (Question
1)?MeanSt error t valueLow to HighMalesFemales<Reminder:
standard error is the sample standard deviation divided by the
square root of the sample size.>Interpretation:2Using our
sample data, construct a 95% confidence interval for the mean

salary difference between the genders in the population. How
does this compare to the findings in week 2, question
2?DifferenceSt Err.T valueLow to HighYes/NoCan the means be
equal?Why?How does this compare to the week 2, question 2
result (2 sampe t-test)?a.Why is using a two sample tool (t-test,
confidence interval) a better choice than using 2 one-sample
techniques when comparing two samples?3We found last week
that the degrees compa values within the population. do not
impact compa rates. This does not mean that degrees are
distributed evenly across the grades and genders.Do males and
females have athe same distribution of degrees by grade?(Note:
while technically the sample size might not be large enough to
perform this test, ignore this limitation for this exercise.)What
are the hypothesis statements:Ho: Ha:Note: You can either use
the Excel Chi-related functions or do the calculations
manually.Data input tables - graduate degrees by gender and
grade levelOBSERVEDA BCDEFTotalDo manual calculations
per cell here (if desired)M GradA BCDEFFem GradM GradMale
UndFem GradFemale UndMale UndFemale UndSum
=EXPECTEDM GradFor this exercise - ignore the requirement
for a correctionFem Gradfor expected values less than 5.Male
UndFemale UndInterpretation:What is the value of the chi
square statistic: What is the p-value associated with this value:
Is the p-value <0.05?Do you reject or not reject the null
hypothesis: If you rejected the null, what is the Cramer's V
correlation:What does this correlation mean?What does this
decision mean for our equal pay question: 4Based on our sample
data, can we conclude that males and females are distributed
across grades in a similar patternwithin the population?What are
the hypothesis statements:Ho: Ha:Do manual calculations per
cell here (if desired)A BCDEFA BCDEFOBS COUNT -
mMOBS COUNT - fFSum = EXPECTEDWhat is the value of
the chi square statistic: What is the p-value associated with this
value: Is the p-value <0.05?Do you reject or not reject the null
hypothesis: If you rejected the null, what is the Phi
correlation:What does this correlation mean?What does this

decision mean for our equal pay question: 5. How do you
interpret these results in light of our question about equal pay
for equal work?
Week 5Week 5 Correlation and Regression1. Create a
correlation table for the variables in our data set. (Use analysis
ToolPak or StatPlus:mac LE function Correlation.)a. Reviewing
the data levels from week 1, what variables can be used in a
Pearson's Correlation table (which is what Excel produces)?b.
Place table here (C8 in Output range box):c.Using r =
approximately .28 as the signicant r value (at p = 0.05) for a
correlation between 50 values, what variables aresignificantly
related to Salary?To compa?d.Looking at the above correlations
- both significant or not - are there any surprises -by that I mean
any relationships you expected to be meaningful and are not and
vice-versa?e.Does this help us answer our equal pay for equal
work question?2Below is a regression analysis for salary being
predicted/explained by the other variables in our sample
(Midpoint, age, performance rating, service, gender, and degree
variables. (Note: since salary and compa are different ways of
expressing an employee’s salary, we do not want to have both
used in the same regression.)Plase interpret the findings.Ho:
The regression equation is not significant.Ha: The regression
equation is significant.Ho: The regression coefficient for each
variable is not significant Note: technically we have one for
each input variable.Ha: The regression coefficient for each
variable is significant Listing it this way to save
space.SalSUMMARY OUTPUTRegression StatisticsMultiple
R0.9915590747R Square0.9831893985Adjusted R
Square0.9808437332Standard
Error2.6575925726Observations50ANOVAdfSSMSFSignificanc
e
FRegression617762.29967387432960.383278979419.151611129
41.8121523852609E-
36Residual43303.70032612577.062798282Total4918066Coeffic
ientsStandard Errort StatP-valueLower 95%Upper 95%Lower
95.0%Upper 95.0%Intercept-1.74962121233.6183676583-

0.48353881570.6311664899-9.04675504275.547512618-
9.04675504275.547512618Midpoint1.21670105050.0319023509
38.13828811638.66416336978111E-
351.15236382831.28103827271.15236382831.2810382727Age-
0.00462801020.065197212-0.07098478760.9437389875-
0.13611071910.1268546987-
0.13611071910.1268546987Performace Rating-
0.05659644050.0344950678-1.64071109710.1081531819-
0.12616237470.0129694936-
0.12616237470.0129694936Service-
0.04250035730.0843369821-0.50393500330.6168793519-
0.21258209120.1275813765-
0.21258209120.1275813765Gender2.4203372120.86084431762.
81158528040.00739661880.6842791924.1563952320.68427919
24.156395232Degree0.27553341430.79980230480.34450190090
.732148119-1.33742165471.8884884833-
1.33742165471.8884884833Note: since Gender and Degree are
expressed as 0 and 1, they are considered dummy variables and
can be used in a multiple regression equation.Interpretation:For
the Regression as a whole:What is the value of the F statistic:
What is the p-value associated with this value: Is the p-value
<0.05?Do you reject or not reject the null hypothesis: What
does this decision mean for our equal pay question: For each of
the coefficients:InterceptMidpointAgePerf.
Rat.ServiceGenderDegreeWhat is the coefficient's p-value for
each of the variables: Is the p-value < 0.05?Do you reject or not
reject each null hypothesis: What are the coefficients for the
significant variables?Using only the significant variables, what
is the equation?Salary =Is gender a significant factor in
salary:If so, who gets paid more with all other things being
equal?How do we know? 3Perform a regression analysis using
compa as the dependent variable and the same
independentvariables as used in question 2. Show the result,
and interpret your findings by answering the same
questions.Note: be sure to include the appropriate hypothesis
statements.Regression hypothesesHo:Ha:Coefficient hypotheses

(one to stand for all the separate variables)Ho:Ha:Put C94 in
output range boxInterpretation:For the Regression as a
whole:What is the value of the F statistic: What is the p-value
associated with this value: Is the p-value < 0.05?Do you reject
or not reject the null hypothesis: What does this decision mean
for our equal pay question: For each of the coefficients:
InterceptMidpointAgePerf. Rat.ServiceGenderDegreeWhat is
the coefficient's p-value for each of the variables: Is the p-value
< 0.05?Do you reject or not reject each null hypothesis: What
are the coefficients for the significant variables?Using only the
significant variables, what is the equation?Compa = Is gender a
significant factor in compa:If so, who gets paid more with all
other things being equal?How do we know? 4Based on all of
your results to date, do we have an answer to the question of are
males and females paid equally for equal work?If so, which
gender gets paid more? How do we know?Which is the best
variable to use in analyzing pay practices - salary or compa?
Why?What is most interesting or surprising about the results we
got doing the analysis during the last 5 weeks?5Why did the
single factor tests and analysis (such as t and single factor
ANOVA tests on salary equality) not provide a complete answer
to our salary equality question?What outcomes in your life or
work might benefit from a multiple regression examination
rather than a simpler one variable test?

Chi-square tests are great to show if distributions differ or i.docx

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Chi-square tests are great to show if distributions differ or i.docx

Ähnlich wie Chi-square tests are great to show if distributions differ or i.docx (20)

Mehr von MARRY7

Mehr von MARRY7 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Chi-square tests are great to show if distributions differ or i.docx