2
Pell Grant (Questions 1 – 2)
For all schools which reported the statistic in 2013, the mean (in percentage) of students with a Pell grant was 52.9.
A random sample of 19 schools in 2015 had an average of 47.6 with a standard deviation of 23. Conduct a
hypothesis test to determine if the average (in percentage) of students with a Pell grant changed from the 2013
levels.
1. Choose the correct hypotheses for this test.
a. 0
: 47.6
: 47.6a
H x
H x
=
b. 0
: 52.9
: 52.9a
H
H
=
c. 0
: 23
: 23a
H
H
=
d. 0
: 52.9
: 52.9a
H
H
=
2. Calculate the value of the test statistic
a. t = 1.04
b. χ2 = 26.36
c. t = -1.00
d. z = 5.30
Home Basketball Attendance (Questions 3 – 8)
Attendance at home basketball games at a university varies throughout the season. The standard deviation for
attendance over the past 3 seasons is easily calculated from online information and is found to be σ = 2055. An
employee in the athletics office wonders if attendance at games during the 1990s had a different standard deviation.
Because it is more difficult to find the information, the employee decides to randomly sample 15 games from the
1990s and calculates the sample standard deviation to be s = 1874. Is this evidence that the variance in attendance in
the past was different from 4223025?
3. Choose the correct null and alternative hypotheses for this test.
a. 𝐻0: 𝑠
2 = 1874 ; 𝐻𝑎: 𝑠
2 ≠ 1874
b. 𝐻0: 𝜎
2 = 3,511,876 ; 𝐻𝑎 : 𝜎
2 < 3,511,876
c. 𝐻0: 𝜎 = 2055 ; 𝐻𝑎 : 𝜎 > 2055
d. 𝐻0: 𝜎
2 = 4,223,025 ; 𝐻𝑎 : 𝜎
2 ≠ 4,223,025
4. Let α = 0.10. What is the rejection rule?
a. Reject H0 if s < 2005
b. Reject H0 if χ2 < 8.547
c. Reject H0 if χ2 >21.064 or if χ2 < 7.790
d. Reject H0 if χ2 < 6.571 or if χ2 > 23.685
5. What is the test statistic?
a. χ2 = 12.76
b. χ2 = 18.04
c. χ2 = 23.93
d. χ2 = 11.64
6. What is the p-value for this test?
a. 0.635 b. 0.730 c. 0.365 d. 1.270
3
7. What assumptions must be true for any inferences we make from this hypothesis test to be valid?
a. The population must be approximately normal.
b. The sample size must be sufficiently large.
c. Both a and b must be true.
d. No assumptions need to be made.
8. Use the sample information to calculate a 90% confidence interval for the standard deviation of attendance for
home games in the 1990s.
a. 1107.7 ≤ σ ≤ 3992.9
b. 2075840 ≤ σ2 ≤ 7482310
c. 1440.8 ≤ σ ≤ 2735.4
d. 1078 ≤ σ ≤ 2670
San Francisco Public Library Employees (Questions 9 – 12)
In 2008, the average monthly contribution made to the retirement account for employees at the Public
Library department in the city of San Francisco was $750. We are interested to see whether this amount has
increased. If the contributions have not increased on average, management has plans to give a salary
increment across the board. A random sample of 20
2 Pell Grant (Questions 1 – 2) For all schools whi.docx
1. 2
Pell Grant (Questions 1 – 2)
For all schools which reported the statistic in 2013, the mean
(in percentage) of students with a Pell grant was 52.9.
A random sample of 19 schools in 2015 had an average of 47.6
with a standard deviation of 23. Conduct a
hypothesis test to determine if the average (in percentage) of
students with a Pell grant changed from the 2013
levels.
1. Choose the correct hypotheses for this test.
a. 0
: 47.6
: 47.6a
H x
H x
=
b. 0
: 52.9
: 52.9a
3. H
H
=
2. Calculate the value of the test statistic
a. t = 1.04
b. χ2 = 26.36
c. t = -1.00
d. z = 5.30
Home Basketball Attendance (Questions 3 – 8)
Attendance at home basketball games at a university varies
throughout the season. The standard deviation for
attendance over the past 3 seasons is easily calculated from
online information and is found to be σ = 2055. An
employee in the athletics office wonders if attendance at games
during the 1990s had a different standard deviation.
Because it is more difficult to find the information, the
employee decides to randomly sample 15 games from the
1990s and calculates the sample standard deviation to be s =
1874. Is this evidence that the variance in attendance in
the past was different from 4223025?
3. Choose the correct null and alternative hypotheses for this
4. test.
a. �0: �
2 = 1874 ; ��: �
2 ≠ 1874
b. �0: �
2 = 3,511,876 ; �� : �
2 < 3,511,876
c. �0: � = 2055 ; �� : � > 2055
d. �0: �
2 = 4,223,025 ; �� : �
2 ≠ 4,223,025
4. Let α = 0.10. What is the rejection rule?
a. Reject H0 if s < 2005
b. Reject H0 if χ2 < 8.547
c. Reject H0 if χ2 >21.064 or if χ2 < 7.790
d. Reject H0 if χ2 < 6.571 or if χ2 > 23.685
5. What is the test statistic?
a. χ2 = 12.76
b. χ2 = 18.04
c. χ2 = 23.93
d. χ2 = 11.64
6. What is the p-value for this test?
5. a. 0.635 b. 0.730 c. 0.365
d. 1.270
3
7. What assumptions must be true for any inferences we make
from this hypothesis test to be valid?
a. The population must be approximately normal.
b. The sample size must be sufficiently large.
c. Both a and b must be true.
d. No assumptions need to be made.
8. Use the sample information to calculate a 90% confidence
interval for the standard deviation of attendance for
home games in the 1990s.
a. 1107.7 ≤ σ ≤ 3992.9
b. 2075840 ≤ σ2 ≤ 7482310
c. 1440.8 ≤ σ ≤ 2735.4
d. 1078 ≤ σ ≤ 2670
San Francisco Public Library Employees (Questions 9 – 12)
In 2008, the average monthly contribution made to the
retirement account for employees at the Public
Library department in the city of San Francisco was $750. We
are interested to see whether this amount has
increased. If the contributions have not increased on average,
management has plans to give a salary
increment across the board. A random sample of 20 of these
Public Library employees’ contributions gives a
6. mean monthly contribution of $817, and a standard deviation of
$482. We set up a hypothesis test as
follows:
�0: � = 750 �� ��: � > 750
9. What will be the value of the test Statistics?
a. Z = 1.975 b. T = 1.96 c. T = 0.622
d. T = - 0.622
10. What is the p-value and corresponding decision for this test
at a 5% significant level?
a. P-value is 0.271, we fail to reject �0
b. P-value is 0.271, we reject �0
c. P-value is 0.0733, we fail to reject �0
d. P-value is 0.0733, we reject �0
11. Suppose we repeat the hypothesis test with an updated
sample and new sample size of 68 employees this
month and calculate a p-value of 0.028. If α = 0.05, what
conclusion should we draw?
a. Reject H0.
b. Do not reject H0.
c. Reject Ha.
d. Do not reject Ha.
12. What is the impact of a Type I error in this context?
a. Management believe the contribution has increased when it
has and did not give across the board salary
7. increment.
b. Management believe the contribution has increased when it
has not and did not give across the board salary
increment.
c. Management believe the contribution has increased when it
has and gave across the board salary increment.
d. Management believe the contribution has increased when it
has not and gave across the board salary
increment.
4
13. Nationally, 65.4% of post-secondary institutions are
classified as two-year schools. I take a random sample of
80 institutions in California and find that 68.8% of these
institutions are classified as two-year schools. Steve
sees this and thinks that the proportion of two-year schools in
California must be higher than in the rest of the
country. His friend, Emma, calculates the probability of
observing a �̂� ≥ 0.688 if the proportion in California
is the same as the rest of the country to be 0.261 and argues that
we don’t have enough reason to support
Steve’s thought. Assuming Emma’s calculated the probability
correctly, who should we believe?
a. We should believe Steve because the sample of 80 California
8. schools was a random sample.
b. We should believe Steve because the difference between
0.688 and 0.654 is less than 0.05.
c. We should believe Emma because observing a sample
proportion of 0.688 or more when the true
proportion is 0.654 will happen about 26% of the times, we take
a random sample like this.
d. We should believe Emma because the difference between
0.688 and 0.261 is more than 0.05.
University Endowments (Questions 14 – 16)
Private colleges and universities rely on money contributed by
individuals and corporations for their operating
expenses. Much of this money is invested in a fund called an
endowment, and the college spends only the interest
earned by the fund. A recent survey of eight private colleges
revealed the following endowments (in millions of
dollars): 75.1, 53, 249.9, 497.2, 114.4,167.8 ,110.1, and 224.8.
14. What value will you used as an unbiased point estimate for
the standard deviation of endowment of all private
colleges
a. 143.247 b. 133.995 c. 141.1
d. 444.2
15. Compute the p-value to test the hypothesis that the true
population variance of Endowments made to private
universities is different from 19600 at 5% significance level.
9. a. 0.041 b. 0.605 c. 0.395
d. 0.791
16. From Question 15 above, state your conclusion
a. We reject the null and conclude that the true variance is
significantly different from 19600
b. We failed to reject the null and conclude that the true
variance is significantly different from 19600
c. We reject the null and conclude that the true variance is not
significantly different from 19600
d. We failed to reject the null and conclude that we do not have
enough evidence to suggest that the true
variance is significantly different from 19600
17. You are interested in purchasing a new car. One of the many
points you wish to consider is the resale
value of the car after 5 years. Since you are particularly
interested in a certain foreign sedan, you decide to
estimate the resale value of this car with a 99% confidence
interval. You manage to obtain data on 17
recently resold 5-year-old foreign sedans of the same model.
These 17 cars were resold at an average price
of $12,110 with a standard deviation of $600. Suppose that the
interval is calculated to be (11685,12535). A
friend of yours believes that the true average resale price for a
5-year-old car is significantly less than
$13,000 dollars, what will you conclude based on your
10. research?
a. I agree with my friend with a 99% level of confidence since
my entire confidence interval is
below 13,000
b. I agree with my friend because $13,000 is such a high amount
c. I cannot find enough reason to agree with my friend
d. I totally disagree with my friend on this issue.
5
18. It costs you $10 to draw a sample of size n=1 and measure
the attribute of interest. You have a budget of
$1,500. Do you have enough funds to estimate the population
mean for the attribute of interest with a 95%
confidence interval that is 6 units in width? Assume � = 14.
a. Yes, because the required sample size is 84
b. No, because the required sample size is 84
c. Yes, because the required sample size is 221
d. No, because the required sample size is 160
19. Choose the value of the Pearson's Correlation Coefficient (r)
that best describes the two plots.
11. a. a. I: -0.23, II: 0.812.
b. I: -0.812, II: 0.23.
c. I: 0.812, II: -0.23.
d. I: 0.812, II: 1.23.
e. I: 0.188, II: -0.23.
20. Consider the first and second exam scores of the 10 students
listed below. Calculate the Pearson's
correlation coefficient for the dataset below and interpret what
that means.
6
a. The correlation is -0.403. There is a moderate positive linear
12. association between Exam 1 and Exam 2.
b. The correlation is -0.403. There is a moderate negative linear
association between Exam 1 and Exam 2.
c. The correlation is 0.403. There is a moderate positive linear
association between Exam 1 and Exam 2.
d. The correlation is 0.403. There is a moderate negative linear
association between Exam 1 and Exam 2.
e. The correlation is 0.403. There is a perfect positive linear
association between Exam 1 and Exam 2.
21. You work for a company in the marketing department. Your
manager has tasked you with forecasting
sales by month for the next year. You notice that over the past
12 months sales have consistently gone up
in a linear fashion, so you decide to run a regression the
company's sales history. If 10 months are
sampled and the regression output is given below, report the
regression equation.
13. 7
a. a. (time) = 20.139*(sales) + 680.829
b. b. (time) = 680.829*(sales) + 20.139
c. (sales) = 680.829*(time) + 20.139
d. (sales) = 20.139*(time) + 680.829
e. (sales) = 20.139*(time)
22. While attempting to measure its risk exposure for the
upcoming year, an insurance company notices a
trend between the age of a customer and the number of claims
per year. It appears that the number of
claims keep going up as customers age. After performing a
regression, they find that the relationship is
(claims per year) = 0.442*(age) + 2.642. If a customer is 39.621
years old and they make an average of
14. 11.974 claims per year, what is the residual?
a. -27.647 b. 27.647 c. 8.18 d. 19.467 e.
-8.18
23. Suppose that in a certain neighborhood, the cost of a home
is proportional to the size of the home in
square feet. If the regression equation quantifying this
relationship is found to be (cost) = 78.504*(size) +
871.128, what does the slope indicate?
a. When cost increases by 1 dollar, size increases by 871.128
square feet.
b. b. When size increases by 1 square foot, cost increases by
78.504 dollars.
8
a. c. When size increases by 1 square foot, cost increases by
871.128 dollars.
d. When cost increases by 1 dollar, size increases by 78.504
square feet.
a. e. We are not given the dataset, so we cannot make an
interpretation.
15. 24. While attempting to measure its risk exposure for the
upcoming year, an insurance company notices a
trend between the age of a customer and the number of claims
per year. It appears that the number of
claims keep going up as customers age. After performing a
regression, they find that the relationship is
(claims per year) = 0.444*(age) + 4.034. If a customer is 47.276
years old and they make an average of
14.507 claims per year, the residual is -10.518. Interpret this
residual in terms of the problem.
a) The number of claims per year is 14.507 claims less than
than what we would expect.
b) The age is 10.518 years larger than than what we would
expect.
c) The age is 10.518 years less than than what we would
expect.
d) The number of claims per year is 10.518 claims greater than
than what we would expect.
e) The number of claims per year is 10.518 claims less than
than what we would expect.
16. 25. Suppose that for a typical FedEx package delivery, the cost
of the shipment is a function of the weight of
the package measured in ounces. You want to try to predict the
cost of a typical shipment given package
dimensions. If 10 packages in a city are sampled and the
regression output is given below, what can we
conclude about the slope of weight?
9
a) The slope is equal to 0.
b) The slope significantly differs from 0.
c) Since we are not given the dataset, we do not have enough
information to determine if the slope differs
from 0.
d) The slope is 0.893 and therefore differs from 0.
e) Not enough evidence was found to conclude the slope differs
17. significantly from 0.
26. You work for a parts manufacturing company and are tasked
with exploring the wear lifetime of a certain
bearing. You gather data on oil viscosity used and load. You see
the regression output given below. What
is the multiple regression equation?
a) (lifetime) = 7.868*(viscosity) + 0.025*(load) + 89.107
b) (lifetime) = 7.868*(load) + 0.025*(viscosity) + 89.107
c) (lifetime) = 7.868*(viscosity) + 0.025*(load)
d) (lifetime) = 1.057*(viscosity) + 0.086*(load) + 89.107
e) We do not have enough information to determine the
regression equation.
10
18. 27. Suppose the sales (1000s of $) of a fast food restaurant are a
linear function of the number of competing
outlets within a 5 mile radius and the population (1000s of
people) within a 1 mile radius. The regression
equation quantifying this relation is (sales) =
1.933*(competitors) + 6.138*(population) + 6.445. Suppose
the sales (in 1000s of $) to be of a store that has 6.737
competitors and a population of 9.746 thousand
people within a 1 mile radius are 50.18 (1000s $). What is the
residual?
a) 29.1086
b) We do not know the observations in the data set, so we
cannot answer that question.
c) 72.5516
d) -29.1086
e) 22.6636
28. Suppose that a researcher wants to predict the weight of
female college athletes based on their height
and percent body fat. If a sample is taken and the following
regression table is produced, interpret the
19. slope of the percent body fat variable.
a) When percent body fat increases by 0.92 percent, weight
increases by 1 pound, holding all other variables
constant.
b) We do not have enough information to say.
c) When percent body fat decreases by 1 percent, weight
increases by 0.92 pounds, holding all other
11
variables constant.
d) When percent body fat increases by 1 percent, weight
decreases by 0.92 pounds, holding all other
variables constant.
e) When percent body fat increases by 1 percent, weight
increases by 0.92 pounds, holding all other variables
constant.
20. 29. A trucking company considered a multiple regression model
for relating the dependent variable of total
daily travel time for one of its drivers (hours) to the predictors
distance traveled (miles) and the number of
deliveries of made. After taking a random sample, a multiple
regression was performed, and the equation
is (time) = 0.03*(distance) + 1.02*(deliveries) - 1.305. Suppose
for a given driver's day, he is scheduled to
drive 206.609 miles and make 13.27 stops. Suppose it took him
10.862 hours to complete the trip yielding
a residual of 7.567. What is the best interpretation of this
residual?
a) The number of hours taken is 7.567 hours larger than what
we would expect.
b) The number of hours taken is 10.862 hours larger than what
we would expect.
c) The number of deliveries is 7.567 less than what we would
expect.
d) The number of hours taken is 7.567 hours less than what we
would expect.
21. e) The number of deliveries is 7.567 larger than what we would
expect.
30. Cardiorespiratory fitness is widely recognized as a major
component of overall physical well-being. Direct
measurement of maximal oxygen uptake (VO2max) is the single
best measure of such fitness, but direct
measurement is time-consuming and expensive. It is therefore
desirable to have a prediction equation for
VO2max in terms of easily obtained quantities. A sample is
taken, and variables measured are age (years),
time necessary to walk 1 mile (mins), and heart rate at the end
of the walk (bpm) in addition to the VO2
max uptake. The following output is from a multiple regression.
Based on the F-test alone, what is the
correct conclusion about the regression slopes?
12
a) At least one of the regression slopes does not equal zero.
b) All the regression slopes are equal to zero.
22. c) We did not find significant evidence to conclude that at least
one slope differs from zero.
d) We do not have the dataset, therefore, we are unable to make
a conclusion about the slopes.
e) All the regression slopes do not equal zero.