2. Welcome to Success Formula Question Pool
Disclaimers
โข All slides and its materials are the property of Success Formula
โข You get an exclusive free personal access once buying the course the slides are made for
โข The slides are individually marked, and Success Formula can track to which users they belong
โข No part of this slide deck may be reproduced, distributed, or transmitted (hereafter in this slide
referred together as โSharedโ) in any form or by any means, including sharing the material on
platforms such as StudyDrive
โข In case slides are shared, Success Formula can attempt legal actions towards the sharing party in line
with European and Dutch Law (Copyright laws)
1
Error Bounty
โข If you find any mistake in this slide deck, let us know and we will refund you the cost of the slides
โข Only the first person indicating the mistake gets the refund
3. Answers
Question
Some people seem to like Breaking Bad, others like Prison Break. What is the percentage of people that
watch TV?
2
A. The Walking Dead
B. Depends on the year
C. All of them
D. Answer D because it is the best answer
Answer: C
Introduction question Question topic
The question
Difficulty
Answers
Correct
Answer
6. Answers
Question
Florian wants to show Julian a new magic trick. As part of the trick, Julian has to pull a card out of a 52
card deck, 3 times in a row, each time keeping the card before pulling the next one. There are 26 red
cards and 26 black cards.
Which statement is incorrect?
5
A. The probability that out of the three chosen cards, there is at least one red card or at least one black
card is equal to 1
B. The outcome of the 2nd trial will influence the outcome of the 3rd trial
C. The probability of picking a queen of hearts equals the probability of picking a queen of hearts
given that in the previous trial Julian picked a 7 of spades
D. The sample space is all the possible combinations of cards that can be drawn in a sample of 3
Answer: C
1. Probability Theory
7. 1E. Probability Theory
Question
Florian wants to show Julian a new magic trick. As part of the trick, Julian has to pull a card out of a 52
card deck, 3 times in a row, each time keeping the card before pulling the next one. There are 26 red
cards and 26 black cards. Which statement is incorrect?
6
Solution
A. Correct. Since the deck of cards has an equal number of red and black cards, Julian will definitely
pick at least 1 card of either black or red colour, meaning that we have a perfect probability equal to
1
B. Correct. Every time Julian picks a card, he does not put it back, meaning that each outcome of every
trial will influence the next one (the events become dependent)
C. Incorrect. P(QH) = P(QH/7S) ร That would be correct if the events were independent. In other
words, if after every trial, Julian put his chosen card back in the deck.
D. Correct. Julian picks 3 cards in total so any possible combination that he can make with 3 cards is
included in the sample space
8. Answers
Question
Suppose that 2 dice are rolled at the same time. Calculate the following probabilities:
โข P(A): The sum of the two numbers is equal to 1
โข P(B): The sum of the two numbers is equal to 5
โข P(C): The sum of the two numbers is less than 13
7
A. P(A) = 0.5, P(B) = 0.23, P(C) = 0
B. P(A) = 0, P(B) = 0.111, P(C) = 1
C. P(A) = 1, P(B) = 0.12, P(C) = 0
D. The probabilities cannot be calculated
Answer: B
2. Probability Theory
9. 2E. Probability Theory
Question
Suppose that 2 dice are rolled at the same time.
Calculate the following probabilities:
โข P(A): The sum of the two numbers is
equal to 1
โข P(B): The sum of the two numbers is
equal to 5
โข P(C): The sum of the two numbers is less
than 13
Sample Space:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
8
Solution
No possible combination resulting from rolling 2
dice at the same time can give us a sum equal to 1
since dice do not have the number 0.
โข The smallest sum we can find is equal to 2,
resulting from the combination (1,1)
โข P(A) = 0
To calculate P(B), we need to identify from our
sample space the combinations that yield a sum
of 5. In this case, we have 4 combinations
(colored ones).
โข We can use the general formula
โข P(A) =
๐๐๐๐๐๐ ๐๐ ๐จ
๐๐๐๐๐
โข ๐ท ๐จ =
๐
๐๐
=
๐
๐
= ๐. ๐๐๐
We can observe that the combination resulting in
the largest sum is the (6,6) with a sum of 12.
โข This means that all possible combinations will
yield a sum lower than 13
โข P(C) is the probability of the entire sample
space
โข P(C) = 1
10. Answers
Question
An experiment has four mutually exclusive outcomes, A, B, C, and D. If P(A) = 0.33, P(B) = 0.17, P(C) =
0.43, P(D) = 0.07, which of the following statements must be true?
9
A. All of the events are independent with each other
B. The marginal probability of A equals the conditional probability of A given D
C. The joint probability of C and B is equal to 0
D. None of the alternatives is correct
Answer: C
3. Probability Theory
11. 3E. Probability Theory
Question
An experiment has four mutually exclusive outcomes, A, B, C, and D. If P(A) = 0.33, P(B) = 0.17, P(C) =
0.43, P(D) = 0.07, which of the following statements must be true?
10
Solution
A. Incorrect. Given that all of our 4 events are mutually exclusive, they cannot happen at the same
time. Thus, we know that our events must be dependent on each other.
B. Incorrect. This is only the case when the 2 events are independent with one another [๐ ๐ด =
๐ โ
๐ด ๐ต .]
C. Correct. ฮur events are mutually exclusive, meaning that they cannot happen at the same time.
[P(C AND B) = 0]
D. Incorrect. C is the correct statement.
12. Answers
Question
Suppose we conduct a random experiment and two events, A and B are independent. Which of the
following rules can we use to prove the relationship between A and B?
11
A. P(A and B) = 0
B. P(and B) = P(A) x P(B/A)
C. P(A or B) = P(A) + P(B) โ P(A and B)
D. P(A)=P(A/B)
Answer: D
4. Probability Theory
13. 4E. Probability Theory
Question
Suppose we conduct a random experiment and two events, A and B are independent. Which of the
following rules can we use to prove the relationship between A and B?
12
Solution
A. Incorrect. P(A and B) = 0 is the rule for spotting disjoint events. It shows that the two events cannot
happen at the same time.
B. Incorrect. P(A and B) = P(A) x P(B/A) is the general multiplication rule
C. Incorrect. P(A or B) = P(A) + P(B) โ P(A and B) is the general addition rule
D. Correct. P(A) = P(A/B) is a rule for spotting independent events, showing that the probability of
event A is not influenced by the occurrence of event B
14. Answers
Question
A recent survey showed that 45% of Success Formula students prefer to visit Tapijn park to relax after a
long day of studying. Also, 27% of UM students both like to go to Tapijn park and the city center to
relax. Finally, the survey showed that 40% of students said that they donโt visit the city center for some
time off. Based on the above data, determine the following probabilities:
a. PA: the probability that a randomly selected UM student visits Tapijn given that he/she also
visits the city center
b. PB: the probability that a randomly selected UM student visits Tapijn or visits the city center
13
A. P(A) = 0.45, P(B) = 0.27
B. P(A) = 0.88, P(B) = 0
C. P(A) = 0.18, P(B) = 0.85
D. P(A) = 0.45, P(B) = 0.78
Answer: D
5. Probability Theory
15. 5E. Probability Theory
Question
A recent survey showed that 45% of Success
Formula students prefer to visit Tapijn park to
relax after a long day of studying. Also, 27% of
UM students both like to go to Tapijn park and
the city center to relax. Finally, the survey
showed that 40% of students said that they donโt
visit the city center for some time off. Based on
the above data, determine the following
probabilities:
a. PA: the probability that a randomly
selected UM student visits Tapijn given
that he/she also visits the city center
b. PB: the probability that a randomly
selected UM student visits Tapijn or
visits the city center
P(Tapijn) = 0.45
P(Tapijn AND City) = 0.27
๐ท ๐ช๐๐๐! = 0.4
P(City) =๐ โ ๐ท ๐ช๐๐๐!
P(City) = ๐ โ ๐. ๐ = ๐. ๐
14
Solution
ร For P(A) we are looking for the P(Tapijn/City)
ร We can first check if these 2 events are
independent
โข ๐ ๐ด ๐ด๐๐ท ๐ต = ๐ ๐ด ร๐ ๐ต ร rule for
spotting independence
โข 0.27 = 0.45 ร 0.6
โข 0.27 = 0.27 ร P(Tapijn) and P(City) are
independent
โข P(Tapijn/City) = P(Tapijn)
โข P(A) = 0.45
ร For P(B) we want the P(Tapijn Or City)
ร The joint probability of these events is not
equal to 0, thus the events are non-disjoint
ร We can use the general formula
โข ๐ ๐ต = ๐ ๐๐๐๐๐๐ + ๐ ๐ถ๐๐ก๐ฆ โ
๐ ๐๐๐๐๐๐ ๐ด๐๐ ๐ถ๐๐ก๐ฆ
โข P(B) = 0.45 + 0.6 โ 0.27
โข P(B) = 0.78
16. Answers
Question
Suppose one runs a random experiment with 3 events (A, B, C). Events A and B are disjoint, C is
independent of A and dependent with B. P(B) = 0.3, P(C/B) = 0.135, P(C/A) =0.48, P(C and A) = 0.16.
Calculate the following probabilities:
a. P(C)
b. P(A and B)
c. P(B or C)
d. P(A or B)
15
A. P(C) = 0.48, P(A and B) = 0, P(B or C) = 0.74, P(A or B) = 0.63
B. P(C) = 0.48, P(A and B) = 0.0405, P(B or C) = 0.78, P(A or B) = 0
C. P(C) = 0.48, P(A and B) = 0, P(B or C) = 0.63, P(A or B) = 0.74
D. P(C) = 0.48, P(A and B) = 0.73, P(B or C) = 0.86, P(A or B) = 0.63
Answer: A
6. Probability Theory
17. 6E. Probability Theory
Question
Suppose one runs a random experiment with 3
events (A, B, C). Events A and B are disjoint, C is
independent of A and dependent with B. P(B) =
0.3, P(C/B) = 0.135, P(C/A) =0.48, P(C and A) =
0.16. Calculate the following probabilities:
a. P(C)
b. P(A and B)
c. P(B or C)
d. P(A or B)
16
Graph
Event C
Event B
Event A
Solution
Since events A and C are independent we can say:
โข P(C) = P(C/A)
โข P(C) = 0.48
We know that events A and B are disjoint and we
also see that there is no intersection in the graph:
โข P(A and B) = 0
P(B or C) = P(B) + P(C) โ P(B and C)
โข We do not have P(B and C) but we can find it
using the multiplication rule
โข P(B and C) = P(B) x P(C/B) = 0.3 X 0.135 =
0.0405
โข P(B or C) = 0.3 + 0.48 - 0.0405 = 0.74
Since A and B are disjoint events we will use the
special form of the formula:
โข P(A or B) = P(A) +P(B)
โข We can calculate P(A) using the
multiplication rule
โข P(C and A) = P(A) x P(C)
โข ร P(A) = 0.16/0.48 = 0.33
P(A or B) = 0.33 + 0.3 = 0.63
18. Answers
Question
Remco decides to investigate which Dutch delicacy is most preferred by students in Maastricht. He
writes down his results in the following table. Calculate the following probabilities:
1. The probability that we randomly select a student who likes fries, given that they are a male
2. The probability that we randomly select a student who is a female, given they like fries
3. The probability that the student likes bitterballen
17
A. P(1) = 66.67%, P(2) = 34.78%, P(3) = 32.5%
B. P(1) = 20%, P(2) = 66.67%, P(3) = 17.5%
C. P(1) = 34.78%, P(2) = 33.33%%, P(3) = 32.5%
D. P(1) = 34.78%, P(2) = 23.52%, P(3) = 17.5%
Answer: C
7. Probability Theory
Fries Bitterballen Stroopwaffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200
19. 7E. Probability Theory
Question
Remco decides to investigate which Dutch
delicacy is most preferred by students in
Maastricht. He writes down his results in the
following table. Calculate the following
probabilities:
1. The probability that we randomly select a
student who likes fries, given that they are a
male
2. The probability that we randomly select a
student who is a female, given they like fries
3. The probability that the student likes
bitterballen
18
Solution
P(1) = P(Fries/Male)
โข It is a conditional probability so we are not
working within the entire sample space
โข The condition indicates the denominator
โข ๐ 1 =
!"
##$
= 34,78%
P(2) = P(Female/Fries)
โข P(2) =
"#
$#
= 33.33%
P(3) = P(Bitterballen)
โข It is the marginal probability within the
entire sample space
โข P(3) =
$%
"##
= 32.5%
Fries Bitter
ballen
Stroop
waffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200
20. Answers
Question
Refer to the table from the previous question. Which of the following statements is correct:
19
A. The probability P(Bitterballen/Female) is not evaluated across the entire sample space
B. The events of picking randomly someone that is a female and of picking randomly someone who
likes stroopwaffles are disjoint
C. The marginal probability of P(Fries) is equal to the conditional probability of P(Fries/Male)
D. The events of randomly picking a male and randomly picking someone that likes stroopwaffles are
independent
Answer: A
8. Probability Theory
Fries Bitterballen Stroopwaffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200
21. 8E. Probability Theory
Question
Refer to the table from the previous question.
Which of the following statements is correct:
20
Solution
A. Correct. P(Bitterballen/Female) is not
evaluated across the entire sample space,
Conditional probabilities are evaluated across
a subset of the entire sample space, in this
case acorss the subset of females.
B. Incorrect. We can see from the table that
there are females that prefer stroopwaffles
(n=35), so these 2 events can happen at the
same time (not Disjoint)
C. Incorrect. P(Fries) โ P(Fries/Male)
๐(๐น๐๐๐๐ ) =
60
200
= 0.3
๐
๐น๐๐๐๐
๐๐๐๐
=
40
115
= 0.35
D. Incorrect. P(Male) โ P(Male/Stroopwaffles)
๐ ๐๐๐๐ =
115
200
= 0.575
๐ ๐๐๐๐/๐๐ก๐๐๐๐๐ค๐๐๐๐๐๐ =
40
80
= 0.2
Fries Bitterb
allen
Stroop
waffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200
22. Answers
Question
The probability of meeting someone who wears eyeglasses randomly in the street is 0.55. When
meeting 4 random people, what is the probability that the number of people that you meet wearing
eyeglasses is 3 or higher?
21
A. P(Xโฅ 3) = 0.392
B. P(Xโฅ 3) = 0.346
C. P(Xโฅ 3) = 0.092
D. The probability cannot be calculated because we do not have the sample size
Answer: A
9. Probability Theory
23. 9E. Probability Theory
Question
The probability of meeting someone who wears
eyeglasses randomly in the street is 0.55. When
meeting 4 random people, what is the probability
that the number of people that you meet wearing
eyeglasses is 3 or higher?
22
Solution
G
G
G
G
NG
NG
G
NG
NG
G
G
NG
NG
G
NG
NG
G
G
G
NG
NG
G
NG
NG
G
G
NG
NG
G
NG
0.55
0.45
24. 9E. Probability Theory
23
Find the Right
Combinations
Since we are looking for the probability of meeting 3 or more people with glasses
in our sample of 4, the right combinations are the following:
โข G-G-G-G
โข G-G-G-NG
โข G-G-NG-G
โข G-NG-G-G
โข NG-G-G-G
Calculate the
Probabilities
We need to calculate the probabilities using multiplication for each of the
combinations:
โข G-G-G-G รจ 0.55 x 0.55 x 0.55 x 0.55 = 0.092
โข G-G-G-NG รจ 0.55 x 0.55 x 0.55 x 0.45 = 0.075
โข G-G-NG-G รจ 0.55 x 0.55 x 0.45 x 0.55 = 0.075
โข G-NG-G-G รจ 0.55 x 0.45 x 0.55 x 0.55 = 0.075
โข NG-G-G-G รจ 0.45 x 0.55 x 0.55 x 0.55 = 0.075
Sum Them Up
We need to add all of the probabilities we just calculated to find the overall
probability of meeting 3 or more people with glasses [P(x โฅ 3)]
โข 0.092 + 0.075 + 0.075 + 0.075 + 0.075 = 0.392
25. Answers
Question
Given the following probability distribution, what is the approximate variance of X?
24
A. 4.05
B. -1.66
C. 7.38
D. 15.52
Answer: D
10. Probability Theory
X P(x)
0 0.4
1 0.8
2 0.32
3 0.15
4 0.54
26. 10E. Probability Theory
Question
25
Solution
ร First, we need to calculate the expected value in order to use in the formula for the variance:
โข ยต๐ = โ ๐(๐ฅ) โ x = 0 x 0.4 + 1 x 0.8 + 2 x 0.32 + 3 x 0.15 + 4 x 0.54 = 4.05
ร We can now calculate the variance using the formula ๐3ยฒ = โ ๐(๐ฅ) โ (๐ฅ โ ยต3)ยฒ
โข ๐3ยฒ = 0.4 0 โ 4.05 4
+ 0.8 1 โ 4.05 4
+ 0.32 2 โ 4.05 4
+ 0.15 3 โ 4.05 4
+ 0.54 4 โ 4.05 4
๐3ยฒ = (6.56) + (7.44) + (1.34) + (0.17) + (0.00135)
๐๐ยฒ = 15.52
Given the following probability distribution, what is the variance of X?
X P(x)
0 0.4
1 0.8
2 0.32
3 0.15
4 0.54
28. Answers
Question
Thomas takes a standardized test as part of his university application. Standardized tests allow
comparisons to be made regarding student achievement. When he received his results, he was told that
he scored -0.28 in terms of Z-scores. However, he is not sure whether that is a good or bad result.
Given that the test scores are normally distributed, what can he conclude from the result?
27
A. He did better than half of the participants
B. He did worse than half of the participants
C. He did worse than 28% of the participants
D. Nothing can be said because we do not have the standard deviation and the mean
Answer: B
1. Probability Distribution
29. 1E. Probability Distribution
Question
Thomas takes a standardized test as part of his university application. Standardized tests allow
comparisons to be made regarding student achievement. When he received his results, he was told that
he scored -0.28 in terms of Z-scores. However, he is not sure whether that is a good or bad result. Given
that the test scores are normally distributed, what can he conclude from the result?
28
Solution
ร Since Thomas has a Z-score equal to -0.28, it means that he scored 0.28 standard deviations below
the mean. The negative sign indicates the direction in regards to the mean. The mean is the average,
with 50% of the scores below and 50% of the scores above it. Since Thomas is on the left side, we can
say that he performed worse than 50% of the test takers.
31. Answers
Question
Lea decides to investigate the average income distribution in her hometown. She observes that the
majority of households have a low to middle income and a small minority with a high-income.
Which of the following statements is correct?
30
A. Scores located within 1 standard deviation to the left and right of the mean make up 68% of the
entire data set
B. A household with an income of 2.3 standard deviations above the mean is in the top 2.5% of the
population
C. The variable in question is a discrete variable
D. None of the above statements is correct
Answer: D
2. Probability Distribution
32. 2E. Probability Distribution
Question
Lea decides to investigate the average income distribution in her hometown. She observes that the
majority of households have a low to middle income and a small minority with a high-income.
Which of the following statements is correct?
31
Solution
ร From the discription, we can understand that the distribution of average income is right skewed,
rather than a normal distribution.
ร A) and B) alternatives are wrong because they refer to the rule of thumb (68%-95%-99.7%), which
can only be used for normal distributions
ร The thrid alternative is wrong because the variable of average income can take infinite possible
values, thus the variable is continuous
33. Answers
Question
Alexandra decides to measure extraversion scores of students at Success Formula. The scores are well
modeled by a normal distribution with a mean of 72 and a standard deviation of 14. What is the
probability of a randomly selected person to score between 66 and 76 for extraversion?
32
A. 28.05%
B. 61.41%
C. 32.98%
D. 40.82%
Answer: A
3. Probability Distribution
34. 3E. Probability Distribution
Question
Alexandra decides to measure extraversion scores of students at Success Formula. The scores are well
modeled by a normal distribution with a mean of 72 and a standard deviation of 14. What is the
probability of a randomly selected person to score between 66 and 76 for extraversion?
33
Solution
Calculate the z-scores: ๐ง& =
'$('"
&)
= 0.29 and ๐ง" =
$$('"
&)
= โ0.43
Look up probabilities in z-table: ๐ง& = 0.29 โ 61.41% and ๐ง" = โ0.43 โ 33.36%
Calculate the probability that the score is between 66 and 78: 61.41% โ 33.36% = 28.05%
35. Answers
Question
Suppose that Alexandra measures extraversion scores for a different population with a mean of 80 and
a standard deviation of 9. What is the probability that a randomly selected person scores higher than
91?
34
A. 73.89%
B. 11.12%
C. 40.57%
D. 55.63%
Answer: B
4. Probability Distribution
36. 4E. Probability Distribution
Question
Suppose that Alexandra measures extraversion scores for a different population with a mean of 80 and
a standard deviation of 9. What is the probability that a randomly selected person scores higher than
91?
35
Solution
Calculate the z-scores: ๐ง& =
*(+
,
=
-&(.#
-
= 1.22
Look up probabilities in z-table: ๐ง& = 1.22 โ 0.8888 (๐โ๐๐ ๐๐ ๐กโ๐ ๐๐๐๐ก ๐ ๐๐๐๐ ๐๐๐๐๐๐๐๐๐๐ก๐ฆ)
Calculate the probability that score is higher than 91 (right sided probability):
1 โ 0.8888 = 0.1112
โ 11.12%
37. Answers
Question
According to the Central Limit Theorem:
36
A. The sample distribution becomes normal if there is a sufficient sample size (n>25)
B. The sampling distribution becomes normal only when the population distribution is normal
C. Regardless of the shape of the population distribution, the sampling distribution will always be
normal
D. As a sample size increases, the sample mean and standard deviation will be closer in value to the
population mean ยต and standard deviation ฯ
Answer: D
5. Probability Distribution
38. 5E. Probability Distribution
Questions
According to the Central Limit Theorem:
37
Solution
A. Incorrect. It is not the sample distribution that approaches normality when there is a sufficiently
large sample. It is the sampling distribution.
B. Incorrect. The sampling distribution is indeed normal when the population distribution is normal
but it can also approach normality whenever the sample size is suffciently large, regardless of the
populationโs shape
C. Incorrect. The sampling distribution is not always normal. For a small sample size, it has a similar
shape to the population distribution and not necessarly normal. For a large sample size, it becomes
approximately normal
D. Correct. As the sample size becomes larger, the mean of all sampled variables and the variances of
the samples become approximately equal to that of the population.
39. Answers
Question
Maja plans to study the effects of Omega-3 supplements on antisocial behaviour. She develops a
measurement which will be filled by her participants before and after a 2-month long trial during which
subjects will be taking daily omega-3 supplements. However, she has trouble recruiting a high number
of participants.
Given that the sample size is not large enough, which of the following statements is incorrect:
38
A. The sample mean is a biased estimator of the population mean
B. The shape of the sampling distribution will be similar to that of the population distribution
C. The standard error will probably be too high
D. There is a high risk of unreliable statements about population parameters
Answer: A
6. Probability Distribution
40. 6E. Probability Distribution
Question
Maja plans to study the effects of Omega-3 supplements on antisocial behaviour. She develops a
measurement which will be filled by her participants before and after a 2-month long trial during which
subjects will be taking daily omega-3 supplements. However, she has trouble recruiting a high number
of participants.
Given that the sample size is not large enough, which of the following statements is incorrect:
39
Solution
A. This statement is incorrect. Bias is not depended on the size of the sample. We might have an
inaccurate estimate, but if we are using the right one for the population parameter, the estimate is
still unbiased. An estimate will be biased if the estimate is not the appropriate one (e.g., no random
sample)
B. Correct. Since Maja has a small sample size, the sampling distribution has a similar shape to the
population distribution and not necessarly a normnal one.
C. Correct. Based on the C.L.T, the lower the sample size, the greater the standard error
D. Correct. Larger sample sizes allow more reliable statements about population parameters,
compared to small sample sizes.
41. 6E. Probability Distribution
40
Estimator
Something that is used in statistics to estimate some facts about population.
ร Sample mean is an estimator of population mean.
Bias
Bias = the difference between the expected value that is estimated and the true
value of the parameter
ร The V
๐ฟ of a simple random sample is always unbiased.
Efficiency
The accuracy of the sample mean.
ร The larger the sample size, the smaller the standard error.
ร The smaller the standard error, the more efficient the estimate.
42. Answers
Question
Alexithymia is a personality trait which features inability to describe, identify and experience
emotions. In a population of people with borderline alexithymia, emotional intelligence scores have a
mean of 57 and a standard deviation of 15. The population distribution is skewed the right. Darian
takes a simple random sample of 32. What is the probability that our sample mean will be between 55
and 60?
41
A. 74.86%
B. 13.11%
C. 64.42%
D. The probability cannot be calculated because the population distribution is skewed
Answer: C
7. Probability Distribution
43. 7E. Probability Distribution
Question
Alexithymia is a personality trait which features
inability to describe, identify and experience
emotions. In a population of people with
borderline alexithymia, emotional intelligence
scores have a mean of 57 and a standard
deviation of 15. The population distribution is
skewed the right. Darian takes a simple random
sample of 32. What is the probability that our
sample mean will be between 55 and 60?
ยต = 57
ฯ = 15
n = 32
ร Central Limit Theorem applies (n >25)
42
Solution
ร Calculate Z-scores
๐ง& =
X
ฮง โ ๐
๐
๐
=
60 โ 57
15
32
= 1.13
z" =
X
ฮง โ ๐
๐
๐
=
55 โ 57
15
32
= โ0.75
ร Look up probabilities in z-table
๐ง& = 1.13 โ 87.08%
๐ง" = โ0.75 โ 22.66%
ร Calculate the probability that the score is
between 55 and 60:
87.08% โ 22.66% = 64.42%
44. Answers
Question
A certain variable follows a normal population distribution. The population mean is equal to 23.48 and
the standard deviation equal to 4.657. The probability that the sample mean is higher than 24 equals
25.14%.
Calculate the sample size.
43
A. 49
B. 24
C. 36
D. The sample size cannot be calculated
Answer: C
8. Probability Distribution
45. 8E. Probability Distribution
Question
A certain variable follows a normal population
distribution. The population mean is equal to
23.48 and the standard deviation equal to 4.657.
The probability that the sample mean is higher
than 24 equals 25.14%.
Calculate the sample size.
ยต = 23.48
ฯ = 4.657
P( ฬ
๐ฅ > 24) = 25.14%
44
Solution
ร We need to see for which Z-score, the
probability of having a sample mean
higher than 24 equals 25.14%
โข Since it is a right-sided probability, we
need to substract from 1 (table gives
left-sided probabilities)
โข 1-0.2514=0.7486
โข We can find the 0.7486 in the table and
it is for the z-score of 0.67
ร We can use the Z-formula
๐ง =
X
๐ โ ๐
๐
๐
0.67 =
24 โ 23.48
4.657
๐
=
0.52
4.657
๐
0.67 =
0.52ร ๐
4.657
๐ =
0.67ร4.657
0.52
= 6
๐ = ๐๐ = ๐๐
46. Answers
Question
Eero develops a new brand of cherry soda and he has decided on a specific bottle design. The contents
of soda bottles are normally distriuted with a mean of 400 and a standard deviation of 7. There is a
8.38% chance that the average contents of a 4-pack will exceed how many ml?
45
A. 400.12
B. 404.83
C. 407.31
D. 400.60
Answer: B
9E. Probability Distribution
47. 9E. Probability Distribution
Question
Eero develops a new brand of cherry soda and he has decided on a specific bottle design. The contents
of soda bottles are normally distriuted with a mean of 400 and a standard deviation of 7. There is a
8.38% chance that the average contents of a 4-pack will exceed how many ml?
46
Solution
ร We know that the contents of the soda bottles are normally distributed, thus we can use the Z-table
ร P( ฬ
๐ฅ>?)=8.38 (right sided probability) โ 1โ 0.0838 = 0.9162 โ Z = 1.38
๐ =
ฬ
๐ฅ โ ๐
g
๐
๐
1.38 =
ฬ
๐ฅ โ 400
g
7
4
4.83 + 400 = ฬ
๐ฅ
ฬ
๐ฅ = 404.83
48. Answers
Question
Leonie wishes to investigate homeslessness experiences in Maastricht. However, there is no list of
homeless people in the city. She decides to use instead a non-random sampling method known as
snowball sampling. Leonie meets one homeless person who participates in her research and also put
her in contact with other homeless people in the area that they know. Using this method she is able to
gather 178 participants.
Which of following statements pertaining to the population estimator is true?
47
A. The estimator is unbiased and efficient
B. The estimator is unbiased and not efficient
C. The estimator is biased and efficient
D. The estimator is biased and not efficient
Answer: C
10. Probability Distribution
49. 10E. Probability Distribution
Question
Leonie wishes to investigate homeslessness experiences in Maastricht. However, there is no list of
homeless people in the city. She decides to use instead a non-random sampling method known as
snowball sampling. Leonie meets one homeless person who participates in her research and also put
her in contact with other homeless people in the area that they know. Using this method she is able to
gather 178 participants.
Which of following statements pertaining to the population mean estimator is true?
48
Solution
ร Leonie is using a non-random sampling method, meaning that her sample is not random. This can
lead to Leonie using an inappropriate estimator for the population mean which would make her
estimator biased. โBiasโ has nothing to do with the sample size
ร Leonie has a sample size of 178 participants which is a sufficiently large sample (C.L.T). Thus, her
estimator for the population mean will indeed be efficient. As the sample size increases, the
standard error decreases
51. Answers
Question
A researcher claims that he was able to develop a drug that enhances human attention. He will test this
hypothesis by recruiting 80 individuals with Attention Deficit Disorder (ADD). He divides evenly his
sample into 2 groups and makes sure that the groups are matched in their attention levels. He
continues by administering the drug only in group 1, keeping group 2 as a control. Finally, all
participants across both groups have to complete an Attention Test, with higher scores indicating
worse attention.
What is the researcherโs null and alternative hypothesis?
50
A. H0: ยต1= ยต2, Hฮฑ: ยต1 โ ยต2
B. H0: ยต1 โ ยต2, Hฮฑ: ยต1< ยต2
C. H0: ยต1= ยต2 Hฮฑ: ยต1> ยต2
D. H0: ยต1= ยต2 Hฮฑ: ยต1< ยต2
Answer: D
1. Probability Theory
52. 1E. Hypothesis Testing
Question
A researcher claims that he was able to develop a drug that enhances human attention. He will test this
hypothesis by recruiting 80 individuals with Attention Deficit Disorder (ADD). He divides evenly his
sample into 2 groups and makes sure that the groups are matched in their attention levels. He
continues by administering the drug only in group 1, keeping group 2 as a control. Finally, all
participants across both groups have to complete an Attention Test, with higher scores indicating
worse attention.
What is the researcherโs null and alternative hypothesis?
51
Solution
A. Incorrect. The alternative hypothesis indicates a two-sided test (Hฮฑ: ยต1 โ ยต2). The researcher wants to test
the hypothesis that the drug enhances human attention, so we are looking for a one-sided test.
B. Incorrect. The null hypothesis always suggests that there is no significant relationship between our data.
In this case, it is the hypothesis that the drug will not have an effect on the mean of group 1 (H0: ยต1 =ยต2)
C. Incorrect. The alternative hypothesis states that the mean of group 1 should be higher than that of group
2 after the drug administration. However, higher scores mean worse attention levels. Since the researcher
expects that the drug is beneficial, we should be expecting that group 1 has better attention levels than
group 2, thus lower scores
D. Correct. The alternative hypothesis claims that group 2 will have worse attention relative to group 1, as
seen from their higher test scores
53. Answers
Question
Refer back to the example in question one. The researcher is informed that the population of people
with ADD is skewed to the right. Which of the following statements is correct?
52
A. The researcher can still test his hypothesis because normality is not a necessary condition
B. The researcher can still test his hypothesis because his sample size is large enough
C. The researcher cannot test his hypothesis because there is no normality in the population
D. The researcher cannot test his hypothesis because his sample size is not large enough
Answer: B
2. Hypothesis Testing
54. 2E. Hypothesis Testing
Question
Refer back to the example in question one. The researcher is informed that the population of people
with ADD is skewed to the right. Which of the following statements is correct?
53
Solution
A. Incorrect. In order to be able to test our hypothesis, we need to make sure that we are working with
a normal distribution
B. Correct. The researcher can indeed do the test because he has a large enough sample size, meaning
that the central limit theorem applies (= the sampling distribution approximates a normal
distribution as the sample size gets larger, regardless of the population distribution)
C. Incorrect. Since the central limit theorem applies, we do not need to worry about the skewed
population distribution
D. Incorrect. The sample size is large enough. The cut-off for the central limit theorem to apply is n โฅ
25
55. Answers
Question
Florian believes that a new Artificial Intelligence teaching method can influence student ratings
compared to using human tutors. He is however unsure about what this influence can look like because,
despite the AIโs greater efficiency, students might still prefer human interaction during their tutorials.
Florian then takes a SRS of 27 students from a population of students with a mean rating of ยต=30,2 and
a standard deviation of ฯ=16. The sample of students take a lesson from the AI system and then give it a
rating with a mean of 24,5.
Can Florian conclude that the mean rating of the AI system is significantly different from the mean of
the normal method?
54
A. Yes, we reject the null hypothesis with the p-value of 0.0322
B. Yes, we reject the null hypothesis with the p-value of 0.0644
C. No, we cannot reject the null hypothesis with the p-value of 0.0322
D. No, we cannot reject the null hypothesis with the p-value of 0.0644
Answer: D
3. Hypothesis Testing
56. 3E. Hypothesis Testing
Question
Florian believes that a new Artificial Intelligence teaching
method can influence student ratings compared to using
human tutors. He is however unsure about what this
influence can look like because, despite the AIโs greater
efficiency, students might still prefer human interaction
during their tutorials. Florian then takes a SRS of 27 students
from a population of students with a mean rating of ยต=30,2
and a standard deviation of ฯ=16. The sample of students
take a lesson from the AI system and then give it a rating
with a mean of 24,5. The significance level is 5%
Can Florian conclude that the mean rating of the AI system
is significantly different from the mean of the normal
method?
55
Data
ฮ0: ๐& = ๐"
Hฮฑ: ๐& โ ๐" (2-tailed test)
ฮฑ = 0.05
ยต = 30.2
ฯ = 16
n = 27
ฬ
๐ฅ = 24.5
Solution
ร The sample size is large enough (n=27), so we
can continue with the test
ร We can use the Z formula to calculate the Zobs
๐012 =
X
๐ โ ๐
๐
๐
=
24.5 โ 30.2
16
27
= โ1.85
ร Using the Z-table we see that a Zobs with a
value of -1.85 is matched to a p-value of
0.0322
ร Since we have a 2-tailed test, we need to
double our p-value
๐ โ ๐ฃ๐๐๐ข๐ร2
0.0322ร2 = 0.0644
ร We can then compare our p-value to the alpha
0.0644 > 0.05
ร The p-value is larger than the ฮฑ, thus the
null hypothesis cannot be rejected
57. Answers
Question
Suppose that for a two-sided test, an experimenter decides to have a significance level of 0.10.
Which of the following statements is incorrect?
56
A. The Z-critical is going to be equal to ยฑ1.65
B. The probability of a type 1 error is equal to 10%
C. If the null hypothesis is rejected at this level, then it will also be rejected at ฮฑ=0.05
D. With the current significance level, there is a lower probability of not rejecting a false null
hypothesis compared to a significance level of 0.05
Answer: C
4. Hypothesis Testing
58. 4E. Hypothesis Testing
Question
Suppose that for a two-sided test, an
experimenter decides to have a significance level
of 0.10.
Which of the following statements is incorrect?
57
Solution
A. Correct. In case of a two-sided test with
ฮฑ=10%, then the Z-critical becomes +/- 1.65
B. Correct. The probability of a type 1 error is
always equal to the significance level of the
study
โข Type 1 error = ฮฑ = 10%
C. Incorrect. If the null hypothesis is rejected at ฮฑ
= 10%, it does not necessarily mean that it
will be rejected at ฮฑ = 1%
โข E.g., a p-value equal to 0.04 is smaller
than 0.10, however it is not smaller than
0.01. Thus, the H0 would be rejected at ฮฑ
= 10% but not at ฮฑ = 1%
D. Correct. By increasing the significance level,
we make the decision criteria more lenient,
making it more difficult to commit a type 2
error. However, we simultaneously increase
the risk of a false positive, that is rejecting a
true null hypothesis
90%
5%
5%
59. Answers
Question
A questionnaire has been constructed to measure the level of psychopathy for incarcerated individuals.
The population is normally distributed with a mean of 44 and a standard deviation of 12. A researcher
wants to check the hypothesis that the population mean is different, so she draws a SRS of 23
individuals. The sample mean is 53.
What are the boundaries of a 90% confidence interval based on this specific sample?
58
A. [48.87, 57.13]
B. [48.14, 56.90]
C. [43.89, 54.96]
D. [49.63, 52.47]
Answer: A
5. Hypothesis Testing
60. 5E. Hypothesis Testing
Question
A questionnaire has been constructed to measure the level of psychopathy for incarcerated individuals.
The population is normally distributed with a mean of 44 and a standard deviation of 12. A researcher
wants to check the hypothesis that the population mean is different, so she draws a SRS of 23
individuals. The sample mean is 53.
What are the boundaries of a 90% confidence interval based on this specific sample?
59
Solution
H0: ยต = 44
Hฮฑ: ยตโ 44
ยต = 44
ฯ = 12
n = 23
X
๐ = 53
Zc = 1.65 (because it is a 90% CI)
๐๐๐๐ ยฑ ๐๐ร
๐
๐
53 ยฑ 1.65ร
12
23
53 โ 1.65ร
12
23
= 53 โ 1.65ร2.5 = 48.87
53 + 1.65ร
12
23
= 53 + 1.65ร2.5 = 57.13
[48.87, 57.13]
61. Answers
Question
Suppose we have a 95% Confidence Interval [37.2, 42.5].
Calculate the sample mean and the standard error
60
A. X
๐ = 40.05, ๐๐ธ = 3,39
B. X
๐ = 38.74, ๐๐ธ = 4.63
C. X
๐ = 39.85, ๐๐ธ = 1.35
D. X
๐ = 41.40, ๐๐ธ = 2.22
Answer: C
6. Hypothesis Testing
62. 6E. Hypothesis Testing
Sample Mean
Suppose we have a 95% Confidence Interval
[37.2, 42.5].
Calculate the sample mean and the standard
error.
ฮฑ = 5%
Zc = 1.96
CI [37.2, 42.5]
V
๐ ยฑ ๐๐ร
๐
๐
V
๐ ยฑ ๐. ๐๐ร
๐
๐
61
Standard Error
ร Confidence interval: xฬ012 ยฑ ๐3 โ g
4
5
ร From the previous calculations we can see
that:
1.96ร
๐
๐
= ฬ
๐ฅโ37.2
ร We already found the sample mean, so we can
use it to calculate the fruction:
1.96ร
๐
๐
= 39.85 โ 37.2
๐
๐
=
2.65
1.96
๐
๐
= 1.35
37.2 = ฬ
๐ฅ โ 1.96ร
๐
๐
1.96ร
๐
๐
= ฬ
๐ฅโ37.2
42.5 = ฬ
๐ฅ + (1.96ร
๐
๐
)
42.5 = ฬ
๐ฅ + ฬ
๐ฅ โ 37.2
2 ฬ
๐ฅ = 42.5 + 37.2
2 ฬ
๐ฅ = 79.7
ฬ
๐ฅ =
79.7
2
4
๐ = ๐๐. ๐๐
Standard Error
63. Answers
Question
Going back to the example of the previous question, what can be said about the null hypothesis, given
that the population mean is equal to 36.05?
62
A. The null hypothesis is accepted
B. The null hypothesis is rejected
C. The null hypothesis cannot be rejected
D. Nothing can be said about the null hypothesis with the current data
Answer: B
7. Hypothesis Testing
64. 7E. Hypothesis Testing
Question
Going back to the example of the previous question, what can be said about the null hypothesis, given
that the population mean is equal to 36.05?
63
Solution
A. Incorrect. When doing a hypothesis test, we can either reject the null hypothesis or do not reject
the null hypothesis, but we can never accept the null hypothesis. We cannot conclude that the null
hypothesis is true merely because we did not find evidence to reject it
B. Correct. We can see that for our 2-tailed test, the population mean is not included within the range
of the 90% CI, so the null hypothesis is rejected
C. Incorrect. Since the population mean is not included in the confidence interval, the null hypothesis
is rejected
D. Incorrect. The second statement is correct.
65. 7E. Hypothesis Testing
64
Condifence
Interval
ร A confidence interval is an interval estimate of ยต.
ร It shows the values that the population mean probably falls between
V
๐ฟ ยฑ ๐๐ร
๐
๐
Interpretation
Example: 95% Confidence Interval
ร If we draw infinite Confidence Intervals, then 95% of those CI have the
population mean ยต
Hypothesis
Testing
ร We can use the confidence interval to see if the null hypothesis is rejected or
not for a two-tailed test
ร If the population mean from the null hypothesis is located inside the interval,
then the null hypothesis cannot be rejected because the specific value is a
possible population mean
ร If the population mean from the null hypothesis is not located inside the
interval, the null hypothesis is rejected
66. Answers
Question
Tobias investigates the effects of participative leadership on satisfaction levels within employees.The
sample mean is equal to 73.8. The boundaries of the 95% confidence interval are [71.4, 76.5].
Calculate the margin of error and the standard error.
65
A. ME = 5.7, SE = 1.22
B. ME = 2.4, SE = 1.22
C. ME = 2.9, SE = 3.91
D. ME = 2.4, SE = 4.75
Answer: B
8. Hypothesis Testing
67. 8E. Hypothesis Testing
Question
Tobias investigates the effects of participative leadership on satisfaction levels within employees.The
sample mean is equal to 73.8. The boundaries of the 95% confidence interval are [71.4, 76.5].
Calculate the margin of error and the standard error
66
Solution
X
๐ = 73.8
95% ๐ถ. ๐ผ โ [71.4, 76.5]
Zcritical = 1.96
Margin of error:
L
๐ ยฑ ๐5ร
๐
๐
L
๐ โ ๐5ร
๐
๐
= 71.4
๐5ร
๐
๐
= L
๐ โ 71.4 = 73.8 โ 71.4
๐5ร
๐
๐
= 2.4
Standard error:
๐6ร
๐
๐
= 2.4
๐
๐
=
2.4
๐6
=
2.4
1.96
= 1.22
68. Answers
Question
Kian is the HR manager for Success Formula. He noticed that the employees are lately having more
stress than usual, so he decides to evaluate their stress levels using a measurement scale (less points =
less stress). On average, the 26 employees had a stress score of 83 with a standard deviation of 17 . Kian
then decided to implement a mindfulness program with the goal of reducing stress scores by 8 points.
The significance level is 5%
What is the power of the test, given that the mindfulness program works as Kian was expecting?
67
A. 0.7734
B. 0.2266
C. 0.6066
D. 0.7123
Answer: B
9. Hypothesis Testing
69. Question
Kian is the HR manager for Success Formula. He
noticed that the employees are lately having
more stress than usual, so he decides to evaluate
their stress levels using a measurement scale
(less points = less stress). On average, the 26
employees had a stress score of 83 with a
standard deviation of 17 . Kian then decided to
implement a mindfulness program with the goal
of reducing stress scores by 8 points. The
significance level is 5%
What is the power of the test, given that the
mindfulness program works as Kian was
expecting?
H0: ยต = 83
ฮฮฑ: ยต < 83
Zc = -1.65
ฮฑ = 0.05
n = 26
ฯ = 17
ยต = 83
ยต (new) = 75
68
Answer
ร Find the critical value
๐3 =
๐3 โ ๐
๐
๐
โ1.65 =
ฮง3 โ 83
17
26
โ5.49 = ๐3 โ 83 โ ๐3 = 77.51
ร Solve for Z
๐3 =
๐3 โ ๐(๐๐๐ค)
๐
๐
Z =
77.51 โ 75
17
26
= 0.75
ร Find the ฮฒ
โข Using the Z-table, we find a p-value of
0.7734
ร To calculate the power we use the formula:
๐ท๐๐๐๐ = ๐ โ ๐ท
๐ท๐๐๐๐ = ๐ โ ๐. ๐๐๐๐ = ๐. ๐๐๐๐
9E. Hypothesis Testing
70. 9E. Hypothesis Testing
69
Type II
Error
ร Definition: We fail to reject a false null hypothesis
ร Measured by ฮฒ
ร Calculation:
โข Find the critical value where ๐ฏ๐ would be rejected.
โข ๐5 =
๐ฟ๐78"
9
#
$
ร solve for ๐ฟ๐
โข Z =
๐ฟ๐78%
9
#
$
ร solve for Z, then look up P
Power
ร Definition: The probability that we are able to reject a false null hypothesis
ร Calculation:
โข Power = 1 - ๐ท
Illustration
71. Answers
Question
Suppose Micheal is conducting an experiment on fear conditioning. He uses a sample of 65 participants
and a significance level of 5%. Before he begins, he wants to make sure that the probability of rejecting
a true null hypothesis is as small as possible.
Which of the following statements is correct?
70
A. He should increase his sample size
B. He should increase the effect size
C. He should increase the significance level
D. None of the above
Answer: D
10. Hypothesis Testing
72. 10E. Hypothesis Testing
Questions
Suppose Micheal is conducting an experiment on fear conditioning. He uses a sample of 65 participants
and a significance level of 5%. Before he begins, he wants to make sure that the probability of rejecting
a true null hypothesis is as small as possible.
Which of the following statements is correct?
71
Solution
A. Incorrect. By increasing the sample size, we decrease the standard error and thus the probability of
not rejecting a false null hypothesis (Type II error)
B. Incorrect. Increasing the effect size is difficult in real life since researchers do not have any control
over it. Theoretically, the higher the effect size, the lower the probability of failing to reject a null
hypothesis (Type II error)
C. Incorrect. By increasing the significance level, it becomes easier to reject a null hypothesis. We
increase the probability of rejecting a true H0 hypothesis (Type I error)
D. None of the above alternatives is correct. Rejecting a true null hypothesis is the Type I error and its
probability is measured by ฮฑ. We can reduce the probability by reducing the ฮฑ, but this increases the
probability of type II error (Nor recommended)
74. Answers
Question
A randomly drawn sample of 60 university students undergo exam training. Before the training, their
mean score on a practice exam was 68. After the training, their mean score improved by 7 points. What
(t-)test would you employ to check if the exam training had a significant effect?
73
A. One-sample t-test
B. Paired samples t-test
C. Independent samples t-test
D. Two-sample t-test
Answer: B
1. T-tests
75. 1E. T-tests
Question
A randomly drawn sample of 60 university students undergo exam training. Before the training, their
mean score on a practice exam was 68. After the training, their mean score improved by 7 points. What
(t-)test would you employ to check if the exam training had a significant effect?
74
Solution
A. Incorrect, we compare two dependent samples not the one sample against the population.
B. Correct, the groups are paired since we test the sample twice (before and after exam training).
C. Incorrect, the two groups are not independent, they are dependent.
D. Incorrect, a two-samples t-test is an independent t-test. The groups were dependent, not
independent.
76. Answers
Question
When testing a null hypothesis about a single population mean, a t-test is usually performed rather
than a z-test. A t-test is more likely to be employed becauseโฆ
75
A. A t-test has more power than a z-test, leading to a more reliable result.
B. Quantitative variables can only be analysed with t-tests.
C. Z-tests are more prone to type I errors, which are to be avoided.
D. In practice, the standard deviation of a population is rarely known.
Answer: D
2. T-tests
77. 2E. T-tests
T-tests
When to use a t-test?
When we canโt use the z-scores because, ฯ
(population standard deviation) is unknown
โข We have to estimate for both parameters.
โข We use an extra estimate (Sx)
โข T-distribution is more dispersed relative to
the z-distribution
โข T-test is always less powerful
76
Z-tests
Z-tests measure of how many standard deviations
our sample (V
๐ฟ) differs from the hypothesized
value of the population mean (๐).
โข Makes use of the z-distribution
โข More powerful than a t-test
โข Most times cannot be used, since in reality
we do not know much about the
parameters of the population
78. Answers
Question
A researcher is interested in the effect of wearing red lipstick on the score at minigolf. They ask 40
people to wear red lipstick while playing 18 holes on the minigolf court. 70 people played the same 18
holes without wearing red lipstick. The dependent variable is the obtained score after the 18 holes (a
lower score is considered to be better). The red lipstick condition had a mean score of 47.5 and a
standard deviation of 4.3. The no-red lipstick condition had a mean score of 62 and a standard
deviation of 9.2.
Which test should the researcher use to test the hull hypothesis that the score at minigolf is not
affected by wearing red lipstick?
77
A. An independent samples t-test, assuming unequal population variances.
B. An independent samples t-test, assuming equal population variances.
C. A paired samples t-test.
D. A one-sample t-tests.
Answer: A
3. T-tests
79. 3E. T-tests
Question
A researcher is interested in the effect of wearing red lipstick on the score at minigolf. They ask 40
people to wear red lipstick while playing 18 holes on the minigolf court. 70 people played the same 18
holes without wearing red lipstick. The dependent variable is the obtained score after the 18 holes (a
lower score is considered to be better). The red lipstick condition had a mean score of 47.5 and a
standard deviation of 4.3. The no-red lipstick condition had a mean score of 62 and a standard
deviation of 9.2.
Which test should the researcher use to test the hull hypothesis that the score at minigolf is not
affected by wearing red lipstick?
78
Solution
A. Correct. The 2 groups are independent, and we compare their samples. The goal of the test is to
check if the 2 samples come from populations with equal means. We see that the rule of thumb
(๐๐๐๐๐๐๐ ๐ก ๐๐ท ร2 > ๐ต๐๐๐๐๐ ๐๐ท) does not hold and the groups donโt have equal sample sizes. This
means we have to do the t-test without assuming equal variances
B. Incorrect. We cannot assume equal variances because the rule of thumb is violated and the group
sizes are not equal
C. Incorrect. Paired samples t-test requires matched groups or a within-subject design.
D. Incorrect. One sample t-test is used when we have 1 population and want to check if its mean is
equal to a specific value.
80. 3E. T-tests
Assumption T-Test Concerned How to Determine What if Violated
Normality All T-Tests
1. Histogram of
Sample Scores looks
normal
2. Sample Size is
large (Central Limit
Theorem)
Canโt do T-test
Quantitative All T-Tests
Dependent variable is
quantitative
Canโt do T-test
Dependent Groups Paired T-Test
The groups are
matched
Two-Sample T-test
Independent Groups Two-Samples T-Test
Two separate groups
are measured.
Paired T-test
Equal Variance Two-Samples T-Test
1. One sample SD is
not 2x bigger than
the other. (Rule of
Thumb).
2. Leveneโs Test is not
significant.
3. The sample sizes
are equal.
If the assumption is
violated Two-Sample
T-test not assuming
Equal variance has to
be used.
ร Less powerful
79
81. Answers
Question
The effect of Ritalin on test performance is tested. 31 participants received a Ritalin pill while another
31 participants received a placebo. The test performance is assumed to be good if the score on the test
is high. The null hypothesis is that exam performance is the same both under Ritalin and placebo, while
the alternative hypothesis is that Ritalin leads to better test performance. The table below presents the
group statistics, computed by SPSS (equal variances assumed).
What statement is incorrect?
80
A. The means of the two populations are very similar. However, a visual inspection of the group
statistics is not enough to reject the null hypothesis.
B. The equal variances assumption is violated, thus we should not interpret the test
C. The equal variances assumption is not violated, thus we can interpret the test
D. During the t-test, we should compute the weighted average of the two standard deviations
Answer: B
4. T-tests
condition N Mean Std. Deviation Std. Error Mean
Test score placebo 31 10.1182 1.9463 .1699
Ritalin 31 10.9374 2.2824 .4099
82. 4E. T-tests
Question
The effect of Ritalin on test performance is tested. 31 participants received a Ritalin pill while another
31 participants received a placebo. The test performance is assumed to be good if the score on the test
is high. The null hypothesis is that exam performance is the same both under Ritalin and placebo, while
the alternative hypothesis is that Ritalin leads to better test performance. The table below presents the
group statistics, computed by SPSS.
What statement is incorrect?
81
Solution
A. Correct. Sample means are random variables, meaning they change depending on the sample. Thus
in order to be able to make conclusions about the populations we need to make sure whether the
differences between the means are indeed significant.
B. Incorrect. The equal variances assumption is not violated. We can check this using the rule of
thumb (biggest SD < smallest SD x 2)
C. Correct. Using the rule of thumb, we can see that the product of the smallest SD multiplied by 2 is
bigger than the bigger SD (Ritalin group), thus the assumption is not violated
D. Correct. Since the equal variances assumtpion is not violated, the 2 standard deviations estimate
the same population standard deviation. By computing their weighted average (pooled SD), we have
the best estimate of ฯ
condition N Mean Std. Deviation Std. Error Mean
Test score placebo 31 10.1182 1.9463 .1699
Ritalin 31 10.9374 2.2824 .4099
83. 4E. T-test
82
Checking
Equal Variances
Assumption
We can use 2 ways to check for the assumption
1. Rule of Thumb
โ Smaller SDx2 should be larger than the Bigger SF
2. Leveneโs Test
โ If the test is significant, the variances are unequal (H0: ๐;
4
= ๐4
4
)
Violation of
Assumption
If this assumption is violated, we can continue with the t-test if the sample size
across both samples is approximately equally large
Special case
If there is violation AND the samples have a difference in size, we can do the t-test
but only with the following formula:
๐ก =
xฬ ! โ xฬ " โ (๐!โ ๐")
๐ !
"
๐!
+
๐ "
"
๐"
If H0: ๐! = ๐" โ = 0
84. Answers
Question
Natalia is a memory researcher and as part of her pilot study, she wishes to test the differences in
memory recall between severe anxiety patients and controls. She suspects that anxiety patients will
have different memory recall scores compared to controls. After a memory test, she compares the
scores of the groups. The anxiety group has a mean of 12.6 and a standard deviation of 3.38. The
control group has a mean of 13.4 and a standard deviation of 2.61. There are 70 participants in total,
equally divided into the 2 groups.
What can Natalia conclude about the null hypothesis.
83
A. The null hypothesis is not rejected with 0.10 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.15
B. The null hypothesis is rejected with 0.01 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.05
C. The null hypothesis is not rejected with 0.20 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.30
D. The nyll hypothesis is rejected with 0.02 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.025
Answer: C
5. T-tests
85. 5E. T-tests
Question
Natalia is a memory researcher and as part of her
pilot study, she wishes to test the differences in
memory recall between severe anxiety patients
and controls. She suspects that anxiety patients
will have different memory recall scores
compared to controls. After a memory test, she
compares the scores of the groups. The control
group has a mean of 13.4 and a standard
deviation of 2.61. The anxiety group has a mean
of 12.6 and a standard deviation of 3.38. There
are 70 participants in total, equally divided into
the 2 groups.
What can Natalia conclude about the null
hypothesis.
H0: ยต1=ยต2
Hฮฑ: ยต1โ ยต2
n1=n2=35
X1=13.4
X2= 12.6
S1=2.61
S2=3.38
84
Solution
ร Since equal variances assumed, we need to
calculate the pooled standard deviation
๐ #=
๐! โ 1 ๐ !
" + (๐" โ 1)๐ "ยฒ
(๐!โ1) + (๐" โ 1)
๐๐ =
34 < 2.61" + 34 < 3.38"
34 + 34
= 3.02
ร Next, we need to calculate the Tobs
๐ =
@
๐! โ @
๐"
๐๐ <
1
๐1
+
1
๐2
๐ =
13.4 โ 12.6
3.02 <
1
35
+
1
35
๐ =
0.8
3.02 < 0.24
= 1.11
ร Using the t-table we see that the p-value is
between the 0.10 and the 0.15. For a 2-
tailed test, we need to double these values
0.20 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.30
Bigger SD < Smallest SD x 2
3.38 < 2.61 x 2
3.38 <5.22 (True)
ร Equal variances assumed
86. Answers
Question
85
A. [-6.52, -3.88]
B. [-6.34; -4.59]
C. [-6.50; -4.0]
D. [-7.29;-3.91]
Answer: A
6. T-tests
An ice cream company has two new potential flavours ready for the market. They developed a tastiness
scale scored from 0 to 30. 40 volunteers tasted flavour A and another 25 volunteers tasted flavour B.
The obtained values are: @
๐$= 22.8, @
๐% = 28, ๐ $ = 4.2 and ๐% = 1.9.
What is the 95% Confidence Interval corresponding to this t-test?
87. 6E. T-tests
Question
An ice cream company has two new potential
flavours ready for the market. They developed a
tastiness scale scored from 0 to 30. 40 volunteers
tasted flavour A and another 25 volunteers tasted
flavour B. The obtained values are: @
๐$= 22.8,
@
๐% = 28, ๐ $ = 4.2 and ๐% = 1.9.
What is the 95% Confidence Interval
corresponding to this t-test?
nA=40
nB=25
@
๐$= 22.8
@
๐% = 28
๐$ = 4.2
๐% = 1.9
86
Solution
ร We are dealing with 2 independent groups,
thus we should have an independent samples
t-test
ร We have to decide if the assumption of equal
variances is violated, in order to use the
correct fomrulas
๐๐๐๐๐๐๐ ๐ก ๐๐ท ร2 > ๐ต๐๐๐๐๐ ๐๐ท
1.9ร2 > 4.2
3.8 > 4.2 ๐๐๐ก ๐ก๐๐ข๐
ร The equal variances assumption is violated,
thus we use the special case of the t-test
@
๐! โ @
๐" ยฑ ๐ <
๐ !
"
๐!
+
๐"
"
๐"
22.8 โ 28 ยฑ 1.711 <
4.2"
40
+
1.9"
25
โ5.2 ยฑ 1.711 < 0.5854
โ5.2 ยฑ 1.711 < 0.77
โ5.2 ยฑ 1.32
[โ6.52, โ3.88]
88. 6E. T-tests
Confidence Interval: General Formula
Observed Xยฑ๐ก& โ Standard Error
Example: Two-Sample T-Test
N=20 (both conditions), @
๐$= โ2.1, @
๐% = โ3.5,
๐ $ = 2.05 and ๐% = 1.89. What is the 95% CI?
@
๐$ โ @
๐% ยฑ๐ก& โ (๐ # โ
!
'!
+
!
'"
)
๐ #=
!(โ".+,"-!(โ!..( ยฒ
!(-!(
= 1.97
1.4ยฑ2.04*(1.97 โ
!
"+
+
!
"+
)
= [0.13;2.67]
Standard Errors of The Different T-Tests
One-Sample T-test
T
๐
๐
Paired Sample T-test
T
๐ 0
๐
Two-Sample T-test
๐ # โ
1
๐!
+
1
๐"
Pooled Standard Deviation:
๐ #=
'!1! 2!
"-('"1!)2"ยฒ
('!1!) - ('"1!)
Two-Sample T-test
Equal variance not assumed
๐ !
"
๐!
+
๐ "
"
๐"
87
89. Answers
Question
Suppose we are testing the null hypothesis that the population mean is equal to a specific value and the
test is right sided. Refer to the SPSS output.
Which of the following statements is correct?
88
A. The null hypothesis is rejected for a significance level of 2.5%
B. The null hypothesis is not rejected for a significance level of 5%
C. The degrees of freedom were found by taking the smallest sample size and subtracting 1
D. None of the alternatives is correct
Answer: A
7. T-tests
Test Value = 570
t df Sig. (2-
tailed)
Mean
Difference
95% Confidence Interval
Lower Upper
Test score 2.139 29 0.041 20.333 0.89 39.77
90. 7E. T-tests
Question
Suppose we are testing the null hypothesis that the population mean is equal to a specific value and the
test is right sided. Refer to the SPSS output.
Which of the following statements is correct?
89
Solution
A. Correct. The SPSS output gives the p-value for a two-sided test. However, we have a one-tailed test
(right sided test means that the alternative hypothesis has the (<) symbol). Thus, we need to divide
the p-value by two (0.041/2=0.0205). We can now see that the corrected p-value is smaller than
0.025, thus the H0 is rejected at an ฮฑ = 2.5%
B. Incorrect. The corrected p-value is smaller than 0.05 as well. Thus, the H0 is rejected at ฮฑ = 5% as
well.
C. Incorrect. Since we have a one sample t-test, the formula for the degrees of freedom is N-1. It is for
an independent samples t-test, not assuming equal variances that we take the smallest n and
subtract 1 for the df
D. Incorrect. A is the correct one
Test Value = 570
t df Sig. (2-
tailed)
Mean
Difference
95% Confidence Interval
Lower Upper
Test score 2.139 29 0.041 20.333 0.89 39.77
91. Answers
Question
A researcher wants to test whether ethnic background influences IQ scores of Dutch primary school
children. They draw a sample of 50 children with grandparents of Turkish origin and another 50
children with Dutch grandparents. Each child of Turkish descend is match for age and sex with a Dutch
one. The groups data is summarized in the table below.
A paired sample t-test was used to test this hypothesis. Which of the following tests could have yielded
the same result?
90
Mean N Std. Deviation Std. Error Mean
Turkish 98.657 50 10.0023 1.6523
Dutch 103.203 50 14.5602 2.2436
A. An independent t-test, assuming equal population variances.
B. An independent t-test, assuming unequal population variances.
C. A one-sample t-test, conducted for the difference in IQ score between matched children.
D. None of the answer above.
Answer: C
8. T-tests
92. 8E. T-tests
Question
A researcher wants to test whether ethnic background influences IQ scores of Dutch primary school
children. They draw a sample of 50 children with grandparents of Turkish origin and another 50
children with Dutch grandparents. Each child of Turkish descend is match for age and sex with a Dutch
one. The groups data is summarized in the table below.
A paired sample t-test was used to test this hypothesis. Which of the following tests could have yielded
the same result?
91
Solution
A. Incorrect, the two groups are match, so they are dependent, not independent.
B. Incorrect, the two groups are match, so they are dependent, not independent.
C. Correct, a paired samples t-test compares the means of the samples to check whether there is a
difference between their means. The 2 tests have the same calculations, thus if one finds the
mean differences and then performs a one sample t-test on the differences, they would get the
same result.
D. Incorrect. Answer is C
Mean N Std. Deviation Std. Error Mean
Turkish 98.657 50 10.0023 1.6523
Dutch 103.203 50 14.5602 2.2436
93. Answers
Question
Inspect the given
output.
What answer is
Correct?
92
A. Laveneโs Test is not significant, therefore equal variances can be assumed.
B. The Tobs is equal to -2.845
C. According to the t-table, the null hypothesis is rejected
D. All answers are correct.
Answer: D
9. T-tests
?
?
?
?
94. 9E. T-tests
Question
Inspect the given
output.
What answer is
Correct?
93
Solution
A. Correct. Leveneโs Test has the null hypothesis that the population variances are equal (๐!
"
= ๐"
"
).
Since we can see that the p-value is a lot larger than 0.05 (p-value = 0.582), we can say that the null
hypothesis is not rejected and that there is no violation of the equal variances assumption
B. Correct. We can calculate the Tobs by dividing the Mean difference ( ฬ
๐ฅ! โ ฬ
๐ฅ" = โ14.00) by the Std.
Error difference (๐ # โ
!
'!
+
!
'"
= 4.92). This will give us -2.845
C. Correct. The null hypothesis in this case is rejected because the value 0 is not located in the 95% CI,
meaning that the population difference between the 2 groups cannot be 0
?
?
?
?
95. Answers
Question
Florian is the GM of Success Formula and has recently heard that colour can influence learning
performances and outcomes. He was informed that research has shown that the colour blue leads to
better performances in tests and better recall. The classes at SF however are painted in white. Florian
decides to test if indeed the colour blue leads to better results compared to white. He gathers 38
students and assigns them to 2 groups. The groups are matched together in regards to skill, age,
motivation and more. One group takes the class in a room painted white, while the second group in a
room painted blue. The test score means afterwards are compared. The population distribution of
difference scores is normal.
Florian gets the following SPSS output. Which statement is correct?
94
10. T-tests
Paired Differences
Mean Std.
Deviation
Std.
Error
Mean
95% CI T df Sig
(2-
tailed)
Lower Upper
Pair 1. White
-
Blue
-.579 2.524 .579 -1.795 .637 -1.000 18 .331
96. Answers
Question
Florian is the GM of Success Formula and has recently heard that colour can influence learning
performances and outcomes. He was informed that research has shown that the colour blue leads to
better performances in tests and better recall. The classes at SF however are painted in white. Florian
decides to test if indeed the colour blue leads to better results compared to white. He gathers 38
students and assigns them to 2 groups. The groups are matched together in regards to skill, age,
motivation and more. One group takes the class in a room painted white, while the second group in a
room painted blue. The test score means afterwards are compared. The population distribution of
difference scores is normal.
Florian gets the following SPSS output. Which statement is correct?
95
A. There is a probability of 0.331 that the H0 is true
B. The researcher might be making a Type I error
C. The researcher might be making a Type II error.
D. Since the TOBS is not located within the 95% CI, the null hypothesis can be rejected
Answer: C
10. T-tests
97. 10E. T-tests
Question
Florian is the GM of Success Formula and has recently heard that colour can influence learning
performances and outcomes. He was informed that research has shown that the colour blue leads to
better performances in tests and better recall. The classes at SF however are painted in white. Florian
decides to test if indeed the colour blue leads to better results compared to white. He gathers 38
students and assigns them to 2 groups. The groups are matched together in regards to skill, age,
motivation and more. One group takes the class in a room painted white, while the second group in a
room painted blue. The test score means afterwards are compared. The population distribution of
difference scores is normal.
Florian gets the following SPSS output (next slide). Which statement is correct?
96
Solution
A. Incorrect. The p-value is 0.331 and it is defined as the probability that our data (or more extreme
data) would have occurred, given that the null hypothesis is true. The p-value does not give the
probability that H0 is true. It is the conditional probability with the condition that H0 is true
B. Incorrect. Type 1 error is defined as rejecting a true null hypothesis. However, our p-value is larger
than 0.05, thus we did nor reject the null hypothesis in the first place. The probability that we are
making a Type 1 error in this case is 0%
C. Correct. Type 2 error is defiened as not rejecting a false null hypothesis. Since the p-value is larger
than our significance level, we did reject H0, but there is always the chance that we made an error
D. Incorrect. While using the CI to see if the H0 is rejected or not for a paired samples t-test, we need
to see if the value 0 is located in the interval, not the Tobs. This is becausle the null hypothesis states
that there is no difference.
98. 10E. T-tests
Type I Error
The null hypothesis is true but we reject it.
ร Measured with ฮฑ
97
Graphical Illustration
Type II Error
The null hypothesis is false but we fail to reject it.
ร Measured by ฮฒ
100. Answers
Question
ANOVA assumes the following statistical model: ๐๐๐ = ๐ + ๐ผ๐ + ๐๐๐, in which Yij denoting the score
of person j in group i.
Choose the incorrect statement from below:
99
A. ยต1= Yij - ๐๐๐ represents the mean of group 1
B. ฮตij has a different value for each individual participant, regardless of treatment effects.
C. ยต is a variable effect, specific to each participant.
D. If there is no treatment effect, ฮฑi is equal among all participants.
Answer: C
1. ANOVA
101. 1E. ANOVA
Question
ANOVA assumes the following statistical model: ๐๐๐ = ๐ + ๐ผ๐ + ๐๐๐, in which Yij denoting the score
of person j in group i.
Choose the incorrect statement from below:
100
Solution
A. Correct. The difference between the individual score from the group mean is a great indicator of the
unexplained variation caused by factors not controlled. It can be written as ๐๐๐ = Yij โ ๐5 โ ๐5 =
Yij โ ๐๐๐
B. Correct. Individual differences are uncontrollable factors that result in the divergence of scores of
participants within the same groups. For each participant, regardless the treatment effects, the
individual differences/residual factors are different
C. Incorrect, ยต is a constant effect. It refers to the factors that are the same in all conditions. It stays
the same for each subject.
D. Correct, if there is no treatment effect, ๐๐ข (for all participants) = 0.
102. 1E. ANOVA
Main Formula
๐๐ข๐ฃ = ๐ + ๐๐ข + ๐๐ข๐ฃ
101
Sum of Squares
โ(๐๐๐ -ำฎ)ยฒ = โ๐๐๐(ำฎ๐- ำฎ)ยฒ + โ(๐๐๐ - ำฎ๐)ยฒ
Participant
j in group i
Constant
effect
Effect
of group i
Effect of remaining
factors of participant
j in group i (error)
= + +
Total sum
of squares
(TSS)
Between
group sum
of squares
(SSG)
Within
group sum
of squares
(SSE)
= +
103. 1E. ANOVA
Example
SSG (Between Groups)
SSG = โ5๐5(ำฎ5- ำฎ)ยฒ
SSG = 3*(2-4)ยฒ+3*(4-4)ยฒ+3*(6-4)ยฒ
SSG = 24
Tip: Alternative notation of ๐ผ5= ยต5 - ยต
Here ยต5=ำฎ5 (mean of single group) and ยต=ำฎ (total
mean).
Preparation
What is the mean of each group
ำฎ!= (1+2+3)/3 = 2
ำฎ"= (3+4+5)/3 = 4
ำฎ7= (4+5+6)/3 = 6
What is the total mean?
ำฎ = (2+4+6)/3 = 4
SSW (Within Groups)
SSW = โ(๐58 - ำฎ5)ยฒ
SSW = (1-2)ยฒ+(2-2)ยฒ+(3-2)ยฒ+(3-4)ยฒโฆ+(7-6)ยฒ
SSW = 6
Tip: Alternative notation of ๐58= ๐58 - ยต5
Here ยต5 is the same as ำฎ5. Both describe the mean
of a single group.
G1 G2 G3
P1 1 3 5
P2 2 4 6
P3 3 5 7
3 different conditions with 3 participants each
102
104. Answers
Question
Participants were asked to memorise a list of words. They were divided into several groups, each using a
different memorization technique. 60 minutes later, the experimenter assessed how many words they
could still remember (the dependent variable RECALL in the output). Which statement is correct?
103
A. The experimental setting had 3 conditions.
B. The total variance equals 4.91
C. The ANOVA test is significant (๐= 5%).
D. All answer are correct.
Answer: D
2. ANOVA
41.566
41.850
83.416
20.783
2.790
105. 2E. ANOVA
104
Question Solution
A. Correct. The degrees of freedom between
groups is given by the formula ๐ โ 1.
ร Degrees of freedom for โbetween
groupsโ is equal to โnumber of groups
minus 1โ (k-1). In our case we had 3
conditions so df=(3-1) = 2
B. Correct. The total variance can be found by
the formula ๐๐9:9;< =
==9
>?#
=
.7.@!A
!B
= 4.91
C. Correct. The ANOVA SPSS output has a p-
value of 0.006 for an F=7.447. The p-value is
smaller than the significance level 5%, thus
the test is significant.
D. Yes, they are all correct.
Participants were asked to memorise a list of
words. They were divided into several groups,
each using a different memorization technique.
60 minutes later, the experimenter assessed how
many words they could still remember (the
dependent variable RECALL in the output).
Which statement is correct?
41.566
41.850
83.416
20.783
2.790
106. Answers
Question
A sample of n= 35 participants was randomly selected from UM students pool. A baseline assessment
rated their arachnophobia. After undergoing 2 sessions of exposure therapy (to spiders), their
arachnophobia was measured again with the same scale. The researcher wants to see if the 2 sessions of
exposure therapy had a significant effect.
Should an ANOVA test be performed on this data set?
105
A. Yes, the normality assumptions hold since the sample size is big enough.
B. Yes, the equal variances assumptions is met because 35 participants were tested both times.
C. No, The independence assumption is violated.
D. Yes, the data is quantitative as their phobia is rated on scale.
Answer: C
3. ANOVA
107. 3E. ANOVA
Answers
A sample of n= 35 participants was randomly selected from UM students pool. A baseline assessment
rated their arachnophobia. After undergoing 2 sessions of exposure therapy (to spiders), their
arachnophobia was measured again with the same scale. The researcher wants to see if the 2 sessions of
exposure therapy had a significant effect.
Should an ANOVA test be performed on this data set?
106
Solution
A. Correct, but the main criteria for an ANOVA: independent groups is violated. Thus, an ANOVA is
not the suitable test here.
B. Incorrect, the same sample is tested twice (baseline and after exposure). We are not comparing
independent groups.
C. Correct, the same sample is tested twice (baseline and after exposure). We are not comparing
independent groups.
D. Correct, but the main criteria for an ANOVA: independent groups is violated. Thus, an ANOVA is
not the suitable test here.
108. Answers
Question
An experiment on the effect of listening to music on information retention is performed. A total sample
of 75 is divided into three equally large groups. All three groups are asked to memorized a list of words
while either (a) listening to Vivaldi, (b) listening to AC/DC, or (c) listening to crickets singing.
An analysis of variance is performed. It is concluded that the null hypothesis cannot be rejected.
What statement is correct?
107
A. MSG and MSE are both unbiased estimators of the error variance.
B. Since the null hypothesis is true, then the difference between groups is as large as difference within
groups.
C. There is no group effect.
D. All are correct
Answer: D
4. ANOVA
109. 4E. ANOVA
Question
An experiment on the effect of listening to music on information retention is performed. A total sample
of 75 is divided into three equally large groups. All three groups are asked to memorized a list of words
while either (a) listening to Vivaldi, (b) listening to AC/DC, or (c) listening to crickets singing.
An analysis of variance is performed. It is concluded that the null hypothesis cannot be rejected.
What statement is correct?
108
Solution
A. Correct. When H0 is rejected, it means that the difference between groups was caused by
uncontrolled factors (error). This means that the MS(G) is an unbiased estimator of error variance.
MSE is an unbiased estimator of error variance in any case.
B. Correct. The difference between groups is measured by MSG while the difference within groups is
measured by MSE. In the case of a true null hypothesis, both MSE and MSQ are unbiased estimators
of error variance, thus MSE=MSG
C. Correct. The H0 for ANOVA states that the means of all groups are equal, meaning that there is no
treatment effect.
D. Correct
110. 4E. ANOVA
109
Unbiased
Estimator
โข MSE is an unbiased estimator of error variance.
Pooled Variance
โข Since we already have the assumption that all populations have equal variance,
we can take the average of estimates.
๐๐" =
๐! โ 1 ร๐!
"
+ ๐" โ 1 ร๐"
"
+. . +(๐' โ 1)ร๐'
"
๐! โ 1 + ๐" โ 1 +. . +(๐' โ 1)
Conclusion
โข MSE = Sp
2
โข Accurate and efficient error estimate.
111. 4E. ANOVA
Random Variables
MSG and MSE count as random variables.
MSE and MSG as Estimators of Error Variance
If there is no group effect (๐ป+: true) MSE as well as MSG count as unbiased estimations of the error
variance.
Relation of MSE and MSG
MSE is the error (or noise)
MSG is the error + the effect of the group.
If ๐ป+ is true and there is no effect of the group
MSE and MSG will be approximately equal.
Another way to phrase this would be, the
difference between groups is as large as
difference within groups.
110
112. Answers
Question
Synesthesia is a perceptual phenomeneon in which there is an experience of 2 sensory/cognitive
pathways. Synesthesia has been linked to enhanced memory skills due to increased association
available. Anton wanders if there is a difference in memory recall between different synesthesia types.
He gathers 120 participants and within his sample, there are 4 different synesthesia types. Each group
has an equal number of participants. After a memorization period, Anton gives his participants a
memory test. Following an ANOVA, SSG = 167.91 and SSE = 1760.88
What can be concluded?
111
A. H0 not rejected with p-value > 0.05
B. H0 rejected with 0.025 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.05
C. H0 rejected with 0.01 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.025
D. Ho not rejected because Fobs< Fcritical
Answer: C
5. ANOVA
113. 5E. ANOVA
Question
Synesthesia is a perceptual phenomeneon in
which there is an experience of 2
sensory/cognitive pathways. Synesthesia has
been linked to enhanced memory skills due to
increased association available. Anton wanders if
there is a difference in memory recall between
different synesthesia types. He gathers 120
participants and within his sample, there are 4
different synesthesia types. Each group has an
equal number of participants. After a
memorization period, Anton gives his
participants a memory test. Following an
ANOVA, SSG = 167.91 and SSE = 1760.88
What can be concluded?
112
Solution
ร Calculate the degrees of freedom
๐๐ ๐บ = ๐ โ 1 = 4 โ 1 = 3
๐๐ ๐ธ = ๐ โ ๐ = 120 โ 4 = 116
ร Calculate the Mean Squares
๐๐ ๐บ =
๐๐๐บ
๐๐(๐บ)
=
167.91
3
= 55.97
๐๐ ๐ธ =
๐๐๐ธ
๐๐(๐ธ)
=
1760.88
116
= 15.18
ร Calculate the F-value
๐น =
๐๐(๐บ)
๐๐(๐ธ)
=
55.97
15.18
= 3.687
ร By taking a look at the F-table we see that for
ฮฑ=0.05, the Fc(3.116)=2.70, which means the
null hypothesis is rejected
๐นC%= > ๐นD
ร On the next pages we see that for ฮฑ=0.01, the
Fc = 3.98
0.01 โค ๐ โ ๐ฃ๐๐๐ข๐ โค 0.025
114. Answers
Question
Based on the ANOVA output, which of the following statements are correct?
113
A. The scores on the dependent variable likely vary due to residual effects only.
B. The scores on the dependent variable likely vary due to residual effects and group effect.
C. The scores on the dependent variable likely vary due to group effect only
D. The scores on the dependent variable likely do not vary due to residual effects nor due
to the group effect.
Answer: B
6. ANOVA
Sum of Squares df Mean Square F Sig
Between Groups 126 1 126 4.4843 ?
Within Groups 1630 58 28.1034
Total 1756 59
115. 6E. ANOVA
Question
Based on the ANOVA output, which of the following statements are correct?
114
Solution
ร Using the F-table, we can see that for ฮฑ=0.05, the ๐น๐ 1.58 = 4.03
ร The Fobs is bigger than the Fc, meaning that the null hypothesis is rejected
ร There is an overall treatment effect, thus not all group means are the same
ร However, error cannot be controlled for, so it is always there
Scores likely vary due to treatment/group effect AND error/residual factors
Sum of Squares df Mean Square F Sig
Between Groups 126 1 126 4.4843 ?
Within Groups 1630 58 28.1034
Total 1756 59
116. Answers
Question
Maja conducted a study with 5 conditions and 30 participants in total have been recruited.
Choose the correct statement:
115
A. F = 21.801, not significant
B. F = 17.474, not significant
C. F = 19.625, significant
D. F = 18.926, significant
Answer: D
7. ANOVA
?
?
?
?
?
? ?
?
2244.500
9041.367
117. 7E. ANOVA
Question
Maja conducted a study with 5 conditions and 30
participants in total have been recruited.
Choose the correct statement:
116
Solution
1) Calcualte the SS(G):
๐๐๐ = ๐๐๐บ + ๐๐๐ธ
๐๐๐บ = ๐๐๐ โ ๐๐๐ธ
๐๐๐บ = 9041.367 โ 2244.5 = 6796.867
2) Calcualte degrees of freedom:
๐๐ ๐บ = ๐ โ 1 = 5 โ 1 = 4
๐๐ ๐ธ = ๐ โ ๐ = 30 โ 5 = 25
๐๐ ๐ = ๐ โ 1 = 30 โ 1 = 29
3) Calculate Mean squares:
๐๐ ๐บ =
๐๐๐บ
๐๐(๐บ)
=
6796.867
4
= 1699.217
๐๐ ๐ธ =
๐๐๐ธ
๐๐(๐ธ)
=
2244.5
25
= 89.780
4) Calculate F-value:
๐น =
๐๐๐บ
๐๐๐ธ
=
1699.217
89.780
= 18.926
5) Use the F-table to reach yout decision:
๐น๐ 4,25 = 2.76 โ ๐น๐๐๐ > ๐น๐ โ ๐๐๐๐๐๐๐๐๐๐๐ก
?
2244.500
9041.367
?
?
?
?
?
? ?
118. Answers
Question
Micheal is a sports enthusiast. He wants to investigate which form of excersise leads to better
concentration. He recruits 75 participants and assigns them randomly to 3 groups (cardio, weights,
crossfit). He later measures their concentration levels and compares the means of the groups.
Given that Micheal ended up rejecting the null hypothesis, which of the following is correct?
117
A. There is no difference in concentration levels between groups
B. Micheal can confidently say that cardio is better than weights
C. Micheal needs an extra statistical analysis
D. There is no treatment effect
Answer: C
8. ANOVA
119. 8E. ANOVA
Question
Micheal is a sports enthusiast. He wants to investigate which form of excersise leads to better
concentration. He recruits 75 participants and assigns them randomly to 3 groups (cardio, weights,
crossfit). He later measures their concentration levels and compares the means of the groups.
Given that Micheal ended up rejecting the null hypothesis, which of the following is correct?
118
Solution
A. Incorrect. The null hypothesis states that all group means are the same (no treatment effect). By
rejecting the null hypothesis we can confidently say that not all group means are the same.
B. Incorrect. By rejecting the null hypothesis, we know that not all group means are the same, however
we do not know where the difference is exactly (e.i., between which groups).
C. Correct. If we want to uncover the exact nature of the group difference, we need to conduct
multiple comparisons.
D. Incorrect. Null hypothesis was rejected, thus there is treatment effect.
120. Answers
Question
Micheal did conduct multiple comparisons to examine the differences between groups. What can be
concluded based on the SPSS output?
119
9. ANOVA
Dependent Variable: Concentration scores
LSD
(I) Group (J) Group Mean
Difference
Std. Error Sig. 95% Confidence Interval
Lower
Bound
Upper
Bound
Cardio Weights 0.1762 0.5102 0.730 -.8338 1.1861
Crossfit 1.4606 0.5470 0.009 .3778 2.5435
Weights Cardio -.1762 0.5102 0.730 -1.1861 .8338
Crossfit 1.2844 0.5696 0.026 .1569 2.4119
Crossfit Cardio -1.4606 0.5470 0.009 -2.5435 -.3778
Weights -1.2844 0.5696 0.026 -2.4119 -.1569
121. Answers
Question
Micheal did conduct multiple comparisons to examine the differences between groups. What can be
concluded based on the SPSS output?
120
A. There are 2 statistically significant comparisons
B. There is 1 statistically significant comparison
C. All three comparisons are statistically significant
D. None of the comparisons reaches significance
Answer: B
9. ANOVA
122. 9E. ANOVA
Question
Micheal did conduct multiple comparisons to
examine the differences between groups. What
can be concluded based on the SPSS output?
121
Family-wise Type 1 error
In a multiple comparison the ฮฑ-value of each
comparison is added up. Hence, the chance of
making a Type I Error increases
Solution
ร While the output does show 2 comparisons
that reach significance (cardio-crossfit,
weights-crossfit), no Bonferroni correction
has been appied for the family-wise Type 1
error.
ร By applying the Bonferroni correction
(multiply p-value by number of comparisons),
we see that only the comparison between
cardio and crossfit remains significant
Bonferroni Correction
1. Multiply p-value by number of comparisons
Or
2. Divide significance level by number of
comparisons
Number of comparisons: (k(k-1))/2)
123. Answers
Question
Given that the groups have equal sample sizes and the following output, which statement is correct?
122
A. The normality assumption was violated, so the test should not have been done
B. An independent samples t-test could be done instead of ANOVA
C. MSE is smaller than MSG, hence the treatment effects are significant
D. If the test is significant, multiple comparisons are the necessary next step
Answer: C
10. ANOVA
Sum of Squares df Mean Square F Sig
Between Groups 126 1 126 4.4843 ?
Within Groups 1630 58 28.1034
Total 1756 59
124. 10E. ANOVA
Question
Given that the groups have equal sample sizes, which statement is correct, given the following output?
123
Solution
A. Incorrect. We can see that our sample size is 60 (N-1=59 ร N=60). Given that each group has 30
participants, the CLT can be applied, thus the test is robust against a normality violation
B. Correct. Since we have just 2 groups, an independent samples t-test would be equivalent to this
ANOVA.
C. Incorrect. It might be that MSE is smaller than MSG, thus F is bigger than 1, but we always have to
rely on the p-value which tells us whether the result is actually significant
D. Incorrect. Since we only have two groups, if the test is significant, we can immediately tell between
which groups there is a difference, thus it is not a necessity to conduct multiple comparisons.
However if we want to see how the difference will look like, we can continue on with them.
Sum of Squares df Mean Square F Sig
Between Groups 126 1 126 4.4843 ?
Within Groups 1630 58 28.1034
Total 1756 59
126. Answers
Question
Florian is the new general manager at Success Formula, replacing Michalina. Success formula offers
courses in Psychology, Business Economics and Law. During the time Michalina was GM, 60% the
student population at SF attended Business Economics courses, 25% Psychology courses and 15% Law
courses. After an intense marketing campaign, Florian believes that this year, things will be different.
In a simple random sample of 275 students, 145 of them chose B/E courses, 75 choose psychology and
55 choose law. Based on the data, Florian wants to test whether the population distribution of field
choice will change or will it be the same as during Michalinaโs reign as GM.
Does the result from the sample give sufficient evidence?
125
A. No, the null hypothesis is not rejected with the observed value of the statistic test equal to 1.23
B. Yes, the null hypothesis is rejected with the observed value of the statistic test equal to 7,57
C. No, the null hypothesis is not rejected with the observed value of the statistic test equal to 2.50
D. Yes, the null hypothesis is rejected with the observed value of the statistic test equal to 9.93
Answer: B
1. Proportions and Entire Distributions
127. 1E. Proportions and Entire Distributions
Question
Florian is the new general manager at Success Formula, replacing Michalina. Success formula offers
courses in Psychology, Business Economics and Law. During the time Michalina was GM, 60% the
student population at SF attended Business Economics courses, 25% Psychology courses and 15% Law
courses. After an intense marketing campaign, Florian believes that this year, things will be different.
In a simple random sample of 275 students, 145 of them chose B/E courses, 75 choose psychology and
55 choose law. Based on the data, Florian wants to test whether the population distribution of field
choice will change or will it be the same as during Michalinaโs reign as GM.
Does the result from the sample give sufficient evidence?
126
Solution
ร We see that we have only 1 variable (count of students) which has more than 2 levels (3)
ร We want to see how well the sample distribution fits a specific model
ร We have to use the X2 Goodness of Fit Test
128. 1E. Proportions and Entire Distributions
Data
Model: BE(60%)-Psy(25%)-Law(15%)
N = 275
H0: Distribution within sample fits the model
Hฮฑ: Distribution within sample does not fit
model
127
Solution
ร Calculate Expected Counts [๐ธ๐ = ๐ร๐ ๐ ]
โข B/E: 275 x 0.6 = 165
โข Psy: 275 x 0.25 = 68.75
โข Law: 275 x 0.15 = 41.25
ร Calculate the chi-square
๐ฅ! = ฮฃ
๐๐ถ โ ๐ธ๐ถ !
๐ธ๐ถ
๐ฅ!
=
145 โ 165 !
165
+
75 โ 68.75 !
68.75
+
55 โ 41.25 !
41.25
๐ฅ! = 2.42 + 0.57 + 4.58
๐๐ = ๐. ๐๐
ร Check the x2 table for the p-value
๐. ๐๐ โค ๐ โ ๐๐๐๐๐ โค ๐. ๐๐๐
We see that the p-value should be lower than
0.05, thus the H0 that the distribution within the
sample fits the model is rejected.
Students
Business/Economics 145
Psychology 75
Law 55
129. 1E. Proportions and Entire Distributions
When to Use
Data type: categorical data
ร Check how well a proposed proportion
distribution fits with an observed one.
๐ป#: ๐โ๐ ๐๐๐ ๐ก๐๐๐๐ข๐ก๐๐๐ ๐ค๐๐กโ๐๐ ๐กโ๐ ๐ ๐๐๐๐๐
๐๐๐ก๐ ๐๐ข๐ ๐๐ฅ๐๐๐๐ก๐๐ก๐๐๐
๐ป$: ๐โ๐ ๐๐๐ ๐ก๐๐๐๐ข๐ก๐๐๐ ๐ค๐๐กโ๐๐ ๐กโ๐ ๐ ๐๐๐๐๐
๐๐๐๐ ๐๐๐ก ๐๐๐ก ๐๐ข๐ ๐๐ฅ๐๐๐๐ก๐๐ก๐๐๐
Degrees of Freedom
Nationality
of class
Dutch 0.2
German 0.5
Belgian 0.2
French 0.1
Formula
ฮง!= ฮฃ
ObsโExp !
Exp
Assumptions
โข Categorical Data
โข Expected Counts >5
EC = N*p(e)
Df = # of cells โ 1
df = 4-1 = 3
128
130. Answers
Question
Andreia has been researching the effectiveness of dialectical behavior therapy (DBT), a type of
cognitive behavioural therapy, for the development of healthy ways to cope with stress and emotion
regulation. She wonders whether DBT has different efficiency levels for different types of populations.
She decides to take two samples, one of people exhibiting eating disorders and one of people with
substance use disorders. After several sessions, Andreia and her team, note for each subject if there was
improvement or not. Andreia is the first researcher to conduct such a study, so she does not know how
the different disorders can have an effect on improvement.
What can be concluded?
129
2. Proportions and Entire Distributions
Improvement
Yes No
Disorder Eating Disorders 148 112 260
Substance use
Disorders
173 102 275
321 214 535
131. Answers
Question
Andreia has been researching the effectiveness of dialectical behavior therapy (DBT), a type of
cognitive behavioural therapy, for the development of healthy ways to cope with stress and emotion
regulation. She wonders whether DBT has different efficiency levels for different types of populations.
She decides to take two samples, one of people exhibiting eating disorders and one of people with
substance use disorders. After several sessions, Andreia and her team, note for each subject if there was
improvement or not. Andreia is the first researcher to conduct such a study, so she does not know how
the different disorders can have an effect on improvement.
What can be concluded?
130
A. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.98
B. The null hypothesis is rejected with the observed value of the statistic test equal to 1.36
C. The null hypothesis is not rejected with the observed value of the statistic test equal to -1.36
D. The null hypothesis is rejected with the observed value of the statistic test equal to -2.71
Answer: C
2. Proportions and Entire Distributions
132. 2E. Proportions and Entire Distributions
Data
ร We now compare 2 independent samples
ร The dependent variable is dichotomous
ร We have to use a 2 proportion z-test
๐ป#: ๐% = ๐!
๐ป$: ๐% โ ๐!
๐1 =
๐ฅ%
๐%
=
148
260
= 0.57
๐2 =
๐ฅ!
๐!
=
173
275
= 0.63
๐ =
๐ฅ% + ๐ฅ!
๐% + ๐!
=
148 + 173
260 + 275
= 0.6
131
Solution
ร Calculate the Z
๐ =
๐1 โ ๐2 โ (๐1 โ ๐2)
๐ < (1 โ ๐) <
1
๐1 +
1
๐2
๐ =
0.57 โ 0.63
0.6(1 โ 0.6) <
1
260 +
1
275
๐ =
โ0.06
0.49 < 0.09
= โ1.36
ร Look at the Z-table for the p-value
P-value(z=-1.36)= 0.0869
ร Double the p-value since it is a two-tailed test
2x0.0869 = 0.1738 > 0.05
The null hypothesis cannot be rejected.
133. 2E. Proportions and Entire Distributions
When to Use
Comparing the proportion of two groups
(categorical data).
๐ป#: ๐% = ๐!
๐ป$: ๐% โ ๐!(two-sided)
๐ป$: ๐% < ๐!or ๐ป$: ๐% > ๐!(one-sided)
Assumptions:
โข Categorical variables
ร dichotomous
โข Independent groups
โข Normality
- always violated
- Central Limit Theorem
Formulas and Application
Z-score =
('
(!) *
("))#
,-
Estimate: โข
๐% โ โข
๐!
SE (for z-test):
'
(!โ(%)*
(%)
/!
+
'
("โ(%)*
(!)
/"
Confidence Interval
p1 โ p2 ยฑ ๐!
"#(#%"#)
'#
+
"((#%"()
'(
132
134. Answers
Question
Refer back to the previous question. What is the 95% confidence interval?
133
A. [0.063, 0.015]
B. [-0.014, 0.023]
C. [-0.053, 0.090]
D. [1.678, 3.683]
Answer: B
3. Proportions and Entire Distributions
135. 3E. Proportions and Entire Distributions
Question
Refer back to the previous question. What is the 95% confidence interval?
134
Solution
๐1 โ ๐2 ยฑ ๐๐ <
๐1 1 โ ๐1
๐1
+
๐2 1 โ ๐2
๐2
0.57 โ 0.63 ยฑ 1.96 <
0.57 < 0.43
260
+
0.63 < 0.37
275
โ0.06 ยฑ 1.96 < 0.042
[โ0.014, 0.023]
136. Answers
Question
Nik wants to see if there is association between the presence of neuroscientific evidence (1=no, 2=yes)
and juror verdicts (not guilty=1, not guilty due to insanity=2 guilty=3).
What can be concluded based on the table?
135
4. Proportion and Entire Distribution
Neuroscientific Evidence
No Yes
Verdict Not Guilty 32 29 61
Not Guilty due
to insanity
55 61 116
Guilty 10 13 23
97 103 200
137. Answers
Question
Nik wants to see if there is association between the presence of neuroscientific evidence (1=no, 2=yes)
and juror verdicts (not guilty=1, not guilty due to insanity=2 guilty=3).
What can be concluded based on the table?
136
A. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.67
B. The null hypothesis is rejected with the observed value of the statistic test equal to 1.30
C. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.20
D. The null hypothesis is rejected with the observed value of the statistic test equal to 0.65
Answer: A
4. Proportion and Entire Distribution
138. 4E. Proportion and Entire Distribution
Data
ร We want to study the relationship of two
categorical variables
ร We use a contigency table
ร We use the chi-square test for contigency
tables
Expected Counts:
๐ธ๐ถ =
๐๐๐ก๐๐ ๐๐๐ค < ๐ก๐๐ก๐๐ ๐๐๐๐ข๐๐
๐
137
Solution
ร Caclualte the chi-square
๐ฅ! = ฮฃ
๐๐ถ โ ๐ธ๐ถ !
๐ธ๐ถ
๐!
=
32 โ 29.585 !
29.585
+
55 โ 56.26 !
56.26
+
10 โ 11.155 !
11.155
+
29 โ 31.415 !
31.415
+
61 โ 59.740 !
59.740
+
13 โ 11.845 !
11.845
๐ฅ!
= 0.197 + 0.028 + 0.119 + 0.186 + 0.026 + 0.113
๐! = 0.669 = 0.67
ร Calculate df
๐๐ = #๐๐๐ค๐ โ 1 < #๐๐๐๐ข๐๐๐ โ 1
= 3 โ 1 < 2 โ 1 = 2
ร Check the p-value
The p-value looks to be greater than 0.25, thus
the null hypothesis cannot be rejected.
No Yes
Not Guilty 32
(29.585)
29
(31.415)
Not Guilty due
to Insanity
55
(56.26)
61
(59.740)
Guilty 10
(11.155)
13
(11.845)