Probability Assignment Help

For any help regarding Statistics Assignment Help
Visit : https://www.statisticsassignmenthelp.com/,
Email - support@statisticsassignmenthelp.com
or call us at - +1 678 648 4277
statisticsassignmenthelp.com

Problem 1. Confident coin.
Recall the confident coin from pset 7. It was spun on its edge 250 times and came up heads 140 times. Our null
hypothesis was that the coin was fair and the two-sided p-value came out as 0.066.
(a) In this problem we want you to make one plot for each of (i) θ = 0.5, (ii) θ = 0.3, (iii)
θ= 0.1.
Your plot should include:
• the pmf p(x|θ) of the binomial(250, θ) distribution
• the pdf of the normal distribution with the same mean and variance
• the normal distribution with the same mean (as the binomial distribution) and the rule
of-thumb variance to your plot.
The rule-of-thumb variance for a Bernoulli(θ) distribution is 1/4, i.e. we use the variance for θ = 0.5 no matter what
the true value of θis. So the rule-of-thumb variance for a binomial(n, θ) is just n/4.
Add the graph of In each case, how close are each of the normal distributions to the binomial distribution? How do the
two normal distributions differ? Based on your plots, for what range of θdo think the rule-of-thumb normal distribution
is a reasonable approximation for binomial(n, θ) with large n?
(b) Find 80% and 95% confidence intervals for θusing the polling rule-of-thumb.
(c)Find an 80% posterior probability interval for θ using a flat prior, i.e. beta(1, 1). Center your interval between the
0.1 and 0.9 quantiles. Hint: Use qbeta(p, a, b).
Problem 2. Height.
Suppose μ is the average height of a college male. You measure the heights (in inches) of twenty college men, getting data
x1 ,..., x20, with sample mean x = 69.55 in. and sample variance s2 = 14.26 in2. Suppose that the xi are drawn from a
normal distribution with unknown mean μ and unknown variance σ2.
(a) Using x and s2, construct a 90% t–confidence interval for μ.
(b)Now suppose you are told that the height of a college male is normally distributed with standard deviation 3.77 in.
Construct a 90% z–confidence interval for μ.
(c)In (b), how many people in total would you need to measure to bring the width of the 90% z–confidence interval
down to 1 inch?
(d)Consider again the case of unknown variance in (a). Based on this sample variance of 14.26 in2, how many people
in total should you expect to need to measure to bring the width of the 90% t–confidence interval down to 1 inch? Is it
guaranteed that this number will be sufficient? Explain your reasoning.
Problem 3. Soda.
Consider a machine that is known to fill soda cans with amounts that follow a normal distribution with (unknown)
mean μ and standard deviation σ = 3 mL. We measure the

volume of soda in a sample of bottles and obtain the following data (in mL):
352, 351, 361, 353, 352, 358, 360, 358, 359
(a) Construct a precise 95% confidence interval for the mean μ.
(b) Now construct a 98% confidence interval for the mean μ.
(c)Suppose now that σ is not known. Redo parts (a) and (b), and compare your answers to those above.
Problem 4. Cholesterol.
In a study on cholesterol levels a sample of 12 men and women was chosen. The plasma cholesterol levels (mmol/L) of the
subjects were as follows:
6.0, 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, 5.8.
(a) Estimate the variance of the plasma cholesterol levels with a 95% confidence interval.
(b) What assumptions did you make about the sample in order to make your estimate?
Problem 5. Candy.
A large candy manufacturer produces packs of candy targeted to weigh 52 grams. A quality control manager was concerned that the
variation in the actual weights was larger than acceptable. That is, he was concerned that some packs weighed significantly less than 52
grams and some weighed significantly more than 52 grams. In order to estimate σ, the standard deviation of the weights of the
nominal 52-gram packs, he took a random sample of n = 10 packs off of the factory line. The random sample yielded a sample variance
of 4.2 grams.
(a)Use the random sample to derive a 95% confidence interval for σ. Note: the interval is for σ not σ2.
(b) What assumptions did you make about the sample in order to make your estimate.

Problem 1. (a) We have x ∼ binomial(n, θ), so E ( X ) = nθ and Var(X) =
n
nθ(1 − θ). The rule-of-thumb variance is just . So the distributions being plotted are
4
binomial(250, θ), N(250θ, 250θ(1 − θ)), N(250θ, 250/4).
Note, the whole range is from 0 to 250, but we only plotted the parts where the graphs were not all 0.
We notice that for each θthe blue dots lie very close to the red curve. So the N(nθ, nθ(1−θ)) distribution is quite close to
the binomial(n, θ) distribution for each of the values of θ considered. In fact, this is true for all θ by the Central
Limit Theorem. For θ = 0.5 the rule-of-thumb gives the exact variance. For θ = 0.3 the rule-of-thumb
approximation is very good: it has smaller peak and slightly fatter tail. For θ = 0.1 the rule-of-thumb approximation
breaks down and is not very good.
In summary we can say two things about the rule-of-thumb approximation:
1. It is good for θnear 0.5 and breaks down for extreme values of θ. 2. Since the rule-of- thumb overestimates the variance
(the rule-of-thumb graphs are shorter and wider) it gives us a confidence interval that is larger than is srictly necessary.
That is a 95% rule-of-thumb interval actually has a greater than 95% confidence level.
(b) Using the rule-of-thumb approximation, we know that x¯is approximately N(θ, 1/4n).
For an 80% confidence interval, we have α = 0.2 so
zα/2 = qnorm(0.9,0,1) = 1.2815.
So the 80% confidence interval for θis given by
z0.1
z 0.1
x¯−
2
√
n
, x¯+
2
√
n
= [0.5195, 0.6005]

For the 95% confidence interval, we use the rule-of-thumb that z0.025 ≈ 2. So the confidence
interval is 1 1
x̄ − √
n
, x̄ + √
n
= [0.497, 0.623]
It’s okay to have used the exact value of z0.025. This gives a confidence interval:
1.96 1.96
x̄ −
2
√
n
, x̄ +
2
√
n
= [0.498, 0.622]
(c) With prior Beta(1, 1), if observe x and then the posterior is Beta(x + 1, 250 + 1 − x). In our case x = 140. So,
using R we get the 80% posterior probability interval:
prob interval = [qbeta(0.1, 141, 111), qbeta(0.9, 141, 111)]
= [0.51937, 0.5995]
This is quite close to the 80% confidence interval. Though the two intervals have very different technical meanings, we
seethat they are consistent (and numerically close). Both give a type of estimate of θ.
Problem 2. (a) We have n = 20 and α = 0.1 so
tα/2 = qt(0.05,19) = 1.7291.
Thus the 90% t-confidence interval is given by
s s
x − tα/2 · √
n
, x + tα/2 · √
n
= [68.08993, 71.01007]
Given that the sample mean and variance are only reported to 2 decimal places the extra digits are a spurious precision.
It is worth noting that to the given precision the 90% confidence interval is [68.08, 71.02]. (The problem did not ask
you to do this.)
(b) We have zα/2 = qnorm(0.05) = 1.6448.
So the 90% z-confidence interval is given by
σ σ
x − zα/2 · √
n
, x + zα/2 · √
n
= [68.15839, 70.94161]
As in part (a) taking the precision of the mean into account we get the interval [68.16, 70.94].
(c) We need n such that 2 · z0.05 · σ/
√
n = 1. So n = (2 · z0.05 · σ)2 = 153.8. Since you
n = 154
need a whole number of people the answer is .
(d) We need to find n so that 2 · t0.05/
√
n = 1. Because the critical value t0.05 depends on
n the only way to find the right n is by systematically checking different values of n.
n = 157
t05 = qt(0.95,n-1) = 1.6547
width = (2*sqrt(s2)*t05/sqrt(n)) = 0.99736 (very close to 1 ) .
(Our actual code used a ‘for loop’ to run through the values n = 130 to n = 180 and print the width to the screen for
each n.)

We find n = 157 is the first value of n where the width 90% interval is less than 1. This is not guaranteed. In an actual
experiment the value of s2 won’t necessarily equal 14.26. If it happens to be smaller then then the 90% t confidence
interval will have width less than 1.
Problem 3. (a) The sample mean is x¯= 356. Since z0.025 = 1.96, σ = 3 and
n = 9, the 95% confidence interval is
σ σ
x¯− z0.025 · √
n
, x¯+ z0.025 · √
n
= [354.04, 357.96]
(b)
We have z0.01 = qnorm(0.99) = 2.33. So the 98% confidence interval is
σ σ
x¯− z0.01 · √
n
, x¯+ z0.01 · √
n
= [353.67, 358.33].
(c) The sample variance is
2
s =
Σ
i
(x − x¯) 2
= var([352 351 361 353 352 358 360 358 359]) = 15.5
n − 1
Since n = 9 the number of degrees of freedom for the t-statistic is 8.
Redo (a): t8,0.025 = qt(0.975, 8) = 2.306. So the 95% confidence interval is
s s
x̄ − t8,0.025 · √
n
, x̄ + t8,0.025 · √
n
≈ [352.97, 359.03].
Redo (b): t8,0.01 = qt(0.99, 8) = 2.896. So the 98% confidence interval is
s s
x¯− t8,0.01 · √
n
, x¯+ t8,0.01 · √
n
≈ [352.20, 359.80].
These intervals are larger than the corresponding intervals from parts (a) and (b). The new uncertainly regarding the
value of σ means we need larger intervals to achieve the same level of confidence. This is reflected in the fact that the t
distribution has fatter tails than the normal distribution).
Problem 4. (a) This is similar to problem 3c. We assume the data is normally distributed with unknown mean μ and
variance σ2. We have the number of data points n = 12. Using Matlab we find
data = [6 .0 , 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, 5 .8 ];
i
x̄ =
Σ
x
= mean(data) = 6.1917
n
2
s =
Σ
i
(x − x¯) 2
= var(data) = 0.15356
n − 1
c0.025 =
c0.975 =
qchisq(0.975, 11) = 21.920
qchisq(0.025, 11) = 3.8157

So the 95% confidence interval is
(n − 1) · s2 (n − 1) · s2
, = [0.077060, 0.442683].
c0.025 c0.975
s2 is our point estimate for σ2 and the confidence interval is our range estimate with 95% confidence.
(b) We have assumed that the plasma cholesterol levels are independent and normally distributed. This might not be a
good assumption because cholesterol for men and women might follow different distributions. We’d have to do further
exploration to understand this.
Problem 5. (a) We have n = 10 and s2 = 4.2 Assuming that the weights are
2
2 2
9
∼ χ . We
(n− 1)s
σ2
normally distributed with mean μ = 52 and variance σ , weknow that have
c0.025 =
c0.975 =
qchisq(0.975, 9) = 19.023
qchisq(0.025, 9) = 2.7004
The 95% confidence interval for σ is given by
⎡
⎣ ,
2 2
s (n − 1) s (n − 1)
= [1.4096, 3.7414]
⎤
c0.975 c0.025
⎦
(b) In order to use a χ2 confidence interval we assumed that the weights of the packs of candy are independent and
normally distributed with mean 52 and variance σ2.

Probability Assignment Help

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Probability Assignment Help

Ähnlich wie Probability Assignment Help (20)

Mehr von Statistics Assignment Help

Mehr von Statistics Assignment Help (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Probability Assignment Help