Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

10. sampling and hypotehsis

sampling and hypothesis

  • Loggen Sie sich ein, um Kommentare anzuzeigen.

10. sampling and hypotehsis

  1. 1. Sampling Hypothesis Testing
  2. 2. Census & Sample-  Census/Complete Enumeration survey method, data are collected for each and every unit of the population/ universe which is the complete set of items which are of interest in any particular situation.  Sample is used to describe a portion chosen from sample.
  3. 3. Sampling  Study of sample is sampling.  In sampling technique instead of every unit of the population only a part of the population is studied and the conclusions are drawn on that basis for the entire universe.  A process of learning about the population on the basis of a sample drawn from it.
  4. 4. Methods of Sampling Probability/Random Non-Prob/Non-random -Simple/ Unrestricted -Stratified -Systematic -Cluster/Multistage -Judgment/ Purposive -Quota -Convenience
  5. 5. Statistic & Parameter-  Statistics: describes the characteristics of a sample.  The values obtained from the study of sample such as mean, median,standard deviation etc.  Parameter: describes the characteristics of a population.  The values obtained from the population such as mean, median, standard deviation etc.
  6. 6. POPULATION SAMPLE DEFINITION Collection of items being considered Part or portion of the population chosen for study PROPERTIES Parameters statistics SYMBOLS Population size=N Sample size=n Population mean= Sample mean= Population s.d =  x Sample s.d= s
  7. 7. Sampling Error-  The difference between the result of studying a sample and inferring a result about the population , and the result of the census of the whole population.  The error arising due to drawing inferences about the population on the basis of few observations(sampling).  Two types:  Biased Errors  Unbiased Errors
  8. 8.  Sampling error is non-existent in complete enumeration survey.  Non-sampling errors: errors that occur in acquiring, recording or tabulating statistical data that cant be ascribed to sampling error.  They may arise in either a census or sample.
  9. 9. Statistical hypothesis A statement about the population parameter. Assertion or assumption, that we make about a population parameter, which may or may not be valid, but is used as a basis for reasoning. Hypothesis testing: A process of testing a statement or belief about a population parameter by the use of information collected from a sample(s).
  10. 10. Hypothesis Testing  Null Hypothesis: It means that there is no real difference in the sample and the population in the particular matter under consideration.  Denoted by Ho.
  11. 11. Null hypothesis • States that the “null” condition exists • There is nothing new happening • The old standard is correct • The old theory is still true • The system is in control. Key word- difference is not significant
  12. 12. Alternative hypothesis Alternative hypothesis is complementary to null hypothesis and specifies those values that the researcher believes to hold true. Denoted by Ha The two hypothesis are such that if one is accepted, the other is rejected.
  13. 13. Alternative hypothesis  The new theory is true  Something is happening.  There are new standards  The system is out-of-control, Key word- difference is significant i.e results of the experiment is unlikely due to chance, reject null hypothesis.
  14. 14. CASELETS  Flour packaged by a manufacturer is supposed to weigh on an average 40 ounces.  The manufacturer wants to test the packaging process Null hypothesis: the average weight of the Packages is 40 ounces(no problem). Alternative hypothesis: the average is not 40 ounces (process is out-of control)
  15. 15.  A Company has found mean life time of fluorescent light bulbs are 1600 hrs.  Due to improvement in technical effort, officials believe that now, life time of bulbs is greater than 1600 hrs. NULL HYPOTHESIS: Life time of bulbs is still 1600 hrs (OLD IDEA) ALTERNATE HYPOTHESIS: Life time of bulbs is greater than 1600 hrs. (NEW THEORY)
  16. 16. • You are investigating the effects of a new pain reliever. • Hope the new drug relieves pains longer than the leading pain reliever. Null hypothesis: the new pain reliever is no better than the leading pain reliever. Alternate hypothesis: the new pain reliever lasts longer than the leading pain reliever.
  17. 17. •Automobile manufacturer claims a new model gets at least 27 miles per gallon. A consumer group disputes this claim and would like to show the mean miles per gallon is lower. ( H0:   27 and Ha: < 27) •A freezer is set to cool food to 10o. If temperature is higher, the food could spoil, and if it is lower, the freezer is wasting energy. Random freezers are selected and tested as they come off the assembly line. The assembly line is stopped if there is any evidence to suggest improper cooling. H0:  = 10 and Ha:   10
  18. 18. Level of Significance  To test the validity of Ho against that of Ha at as certain level of significance.  The risk with which an experimenter rejects or retains- a null hypothesis depends upon the significance level adopted.  5%: Prob. of rejecting the null hypothesis if it is true.  Denoted by  - is specified before the samples are drawn.
  19. 19. Accept the null hypothesis if the sample statistic falls in this region Rejection /Critical Region Acceptance Region Reject the null hypothesis if the sample statistic falls in these two regions.
  20. 20. Errors in sampling  Type I Error : Reject Ho when it is true P{ Reject Ho/ Ho is true} =   Type II Error: Accept Ho when it is false. P{Accept Ho/Ha is true} = 
  21. 21. Decision Ho is true Ho is false Accept Ho Correct Decision with confidence (1- ) Error-II() Reject Ho Error-I( ) Correct Decision (1- ) P{Reject a lot when it is good) =  (Producer's Risk) P{Accept a lot when it is bad} =  (Consumer’s risk)
  22. 22. Note: 1. Would like  and  to be as small as possible. 2.  and  are inversely related. 3. Usually set  (as .01 or .05) 4. 1-  : known as confidence coefficient/ degree of confidence. 5. 1 -  : THE POWER OF THE STATISTICAL TEST. A measure of the ability of a hypothesis test to reject a false null hypothesis. 6. Regardless of the outcome of a hypothesis test, we never really know for sure if we have made the correct decision.
  23. 23. Two-tailed Test  The alternative hypothesis states that the population parameter may be either less than or greater than the value stated in Ho.  Ho:  = o Ha:   o –The rejection region is located in both the tails.
  24. 24. One-tailed Test  The alternative hypothesis states that the population parameter differs from the value stated in H0 in one particular direction.  Ho:   o Ha:  > o (Right Tailed test)  Ho:  o Ha:  < o (Left Tailed test) – The critical region is located only in one tail of the sampling distribution.
  25. 25. One-tailed Test  Right/Upper-tail Critical  Left/Lower-tail Critical
  26. 26. Critical values/Significant values Value of test statistic which separates The critical (rejection) region and the Acceptance region. Depends on 1) Level of significance 2) Type of tail
  27. 27. Standard Normal Distribution z 2 z 2 (1 )  2 --5 --4 --3 -2 --1 0 1 2 3 4 5 . . . . Z  2
  28. 28. Summary of certain Critical Values for Sample Statistic z Level of Significance Rejection Region =0.10 =0.05 =0.01 =0.005 One-tailed 1.28 1.645 2.33 2.58 Two-tailed 1.645 1.96 2.58 2.81
  29. 29. Sampling Distribution of a Statistic Probability distribution of all possible values statistic may assume, when computed from random samples of same size, drawn from a specified population.
  30. 30. • Draw ‘k’ samples of size n from given finite population of size N. • Compute some statistic like mean, variance etc.. for each of these k samples. • The set of the values of the statistic so obtained (one for each sample) constitutes the ‘sampling distribution of the statistic’
  31. 31. Properties of Sampling Distribution of Mean-  The arithmetic mean of the sampling distribution of means is equal to the mean of the population from which sample were drawn. x   •Sampling distribution of means is normally distributed ( irrespective of the distribution of the universe)
  32. 32. The sampling distribution of mean has a Standard deviation ( A Standard Error) equal to the population standard deviation divided by square root of sample size. n S E X  .   Standard error The standard deviation of the sampling distribution of a statistic about population parameter is known as its standard error
  33. 33. Test of Significance for single mean (Large Samples)  If xi, i= 1,2,..,n is a random sample of size ‘n’ from a normal population with mean  and standard deviation  , then the sample mean is distributed normally with mean  and standard deviation  n 2  x N( , ) n 
  34. 34.  For large samples, standard normal variate is  Test statistic=  value of sample statistic-value of hypothesized population parameter Standard error of statistic x z    n 
  35. 35. Procedure in hypothesis testing  Formulate a Hypothesis  Set up suitable significance level  Select test criterion  Compute ‘z’  Make decisions.
  36. 36. Case-Let  A company manufacturing automobile tyres finds that tyre life is normally distributed with a mean of 40,000 km and standard deviation of 3000 km. It is believed that a change in the production process will result in a better product and the company has developed a new tyre. A sample of 100 new tyres has been selected.The company has found that the mean life of these new tyres is 40,900 km.Can it be concluded that the new tyre is significantly better than the old one, using the significance level of 0.01?
  37. 37. Solution- 1. Null hypothesis: H0 :  = 40,000  Alternate Hypo: Ha :  > 40,000  Level of significance () = 0.01  Test criterion: z-test  Computation : x  n   x z    = 40,900-40,000 = 3  n 300
  38. 38.  At 0.01 level, the critical value of z is 2.33.  Zcal=3 .01 2.33 As computed value falls in rejection region, we reject the null hypothesis. i.e. alternate hypothesis that  is greater than 40,000 km is accepted. 3 Z tab > Z cal Accept
  39. 39. Case let  An ambulance service claims that it takes, on the average 8.9 minutes to reach its destination in emergency calls.To check on this claim, the agency which licenses ambulance services has then timed on 50 emergency calls, getting a mean of 9.3 minutes with a standard deviation of 1.8 minutes.Does this constitute evidence that the figure claimed is not right at 1% level of significance? Hint: Ho: = 8.9; Ha: 8.9, Zcal = 1.574 ; Ho accepted.
  40. 40.  A random sample of boots worn by 40 combat soldiers in a desert region showed an average life of 1.08 yrs with a standard deviation of 0.05.Under the standard conditions,the boots are known to have an average life of 1.28 yrs.Is there reason to assert at a level of significance of 0.05 that use in the desert causes the mean life of such boots to decrease? Hint: Ho:  = 1.28, Ha: <1.28 ,Zcal= -28.57 Ho rejected.
  41. 41.  Hinton Press hypothesizes that the average life of its largest web press is 14,500 hrs.They know that the standard deviation of press life is 2100 hrs.From a sample of 25 presses, the company finds a sample mean of 13000 hrs. At a 0.01 significance level, should the company conclude that the average life of the presses is less than the hypothesized 14,500 hours?  Ans: Ho rejected.
  42. 42.  ABC company is engaged in the packaging of a superior quality tea in jars of 500 gm each.The company is of the view that as long as jars contain 500 gm of tea, the process is in control.The standard deviation is 50 gm.A sample of 225 jars is taken at random and the sample average is found to be 510 gm.Has the process gone out of control? Hint:  =500, 500; Zcal = 3; Ho rejected
  43. 43. Case let  American Theaters knows that a certain hit movie ran an average of 84 days in each city, and the corresponding standard deviation was 10 days.The manager of the southeastern district was interested in comparing the movie;s popularity in his region with that in all of American’s other theaters. He randomly chose 75 theaters in his region and found that they ran the movie an average of 81.5 days.
  44. 44.  State appropriate hypothesis for testing whether there was a significant difference in the length of the picture’s run between theaters in the southeastern district and all of American’s other theaters.  At a 1% significance level, test these hypothesis.  (Ans: Accept Ho)
  45. 45. Test of significance of difference of means  Here, we study two populations.  Let x 1 be the mean of a sample of size n1 from a population with mean 1 and variance and let be the mean of an independent random sample of size n2 from another population with mean 2 and variance . 2 x 2 1 2 2  2 x N n x N n (   / ) (   / ) 1 1, 1 1 2 2 2, 2 2
  46. 46.  The mean of the sampling distribution of the difference between sample mean is symbolically x1  x 2 x1x 2 12 x1  x 2 x1x 2 12  The standard deviation of the sampling distribution of the difference between the sample means is called the STANDARD ERROR of the difference between two means.  1  2   2 2 1 2   1 2 x x n n
  47. 47.  When n>30(Large samples), test statistic is (x  x )  (  )   1 2 1 2    As Ho: 1=2 2 2 1 2 1 2 z n n   (x 1  x 2 ) 2 2   1 2 1 2 z n n  
  48. 48.  The means of two single large samples of 1000 and 2000 members are 6.75 inches and 68.0 inches respectively. Can the samples be regarded as drawn from the same population of standard deviation 2.5 inches. Test at 5% level of significance.  Sol: n1= 1000, n2 = 2000,  x1 = 67.5 inches, x2 = 68.0 inches,  1= 2=2.5 inches
  49. 49.  Ho: 1=2 (the samples are drawn from same population )  Ha: 12 (Two-tailed)  Level of significance () = 0.05  Z-test (x 1  x 2 ) 2 2   1 2 1 2 z n n   67.5 68.0 1 1 2.5 1000 2000 z   =-5.1       
  50. 50.  Zcal = -5.1  Z tab= 1.96( at 5%, two tailed) As z tab < z cal so,we Reject Null Hypothesis.i.e samples drawn are certainly not from the same population with standard deviation 2.5. Z cal value lies in rejection region -5.1
  51. 51.  In a survey of buying habits,400 women shoppers are chosen at random in super market ’A’ located in a certain section of the city.Their average weekly food expenditure is Rs. 250 with a standard deviation of Rs 40. For 400 women shoppers chosen at random in super market ‘B’ in another section of the city, the average weekly food expenditure is Rs 220 with a standard deviation of Rs 55. Test at 1% level of significance whether the average weekly food expenditure of the two populations of shoppers are equal. Ans: Ho: 1=2 ;Ha: 12 (Two-tailed) Ho rejected
  52. 52.  The average hourly wage of a sample of 150 workers in a plant ‘A’ was Rs 2.56 with a standard deviation of Rs 1.08. The average hourly wage of a sample of 200 workers in plant ‘B’ was Rs 2.87 with a standard deviation of Rs 1.28.Can an applicant safely assume that the hourly wages paid by plant ‘B’ are higher than those paid by plant ‘A’? Ans: Ho: 1=2 ;Ha: 1<2 (Left-tailed) Ho rejected
  53. 53.  In 1993,Financial accounting Standard Boards(FASB) was considering a proposal to require companies to report the potential effect of employees stock options on earning per share(EPS). A random sample of 41 High tech firms revealed that the new proposal will reduce EPS by an average of 13.8 %,with a s.d. of 18.9%.A random sample of 35 producers of consumer goods would reduce EPS by 9.1% on average, with a s.d. of 8.7%.On the basis of these samples,is it reasonable to conduct (at 10% level of significance) that the FASB proposal will cause greater reduction in EPS for high-tech firms than for producers of consumer goods? Ans: Ho: 1=2 ;Ha: 1>2 (Right-tailed) Ho rejected
  54. 54. Que: Two independent samples of observations were collected.for the first sample of 60 elements,the mean was 86 and the standard deviation 6.The second sample of 75 elements had a mean of 82and a standard deviation of 9.  Compute the standard error of the difference between means.  Using =0.01,test whether the two samples can reasonably be considered to have come from populations with the same mean.
  55. 55.  A potential buyer wants to decide which of the two brands of electric bulbs he should buy as he has to buy them in bulk.As a specimen, he buys 100 bulbs of each of the two brands-A and B.On using these bulbs, he finds that brand A has a mean life of 1200 hrs with a standard deviation of 50 hrs and Brand B has a mean life of 1150 hrs with a s.d. of 40 hrs.Do the two brands differ significantly in quality? (Use  = 0.05) Ans: Ho: 1=2 ;Ha: 12 (Two-tailed) Ho rejected
  56. 56. Test of Significance for Single Proportion-  Suppose we take a sample of n persons from a population and if x of these persons are possessing a particular characteristic(say educated) then the sample proportion p =x/n  Population proportion = P
  57. 57.  Mean of sampling distribution of proportions is p = P  Standard deviation of sampling distribution of proportion is STANDARD ERROR of proportion (1 ) P PQ P P n n      The standard normal variable is z = p – P PQ n
  58. 58.  Q: A company engaged in the manufacture of superior quality diaries, which are primarily meant for senior executives in the corporate world.It claims that 75% of the executives employed in Delhi use its diaries. A random sample of 800 executives was taken and it was found that 570 executives did use its diary when the survey was undertaken.Verify the company’s claim, using 5% level of significance
  59. 59.  Ho: P = .75  Ha: P  .75  Level of significance () = .05 z = p – P PQ n p = 570/800 = .7125 .7125 0.75 0.75(1 .75) 800 z    = -2.45
  60. 60.  Z cal=-2.45  Z tab= 1.96( at 5%, two tailed)  Ho rejected. This implies the claim of the company is exaggerated and is not supported by our test. Zcal value lies in rejection region -2.45
  61. 61.  Fifty people were attacked by the disease and only 45 survived. Will you reject the hypothesis that the survival rate , if attacked by disease ,is 85% in favor of the hypothesis that it is more,at 5% level. Hint: p=50/45 Ho: P=.85 Ha: P>.85(R.T) Accept Ho.
  62. 62.  A ketch-up manufacturer is in the process of deciding whether to produce a new extra-spicy brand.The company’s marketing research department used a national telephone survey of 6000 households and found that the extra-spicy ketchup would be purchased by 335 of them. A much more extensive study made 2 years ago showed that 5% of the households would purchase the brand then.At a 2% significance level, should the company conclude that there is an increased interest in the extra-spicy flavor? Hint: p=.055 Ho: P=0.05 Ha:P>0.05 Ho rejected.
  63. 63.  A manufacturer claims that at least 95% of the equipments which he supplied to a factory conformed to the specification.An examination of the sample of 200 pieces of equipment revealed that 18 were faulty.Test the claim of the manufacturer. Hint: Ho:P=.95 Ha:P<.95 p=1-18/100=.91 Ho rejected.