SlideShare verwendet Cookies, um die Funktionalität und Leistungsfähigkeit der Webseite zu verbessern und Ihnen relevante Werbung bereitzustellen. Wenn Sie diese Webseite weiter besuchen, erklären Sie sich mit der Verwendung von Cookies auf dieser Seite einverstanden. Lesen Sie bitte unsere Nutzervereinbarung und die Datenschutzrichtlinie.

SlideShare verwendet Cookies, um die Funktionalität und Leistungsfähigkeit der Webseite zu verbessern und Ihnen relevante Werbung bereitzustellen. Wenn Sie diese Webseite weiter besuchen, erklären Sie sich mit der Verwendung von Cookies auf dieser Seite einverstanden. Lesen Sie bitte unsere unsere Datenschutzrichtlinie und die Nutzervereinbarung.

Diese Präsentation wurde erfolgreich gemeldet.

Diese Präsentation gefällt Ihnen? Dann am besten gleich teilen!

2.019 Aufrufe

Veröffentlicht am

Lennox argues that the problem does not lie with statistical methods, but rather from misleading training for non-statisticians. The talk is intended to establish that statistics is not just a set of numerical procedures, but rather a distinctive way of thinking about and solving problems. Real-world examples demonstrate the pitfalls of "procedural" statistics, and that non-statisticians can be successful by approaching statistical challenges in the same way that they do problems in their field of expertise and by leveraging the statistical expertise available at the laboratory as necessary.

Veröffentlicht in:
Daten & Analysen

Keine Downloads

Aufrufe insgesamt

2.019

Auf SlideShare

0

Aus Einbettungen

0

Anzahl an Einbettungen

9

Geteilt

0

Downloads

93

Kommentare

0

Gefällt mir

7

Keine Einbettungen

Keine Notizen für die Folie

- 1. LLNL-PRES-670181 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC Everything Wrong with Statistics (and How to Fix It) Kristin P. Lennox Director of Statistical ConsultingJuly 29, 2015
- 2. Lawrence Livermore National Laboratory LLNL-PRES-670181 2 Crisis! Essay Open access, freely available online factors that inﬂuence this problem and some corollaries thereof. Modeling the Framework for False Positive Findings Several methodologists have pointed out [9–11] that the high rate of nonreplication (lack of conﬁrmation) of research discoveries is a consequence of the convenient, yet ill-founded strategy of claiming conclusive research ﬁndings solely on the basis of a single study assessed by formal statistical signiﬁcance, typically for a p-value less than 0.05. Research is not most appropriately represented and summarized by p-values, but, unfortunately, there is a widespread notion that medical research articles should be interpreted based only on p-values. Research ﬁndings are deﬁned here as any relationship reaching is characteristic of the ﬁeld and can vary a lot depending on whether the ﬁeld targets highly likely relationships or searches for only one or a few true relationships among thousands and millions of hypotheses that may be postulated. Let us also consider, for computational simplicity, circumscribed ﬁelds where either there is only one true relationship (among many that can be hypothesized) or the power is similar to ﬁnd any of the several existing true relationships. The pre-study probability of a relationship being true is R⁄(R + 1). The probability of a study ﬁnding a true relationship reﬂects the power 1 − β (one minus the Type II error rate). The probability of claiming a relationship when none truly exists reﬂects the Type I error rate, α. Assuming that c relationships are being probed in the ﬁeld, the expected values of the 2 × 2 table are given in Table 1. After a research ﬁnding has been claimed based on achieving formal statistical signiﬁcance, the post-study probability that it is true is the positive predictive value, PPV. Why Most Published Research Findings Are False John P.A.Ioannidis Summary There is increasing concern that most current published research ﬁndings are false.The probability that a research claim is true may depend on study power and bias,the number of other studies on the same question,and,importantly,the ratio of true to no relationships among the relationships probed in each scientiﬁc ﬁeld.In this framework,a research ﬁnding is less likely to be true when the studies conducted in a ﬁeld are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater ﬂexibility in designs,deﬁnitions, outcomes,and analytical modes; when there is greater ﬁnancial and other interest and prejudice; and when more teams are involved in a scientiﬁc ﬁeld in chase of statistical signiﬁcance. Simulations show that for most study designs and settings,it is more likely for a research claim to be false than true. Moreover,for many current scientiﬁc ﬁelds,claimed research ﬁndings may It can be proven that most claimed research ﬁndings are false.
- 3. Lawrence Livermore National Laboratory LLNL-PRES-670181 3 What’s going on? Statistics is popular and important! Statisticians are rare. Statistics training isn’t working. σ
- 4. Lawrence Livermore National Laboratory LLNL-PRES-670181 4 STAT 101 is Procedural 1. Check your data type 2. Select inference method 3. Calculate required sample statistics 4. Look up critical values … N. Report result
- 5. Lawrence Livermore National Laboratory LLNL-PRES-670181 5 Real Statistics Isn’t
- 6. Lawrence Livermore National Laboratory LLNL-PRES-670181 6 Comprehensive Plan for Reform of All Statistics 1) Show the problems with “cookbook statistics” 2) Demonstrate real statistical thinking 3) Help as needed
- 7. Lawrence Livermore National Laboratory LLNL-PRES-670181 7 § Know thy problem. § Know thy tools. § Know thy data. Golden Rules of Statistics (What Statisticians REALLY Do)
- 8. Lawrence Livermore National Laboratory LLNL-PRES-670181 8 STAT 101: Determine the appropriate analysis by looking at the data. E.g. two numeric variables = linear regression Know Thy Problem
- 9. Lawrence Livermore National Laboratory LLNL-PRES-670181 9 STAT 101: Determine the appropriate analysis by looking at the data. E.g. two numeric variables = linear regression Know Thy Problem Appropriate data AND appropriate analysis depend on the real world problem.
- 10. Lawrence Livermore National Laboratory LLNL-PRES-670181 10 The Million Dollar Binomial Distribution
- 11. Lawrence Livermore National Laboratory LLNL-PRES-670181 11 Know Thy Tools STAT 101: Statistical methods are selected according to the appropriateness to the data and correctness of assumptions. STAT 101: Statistical procedures, used correctly, yield unambiguous results.
- 12. Lawrence Livermore National Laboratory LLNL-PRES-670181 12 Know Thy Tools STAT 101: Statistical methods are selected according to the appropriateness to the data and correctness of assumptions. STAT 101: Statistical procedures, used correctly, yield unambiguous results. Statistical models work the same way that other scientific and engineering models work. Their validity depends on context, and they may be open to interpretation.
- 13. Lawrence Livermore National Laboratory LLNL-PRES-670181 13 yi = b0 + b1xi +εi x ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● 0 10 20 30 40 50 050100150 x y −4 −2 0 2 40.000.100.200.30 x Density Statistical Methods are Based on Models
- 14. Lawrence Livermore National Laboratory LLNL-PRES-670181 14 Statistical Methods are Based on Models yi = b0 + b1xi +εi x ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● 0 10 20 30 40 50 050100150 x y −5 0 5 10 150.000.040.080.12 x Density
- 15. Lawrence Livermore National Laboratory LLNL-PRES-670181 15 A Wise Man Once Said… “Essentially, all models are wrong, but some are useful. ” – George E. P. Box
- 16. Lawrence Livermore National Laboratory LLNL-PRES-670181 16 How to Evaluate Explosives Safety A METHOD FOR OBTAINING AND ANALYZING SENSITIVITY DATA* W. J. DIXON University of Oregon AND A. M. MOOD Iowa State College The standard method of dealing with sensitivity of dosage- mortality data is the probit technique developed by Bliss and Fisher. This paper provides an alternative technique based on a special system for obtaining such data. It has some ad- vantages when observations must be taken on individuals rather than groups of individuals, and it may be preferred in certain other situations. INTRODUCTION EX PERI MENTAL investigations often deal with continuous variables which cannot be measured in practice. For example, in testing the sensitivity of explosives to shock, a common procedure is to drop a weight on specimens of the same explosive mixture from various heights. There are heights at which some specimens will explode, and others will not, and it is assumed that those which willnot explode would explode were the weight dropped from a sufficiently greater height. It is supposed, therefore, that there is a critical height associated with each specimen, and that the specimen will explode when the weight is dropped from a greater height and will not explode when the weight is dropped from a lesser height. The population of specimens is thus characterized by a continuous variable-the critical height-which cannot be measured. All one can do is select some height arbitrarily and determine whether the critical height for a given specimen is less than or greater than the selected height. This situation arises in many fields of research. Thus in testing insec- ticides, a critical dose is associated with each insect, but one cannot oadedby[LawrenceLivermoreNationalLaboratory]at16:1903October2013 0 10 20 30 40 50 60 −202 Up−and−Down Test Demo Test NormalizedHeight x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x o o o o o o o o o o o o o o o o o o o o o o o o o o o o o x x
- 17. Lawrence Livermore National Laboratory LLNL-PRES-670181 17 How NOT to Evaluate Explosives Safety A METHOD FOR OBTAINING AND ANALYZING SENSITIVITY DATA* W. J. DIXON University of Oregon AND A. M. MOOD Iowa State College The standard method of dealing with sensitivity of dosage- mortality data is the probit technique developed by Bliss and Fisher. This paper provides an alternative technique based on a special system for obtaining such data. It has some ad- vantages when observations must be taken on individuals rather than groups of individuals, and it may be preferred in certain other situations. INTRODUCTION EX PERI MENTAL investigations often deal with continuous variables which cannot be measured in practice. For example, in testing the sensitivity of explosives to shock, a common procedure is to drop a weight on specimens of the same explosive mixture from various heights. There are heights at which some specimens will explode, and others will not, and it is assumed that those which willnot explode would explode were the weight dropped from a sufficiently greater height. It is supposed, therefore, that there is a critical height associated with each specimen, and that the specimen will explode when the weight is dropped from a greater height and will not explode when the weight is dropped from a lesser height. The population of specimens is thus characterized by a continuous variable-the critical height-which cannot be measured. All one can do is select some height arbitrarily and determine whether the critical height for a given specimen is less than or greater than the selected height. This situation arises in many fields of research. Thus in testing insec- ticides, a critical dose is associated with each insect, but one cannot oadedby[LawrenceLivermoreNationalLaboratory]at16:1903October2013 “…the up and down method is particularly effective for estimating the mean. It is not a good method for estimating small or large percentage points (for example, the height at which 99 per cent of specimens explode) unless normality of the distribution is assured.” – Dixon and Mood 0 10 20 30 40 50 60 −202 Up−and−Down Test Demo Test NormalizedHeight x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x o o o o o o o o o o o o o o o o o o o o o o o o o o o o o x x
- 18. Lawrence Livermore National Laboratory LLNL-PRES-670181 18 A Note on Statistical Significance (the following statements reflect only the author’s opinion, and should not be construed to reflect those of LLNL, the Applied Statistics Group, or any other person, statistician or not, living or dead) • There isn’t anything wrong with p-values …but p=0.0501 is the same as p=0.0499 • There isn’t anything wrong with statistical hypothesis testing … but it isn’t the right tool for making all decisions. These procedures aren’t broken. They are misused. This does not mean that you should keep using them.
- 19. Lawrence Livermore National Laboratory LLNL-PRES-670181 19 Know Thy Data Parametric models are (of course) sensitive to assumptions, but purely data driven approaches are far more robust to “cookbook” approaches.
- 20. Lawrence Livermore National Laboratory LLNL-PRES-670181 20 Know Thy Data Parametric models are (of course) sensitive to assumptions, but purely data driven approaches are far more robust to “cookbook” approaches. There are multiple cautions and caveats when using “big data” approaches. The most important is that you have to start with the right data.
- 21. Lawrence Livermore National Laboratory LLNL-PRES-670181 21 Jackie’s Improbable Sister Jackie is a girl in a family with two children. What is the probability that Jackie has a sister? A. 1/2 B. 1/3 C. 0 or 1, but we don’t know which
- 22. Lawrence Livermore National Laboratory LLNL-PRES-670181 22 Jackie’s Improbable Sister A. 1/2 B. 1/3 How did we find Jackie? Jackie is a girl in a family with two children. What is the probability that Jackie has a sister?
- 23. Lawrence Livermore National Laboratory LLNL-PRES-670181 23 Option A: 1/2 1) Pick a two child family at random. 2) Pick a child from the family at random.
- 24. Lawrence Livermore National Laboratory LLNL-PRES-670181 24 Option A: 1/2 1) Pick a two child family at random. 2) Pick a child from the family at random. Two girls have sisters and two girls have brothers.
- 25. Lawrence Livermore National Laboratory LLNL-PRES-670181 25 Option B: 1/3 1) Pick a two child family with at least one girl at random. 2) Report one girl’s name for each family.
- 26. Lawrence Livermore National Laboratory LLNL-PRES-670181 26 Option B: 1/3 1) Pick a two child family with at least one girl at random. 2) Report one girl’s name for each family. Of three possible families, only one has girls with sisters.
- 27. Lawrence Livermore National Laboratory LLNL-PRES-670181 27 Real (and Expensive) Problem 1948 GENETICDIAGNOSIS Data barriers hamper search for meaning in mutations p.156 FUNDING US science agencies gird themselves for the budget axe p.158 MALARIA Plant source of key drug faces lab-made competition p.160 BIOMEDICINE A showdown stem-cell th BY DECLAN BUTLER W hen influenza hit early and hard in the United States this year, it qui- etly claimed an unacknowledged victim: one of the cutting-edge techniques being used to monitor the outbreak. A com- parison with traditional surveillance data showed that Google Flu Trends, which esti- mates prevalence from flu-related Internet searches, had drastically overestimated peak flu levels. The glitch is no more than a tempo- complement, but not substitute for, traditional epidemiological surveillance networks. “It is hard to think today that one can pro- vide disease surveillance without existing systems,” says Alain-Jacques Valleron, an epidemiologist at the Pierre and Marie Curie University in Paris, and founder of France’s Sentinellesmonitoringnetwork.“Thenewsys- tems depend too much on old existing ones to be able to live without them,” he adds. This year’s US flu season started around November and seems to have peaked just after virulent of the three main seaso Traditional flu monitoring de on national networks of physicia cases of patients with influen (ILI) — a diffuse set of sympto high fever, that is used as a prox estimate is then refined by testi people with these symptoms to d many have flu and not some oth With its creation of the Sentin in 1984, France was the first co puterize its surveillance. Many c since developed similar netwo system, overseen by the Cente Control and Prevention (CDC Georgia, includes some 2,70 centres that record about 30 m visits annually. But the near-global coverage and burgeoning social-media p as Twitter have raised hopes th nologies could open the way to estimates of ILI, spanning large Themotherofthesenewsyste launchedin2008.Basedonresea and the CDC, it relies on data m of flu-related search terms enter search engine, combined wi modelling. Its estimates have a matched the CDC’s own surv over time — and it delivers them faster than the CDC can. The sy been rolled out to 29 countries w has been extended to include sur second disease, dengue. Google Flu Trends has cont formremarkablywell,andresear countries have confirmed that it are accurate. But the latest US flu to have confounded its algorithm for the Christmas national peak doubletheCDC’s(see‘Feverpea of its state data show even larger It is not the first time that a tripped Google up. In 2009, F to tweak its algorithms after its underestimated ILI in the Unite start of the H1N1 (swine flu) p glitch attributed to changes in p behaviour EPIDEMIOLOGY When Google got flu wrong US outbreak foxes a leading web-based method for tracking seasonal flu. The latest US influenza season is more severe and has caused more deaths than usual. JOHNANGELILLO/UPI/NEWSCOM NEWSINFOCU 2013 1954
- 28. Lawrence Livermore National Laboratory LLNL-PRES-670181 28 To summarize…
- 29. Lawrence Livermore National Laboratory LLNL-PRES-670181 29 Don’t:
- 30. Lawrence Livermore National Laboratory LLNL-PRES-670181 30 § Know thy problem. § Know thy tools. § Know thy data. Do:
- 31. Lawrence Livermore National Laboratory LLNL-PRES-670181 31 § Know thy problem. § Know thy tools. § Know thy data. Do:
- 32. Lawrence Livermore National Laboratory LLNL-PRES-670181 32 The LLNL Statistical Consulting Service provides up to 4 hours of assistance free of charge for LLNL projects. When in doubt: stats-consulting@llnl.gov https://data-analytics.llnl.gov/statistical_consultants Thank you! σ
- 33. Lawrence Livermore National Laboratory LLNL-PRES-670181 34 Wikipedia: Betty Crocker Cookbook, Salk Polio Vaccine Wikipedia (CC BY-SA 3.0): George Box Harry S. Truman Library: Bernard Dickmann with Harry S. Truman Library of Congress: Chicago Tribune Headline Plain Unicorn: WPClipart LLNL: NIF, Drop Hammer, Sigma the Statistics Unicorn Image sources:

Keine öffentlichen Clipboards für diese Folie gefunden

Sie haben diese Folie bereits ins Clipboard „“ geclippt.

Clipboard erstellen

Als Erste(r) kommentieren