This document discusses the role and importance of statistics in scientific research. It begins by defining statistics as the science of learning from data and communicating uncertainty. Statistics are important for summarizing, analyzing, and drawing inferences from data in research studies. They also allow researchers to effectively present their findings and support their conclusions. The document then describes how statistics are used and are important in many fields of scientific research like biology, economics, physics, and more. It also provides examples of statistical terms commonly used in research studies and some common misuses of statistics.
Pests of mustard_Identification_Management_Dr.UPR.pdf
Role of Statistics in Scientific Research
1. Role of Statistics
in
Scientific Research
A Special Topic Discussion By;
Waruna Kodituwakku & Harsha Perera
2. "Statistics is the grammar of science.”
Karl Pearson
"If your experiment needs statistics, you
ought to have done a better experiment....“
Ernest Rutherford
3. Questions
What is statistics?
Why Study Statistics?
What is the importance of statistics in scientific
research?
What is the role of statistics in research?
Describe the importance of statistics in different
fields of study
4. Questions Contd.
What are the statistical terms used in research
studies?
What are the misuses of statistics in a
research?
How effectively present stat findings using
tools such as tables, graphs etc.
6. “Statistics is the science of learning from data, and of
measuring, controlling, and communicating uncertainty;
and it thereby provides the navigation essential for
controlling the course of scientific and societal advances”
(Davidian and Louis, 2012)
“Statistics is the study of the collection, analysis,
interpretation, presentation and organization of data”.
(Dodge, Y. (2006) The Oxford Dictionary of Statistical
Terms)
7. Statistical methods can be used to summarize or describe a
collection of data; this is called descriptive statistics.
In addition, patterns in the data may be modeled in a way
that accounts for randomness and uncertainty in the
observations, and are then used to draw inferences about the
process or population being studied; this is called inferential
statistics.
8. "Applied statistics" contains descriptive statistics and the application of
inferential statistics.
“Theoretical statistics” consist of logical arguments underlying justification
of approaches to statistical inference.
“Mathematical statistics” contains the manipulation of probability
distributions necessary for deriving results related to methods of estimation
and various aspects of computational statistics and the design of
experiments.
10. • Knowledge in statistics provides you with the
necessary tools and conceptual foundations in
quantitative reasoning to extract information
intelligently from this sea of data.
• Statistical methods and analyses are often used to
communicate research findings and to support
hypotheses and give credibility to research
methodology and conclusions.
• It is important for researchers and also consumers of
research to understand statistics so that they can be
informed, evaluate the credibility and usefulness of
information, and make appropriate decisions.
11. What is the importance
of statistics in
scientific research?
12. • Statistics play a vital role in researches. For example
statistics can used as in data collection, analysis,
interpretation, explanation and presentation. Use of
statistics will guide researchers in research for proper
characterization, summarization, presentation and
interpretation of the result of research.
• Statistics provides a platform for research as to; How to go
about your research, either to consider a sample or the
whole population, the Techniques to use in data collection
and observation, how to go about the data description
(using measure of central tendency).
13. • Statistical methods and analyses are often used to
communicate research findings and to support
hypotheses and give credibility to research
methodology and conclusions.
• It is important for researchers and also
consumers of research to understand statistics
so that they can be informed, evaluate the
credibility and usefulness of information, and
make appropriate decisions.
14. • Statics is very important when it comes to the
conclusion of the research.
• In this aspect the major purposes of statistics
are to help us understand and
describe phenomena in our word and to help
us draw reliable conclusions about those
phenomena.
15. What is the role of
statistics in
scientific research?
18. Statistics has important role in determining the
existing position of per capita income,
unemployment, population growth rate, housing,
schooling medical facilities etc…in a country.
Now statistics holds a central position in almost
every field like Industry, Commerce, Trade, Physics,
Chemistry, Economics, Mathematics, Biology,
Botany, Psychology, Astronomy, Information
Technology etc…, so application of statistics is very
wide.
19. Specialties have evolved to apply statistical theory and methods to
various disciplines. So there are different fields of application of
statistics. Some of those are described below.
• Astrostatistics is the discipline that applies statistical analysis to
the understanding of astronomical data.
• Biostatistics is a branch of biology that studies biological
phenomena and observations by means of statistical analysis, and
includes medical statistics.
• Econometrics is a branch of economics that applies statistical
methods to the empirical study of economic theories and
relationships.
• Business analytics is a rapidly developing business process that
applies statistical methods to data sets to develop new insights
and understanding of business performance & opportunities.
20. • Environmental statistics is the application of statistical methods to
environmental science. Weather, climate, air and water quality are
included, as are studies of plant and animal populations.
• Statistical mechanics is the application of probability theory, which
includes mathematical tools for dealing with large populations, to the
field of mechanics, which is concerned with the motion of particles or
objects when subjected to a force.
• Statistical physics is one of the fundamental theories of physics, and
uses methods of probability theory in solving physical problems.
• Actuarial science is the discipline that applies mathematical and
statistical methods to assess risk in the insurance and finance industries.
21. What are the statistical terms
used in research studies?
22. Population
sampling
Generalization Hypothesis
Significance tests Correlation Regression
analysis
Correlation
coefficient (r)
Dependent and
Independent
variable
Elasticity
Standard deviation Factor analysis t-Test
Q-Test Chi-squared test Mann–Whitney U-test
Wilcoxon matched
pairs test
Fisher's exact test One-way ANOVA
23. What are the misuses of
statistics in a research?
24. Discarding unfavorable data Loaded questions
Overgeneralization Biased samples
Misreporting or misunderstanding
of estimated error
False causality
Proof of the null hypothesis Confusing statistical significance
with practical significance
Misleading Graphs Data manipulation
Suspect Samples Ambiguous Averages
25.
26. References
Bu.edu, 'Why Study Statistics » Statistics » Boston University', 2014. [Online]. Available:
http://www.bu.edu/stat/undergraduate-program-information/why-study-statistics/. [Accessed: 23- Oct-
2014].
Understanding Descriptive and Inferential Statistics. 2014. Understanding Descriptive and Inferential Statistics.
[ONLINE] Available at:https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php.
[Accessed 27 October 2014].
Kent, J 2013, ' Why Statistics Is Important To A Research?', Viewed 24 October 2014,
http://education.blurtit.com/335945/why-statistics-is-important-to-a-research
Journalists Resources 2014, 'Statistical terms used in research studies', Viewed 24 October 2014,
http://journalistsresource.org/skills/research/statistics-for-journalists#
Theanalysisfactor.com, 'Factor Analysis: A Short Introduction, Part 1', 2014. [Online]. Available:
http://www.theanalysisfactor.com/factor-analysis-1-introduction/. [Accessed: 25- Oct- 2014].
Stattrek.com, 'Regression Example', 2014. [Online]. Available: http://stattrek.com/regression/regression-example.
aspx. [Accessed: 26- Oct- 2014].
Ats.ucla.edu, (2014). What statistical analysis should I use? Statistical analyses using Stata. [online] Available
at: http://www.ats.ucla.edu/stat/stata/whatstat/whatstat.htm [Accessed 26 Oct. 2014].
Hinweis der Redaktion
Population sampling
As we discussed in previous discussion sessions, population sampling can be defined as the procedure of taking a subset of larger population. Inferential statistics seek to make predictions about a population based on the results observed in a sample of that population.
Generalization
Generalization is the attempt to extend the results of a sample to a population. When generalizing, sample variation must be taken into account. Even if the sample selection is completely random, still there is a degree of variance within the population. The concept 'margin of error' comes into the field due to this reason.
Hypothesis
A quantitative research process starts with a hypothesis, and the research design is based on this hypothesis. Hypothesis is defined as a proposed explanation for a phenomenon. A well designed research disproves the null hypothesis.
Significance tests
Significance tests are used to determine the probability of the research result if the null hypothesis was true. p-value is used to determine this probability statistically.
Correlation
When two variables move together, they are said to be correlated. Correlation can be either positive where one variable rises, the other variable rises as well or negative where two variables move in opposite directions.
Regression analysis
Regression analysis is used to determine if there is or isn’t a correlation between two or more variables and how strong any correlation may be. It involves plotting data points on an X/Y graph and looking at the distribution of data it establishes a trend line.
Correlation coefficient (r)
Correlation coefficient is used to indicate the degree to which two quantitative variables are related. The most common Correlation coefficient which is used is the Pearson coefficient.
Causation
Causation or cause and effect relationship denotes when one variable changes another. Unlike in correlation relationship, causation flows on only one direction.
Dependent and Independent variable
In a causation relationship, the factor that drives change is the independent variable. The variable that is driven is the dependent variable.
Elasticity
Elasticity is mainly used in economics research. It measures how much a change in one variable affects another.
Standard deviation
Standard deviation measures the difference from the group's mean to provide an insight of much variation there is within a group of values.
Dependent t-test
A data analysis procedure that assesses whether the means of two related groups are statistically different from each other.
t-Test
The t- test is used to calculate the confidence intervals of a measurement when the population standard deviation is not know. The t-test is also used to compare two averages. The t-test corrects for the uncertainty of the sample standard deviation (s) caused by taking a small number of samples.
Q-Test
This test is used to determine if there is a statistical basis for removing a data point from a data set. The t-test enables to see whether two samples are different when you have data that are continuous and normally distributed. The test allows comparing the means and standard deviations of the two groups to see whether there is a statistically significant difference between them.
Chi-squared test
The chi-squared test is used with categorical data to see whether any difference in frequencies between sets of results is due to chance. The chi-square test assumes the expected value of each cell is five or higher.
Mann–Whitney U-test
The Mann–Whitney U-test is similar to the t-test. It is used when comparing ordinal data that are not normally distributed. Measurements must be categorical, yes or no and independent of each other. The Mann–Whitney U-test could be used to test the effectiveness of an antihistamine tablet compared to a spray in a group of people with hay fever. To do this, you would split the group in half, then give each half a different treatment and ask each person how effective they thought it was. The test could be used to see whether there is a difference in the perceived efficacy of the two treatments.
Standard error and 95 per cent confidence limits
The standard error and 95 per cent confidence limits allow us to gauge how representative of the real world population the data are.
Spearman’s rank correlation coefficient
The Spearman’s rank correlation coefficient tests the relationship between two variables in a dataset; for example, is a person’s weight related to their height? If there is a statistically significant relationship, you can reject the null hypothesis, which may be that there is no link between the two variables.
Wilcoxon matched pairs test
Like the Mann–Whitney U-test, this test is used for discontinuous data that are not normally distributed but do have a link between the two datasets. For example, when asking people to rank how hungry they feel before a meal and doing so again after they have eaten because the same person is providing both answers, the datasets are not independent.
Fisher's exact test
The Fisher's exact test is used when you want to conduct a chi-square test, but one or more of your cells has an expected frequency of five or less. Remember that the chi-square test assumes that each cell has an expected frequency of five or more, but the Fisher's exact test has no such assumption and can be used regardless of how small the expected frequency is.
One-way ANOVA
A one-way analysis of variance (ANOVA) is used when you have a categorical independent variable with two or more categories and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable.
Discarding unfavorable data
This is a main misuse happen in the research. Researchers tend to discard the unfavorable data to their research study
Loaded questions
The answers to surveys can often be manipulated by wording the question in such a way as to induce a prevalence towards a certain answer from the respondent. For example, in polling support for a war, the questions:
Do you support the attempt by the USA to bring freedom and democracy to other places in the world?
Do you support the unprovoked military action by the USA?
Overgeneralization
Overgeneralization is a fallacy occurring when a statistic about a particular population is asserted to hold among members of a group for which the original population is not a representative sample.
False causality
he fallacy is that an event or action influences another that is not reasonably related. Ex.: "There were many strangers in the room, so naturally they began to argue."
Suspect Samples
"Three out of four doctors surveyed recommend brand Misleadatron." If only 4 doctors were surveyed, the results could have been obtained by chance alone; however, if 100 doctors were surveyed, the results might be quite different. But probably not the desired one.
Ambiguous Averages
"There are four commonly used measures that are loosely called averages. They are the mean, median, mode, and midrange." For the same data set, these averages can differ markedly. People who know this can, without lying, select the one measure of average that lends the most evidence to support their position.