This document discusses factors that influence the selection of appropriate statistical techniques. It begins by explaining the main statistical methods used in analysis: descriptive statistics, inferential statistics, and regression analysis. It then discusses three main factors that influence technique selection: the study aim and objectives, the nature of observations as paired or unpaired, and the type and distribution of data. Specific statistical tests are recommended based on different data characteristics, such as whether data is continuous and normally distributed. The impacts of selecting the wrong statistical technique are also reviewed.
3. Selection of Statistical
Methods
• To select appropriate statistical method one need to
know
• Assumptions and conditions of the statistical
methods
• The main methods are used in statistical analysis
• Descriptive statistics: summarizes data using
indexes such as mean and SD
• Inferential statistics: draws conclusions from data
using statistical tests such as T-test, Chi-square
test, ANOVA.
Prabesh Ghimire, MPH 3
4. Factors influencing selection of statistical
technique
• Aim and Objective of the study
• Nature of observations: Paired or unpaired
• Type and distribution of the data used
Prabesh Ghimire, MPH 4
5. Factors influencing selection of statistical
technique
Aim and Objective of the study
• Statistical technique depends on the aim and objective of the
study
Objective
To describe something / to find out the prevalence
To find out the association…/ relationship
To find out the predictors/risk factors of outcome
variable
To compare the effectiveness of two different drugs
Techniques
Descriptive analysis (mean, median, SD,
percentage)
Chi-square test/ other tests..
Regression analysis
Independent sample t-test
Prabesh Ghimire, MPH 5
6. Factors influencing selection of statistical
technique
Nature of Observations: Paired or unpaired
• Paired: same subjects assessed at different time points or using
different methods
• Unpaired (independent): each group have different participants
• When data is paired: paired sample t-test/ Wilcoxon signed
rank test
• When data is unpaired: independent sample t-test / Mann
Whitney U test
Prabesh Ghimire, MPH 6
7. Factors influencing selection of statistical
technique
Type and distribution of data used
• For same objective, selection of the statistical test varies as per data
type.
• For nominal, ordinal and discrete data: non-parametric methods
• For continuous data: parametric methods as well as non-parametric
methods
• In regression analysis
• For categorical outcome:
• Dependent variable has two categories: Binary Logistics regression
• Dependent variable has more than two categories: Ordinal logistic regression /
Multinomial logistic regression
• For continuous variable: Linear regression
Prabesh Ghimire, MPH 7
8. Factors influencing selection of statistical
technique
Type and distribution of data used: CONTINUOUS VARIABLE
• If continuous variable follows normal distribution, mean is the
representative measure
• For non-normal data: median is the most appropriate measure
of the data set
Prabesh Ghimire, MPH 8
9. Factors influencing selection of statistical
technique
Type and distribution of data used
• We want to compare the hemoglobin level between treatment
and control groups
• If hemoglobin level follows normal distribution: Independent sample t-
test
• If it follows non-normal distribution: Mann-Whitney U test
Prabesh Ghimire, MPH 9
10. Factors influencing selection of statistical
technique
Type and distribution of data used
• We want to compare the hemoglobin level of women before and
after intervention
• If hemoglobin level follows normal distribution: paired sample t-test
• If it follows non-normal distribution: Wilcoxon signed-rank test
Prabesh Ghimire, MPH 10
11. Impacts of wrong selection of statistical
technique
Prabesh Ghimire, MPH 11
12. Impact of wrong selection..
• In a study systolic blood pressure (MeanSD) of control and
intervention groups were:
• Control: (126.458.85, n2=20)
• Intervention: (121.855.96, n2=20)
On independent sample t-test, result showed
• mean difference between two groups was not statistically significant
(p=0.061)
On paired sample t-test, result showed
• Mean difference was statistically significant (p=0.011)
Prabesh Ghimire, MPH 12
13. Further reading
• Khusainova, R. M., Shilova, Z. V., & Curteva, O. V. (2016). Selection of appropriate
statistical methods for research results processing. International Electronic Journal of
Mathematics Education, 11(1), 303-315.
• Mishra, P., Pandey, C. M., Singh, U., Keshri, A., & Sabaretnam, M. (2019). Selection of
appropriate statistical methods for data analysis. Annals of cardiac anaesthesia, 22(3),
297.
Prabesh Ghimire, MPH 13
15. Data Presentation
• Data: set of facts
• Data are collected in raw format
• Should be summarized, processed, analyzed
• Methods of presentation must be determined according to
• Data format
• Method of analysis to be used
• Information to be emphasized
Prabesh Ghimire, MPH 15
16. Ways of data presentation
• Three broad ways:
• As a text
• In tabular form
• In graphical forms
Prabesh Ghimire, MPH 16
17. Text Presentation
• Method of explaining results and trends in textual form
• Data are fundamentally presented in paragraphs or sentences
• Data which often are numbers and figures are better presented
in tables and graphs
• While interpretation are better stated in text
• If there are too few variables, data can be limited to texts
• For example, the majority of diabetic patients enrolled in the study were
male (80%) compare to female (20%).
Prabesh Ghimire, MPH 17
18. Text Presentation: Basic Rules
• Do not explain all the data available in the table or graph
• Only important points and results are to be highlighted in the text
• Avoid jargons
• Example: "Remarkably decreased", "was extremely high" and
"obviously lower"
• Exact values in the data will show just how remarkable, how extreme or
how obvious the findings are.
Prabesh Ghimire, MPH 18
19. Table Presentation
• Most widely used in academic research
• Data are presented in rows and columns
• Can present both qualitative and quantitative information
• Can accurately present information that cannot be presented
with a graph.
• Example: number such as 132.145 can be accurately represented in
table
• Information with different units can be presented together.
Prabesh Ghimire, MPH 19
20. Table Presentation
• Interpretation of information take longer in tables than in graphs
• Tables are not appropriate for studying data trends
Prabesh Ghimire, MPH 20
21. Basic rules for table presentation
Ideally every table should:
• Be self-explanatory
• Present values with the same number of decimal places in all its
cells (standardization)
• Include a title information what is being described and where as
well as the number of observations (n)
• Have a structure formed by three horizontal lines, defining table
heading and the end of the table at its lower border
Prabesh Ghimire, MPH 21
22. Basic rules for table presentation
• Not have vertical lines at its lateral borders
• Provide additional information in table footer, when needed
• Be inserted into a document only after being mentioned in the
text
• Be numbered by Arabic numerals
• Numbers should be aligned right and texts should be aligned
left
• Should fit the window
Prabesh Ghimire, MPH 22
24. Graphical Presentation
• Graphs simplify complex information by using images and
emphasizing data patterns or trends
• Useful for summarizing, explaining or exploring quantitative
data.
Prabesh Ghimire, MPH 24
25. Different Types of Graphical Presentation
• Bar Graphs
• Histogram
• Scatter Plot
• Pie Chart
• Box and Whisker Plot
• Stem and Leaf Plot
Prabesh Ghimire, MPH 25
26. Basic Rules for Graphical Presentation
Graphs should
• Include, below the figure, a title providing all relevant
information;
• Be referred to as figures in the text;
• Identify figure axes by the variables under analysis;
• Quote the source which provided the data, if required;
• Demonstrate the scale being used; and
• Be self-explanatory.
Prabesh Ghimire, MPH 26
27. Data Presentation
• Presentation of categorical variables
• Table or bar graph, including pie chart
• To assess relationship between two variable: contingency table may be
used
• Presentation of numerical variables
• Table, Histogram, Frequency polygon chart, Scatter plot
Prabesh Ghimire, MPH 27
29. Concept
• Inferential statistical methods fall into
two possible categorizations:
• Parametric and
• Nonparametric.
Prabesh Ghimire, MPH 29
30. Parametric Methods
• All type of statistical methods those are used to compare
the means are called parametric
• Two key parameters: Mean and Standard Deviation
• Used with continuous, interval data
• Parametric tests rely on the assumption that the variable
is continuous and follow approximate normal distribution.
• Examples: all types of t-test, F test
• Pearson's correlation coefficient, linear regression
Prabesh Ghimire, MPH 30
31. Parametric Methods
• Student's t-test is used to compare the
means between two groups
• F test (one way ANOVA, repeated
measures ANOVA) are used to
compare means among three or more
groups.
Prabesh Ghimire, MPH 31
32. Non-Parametric Methods
• Alternative to parametric tests for the data
where there skewness, extreme asymmetries,
especially in small samples
• Statistical methods used to compare other than
means (ex-median/mean ranks/proportions) are
called non-parametric methods.
• Applied in ordinal data or nominal data
Prabesh Ghimire, MPH 32
33. Non-Parametric Methods
• When data is continuous with non-normal
distribution or any other types of data other
than continuous variable, nonparametric
methods are used.
• Fortunately, the most frequently used
parametric methods have nonparametric
counterparts.
Prabesh Ghimire, MPH 33
34. Non-Parametric Methods
• This can be useful when the assumptions of a parametric test are
violated and we can choose the nonparametric alternative as a
backup analysis
• Examples:
• Mann Whitney U test
• Wilcoxon test
• Kruskal-Wallis H test
• Median test
• Friedman test
• Log linear regression
• Spearman rank correlation coefficient
• Pearson's Chi-square test
Prabesh Ghimire, MPH 34
35. Parametric Vs Non-Parametric Methods
Basis of Comparison Parametric Test Non-Parametric Test
Scale of Measurement Interval/Ratio Nominal/Ordinal
Distribution Normal Normal or not
Variance Equal variance Different variance
Sample size Large Small
Selection Random sample Random/ Non-random
Power More Power Less Power
Prabesh Ghimire, MPH 35