Intro_BiostatPG.ppt

Introduction to Biostatistics
Dr. Karunambigai.M
Public Health Sciences Department
KI University
Data
• Research is any process by which information is
systematically and carefully gathered for the purpose of
answering questions, examining ideas, or testing
theories.
• Numerical information collected as part of any research
is called Data. Depending on the nature of the problem,
the data may relate to individuals, families, houses,
villages etc…
• The data collected are known as observations. The
individual subjects upon whom the data are collected
are known as statistical units.
Variables
• The characteristics or events that are measured on a
subject, in a research study are called variables,
because they vary. (i.e., they take different values in
different subjects or vary from one subject to
another).
• Variables are measured according to two broad
types of measurement scales: Numerical &
Categorical (otherwise known as Quantitative &
Qualitative).
Types of dataset and their measure
• Population - dataset consisting of all outcomes,
measurements, or responses of interest.
• Sample - dataset which is a subset of the
population.
• Parameter - a numerical measurement made
using the population.
• Statistic - a numerical measurement made using
a sample.
Properties of Measurement
• Difference - Different numerals mean different
instances the variable can take
• Magnitude – This indicates that something is
more or less than the other
• Equal Appearing Interval – Different numerals
have equal distances with preceding & succeeding
numbers
• True Zero – Zero has an absolute meaning
Level of Measurement (Measurement
Scale)
• Nominal
• Ordinal
• Interval
• Ratio
The measurement levels are considered in the
following hierarchy:
(LOWEST) Nominal -Ordinal – Interval - Ratio (HIGHEST)
Nominal Scale
• Numbers serve as labels.
• Numbers used only for identification and one-
to- one correspondence with the objects
• Only permissible operation is counting
• Statistical analysis based on frequency counts
such as percentage, mode.
Example: gender, religion, locality, party
affiliation etc
Ordinal Scale
• Ranking scale, assign numbers to indicate relative
extent to which the object possess some
characteristics
• Can determine whether an object has more or less
some characteristics than other object and not how
much more or less
• Any series of numbers can be given that preserves
the ordered relationship among objects.
• Along with counting operation of nominal scale this
has statistics based on percentiles, quartiles and
median.
Example: social class, severity of a behavior disorder
Interval Scale
• Distance between any two objects is fixed and equal
• It allows comparison of difference between two
objects
• Meaningful addition and subtraction of scale values
are possible
• The zero point and the unit of measurement are
arbitrary
• In addition to the statistical techniques applied to
nominal and ordinal data, the arithmetic mean and
standard deviation are used
Example: Temperature (Fahrenheit or Celsius)
Ratio Scale
• Possess all the properties of nominal, ordinal and
interval scale
• This has absolute zero point
• It is meaningful to calculate ratio of scale values.
• All statistical techniques can be applied.
Examples: Income, age, weight, height so on
Categorical variables
• They can be placed into one of two (dichotomous) or
more (polychotomous) categories.
• Examples of dichotomous categorical variables:
Male / Female Pregnant / Not pregnant
Smoker / Non smoker Married / Single
• However, many classifications require more than two
categories. For e.g., Married / Single / Divorced/
Separated/ Widowed; Blood group: A/ B/ AB/ O;
Religion: Hindu/ Christian/ Muslim etc…. There is no
ordering of the categories.
• These are examples of nominal scale, in which the
values fall into unordered categories or classes.
Categorical variables
• But often there is a natural order, as with the
varying stages of cancer and social class.
• Example : degree of smoking can be further
divided as non-smokers/ ex-smokers/ light
smokers/ heavy smokers. This is an example of
ordinal scale.
• In ordinal scales, the categories bear an ordered
relationship to one another.
Numerical variables
• Also called quantitative or interval variables. They are
expressed as integers, fractions or decimals, in which
equal distances exist between successive intervals. Age,
systolic & diastolic blood pressure, and height are
examples of continuous variables.
• Numerical variables can be further divided into discrete &
continuous. Discrete numerical variable can take only
intermittent values over a range, they differ by fixed
amount, and no intermediate values are possible.
• Examples of discrete numerical variables are no. of
children, no. of ectopic heart beats etc…
Numerical variables
• Data that represent measurable quantities but are
not restricted to taking on specified values such as
integers are known as continuous data.
• If the values of the measurement take any number in
a range, the data are said to be continuous.
• The difference between any two possible data values
can be very small. Common examples include height,
weight, temperature etc…
• Continuous data can be reduced to several
categories.
Discrete data -- Gaps between possible values
Continuous data -- no gaps between possible values
Discrete
vs.
Continuous Data
Derived Variables
• Used to measure diseases in epidemiological studies.
• Rate, ratio and proportion.
Ratio: quantifies the magnitude of one occurrence or
condition to another.
 Expresses the relationship between two numbers
Example: The ratio of males or females in Ethiopia
Proportion: quantifies occurrences in relation to the
population in which these occurrences take place
 Expressed as a percentage
Example: The proportion of all births that was male
Derived Variables…
• Rate: expresses probability or risk of disease in
a defined population over a specified period
of time.
Considered to be a basic measure of disease
occurrence.
Example: The number of newly diagnosed breast
cancer cases per 100,000 women.
Data collection
• There are two sources of data:
• Primary Data
Data measured or collect by the investigator or
the user directly from the source.
Data collected first hand by the investigator.
• Secondary Data
Data gathered or compiled from published and
unpublished sources or files.
Planning & Measuring
Planning:
• Identify source and elements of the data.
• Decide whether to consider sample or census.
• If sampling is preferred, decide on sample size,
selection method,… etc
• Decide measurement procedure.
• Set up the necessary organizational structure.
Measuring:
• there are different methods.
Methods of collecting primary data
• Survey method
- Investigator makes personal contact with the
informants either directly or indirectly and collect the
data (Telephone Interview, Mail Questionnaires)
- Collected information is more reliable/accurate
• Experimental method
-Determine whether/in what manner variables are
related to each other
- Large scale organizations with R & D departments
doing to determine the cause and effect relationships.
-to study the effect of fertilizer on crop
Methods of collecting primary data…
• Observation method
-Investigator observes the overall nature of the event
and collects the required data.
-devices used are automatic recorder, motion picture
etc
-ex: individual doing research on growth of plants,
behavior of bats, keenly observes and finds out the
required information.
-Gives more accurate result and supplementary
information. Costly and time consuming.
Secondary data sources
• Official publications of Government
• Publications of research institutions
• Professional bodies
• Economic trade and scientific Journals
When the source is secondary data check that:
• The type and objective of the situations.
• The purpose for which the data are collected and
compatible with the present problem.
• The nature and classification of data is appropriate to our
problem.
• There are no biases and misreporting in the published
data.
Note: Data which are primary for one may be secondary for
the other.
Descriptive Vs Inferential Statistics
Depending on how data can be used, statistics is
sometimes divided in to two main areas or
branches.
• Descriptive Statistics:
 is concerned with summary calculations, graphs, charts
and tables.
 Generally characterizes or describes a set of data
elements by graphically displaying the information or
describing its central tendencies and how it is
distributed.
• Inferential Statistics:
 consists of generalizing from samples to populations,
performing estimations and hypothesis tests, determining
relationships among variables, and making predictions.
 Statistical techniques based on probability theory are
required.
• Example: the following is the number of malaria patients who have
been treated in a Hospital from 2001 to 2005: 3645; 4568; 5432; 6751;
7369
If we calculate the average malaria patients from 2001 to 2005, then our
work belongs to the domain of descriptive statistics.
If we predict the number of malaria patients in the year 2015 to be 9917,
then our work belongs to the domain of inferential statistics.
Thank You
1 von 26

Recomendados

Chapter-one.pptx von
Chapter-one.pptxChapter-one.pptx
Chapter-one.pptxAbebeNega
12 views23 Folien
Sampling-A compact study of different types of sample von
Sampling-A compact study of different types of sampleSampling-A compact study of different types of sample
Sampling-A compact study of different types of sampleAsith Paul.K
596 views47 Folien
Epidemiolgy and biostatistics notes von
Epidemiolgy and biostatistics notesEpidemiolgy and biostatistics notes
Epidemiolgy and biostatistics notesCharles Ntwale
1.4K views26 Folien
Understanding statistics in research von
Understanding statistics in researchUnderstanding statistics in research
Understanding statistics in researchDr. Senthilvel Vasudevan
8.9K views37 Folien
Probability_and_Statistics_lecture_notes_1.pptx von
Probability_and_Statistics_lecture_notes_1.pptxProbability_and_Statistics_lecture_notes_1.pptx
Probability_and_Statistics_lecture_notes_1.pptxAliMurat5
13 views55 Folien
01 Introduction (1).pptx von
01 Introduction (1).pptx01 Introduction (1).pptx
01 Introduction (1).pptxBAVAHRNIAPSUBRAMANIA
34 views39 Folien

Más contenido relacionado

Similar a Intro_BiostatPG.ppt

1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx von
1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx
1.-Lecture-Notes-in-Statistics-POWERPOINT.pptxAngelineAbella2
8 views93 Folien
Research and Data Analysi-1.pptx von
Research and Data Analysi-1.pptxResearch and Data Analysi-1.pptx
Research and Data Analysi-1.pptxMaryamManzoor25
29 views34 Folien
Introduction to basics of bio statistics. von
Introduction to basics of bio statistics.Introduction to basics of bio statistics.
Introduction to basics of bio statistics.AB Rajar
503 views48 Folien
Introduction to statistics.pptx von
Introduction to statistics.pptxIntroduction to statistics.pptx
Introduction to statistics.pptxMuddaAbdo1
22 views44 Folien
Statistics for Data Analytics von
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data AnalyticsSSaudia
574 views71 Folien
Final Lecture - 1.ppt von
Final Lecture - 1.pptFinal Lecture - 1.ppt
Final Lecture - 1.pptssuserbe1d97
2 views27 Folien

Similar a Intro_BiostatPG.ppt(20)

1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx von AngelineAbella2
1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx
1.-Lecture-Notes-in-Statistics-POWERPOINT.pptx
AngelineAbella28 views
Introduction to basics of bio statistics. von AB Rajar
Introduction to basics of bio statistics.Introduction to basics of bio statistics.
Introduction to basics of bio statistics.
AB Rajar503 views
Introduction to statistics.pptx von MuddaAbdo1
Introduction to statistics.pptxIntroduction to statistics.pptx
Introduction to statistics.pptx
MuddaAbdo122 views
Statistics for Data Analytics von SSaudia
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
SSaudia574 views
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf von Ravinandan A P
Common Statistical Terms - Biostatistics - Ravinandan A P.pdfCommon Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Ravinandan A P206 views
Introduction to biostatistics von Ali Al Mousawi
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
Ali Al Mousawi24.3K views
WEEK-1-IS-20022023-094301am.pdf von MdDahri
WEEK-1-IS-20022023-094301am.pdfWEEK-1-IS-20022023-094301am.pdf
WEEK-1-IS-20022023-094301am.pdf
MdDahri13 views
Introduction to Data Management in Human Ecology von Kern Rocke
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
Kern Rocke2K views
Basics of statistics von donthuraj
Basics of statisticsBasics of statistics
Basics of statistics
donthuraj21.3K views
TREATMENT OF DATA_Scrd.pptx von Carmela857185
TREATMENT OF DATA_Scrd.pptxTREATMENT OF DATA_Scrd.pptx
TREATMENT OF DATA_Scrd.pptx
Carmela857185106 views

Último

Chapter 3b- Process Communication (1) (1)(1) (1).pptx von
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptxayeshabaig2004
5 views30 Folien
ColonyOS von
ColonyOSColonyOS
ColonyOSJohanKristiansson6
9 views17 Folien
3196 The Case of The East River von
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East RiverErickANDRADE90
12 views4 Folien
SAP-TCodes.pdf von
SAP-TCodes.pdfSAP-TCodes.pdf
SAP-TCodes.pdfmustafaghulam8181
9 views285 Folien
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... von
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...DataScienceConferenc1
5 views11 Folien
Short Story Assignment by Kelly Nguyen von
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyenkellynguyen01
19 views17 Folien

Último(20)

Chapter 3b- Process Communication (1) (1)(1) (1).pptx von ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20045 views
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... von DataScienceConferenc1
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
Short Story Assignment by Kelly Nguyen von kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 views
Survey on Factuality in LLM's.pptx von NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra15 views
Ukraine Infographic_22NOV2023_v2.pdf von AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf von vikas12611618
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfVikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
vikas126116188 views
SUPER STORE SQL PROJECT.pptx von khan888620
SUPER STORE SQL PROJECT.pptxSUPER STORE SQL PROJECT.pptx
SUPER STORE SQL PROJECT.pptx
khan88862012 views
Cross-network in Google Analytics 4.pdf von GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
Advanced_Recommendation_Systems_Presentation.pptx von neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx von DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
Organic Shopping in Google Analytics 4.pdf von GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials12 views
UNEP FI CRS Climate Risk Results.pptx von pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views

Intro_BiostatPG.ppt

  • 1. Introduction to Biostatistics Dr. Karunambigai.M Public Health Sciences Department KI University
  • 2. Data • Research is any process by which information is systematically and carefully gathered for the purpose of answering questions, examining ideas, or testing theories. • Numerical information collected as part of any research is called Data. Depending on the nature of the problem, the data may relate to individuals, families, houses, villages etc… • The data collected are known as observations. The individual subjects upon whom the data are collected are known as statistical units.
  • 3. Variables • The characteristics or events that are measured on a subject, in a research study are called variables, because they vary. (i.e., they take different values in different subjects or vary from one subject to another). • Variables are measured according to two broad types of measurement scales: Numerical & Categorical (otherwise known as Quantitative & Qualitative).
  • 4. Types of dataset and their measure • Population - dataset consisting of all outcomes, measurements, or responses of interest. • Sample - dataset which is a subset of the population. • Parameter - a numerical measurement made using the population. • Statistic - a numerical measurement made using a sample.
  • 5. Properties of Measurement • Difference - Different numerals mean different instances the variable can take • Magnitude – This indicates that something is more or less than the other • Equal Appearing Interval – Different numerals have equal distances with preceding & succeeding numbers • True Zero – Zero has an absolute meaning
  • 6. Level of Measurement (Measurement Scale) • Nominal • Ordinal • Interval • Ratio The measurement levels are considered in the following hierarchy: (LOWEST) Nominal -Ordinal – Interval - Ratio (HIGHEST)
  • 7. Nominal Scale • Numbers serve as labels. • Numbers used only for identification and one- to- one correspondence with the objects • Only permissible operation is counting • Statistical analysis based on frequency counts such as percentage, mode. Example: gender, religion, locality, party affiliation etc
  • 8. Ordinal Scale • Ranking scale, assign numbers to indicate relative extent to which the object possess some characteristics • Can determine whether an object has more or less some characteristics than other object and not how much more or less • Any series of numbers can be given that preserves the ordered relationship among objects. • Along with counting operation of nominal scale this has statistics based on percentiles, quartiles and median. Example: social class, severity of a behavior disorder
  • 9. Interval Scale • Distance between any two objects is fixed and equal • It allows comparison of difference between two objects • Meaningful addition and subtraction of scale values are possible • The zero point and the unit of measurement are arbitrary • In addition to the statistical techniques applied to nominal and ordinal data, the arithmetic mean and standard deviation are used Example: Temperature (Fahrenheit or Celsius)
  • 10. Ratio Scale • Possess all the properties of nominal, ordinal and interval scale • This has absolute zero point • It is meaningful to calculate ratio of scale values. • All statistical techniques can be applied. Examples: Income, age, weight, height so on
  • 11. Categorical variables • They can be placed into one of two (dichotomous) or more (polychotomous) categories. • Examples of dichotomous categorical variables: Male / Female Pregnant / Not pregnant Smoker / Non smoker Married / Single • However, many classifications require more than two categories. For e.g., Married / Single / Divorced/ Separated/ Widowed; Blood group: A/ B/ AB/ O; Religion: Hindu/ Christian/ Muslim etc…. There is no ordering of the categories. • These are examples of nominal scale, in which the values fall into unordered categories or classes.
  • 12. Categorical variables • But often there is a natural order, as with the varying stages of cancer and social class. • Example : degree of smoking can be further divided as non-smokers/ ex-smokers/ light smokers/ heavy smokers. This is an example of ordinal scale. • In ordinal scales, the categories bear an ordered relationship to one another.
  • 13. Numerical variables • Also called quantitative or interval variables. They are expressed as integers, fractions or decimals, in which equal distances exist between successive intervals. Age, systolic & diastolic blood pressure, and height are examples of continuous variables. • Numerical variables can be further divided into discrete & continuous. Discrete numerical variable can take only intermittent values over a range, they differ by fixed amount, and no intermediate values are possible. • Examples of discrete numerical variables are no. of children, no. of ectopic heart beats etc…
  • 14. Numerical variables • Data that represent measurable quantities but are not restricted to taking on specified values such as integers are known as continuous data. • If the values of the measurement take any number in a range, the data are said to be continuous. • The difference between any two possible data values can be very small. Common examples include height, weight, temperature etc… • Continuous data can be reduced to several categories.
  • 15. Discrete data -- Gaps between possible values Continuous data -- no gaps between possible values Discrete vs. Continuous Data
  • 16. Derived Variables • Used to measure diseases in epidemiological studies. • Rate, ratio and proportion. Ratio: quantifies the magnitude of one occurrence or condition to another.  Expresses the relationship between two numbers Example: The ratio of males or females in Ethiopia Proportion: quantifies occurrences in relation to the population in which these occurrences take place  Expressed as a percentage Example: The proportion of all births that was male
  • 17. Derived Variables… • Rate: expresses probability or risk of disease in a defined population over a specified period of time. Considered to be a basic measure of disease occurrence. Example: The number of newly diagnosed breast cancer cases per 100,000 women.
  • 18. Data collection • There are two sources of data: • Primary Data Data measured or collect by the investigator or the user directly from the source. Data collected first hand by the investigator. • Secondary Data Data gathered or compiled from published and unpublished sources or files.
  • 19. Planning & Measuring Planning: • Identify source and elements of the data. • Decide whether to consider sample or census. • If sampling is preferred, decide on sample size, selection method,… etc • Decide measurement procedure. • Set up the necessary organizational structure. Measuring: • there are different methods.
  • 20. Methods of collecting primary data • Survey method - Investigator makes personal contact with the informants either directly or indirectly and collect the data (Telephone Interview, Mail Questionnaires) - Collected information is more reliable/accurate • Experimental method -Determine whether/in what manner variables are related to each other - Large scale organizations with R & D departments doing to determine the cause and effect relationships. -to study the effect of fertilizer on crop
  • 21. Methods of collecting primary data… • Observation method -Investigator observes the overall nature of the event and collects the required data. -devices used are automatic recorder, motion picture etc -ex: individual doing research on growth of plants, behavior of bats, keenly observes and finds out the required information. -Gives more accurate result and supplementary information. Costly and time consuming.
  • 22. Secondary data sources • Official publications of Government • Publications of research institutions • Professional bodies • Economic trade and scientific Journals
  • 23. When the source is secondary data check that: • The type and objective of the situations. • The purpose for which the data are collected and compatible with the present problem. • The nature and classification of data is appropriate to our problem. • There are no biases and misreporting in the published data. Note: Data which are primary for one may be secondary for the other.
  • 24. Descriptive Vs Inferential Statistics Depending on how data can be used, statistics is sometimes divided in to two main areas or branches. • Descriptive Statistics:  is concerned with summary calculations, graphs, charts and tables.  Generally characterizes or describes a set of data elements by graphically displaying the information or describing its central tendencies and how it is distributed.
  • 25. • Inferential Statistics:  consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions.  Statistical techniques based on probability theory are required. • Example: the following is the number of malaria patients who have been treated in a Hospital from 2001 to 2005: 3645; 4568; 5432; 6751; 7369 If we calculate the average malaria patients from 2001 to 2005, then our work belongs to the domain of descriptive statistics. If we predict the number of malaria patients in the year 2015 to be 9917, then our work belongs to the domain of inferential statistics.