- 1. CHAPTER ONE INTRODUCTION AND DEFINITION OF STATISTICS
- 2. INTRODUCTION • The word statistics has two meanings. That is we can define Statistics in two senses. Statistics defined in plural sense (As a data) • In this sense, it is equivalent to referring to numerical facts, figures or statistical data. i.e., the raw data themselves, like statistics of births, statistics of students, statistics of imports and exports, etc. • Statistics defined as a method (Singular Sense) • The second meaning of statistics refers to the science or discipline of study. • In this sense of the word, Statistics is defined as the science of collecting, presenting, analyzing and interpreting numerical data to make decisions.
- 3. Characteristics of statistics • not all numerical data are statistics. • Some of the characteristics which numerical data must possess in order that they may be called statistics are given below. – Statistics should be numerically expressed – They should be aggregates of facts – They should be collected in a systematic manner – They should be collected for a predetermined purpose – They should be placed in relation to each other – They should be enumerated or estimated according to reasonable standards of accuracy – Statistics should be affected to a marked extent by a multiplicity of causes • In general it can be said that all statistics are numerical facts; but not all numerical facts are statistics.
- 4. BASIC TERMINOLOGIES IN STATISTICS 1. Data: In Statistics, all conclusions are based on facts and the first step in any statistical investigation is to collect a set of related observations from which conclusions may be drawn. • Such related observations that form the set are known as Data. • The word data was obtained from the singular Latin word “datum “to mean fact. 2. Population: The complete collection of individuals, objects, or measurements that have a characteristic in common or totality of related observations in a given study is described as a population. • The population that is being studied is also called the target population. • Population can be finite (limited in its size) or infinite (unrestricted). 3. Sample: A sub group of the population that will be studied in detail is called a Sample.
- 5. 4. Parameters: are statistical measures obtained from a population data. These measures may include the mean, variance, standard deviation, etc. and are denoted by , etc. respectively. 4. Sample Statistic: a number computed from a sample data. Sample Statistics are denoted by lower case letters of the alphabet such as -the sample mean, - the sample variance, etc 5. Variable: is a characteristic under study that assumes different values for different elements and most of the time variables are denoted by the letters X, Y, Z, etc. Example 1.3: Height, Weight, Age, Income, Expenditure, Grade, Intelligence, sex, color, etc. 6. Quantitative Variable: is variable that can be expressed numerically such as height, weight, age, income, expenditure, grade, family size, number of students in a class, etc. 7. Qualitative Variable: is variable that cannot assume a numerical value but can be classified into two or more nonnumeric categories such as the gender of a person, the language in which a book is written, hair color, and so on.
- 6. • Discrete Variable: is a variable whose values are countable such as family size, number of students in a class, etc. Its values are obtained by counting. • Continuous Variable: is a variable which can, theoretically assume any numerical value between two given values. • Observation or Measurement: The value of a variable for an element is called an Observation or Measurement. • Data Set: A data set is a collection of observations on one or more variables. • Sample Frame: A list of the entire population from which items can be selected to form a sample is referred to as sample frame.
- 7. 1.4. CLASSIFICATION OF STATISTICS 1. Descriptive Statistics consists of the collection, organization, presentation and analysis of numerical data. • It is concerned with describing certain characteristics of a set of data (usually a sample) – that is, what it is shaped like, what number the values tend to cluster (converge) around, how much variation is present in the data, and so forth. • In short, Descriptive Statistics describes the nature or characteristics of data without making conclusion or generalization. • Example 1.4: The average age of athletes participated in London Marathon was 25 years, 80% of the employees of the company are males, The marks of 50 students in Statistics course are found to range from 30 to 85, etc. are some examples of Descriptive Statistics.
- 8. 2. Inferential Statistics • Inferential Statistics is concerned with the process of drawing conclusions (inferences) about specific characteristics of a population based on information obtained from samples, performing hypothesis testing, determining relationships among variables, and making predictions. • Example 1.5: - The result obtained from the analysis of the income of 100 randomly selected citizens in Ethiopia suggests that the average per capita income of a citizen in Ethiopia is 30 Birr. • The average income of all families in Ethiopia can be estimated from figures obtained from a few hundred families.
- 9. 1.5 STEPS OF STATISTICAL INVESTIGATION 1. Collection of data 2. Organization of data 3. Presentation 4. Analysis of Data 5. Interpretation of data
- 10. 1. Collection of data • This is the process of obtaining measurements or counts and constitutes the first step in statistical investigation. • In general, information pertinent to the underlying investigation is collected. • Valid conclusions can only result from properly collected data. i.e., if data are faulty, the conclusions drawn can never be reliable. • Hence, utmost care must be exercised in collecting data because they form the foundation of statistical analysis
- 11. 2. Organization of data • Collected data have to be organized in a suitable form so that one can have a general understanding of the information gathered. A large mass of figures that are collected from surveys frequently need organization • The first step in organizing a group of data is editing. The collected data must be edited very carefully so that omissions, inconsistencies, irrelevant answers and wrong computations in the returns from a survey may be corrected or adjusted • The next step is to classify the data. The purpose of data classification is to arrange them according to some common Characteristics possessed by the items constituting the data • The last step in data organization is tabulation. The purpose of tabulation is to arrange the data in columns and rows so that there is absolute clarity in the data presented.
- 12. 3. Presentation • The main purpose of data presentation is to facilitate statistical analysis. • This can be done by arranging the data using graphs and diagrams. 4. Analysis of Data • This is the extraction of summarized and comprehensible numerical descriptions of the data where these measures will in turn give a far better understanding of the nature of the data. • The purpose of analyzing data is to dig out information useful for decision making • The most commonly used methods of statistical analysis are measures of central tendency, measures of variation, correlation, regression, estimation and hypothesis testing
- 13. 5. Interpretation of data • Is the task of drawing conclusions from the analysis of the data and usually involves the formulation of predictions concerning a large collection of objects from information available for a small collection of similar objects. • This step usually involves decision making about a large collection of objects (Population) and about information gathered from a small collection of similar objects (sample). • The interpretation of data is a difficult task and necessitates a high degree of skill and experience.
- 14. 1.6 APPLICATIONS OF STATISTICS • The study of statistics has become more popular than ever during the past three decades or so. • The increasing availability of computers and statistical soft ware packages has enlarged the role of statistics as a tool for empirical research. • As a result, statistics is used for research in almost all professions, from medicine to sports, Today college students in almost all disciplines are required to take at least one statistics course. • So the various tools of statistics are being used to solve problems in everyday life, in research, in marketing, in planning, in production and quality control, and other areas. • widely used in all areas of human knowledge and widely applied in a variety of disciplines such as business, economics and research.
- 15. 1.6.1 LIMITATION OF STATISTICS i. It does not deal with individual values. ii. It cannot deal with qualitative characteristic. Statistics deals with quantitative characteristics iii. Statistical conclusions are not universally true. iv. Statistical interpretation requires a high degree of skill and understanding of the subject. v. Statistics can be misused. Statistics can be used to establish wrong conclusions and, therefore, can be used only by experts.
- 16. 1.6.2 USES OF STATISTICS To present facts in a definite form Statistics facilitates comparisons Statistics gives guidance in the formulation of suitable policies Prediction Statistical methods are very helpful in formulating and testing hypothesis and to develop new theories Statistics in the sciences.
- 17. 1.7 SCALES OF MEASUREMENT • Measurement can be defined as the assignment of numbers to objects and events according to logically acceptable rules. • The number system is highly logical and offers a multiplicity of possibilities of further logical manipulations. • A measurement scale should possess the following attributes to allow for these logical manipulations • Magnitude - quantity in which the attribute exists in various instances of the phenomena. • Equal intervals - It denotes that the magnitude of the attribute represented by a unit of measurement on the scale is equal regardless of where on the scale the unit falls. • Absolute zero point - Is a value that indicates exists at that point or nothing at all of the attribute being measured exists.
- 18. Types of Measurement • 1. Nominal Scale (Classificatory Scale) • It refers to the simple classification of objects or items in to discrete groups which do not bear any magnitude relations ships to one another. • “Nominal” stands for “name” of category. The nominal scale of measurement is used for qualitative rather than quantitative data: blue, green, red, male, female; marital status (married, single, divorced, etc.); professional classification; geographic classification; and so on.
- 19. • Ordinal Scale or Ranking Scale • First it is nominal scale and here data elements may be ordered according to their relative size or quality. It means that ordinal scale is first of all, nominal but most people would agree with the order in which the categories were placed so, in ordinal scale inequalities have a meaning the inequality signs ‘<’ or’>’ may assume any meaning like “strong than”, “softer than”, “weaker than” etc. • Ordinal scale reflects only magnitude and does not possess the attribute of equal intervals or an absolute zero point
- 20. • Interval Scale • The interval scale possesses two out of three important requirements of good measurement scale i.e., magnitude and equal intervals but lacks the real or absolute zero point. • An interval scale is one, which provides equal intervals from an arbitrary origin. An interval scale not only orders according to the amount of the attribute they represent, but also establishes equal intervals between the units of measure. Equal differences in the numbers represent equal differences in the attribute being measured.
- 21. • Ratio Scale • The scale of measurement which has all the three attributes – magnitude, equal intervals and an absolute zero point- is called a ratio scale addition, subtraction, multiplication and division of the numbers are appropriate • Example 1.12:- All physical measurements, like height, weight, etc. - Number of students in various classes. - Number of books possessed by students of a class, etc.