Datasets are collections of numeric data that can be analyzed using statistical software, while statistics are organized and interpreted numeric data usually displayed in tables. Data are the raw materials used to create statistics by performing statistical analysis to show relationships between variables. Datasets contain individual cases like people or households and can be cross-sectional, time series, or longitudinal. To find datasets, consider who might collect the desired data type, look for publications citing the dataset, and determine if it is freely available, through a library subscription, can be purchased, or requested from the researcher.
6. What is Data?
• Data are raw ingredients from which statistics are created.
• Statistical analysis can be performed on data to show
relationships among the variables collected.
• Through secondary data analysis, many different researchers
can re-use the same data set for different purposes.
7. Aggregate Data
Is higher-level data that have been compiled
from smaller units of data.
• Examples: inflation rate, consumer price
index, demographic data for city or state
8. Microdata
• Data directly observed or collected from a
specific unit of observation.
• Contain individual cases, usually individual
people, or in the case of Census data,
individual households
– Examples:
• Census: the unit of observation is probably an
individual, a household or a family.
• Survey or poll: the responses of a single respondent
9. Datasets
• A data set or study is
made up of the raw data
file and any related files,
usually the codebook
and setup files.
• Most data sets require at
least basic statistical or
spreadsheet programs to
use.
10. Types of data
• Cross-Sectional - data that are
only collected once.
• Time Series study the same
variable over time.
• Longitudinal Studies describe
surveys that are conducted
repeatedly, in which the same
group of respondents are
surveyed each time.
12. 1. Think about who might
collect the data.
• Could it have been collected by a government
agency?
• A nonprofit or nongovernmental organization?
• A private business or industry group?
• Academic researchers?
13. 2. Look for publications that use
the kind of data you’re looking for
and that cite the dataset
In other words, is the data you
want mentioned in scholarly
articles or government reports
or some other source?
14. 3. Once you know that what you want
exists, it's time to hunt it down.
• Is it freely available on the web?
• Or part of a package to which the
library already subscribes?
• Is it something we can buy? (And is
it within the library's budget and
can the purchase be made quickly
enough to fit your timeframe?)
• Can it be requested directly from the
researcher?