SlideShare ist ein Scribd-Unternehmen logo
1 von 105
DATA ANALYSIS
Marcelo Augusto A. Cosgayon
DATA ANALYSIS
defined as the process of systematically searching and
arranging interview transcripts, observation notes, or
other non-textual materials that the researcher
accumulates to increase the understanding of the
phenomenon.
DATA ANALYSIS
Qualitative research yields mainly unstructured text-
based data in the form of:
Interview transcripts
Observation notes
Diary entries
Records
DATA ANALYSIS
Data analysis in qualitative research is more of a
dynamic, intuitive, and creative process of inductive
reasoning, thinking, and theorizing.
In contrast to quantitative research, which uses statistical
methods, qualitative research focuses on the exploration
of values, meanings, beliefs, thoughts, experiences, and
feelings characteristic of the phenomenon under
investigation.
DATA ANALYSIS
The process of analyzing qualitative data predominantly involves
coding or categorizing the data. Basically, it involves making
sense of huge amounts of data by reducing the volume of raw
information, followed by identifying significant patterns, and finally
drawing meaning from data, and subsequently building a logical
chain of evidence.
SCALES OF MEASUREMENT
Data can be classified as being on one of four (4)
scales:
1. Nominal
2. Ordinal
3. Interval
4. Ratio
SCALES OF MEASUREMENT
Nominal Scale
Nominal variables (also called categorical variables) can
be placed into categories. They don’t have a numeric
value and so cannot be added, subtracted, divided or
multiplied. They also have no order; if they appear to
have an order then these are ordinal variables instead
SCALES OF MEASUREMENT
Nominal Scale
The nominal scale of measurement only satisfies the
identity property of measurement. Values assigned to
variables represent a descriptive category, but have no
inherent numerical value with respect to magnitude.
SCALES OF MEASUREMENT
Nominal Scale
Gender is an example of a variable that is measured on a
nominal scale. Individuals may be classified as "male" or
"female", but neither value represents more or less
"gender" than the other. Religion and political affiliation
are other examples of variables that are normally
measured on a nominal scale
SCALES OF MEASUREMENT
Ordinal Scale
The ordinal scale contains things that you can place in
order. For example, hottest to coldest, lightest to
heaviest, richest to poorest. Basically, if you can rank
data by 1st, 2nd, 3rd place (and so on), then you have data
that’s on an ordinal scale
SCALES OF MEASUREMENT
Ordinal Scale
The ordinal scale has the property of both identity and
magnitude. Each value on the ordinal scale has a unique
meaning, and it has an ordered relationship to every
other value on the scale
SCALES OF MEASUREMENT
Ordinal Scale
An example of an ordinal scale in action would be the results
of a horse race, reported as “win”, “place”, and “show”. The
rank order in which horses finished the race is known. The
horse that won finished ahead of the horse that placed, and
the horse that placed finished ahead of the horse that
showed. However, we cannot tell from this ordinal scale
whether it was a close race or whether the winning horse won
by a mile
SCALES OF MEASUREMENT
Interval Scale
An interval scale has ordered numbers with meaningful
divisions.
Temperature is on the interval scale: a difference of 10
degrees between 90 and 100 means the same as 10 degrees
between 150 and 160. Compare that to high school ranking
(which is ordinal), where the difference between 1st and 2nd
might be .01 and between 10th and 11th .5. If you have
meaningful divisions, you have something on the interval
scale
SCALES OF MEASUREMENT
Interval Scale
The interval scale of measurement has the properties of identity, magnitude, and
equal intervals.
 A perfect example of an interval scale is the Fahrenheit scale to measure
temperature. The scale is made up of equal temperature units, so that the
difference between 40 and 50 degrees Fahrenheit is equal to the difference
between 50 and 60 degrees Fahrenheit.
 With an interval scale, you know not only whether different values are bigger or
smaller, you also know how much bigger or smaller they are. For example,
suppose it is 60 degrees Fahrenheit on Monday and 70 degrees on Tuesday. You
know not only that it was hotter on Tuesday, you also know that it was 10 degrees
hotter
SCALES OF MEASUREMENT
Ratio Scale
The ratio scale is exactly the same as the interval scale
with one major difference: zero is meaningful. For
example, a height of zero is meaningful (it means you
don’t exist). Compare that to a temperature of zero, which
while it exists, it doesn’t mean anything in particular
(although admittedly, in the Celsius scale it’s the freezing
point for water)
SCALES OF MEASUREMENT
Ratio Scale
The ratio scale of measurement satisfies all four of the
properties of measurement: identity, magnitude, equal
intervals, and a minimum value of zero.
The weight of an object would be an example of a ratio
scale. Each value on the weight scale has a unique
meaning, weights can be rank ordered, units along the
weight scale are equal to one another, and the scale has a
minimum value of zero.
TYPES OF DATA ANALYSIS
1. Content analysis
2. Narrative analysis
3. Discourse analysis
4. Framework analysis
5. Grounded theory
TYPES OF DATA ANALYSIS
Content analysis
This refers to the process of categorizing verbal or
behavioural data to classify, summarize, and tabulate the
data.
TYPES OF DATA ANALYSIS
Narrative analysis.
This method involves the reformulation of stories
presented by respondents, taking into account the
context of each case and different experiences of each
respondent. In other words, narrative analysis is the
revision of primary qualitative data by the researcher.
TYPES OF DATA ANALYSIS
Discourse analysis
A method of analysis of naturally occurring talk and all
types of written text.
TYPES OF DATA ANALYSIS
Framework analysis
This is a more advanced method that consists of several
stages such as familiarization, identifying a thematic
framework, coding, charting, mapping, and interpretation.
TYPES OF DATA ANALYSIS
Grounded theory
This method of qualitative data analysis starts with an
analysis of a single case to formulate a theory. Then,
additional cases are examined to see if they contribute to
the theory.
OTHER TYPES OF DATA ANALYSIS
1. Phenomenological Method
2. Ethnographic Model
3. Grounded Theory Method
4. Case Study Model
5. Historical Model
6. Narrative Model
TYPES OF DATA ANALYSIS
Phenomenological Method
Describing how any one participant experiences a specific event is the goal
of the phenomenological method of research.
This method utilizes interviews, observation and surveys to gather
information from subjects. Phenomenology is highly concerned with how
participants feel about things during an event or activity. Businesses use this
method to develop processes to help sales representatives effectively close
sales using styles that fit their personality.
TYPES OF DATA ANALYSIS
Ethnographic Model
The ethnographic model is one of the most popular and widely recognized
methods of qualitative research; it immerses subjects in a culture that is
unfamiliar to them.
The goal is to learn and describe the culture's characteristics much the same
way anthropologists observe the cultural challenges and motivations that
drive a group. This method often immerses the researcher as a subject for
extended periods of time. In a business model, ethnography is central to
understanding customers. Testing products personally or in beta groups
before releasing them to the public is an example of ethnographic research.
TYPES OF DATA ANALYSIS
Grounded Theory Method
The grounded theory method tries to explain why a course of action evolved
the way it did.
Grounded theory looks at large subject numbers. Theoretical models are
developed based on existing data in existing modes of genetic, biological, or
psychological science. Businesses use grounded theory when conducting
user or satisfaction surveys that target why consumers use company
products or services. This data helps companies maintain customer
satisfaction and loyalty.
TYPES OF DATA ANALYSIS
Case Study Model
Unlike grounded theory, the case study model provides an in-depth look at
one test subject. The subject can be a person or family, business or
organization, or a town or city.
Data is collected from various sources and compiled using the details to
create a bigger conclusion. Businesses often use case studies when
marketing to new clients to show how their business solutions solve a
problem for the subject
TYPES OF DATA ANALYSIS
Historical Model
The historical method of qualitative research describes past events in order
to understand present patterns and anticipate future choices.
This model answers questions based on a hypothetical idea and then uses
resources to test the idea for any potential deviations. Businesses can use
historical data of previous ad campaigns and the targeted demographic and
split-test it with new campaigns to determine the most effective campaign.
TYPES OF DATA ANALYSIS
Narrative Model
The narrative model occurs over extended periods of time and compiles
information as it happens.
Like a story narrative, it takes subjects at a starting point and reviews
situations as obstacles or opportunities occur, although the final narrative
doesn't always remain in chronological order. Businesses use the narrative
method to define buyer personas and use them to identify innovations that
appeal to a target market
MAIN APPROACHES TO DATA
ANALYSIS
Deductive Approach
Inductive Approach
APPROACHES TO DATA ANALYSIS
Deductive Approach
The deductive approach involves analyzing qualitative data based
on a structure that is predetermined by the researcher.
In this case, a researcher can use the questions as a guide for
analyzing the data. This approach is quick and easy and can be
used when a researcher has a fair idea about the likely responses
that will be received from the sample population
APPROACHES TO DATA ANALYSIS
Inductive Approach
The inductive approach, on the contrary, is not based on a
predetermined structure or set ground rules/framework.
This is more time consuming and a thorough approach to
qualitative data analysis. Inductive approach is often used when a
researcher has very little or no idea of the research phenomenon
VARIABLE DESCRIPTIVE ANALYSIS
1. Univariate Analysis – contains a single variable
2. Bivariate Analysis – contains two variables
3. Multivariate Analysis – contains multiple variables
VARIABLE DESCRIPTIVE ANALYSIS
Univariate Analysis
the simplest form of data analysis where the data being
analyzed contains only one variable.
Because there is only a single variable, it does not deal
with causes or relationships. The main purpose of
univariate analysis is to describe the data and find
patterns that exist within it
VARIABLE DESCRIPTIVE ANALYSIS
Univariate Analysis
Some ways that univariate data can describe patterns is by
looking at the mean, mode, median, range, variance,
maximum, minimum, quartiles, and standard deviation.
Additionally, some ways you may display univariate data
include frequency distribution tables, bar charts, histograms,
frequency polygons, and pie charts
VARIABLE DESCRIPTIVE ANALYSIS
Bivariate Analysis
used to find out if there is a relationship between two
different variables.
VARIABLE DESCRIPTIVE ANALYSIS
Bivariate Analysis
Something as simple as creating a scatterplot by plotting
one variable against another on a Cartesian plane (think
X and Y axis) can sometimes give a picture of what the
data is trying to indicate.
If the data seems to fit a line or curve, then there is a
relationship or correlation between the two variables
VARIABLE DESCRIPTIVE ANALYSIS
Multivariate Analysis
the analysis of three or more variables to determine
relationship
VARIABLE DESCRIPTIVE ANALYSIS
 Additive Tree
 Canonical Correlation Analysis
 Cluster Analysis
 Correspondence Analysis / Multiple
Correspondence Analysis
 Factor Analysis
 Generalized Procrustean Analysis
 MANOVA
 Multidimensional Scaling
 Multiple Regression Analysis
 Partial Least Square Regression
 Principal Component Analysis /
Regression / PARAFAC
 Redundancy Analysis.
Multivariate Analysis
CHARACTERISTICS OF DATA
Frequency Distribution - a tabular representation of a survey data set used
to organize and summarize the data. Specifically, it is a list of either
qualitative or quantitative values that a variable takes in a data set and the
associated number of times each value occurs (frequencies)
Measures of Central Tendency - a summary statistic that represents the
center point or typical value of a dataset. In statistics, the three most
common measures of central tendency are the mean, median, and mode.
Each of these measures calculates the location of the central point using a
different method.
FREQUENCY DISTRIBUTION
There are four important characteristics of frequency
distribution:
1. Measures of central tendency and location (mean, median,
mode)
2. Measures of dispersion (range, variance, standard deviation)
3. The extent of symmetry/asymmetry (skewness)
4. The flatness or peakedness (kurtosis)
FREQUENCY DISTRIBUTION
Frequency distribution
tells how frequencies are distributed over values.
Frequency distributions are mostly used for
summarizing categorical variables
MEASURES OF CENTRAL TENDENCY
There are three main measures of central tendency:
1. Mode
2. Median
3. Mean
Each of these measures describes a different indication of the
typical or central value in the distribution. The mode is the most
commonly occurring value in a distribution
MEASURES OF CENTRAL TENDENCY
Mode
- a list of numbers that refers to the integers that occur
most frequently. Unlike the median and mean, the mode
is about the frequency of occurrence. There can be more
than one mode or no mode at all; it all depends on the
data set itself
MEASURES OF CENTRAL TENDENCY
Median
- the middle number when listed in order from least to
greatest
MEASURES OF CENTRAL TENDENCY
Mean
– refers to the average. To calculate the mean, add
together all of the numbers in your data set. Then divide
that sum by the number of addends
VARIANCE IN DATA
The variance (σ2) is a measure of how far each value in the
data set is from the mean.
Variance is a measure of how spread out a data set is. It's
useful when creating statistical models since low variance
can be a sign that you are over-fitting your data.
Here is how it is defined: Subtract the mean from each
value in the data. This gives you a measure of the distance
of each value from the mean.
VARIANCE IN DATA
Calculating Variance of a Sample
 Write down your sample data set
 Write down the sample variance formula
 Calculate the mean of the sample
 Subtract the mean from each data point
 Square each result
 Find the sum of the squared values
 Divide by n - 1, where n is the number of data points.
VARIANCE IN DATA
Write down your sample data
set
X
X1 17
X2 15
X3 23
X4 7
X5 9
X6 13
VARIANCE IN DATA
Write down the sample
variance formula
The variance of a data set tells
you how spread out the data
points are. The closer the
variance is to zero, the more
closely the data points are
clustered together.
VARIANCE IN DATA
Calculate the mean of the
sample
The symbol x̅ or "x-bar" refers to
the mean of a sample.[Calculate
this as you would any mean: add
all the data points together, then
divide by the number of data
points.
VARIANCE IN DATA
Subtract the mean from each
data point
your answers should add up to
zero. This is due to the definition
of mean, since the negative
answers (distance from mean to
smaller numbers) exactly cancel
out the positive answers
(distance from mean to larger
numbers)
VARIANCE IN DATA
Square each result
This means the "average deviation"
will always be zero as well, so that
doesn't tell anything about how
spread out the data is. To solve this
problem, find the square of each
deviation. This will make them all
positive numbers, so the negative
and positive values no longer cancel
out to zero
VARIANCE IN DATA
Find the sum of the squared
values
∑ tells you to sum the value of
the following term for each
value of xi
Because (xi – x)2 is already
calculated, all you need to do is
add the results.
VARIANCE IN DATA
Divide by n - 1, where n is the
number of data points
VARIANCE VS. STANDARD DEVIATION
Variance is a numerical value that describes the variability of
observations from its arithmetic mean
Standard deviation is a measure of dispersion of observations
within a data set
Variance is nothing but an average of squared deviations. On the
other hand, the standard deviation is the root mean square
deviation
VARIANCE VS. STANDARD DEVIATION
A variance of zero indicates that all of the data values are identical. A
high variance indicates that the data points are very spread out from
the mean, and from one another. Variance is the average of the
squared distances from each point to the mean.
Standard deviation is a number used to tell how measurements for a
group are spread out from the average (mean), or expected value. A
low standard deviation means that most of the numbers are close to
the average. A high standard deviation means that the numbers are
more spread out.
RELATIONSHIPS BETWEEN
VARIABLES
The statistical relationship between two variables is
referred to as their correlation.
A correlation could be positive, meaning
both variables move in the same direction, or negative,
meaning that when the value of one variable increases,
the value of the other variable decreases
ASPECTS OF ASSOCIATION
BETWEEN VARIABLES
Association between two variables means the values of
one variable relate in some way to the values of the
other.
Association is usually measured by correlation for two
continuous variables and by cross tabulation and a Chi-
square test for two categorical variables.
ASPECTS OF ASSOCIATION
BETWEEN VARIABLES
Chi Square Test
relating to or denoting a statistical method assessing the
goodness of fit between observed values and those
expected theoretically.
Commonly used for testing relationships between
categorical variables. The null hypothesis of the Chi-Square
test is that no relationship exists on the categorical
variables in the population; they are independent.
ASPECTS OF ASSOCIATION
BETWEEN VARIABLES
Chi Square Test
The subscript “c” are the degrees of freedom.
“O” is your observed value and E is your expected value
MEASURES OF ASSOCIATION
BETWEEN VARIABLES
The measures of association refer to a wide variety of
coefficients (including bivariate correlation and
regression coefficients) that measure the strength and
direction of the relationship between variables;
these measures of strength, or association, can be
described in several ways, depending on the analysis.
MEASURES OF ASSOCIATION
BETWEEN VARIABLES
For measures of association, a value of zero signifies that no
relationship exists.
In a correlation analysis, if the coefficient (r) has a value of
one, it signifies a perfect relationship on the variables of
interest.
In regression analyses, if the standardized beta weight (β)
has a value of one, it also signifies a perfect relationship on
the variables of interest.
STATISTICAL MEASURES OF
RELATIONSHIPS
1. Correlational Coefficient
2. Linear Regression
3. Multiple Regression
4. Discriminant Analysis
5. Factor Analysis
STATISTICAL MEASURES OF
RELATIONSHIPS
Correlational Coefficient
the relationship between two or more variables or sets of
data. It is expressed in the form of a coefficient with +1.00
indicating a perfect positive correlation; -1.00 indicating a
perfect inverse correlation; 0.00 indicating a complete
lack of a relationship.
STATISTICAL MEASURES OF
RELATIONSHIPS
Correlational Coefficient
 Pearson's Product Moment Coefficient (r) is the most often
used and most precise coefficient; and generally used with
continuous variables
 Spearman Rank Order Coefficient (p) is a form of the Pearson's
Product Moment Coefficient that can be used with ordinal or
ranked data
 Phi Correlation Coefficient is a form of the Pearson's Product
Moment Coefficient that can be used with dichotomous variables
(i.e. pass/fail, male/female)
STATISTICAL MEASURES OF
RELATIONSHIPS
Linear Regression
the use of correlation coefficients to plot a line illustrating the linear relationship of
two variables X and Y. It is based on the slope of the line which is represented by
the formula :
Y = a + bX
where
• Y = dependent variable
• X = independent variable
• b = slope of the line
• a = constant or Y intercept
Regression is used extensively in making predictions based on finding unknown Y
values from known X values
STATISTICAL MEASURES OF
RELATIONSHIPS
Multiple Regression
the same as regression except that it attempts to predict Y
from two or more independent X variables. The formula for
multiple regression is an extension of the linear regression
formula:
Y = a + b1 X1 + b2 X2 + ....
Multiple regression is used extensively in making predictions
based on finding unknown Y values from known X values
STATISTICAL MEASURES OF
RELATIONSHIPS
Discriminant Analysis
analogous to multiple regression, except that the criterion
variable consists of two categories rather than a
continuous range of values
STATISTICAL MEASURES OF
RELATIONSHIPS
Factor Analysis
often used when a large number of correlations have
been explored in a given study; it is a means of grouping
certain variables into clusters or factors that are
moderately to highly correlated with each other
ANALYZING DIFFERENCES WITHIN
THE DATA
1. T-Test
2. Matched Pairs T-Test
3. Analysis of Variance (ANOVA)
ANALYZING DIFFERENCES WITHIN
THE DATA
T-Test
A t-test is used to determine if the scores of two groups
differ on a single variable. A t-test is designed to test for
the differences in mean scores
Note: A t-test is appropriate only when looking at paired data. It is useful in
analyzing scores of two groups of participants on a particular variable or in
analyzing scores of a single group of participants on two variables.
ANALYZING DIFFERENCES WITHIN
THE DATA
Matched Pairs T-Test
This type of t-test could be used to determine if the
scores of the same participants in a study differ under
different conditions
Note: A t-test is appropriate only when looking at paired data. It is useful in
analyzing scores of two groups of participants on a particular variable or in
analyzing scores of a single group of participants on two variables
ANALYZING DIFFERENCES WITHIN
THE DATA
Analysis of Variance (ANOVA)
The ANOVA (analysis of variance) is a statistical test which
makes a single, overall decision as to whether a significant
difference is present among three or more sample means (Levin
484). An ANOVA is similar to a t-test. However, the ANOVA can
also test multiple groups to see if they differ on one or more
variables. The ANOVA can be used to test between-groups and
within-groups differences.
ANALYZING DIFFERENCES WITHIN
THE DATA
Analysis of Variance (ANOVA)
One-Way ANOVA: This tests a group or groups to
determine if there are differences on a single set of scores
Multiple ANOVA (MANOVA): This tests a group or groups
to determine if there are differences on two or
more variables
MULTIVARIATE ANALYSIS
Multivariate analysis is used to study more complex sets of data
than what univariate analysis methods can handle. This type of
analysis is almost always performed with software
(i.e. SPSS or SAS), as working with even the smallest of data sets
can be overwhelming by hand.
Multivariate analysis can reduce the likelihood of Type I errors.
Sometimes, univariate analysis is preferred as multivariate techniques
can result in difficulty interpreting the results of the test. For example,
group differences on a linear combination of dependent variables in
MANOVA can be unclear. In addition, multivariate analysis is usually
unsuitable for small sets of data.
MULTIVARIATE ANALYSIS
There are more than 20 different ways to perform multivariate analysis, depending
on the type of data and the objectives of the research. For single data sets there are
several choices:
1. Additive Trees, Multidimensional Scaling, and Cluster Analysis are
appropriate for when the rows and columns in a data table represent the same
units and the measure is either a similarity or a distance
2. Principal Component Analysis (PCA) decomposes a data table with
correlated measures into a new set of uncorrelated measures
3. Correspondence Analysis is similar to PCA. However, it applies to
contingency tables
MULTIVARIATE ANALYSIS
 Additive Tree
 Canonical Correlation Analysis
 Cluster Analysis
 Correspondent Analysis/Multiple
Correspondence Analysis
 Factor Analysis
 Generalized Procrustean Analysis
 Independent Component Analysis
 MANOVA
 Multidimensional Scaling
 Multiple Regression Analysis
 Partial Least Square Regression
 Principal Component
Analysis/Regression/PARAFAC
 Redundancy Analysis
MULTIVARIATE ANALYSIS
Additive Tree
a general way to represent clusters of data in a graph. It
is used when the data table is composed of rows and
columns that represent the same units; the measure must
be a distance or a similarity.
MULTIVARIATE ANALYSIS
Additive Tree
A “tree” is a finite, connected graph where any two nodes are connected by
one path. The additive tree is a similar technique to cluster analysis. Both
techniques have the “leaves” of the tree representing units. Where the
additive tree differs is that the distance is graphically represented by the
distance of those units on the tree
MULTIVARIATE ANALYSIS
Additive Tree
Cluster Analysis creates the clusters but does not create
a graph that represents the results. An additional
limitation of hierarchical cluster analysis is that objects in
the same cluster must be exactly the same distance from
each other, and the distances between clusters must be
larger than the “within clusters” distance. Additive trees
do not have these limitations
MULTIVARIATE ANALYSIS
Canonical Correlation Analysis
one way to find associations between two data sets. Like the
Correlation Coefficient, CCA measures the relationship between
variables. Where Canonical Correlation Analysis differs is that it is
specifically used to find the relationships between two
sets of variables
MULTIVARIATE ANALYSIS
Canonical Correlation Analysis
appropriate to use in the same situations as you might
use multiple regression analysis, but when you have multiple
intercorrelated outcome variables.
CCA is not recommended for small data sets.
MULTIVARIATE ANALYSIS
Canonical Correlation Analysis
The purpose of Canonical Correlation Analysis is to explain the
variability within and between sets through identification of
several sets of canonical variates. Canonical variates are new
variables formed by making a linear combination of two of more
variables from the data sets. When running CCA, you choose
weights that maximize the correlation between these sets of
variates.
MULTIVARIATE ANALYSIS
Cluster Analysis
Clustering in statistics refers to how data is gathered (“clustered”) by factors
like:
 Age.
 Household size.
 Income.
 Education level.
MULTIVARIATE ANALYSIS
Cluster Analysis
Clusters can be based on factors like:
Distance-based clustering. Items are sorted based on their
proximity (or distance). For example, cancer cases might be
clustered together if they are in the same geographic location
Conceptual clustering. Items are grouped by factors that
items have in common. For example, cancer clusters could be
grouped by “people who work in manufacturing
MULTIVARIATE ANALYSIS
Cluster Analysis
Clustering Types (continued):
Hierarchical Clustering This is a more complex approach to
clustering used in data mining. Basically, each item is given its
own cluster. A pair of clusters is joined based on similarities,
giving one less cluster. This process is repeated until all items
are clustered. The dendrogram is a graph that shows
hierarchical clusters
Probabilistic Clustering. Data is clustered using algorithms
which connect items using distances or densities. This is
usually performed by a computer
MULTIVARIATE ANALYSIS
Cluster Analysis
Clustering Types (continued):
Ward’s method: uses minimum variance in each step to create
relatively small, even-sized clusters
MULTIVARIATE ANALYSIS
Cluster Analysis
Clustering Types:
Exclusive Clustering. Each item can only belong in a single
cluster. It cannot belong in another cluster
Fuzzy Clustering: Data points are assigned a probability of
belonging to one or more clusters
Overlapping Clustering. Each item can belong to more than
one cluster
MULTIVARIATE ANALYSIS
Correspondence Analysis/Multiple Correspondence
Analysis
a descriptive/exploratory technique designed
to analyze simple two-way and multi-way tables
containing some measure of correspondence between
the rows and columns
MULTIVARIATE ANALYSIS
Factor Analysis
a way to take a mass of data and shrinking it to a smaller
data set that is more manageable and more
understandable. It is a way to find hidden patterns, show
how those patterns overlap, and show what
characteristics are seen in multiple patterns. It is also
used to create a set of variables for similar items in the
set called dimensions
MULTIVARIATE ANALYSIS
Factor Analysis
It can be a very useful tool for complex sets of data involving
psychological studies, socioeconomic status and other involved
concepts. A “factor” is a set of observed variables that have
similar response patterns; They are associated with a hidden
variable (called a confounding variable) that is not directly
measured. Factors are listed according to factor loadings, or how
much variation in the data they can explain
MULTIVARIATE ANALYSIS
Factor Analysis
Types:
1. Exploratory factor analysis is if the researcher does not any
idea about the structure of the data or the number of
dimensions exist in a set of variables
2. Confirmatory Factor Analysis is used for verification as long
as there is a specific idea about the data structure or the
number of dimensions in a set of variables
MULTIVARIATE ANALYSIS
Generalized Procrustean Analysis
a way to compare two sets of configurations, or shapes. Originally
developed to match two solutions from Factor Analysis, the
technique was extended to Generalized Procrustes Analysis so
that more than two shapes could be compared. The shapes are
aligned to a target shape or to each other.
GPA uses geometric transformations (i.e. isotropic rescaling,
reflection, rotation, or translation) of matrices to compare the sets
of data
MULTIVARIATE ANALYSIS
Independent Component Analysis
used in statistics and signal processing to express a
multivariate function by its hidden factors or
subcomponents. These component signals are
independent non-Gaussian signals, and the intention is
that these independent subcomponents accurately
represent the composite signal
MULTIVARIATE ANALYSIS
Multiple Analysis of Variance (MANOVA)
Analysis of variance (ANOVA) tests for differences
between means. MANOVA is just an ANOVA with several
dependent variables.
Similar to many other tests and experiments in that it’s
purpose is to find out if the response variable (i.e. your
dependent variable) is changed by manipulating the
independent variable
MULTIVARIATE ANALYSIS
Multiple Analysis of Variance (MANOVA)
The test helps to answer many research questions, including:
Do changes to the independent variables have statistically
significant effects on dependent variables?
What are the interactions among dependent variables?
What are the interactions among independent variables?
MULTIVARIATE ANALYSIS
Multidimensional Scaling a visual representation of distances or
dissimilarities between sets of objects.
“Objects” can be colors, faces, map coordinates, political
persuasion, or any kind of real or conceptual stimuli. Objects that
are more similar (or have shorter distances) are closer together
on the graph than objects that are less similar (or have longer
distances). As well as interpreting dissimilarities as distances on a
graph, MDS can also serve as a dimension reduction technique
for high-dimensional data
MULTIVARIATE ANALYSIS
Multiple Regression Analysis used to see if there is
a statistically significant relationship between sets of
variables. It’s used to find trends in those sets of data.
Multiple regression analysis is almost the same as simple
linear regression. The only difference between simple
linear regression and multiple regression is in the number
of predictors (“x” variables) used in the regression
MULTIVARIATE ANALYSIS
Partial Least Square Regression if the data shows a linear
relationship between the X and Y variables, the line that best fits
this linear relationship needs to be found.
That line is called a Regression Line and has the equation
ŷ= a + b x
The Least Squares Regression Line is the line that makes the
vertical distance from the data points to the regression line as
small as possible. It’s called a “least squares” because the best
line of fit is one that minimizes the variance
MULTIVARIATE ANALYSIS
Parallel Factor Analysis (PARAFAC)
is a generalization of Principal Component Analysis to
higher-order arrays. It is useful for exploratory data
analysis on very particular sets of data, for example if you
have three-way data. Where PARAFAC differs from
Principal Component Analysis is that PARAFAC produces
unique components
MULTIVARIATE ANALYSIS
Principal Component Analysis
is a tool that has two main purposes:
1. To find variability in a data set and
2. To reduce the dimensions of the data set
Reducing dimensions means that redundancy in the data is
eliminated; This can make patterns in the data set more clear.
Therefore, Principal Component Analysis is a good tool to use
redundancies are suspected in a data set. Redundancy doesn’t
mean that the variables are identical; it means that there is a
strong correlation between them
MULTIVARIATE ANALYSIS
Principal Component Regression
based on Principal Component Analysis. It is used when the data
set exhibits multicollinearity, meaning that although least squares
estimates are biased, variances may be too far away from the
actual value. PCA adds some bias to the regression model and
reduces standard error.
The first step in PCA is the same as in Principal Component
Analysis: identify the principal components. Regression is then
performed on those components
MULTIVARIATE ANALYSIS
Redundancy Analysis the constrained version of Principal Components
Analysis. Constrained basically means reduction of dimensions. This
reduction is what leads to more understandable results.
Redundancy Analysis is a way to summarize linear relationships in a set of
dependent variables that are influenced by a set of independent variables.
 Linear Regression is first applied to represent Y as a function of X.
 PCA is then applied to a matrix of the results to provide a visual
representation.
MULTIVARIATE ANALYSIS

Weitere ähnliche Inhalte

Was ist angesagt?

Data analysis and Presentation
Data analysis and PresentationData analysis and Presentation
Data analysis and PresentationJignesh Kariya
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisEva Durall
 
Exploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfExploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfAmmarAhmedSiddiqui2
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data AnalysisUmair Shafique
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpointSarah Hallum
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAttaullah Khan
 
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Stats Statswork
 
Data management principles
Data management principlesData management principles
Data management principlesFiddy Prasetiya
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data ScienceMaloy Manna, PMP®
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
 
Data Visualization
Data VisualizationData Visualization
Data Visualizationsimonwandrew
 
Data Visualization & Analytics.pptx
Data Visualization & Analytics.pptxData Visualization & Analytics.pptx
Data Visualization & Analytics.pptxhiralpatel3085
 

Was ist angesagt? (20)

Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Data analysis and Presentation
Data analysis and PresentationData analysis and Presentation
Data analysis and Presentation
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
 
Exploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfExploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdf
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Data analysis
Data analysisData analysis
Data analysis
 
Data analysis
Data analysisData analysis
Data analysis
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
 
Data management principles
Data management principlesData management principles
Data management principles
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Data analysis
Data analysisData analysis
Data analysis
 
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Data and its Types
Data and its TypesData and its Types
Data and its Types
 
Data Visualization & Analytics.pptx
Data Visualization & Analytics.pptxData Visualization & Analytics.pptx
Data Visualization & Analytics.pptx
 

Ähnlich wie Data Analysis

Research Method chapter 6.pptx
Research Method chapter 6.pptxResearch Method chapter 6.pptx
Research Method chapter 6.pptxAsegidHmeskel
 
Differences between qualitative
Differences between qualitativeDifferences between qualitative
Differences between qualitativeShakeel Ahmad
 
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...EqraBaig
 
INTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxINTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxAvilosErgelaKram
 
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptx
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptxQ4 ENG 10 Distinguish Technical Terms Used in Research.pptx
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptxJeralynPetilo
 
Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Moin Sarker
 
Introduction of statistics and probability
Introduction of statistics and probabilityIntroduction of statistics and probability
Introduction of statistics and probabilityBencentapleras
 
QUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxQUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxssuser2123ba
 
Dr. Kanwal DP Singh.ppt
Dr. Kanwal DP Singh.pptDr. Kanwal DP Singh.ppt
Dr. Kanwal DP Singh.pptDeepak Alloid
 
Importance and function of Statistics in psychology.
Importance and function of Statistics in psychology.Importance and function of Statistics in psychology.
Importance and function of Statistics in psychology.VandanaGaur15
 
Qualitative-vs.-Quantitative Research.pptx
Qualitative-vs.-Quantitative Research.pptxQualitative-vs.-Quantitative Research.pptx
Qualitative-vs.-Quantitative Research.pptxmangabangjaymarie32
 
Research methodology for business .pptx
Research methodology for business .pptxResearch methodology for business .pptx
Research methodology for business .pptxParmeshwar Biradar
 
Basic statistics
Basic statisticsBasic statistics
Basic statisticsGanesh Raju
 

Ähnlich wie Data Analysis (20)

Research Method chapter 6.pptx
Research Method chapter 6.pptxResearch Method chapter 6.pptx
Research Method chapter 6.pptx
 
Differences between qualitative
Differences between qualitativeDifferences between qualitative
Differences between qualitative
 
Basic concept of statistics
Basic concept of statisticsBasic concept of statistics
Basic concept of statistics
 
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...
Introduction to Statistics - Basics Statistics Concepts - Day 1- 8614 - B.Ed ...
 
Research unit booklet
Research unit bookletResearch unit booklet
Research unit booklet
 
CHAPTER FOUR.pptx
CHAPTER FOUR.pptxCHAPTER FOUR.pptx
CHAPTER FOUR.pptx
 
INTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxINTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptx
 
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptx
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptxQ4 ENG 10 Distinguish Technical Terms Used in Research.pptx
Q4 ENG 10 Distinguish Technical Terms Used in Research.pptx
 
Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)Some Research ques & ans ( Assignment)
Some Research ques & ans ( Assignment)
 
Introduction of statistics and probability
Introduction of statistics and probabilityIntroduction of statistics and probability
Introduction of statistics and probability
 
QUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxQUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptx
 
Dr. Kanwal DP Singh.ppt
Dr. Kanwal DP Singh.pptDr. Kanwal DP Singh.ppt
Dr. Kanwal DP Singh.ppt
 
Data Analysis
Data Analysis Data Analysis
Data Analysis
 
Importance and function of Statistics in psychology.
Importance and function of Statistics in psychology.Importance and function of Statistics in psychology.
Importance and function of Statistics in psychology.
 
Qualitative-vs.-Quantitative Research.pptx
Qualitative-vs.-Quantitative Research.pptxQualitative-vs.-Quantitative Research.pptx
Qualitative-vs.-Quantitative Research.pptx
 
The what, why and how to do research: Implications for developing countries
The what, why and how to do research: Implications for developing countries The what, why and how to do research: Implications for developing countries
The what, why and how to do research: Implications for developing countries
 
Research methodology for business .pptx
Research methodology for business .pptxResearch methodology for business .pptx
Research methodology for business .pptx
 
Lecture 07
Lecture 07Lecture 07
Lecture 07
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Lesson 6 chapter 4
Lesson 6   chapter 4Lesson 6   chapter 4
Lesson 6 chapter 4
 

Mehr von Marcelo Augusto A. Cosgayon

0104 report ethics and accountability in the government
0104 report ethics and accountability in the government0104 report ethics and accountability in the government
0104 report ethics and accountability in the governmentMarcelo Augusto A. Cosgayon
 
Final term paper management information systems styled
Final term paper management information systems styledFinal term paper management information systems styled
Final term paper management information systems styledMarcelo Augusto A. Cosgayon
 
Final term paper management information systems cosgayon batch ii
Final term paper management information systems cosgayon batch iiFinal term paper management information systems cosgayon batch ii
Final term paper management information systems cosgayon batch iiMarcelo Augusto A. Cosgayon
 
01020 final examinations human resource management 2016
01020 final examinations human resource management 201601020 final examinations human resource management 2016
01020 final examinations human resource management 2016Marcelo Augusto A. Cosgayon
 
Economic analysis of annulment of marriage cosgayon new
Economic analysis of annulment of marriage cosgayon newEconomic analysis of annulment of marriage cosgayon new
Economic analysis of annulment of marriage cosgayon newMarcelo Augusto A. Cosgayon
 
0103 b assignment 2 history and advancements of management science and operat...
0103 b assignment 2 history and advancements of management science and operat...0103 b assignment 2 history and advancements of management science and operat...
0103 b assignment 2 history and advancements of management science and operat...Marcelo Augusto A. Cosgayon
 
Group 2 marketing plan the premiere business hotel new
Group 2 marketing plan the premiere business hotel newGroup 2 marketing plan the premiere business hotel new
Group 2 marketing plan the premiere business hotel newMarcelo Augusto A. Cosgayon
 

Mehr von Marcelo Augusto A. Cosgayon (20)

MRWD Water Production Proposals Comparison
MRWD Water Production Proposals ComparisonMRWD Water Production Proposals Comparison
MRWD Water Production Proposals Comparison
 
0104 report ethics and accountability in the government
0104 report ethics and accountability in the government0104 report ethics and accountability in the government
0104 report ethics and accountability in the government
 
Wages
WagesWages
Wages
 
Collective Bargaining Agreements
Collective Bargaining AgreementsCollective Bargaining Agreements
Collective Bargaining Agreements
 
Case Study Research
Case Study ResearchCase Study Research
Case Study Research
 
Laws against violence against women new
Laws against violence against women newLaws against violence against women new
Laws against violence against women new
 
Lean production toyota production system
Lean production toyota production systemLean production toyota production system
Lean production toyota production system
 
Graceville subdivision drainage problem
Graceville subdivision drainage problemGraceville subdivision drainage problem
Graceville subdivision drainage problem
 
Ethical privacy and security issues
Ethical privacy and security issuesEthical privacy and security issues
Ethical privacy and security issues
 
0201B Aggregate Resource Planning
0201B Aggregate Resource Planning0201B Aggregate Resource Planning
0201B Aggregate Resource Planning
 
0201B Lean Production
0201B Lean Production0201B Lean Production
0201B Lean Production
 
0103B Pegasus Airlines Case Study
0103B Pegasus Airlines Case Study0103B Pegasus Airlines Case Study
0103B Pegasus Airlines Case Study
 
Final term paper management information systems styled
Final term paper management information systems styledFinal term paper management information systems styled
Final term paper management information systems styled
 
Final term paper management information systems cosgayon batch ii
Final term paper management information systems cosgayon batch iiFinal term paper management information systems cosgayon batch ii
Final term paper management information systems cosgayon batch ii
 
01020 final examinations human resource management 2016
01020 final examinations human resource management 201601020 final examinations human resource management 2016
01020 final examinations human resource management 2016
 
Economic analysis of annulment of marriage cosgayon new
Economic analysis of annulment of marriage cosgayon newEconomic analysis of annulment of marriage cosgayon new
Economic analysis of annulment of marriage cosgayon new
 
0102 final term paper human resource management
0102 final term paper human resource management0102 final term paper human resource management
0102 final term paper human resource management
 
0103 b assignment 1 operations manager
0103 b assignment 1 operations manager0103 b assignment 1 operations manager
0103 b assignment 1 operations manager
 
0103 b assignment 2 history and advancements of management science and operat...
0103 b assignment 2 history and advancements of management science and operat...0103 b assignment 2 history and advancements of management science and operat...
0103 b assignment 2 history and advancements of management science and operat...
 
Group 2 marketing plan the premiere business hotel new
Group 2 marketing plan the premiere business hotel newGroup 2 marketing plan the premiere business hotel new
Group 2 marketing plan the premiere business hotel new
 

Kürzlich hochgeladen

CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECall girl Jaipur
 
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...gurkirankumar98700
 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYdizinfo
 
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call Me
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call MeCall^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call Me
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call MeMs Riya
 
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779Night 7k Call Girls Noida Sector 120 Call Me: 8448380779
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...anilsa9823
 
Your LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageYour LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageSocioCosmos
 
Angela Killian | Operations Director | Dallas
Angela Killian | Operations Director | DallasAngela Killian | Operations Director | Dallas
Angela Killian | Operations Director | DallasAngela Killian
 
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...makika9823
 
Codes and Conventions of Artists' Websites
Codes and Conventions of Artists' WebsitesCodes and Conventions of Artists' Websites
Codes and Conventions of Artists' WebsitesLukeNash7
 
This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...samuelcoulson30
 
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...Mona Rathore
 
Call Girls In Noida Mall Of Noida O9654467111 Escorts Serviec
Call Girls In Noida Mall Of Noida O9654467111 Escorts ServiecCall Girls In Noida Mall Of Noida O9654467111 Escorts Serviec
Call Girls In Noida Mall Of Noida O9654467111 Escorts ServiecSapana Sha
 
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Delhi Call girls
 
Interpreting the brief for the media IDY
Interpreting the brief for the media IDYInterpreting the brief for the media IDY
Interpreting the brief for the media IDYgalaxypingy
 
Website research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazineWebsite research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazinesamuelcoulson30
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...baharayali
 
Call Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking MenCall Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking MenSapana Sha
 

Kürzlich hochgeladen (20)

CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
 
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
c Starting with 5000/- for Savita Escorts Service 👩🏽‍❤️‍💋‍👨🏿 8923113531 ♢ Boo...
 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANY
 
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call Me
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call MeCall^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call Me
Call^ Girls Delhi Independent girls Chanakyapuri 9711199012 Call Me
 
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779Night 7k Call Girls Noida Sector 120 Call Me: 8448380779
Night 7k Call Girls Noida Sector 120 Call Me: 8448380779
 
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...
CALL ON ➥8923113531 🔝Call Girls Ashiyana Colony Lucknow best sexual service O...
 
Your LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageYour LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence Package
 
9953056974 Young Call Girls In Kirti Nagar Indian Quality Escort service
9953056974 Young Call Girls In  Kirti Nagar Indian Quality Escort service9953056974 Young Call Girls In  Kirti Nagar Indian Quality Escort service
9953056974 Young Call Girls In Kirti Nagar Indian Quality Escort service
 
Angela Killian | Operations Director | Dallas
Angela Killian | Operations Director | DallasAngela Killian | Operations Director | Dallas
Angela Killian | Operations Director | Dallas
 
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
Independent Escorts Lucknow 8923113531 WhatsApp luxurious locale in your city...
 
Codes and Conventions of Artists' Websites
Codes and Conventions of Artists' WebsitesCodes and Conventions of Artists' Websites
Codes and Conventions of Artists' Websites
 
This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...
 
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
GREAT OPORTUNITY Russian Call Girls Kirti Nagar 9711199012 Independent Escort...
 
Call Girls In Noida Mall Of Noida O9654467111 Escorts Serviec
Call Girls In Noida Mall Of Noida O9654467111 Escorts ServiecCall Girls In Noida Mall Of Noida O9654467111 Escorts Serviec
Call Girls In Noida Mall Of Noida O9654467111 Escorts Serviec
 
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
Night 7k Call Girls Noida Sector 121 Call Me: 8448380779
 
Vip Call Girls Tilak Nagar ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Tilak Nagar ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Tilak Nagar ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Tilak Nagar ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Interpreting the brief for the media IDY
Interpreting the brief for the media IDYInterpreting the brief for the media IDY
Interpreting the brief for the media IDY
 
Website research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazineWebsite research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazine
 
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
Top Astrologer, Kala ilam specialist in USA and Bangali Amil baba in Saudi Ar...
 
Call Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking MenCall Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking Men
 

Data Analysis

  • 2. DATA ANALYSIS defined as the process of systematically searching and arranging interview transcripts, observation notes, or other non-textual materials that the researcher accumulates to increase the understanding of the phenomenon.
  • 3. DATA ANALYSIS Qualitative research yields mainly unstructured text- based data in the form of: Interview transcripts Observation notes Diary entries Records
  • 4. DATA ANALYSIS Data analysis in qualitative research is more of a dynamic, intuitive, and creative process of inductive reasoning, thinking, and theorizing. In contrast to quantitative research, which uses statistical methods, qualitative research focuses on the exploration of values, meanings, beliefs, thoughts, experiences, and feelings characteristic of the phenomenon under investigation.
  • 5. DATA ANALYSIS The process of analyzing qualitative data predominantly involves coding or categorizing the data. Basically, it involves making sense of huge amounts of data by reducing the volume of raw information, followed by identifying significant patterns, and finally drawing meaning from data, and subsequently building a logical chain of evidence.
  • 6. SCALES OF MEASUREMENT Data can be classified as being on one of four (4) scales: 1. Nominal 2. Ordinal 3. Interval 4. Ratio
  • 7. SCALES OF MEASUREMENT Nominal Scale Nominal variables (also called categorical variables) can be placed into categories. They don’t have a numeric value and so cannot be added, subtracted, divided or multiplied. They also have no order; if they appear to have an order then these are ordinal variables instead
  • 8. SCALES OF MEASUREMENT Nominal Scale The nominal scale of measurement only satisfies the identity property of measurement. Values assigned to variables represent a descriptive category, but have no inherent numerical value with respect to magnitude.
  • 9. SCALES OF MEASUREMENT Nominal Scale Gender is an example of a variable that is measured on a nominal scale. Individuals may be classified as "male" or "female", but neither value represents more or less "gender" than the other. Religion and political affiliation are other examples of variables that are normally measured on a nominal scale
  • 10. SCALES OF MEASUREMENT Ordinal Scale The ordinal scale contains things that you can place in order. For example, hottest to coldest, lightest to heaviest, richest to poorest. Basically, if you can rank data by 1st, 2nd, 3rd place (and so on), then you have data that’s on an ordinal scale
  • 11. SCALES OF MEASUREMENT Ordinal Scale The ordinal scale has the property of both identity and magnitude. Each value on the ordinal scale has a unique meaning, and it has an ordered relationship to every other value on the scale
  • 12. SCALES OF MEASUREMENT Ordinal Scale An example of an ordinal scale in action would be the results of a horse race, reported as “win”, “place”, and “show”. The rank order in which horses finished the race is known. The horse that won finished ahead of the horse that placed, and the horse that placed finished ahead of the horse that showed. However, we cannot tell from this ordinal scale whether it was a close race or whether the winning horse won by a mile
  • 13. SCALES OF MEASUREMENT Interval Scale An interval scale has ordered numbers with meaningful divisions. Temperature is on the interval scale: a difference of 10 degrees between 90 and 100 means the same as 10 degrees between 150 and 160. Compare that to high school ranking (which is ordinal), where the difference between 1st and 2nd might be .01 and between 10th and 11th .5. If you have meaningful divisions, you have something on the interval scale
  • 14. SCALES OF MEASUREMENT Interval Scale The interval scale of measurement has the properties of identity, magnitude, and equal intervals.  A perfect example of an interval scale is the Fahrenheit scale to measure temperature. The scale is made up of equal temperature units, so that the difference between 40 and 50 degrees Fahrenheit is equal to the difference between 50 and 60 degrees Fahrenheit.  With an interval scale, you know not only whether different values are bigger or smaller, you also know how much bigger or smaller they are. For example, suppose it is 60 degrees Fahrenheit on Monday and 70 degrees on Tuesday. You know not only that it was hotter on Tuesday, you also know that it was 10 degrees hotter
  • 15. SCALES OF MEASUREMENT Ratio Scale The ratio scale is exactly the same as the interval scale with one major difference: zero is meaningful. For example, a height of zero is meaningful (it means you don’t exist). Compare that to a temperature of zero, which while it exists, it doesn’t mean anything in particular (although admittedly, in the Celsius scale it’s the freezing point for water)
  • 16. SCALES OF MEASUREMENT Ratio Scale The ratio scale of measurement satisfies all four of the properties of measurement: identity, magnitude, equal intervals, and a minimum value of zero. The weight of an object would be an example of a ratio scale. Each value on the weight scale has a unique meaning, weights can be rank ordered, units along the weight scale are equal to one another, and the scale has a minimum value of zero.
  • 17. TYPES OF DATA ANALYSIS 1. Content analysis 2. Narrative analysis 3. Discourse analysis 4. Framework analysis 5. Grounded theory
  • 18. TYPES OF DATA ANALYSIS Content analysis This refers to the process of categorizing verbal or behavioural data to classify, summarize, and tabulate the data.
  • 19. TYPES OF DATA ANALYSIS Narrative analysis. This method involves the reformulation of stories presented by respondents, taking into account the context of each case and different experiences of each respondent. In other words, narrative analysis is the revision of primary qualitative data by the researcher.
  • 20. TYPES OF DATA ANALYSIS Discourse analysis A method of analysis of naturally occurring talk and all types of written text.
  • 21. TYPES OF DATA ANALYSIS Framework analysis This is a more advanced method that consists of several stages such as familiarization, identifying a thematic framework, coding, charting, mapping, and interpretation.
  • 22. TYPES OF DATA ANALYSIS Grounded theory This method of qualitative data analysis starts with an analysis of a single case to formulate a theory. Then, additional cases are examined to see if they contribute to the theory.
  • 23. OTHER TYPES OF DATA ANALYSIS 1. Phenomenological Method 2. Ethnographic Model 3. Grounded Theory Method 4. Case Study Model 5. Historical Model 6. Narrative Model
  • 24. TYPES OF DATA ANALYSIS Phenomenological Method Describing how any one participant experiences a specific event is the goal of the phenomenological method of research. This method utilizes interviews, observation and surveys to gather information from subjects. Phenomenology is highly concerned with how participants feel about things during an event or activity. Businesses use this method to develop processes to help sales representatives effectively close sales using styles that fit their personality.
  • 25. TYPES OF DATA ANALYSIS Ethnographic Model The ethnographic model is one of the most popular and widely recognized methods of qualitative research; it immerses subjects in a culture that is unfamiliar to them. The goal is to learn and describe the culture's characteristics much the same way anthropologists observe the cultural challenges and motivations that drive a group. This method often immerses the researcher as a subject for extended periods of time. In a business model, ethnography is central to understanding customers. Testing products personally or in beta groups before releasing them to the public is an example of ethnographic research.
  • 26. TYPES OF DATA ANALYSIS Grounded Theory Method The grounded theory method tries to explain why a course of action evolved the way it did. Grounded theory looks at large subject numbers. Theoretical models are developed based on existing data in existing modes of genetic, biological, or psychological science. Businesses use grounded theory when conducting user or satisfaction surveys that target why consumers use company products or services. This data helps companies maintain customer satisfaction and loyalty.
  • 27. TYPES OF DATA ANALYSIS Case Study Model Unlike grounded theory, the case study model provides an in-depth look at one test subject. The subject can be a person or family, business or organization, or a town or city. Data is collected from various sources and compiled using the details to create a bigger conclusion. Businesses often use case studies when marketing to new clients to show how their business solutions solve a problem for the subject
  • 28. TYPES OF DATA ANALYSIS Historical Model The historical method of qualitative research describes past events in order to understand present patterns and anticipate future choices. This model answers questions based on a hypothetical idea and then uses resources to test the idea for any potential deviations. Businesses can use historical data of previous ad campaigns and the targeted demographic and split-test it with new campaigns to determine the most effective campaign.
  • 29. TYPES OF DATA ANALYSIS Narrative Model The narrative model occurs over extended periods of time and compiles information as it happens. Like a story narrative, it takes subjects at a starting point and reviews situations as obstacles or opportunities occur, although the final narrative doesn't always remain in chronological order. Businesses use the narrative method to define buyer personas and use them to identify innovations that appeal to a target market
  • 30. MAIN APPROACHES TO DATA ANALYSIS Deductive Approach Inductive Approach
  • 31. APPROACHES TO DATA ANALYSIS Deductive Approach The deductive approach involves analyzing qualitative data based on a structure that is predetermined by the researcher. In this case, a researcher can use the questions as a guide for analyzing the data. This approach is quick and easy and can be used when a researcher has a fair idea about the likely responses that will be received from the sample population
  • 32. APPROACHES TO DATA ANALYSIS Inductive Approach The inductive approach, on the contrary, is not based on a predetermined structure or set ground rules/framework. This is more time consuming and a thorough approach to qualitative data analysis. Inductive approach is often used when a researcher has very little or no idea of the research phenomenon
  • 33. VARIABLE DESCRIPTIVE ANALYSIS 1. Univariate Analysis – contains a single variable 2. Bivariate Analysis – contains two variables 3. Multivariate Analysis – contains multiple variables
  • 34. VARIABLE DESCRIPTIVE ANALYSIS Univariate Analysis the simplest form of data analysis where the data being analyzed contains only one variable. Because there is only a single variable, it does not deal with causes or relationships. The main purpose of univariate analysis is to describe the data and find patterns that exist within it
  • 35. VARIABLE DESCRIPTIVE ANALYSIS Univariate Analysis Some ways that univariate data can describe patterns is by looking at the mean, mode, median, range, variance, maximum, minimum, quartiles, and standard deviation. Additionally, some ways you may display univariate data include frequency distribution tables, bar charts, histograms, frequency polygons, and pie charts
  • 36. VARIABLE DESCRIPTIVE ANALYSIS Bivariate Analysis used to find out if there is a relationship between two different variables.
  • 37. VARIABLE DESCRIPTIVE ANALYSIS Bivariate Analysis Something as simple as creating a scatterplot by plotting one variable against another on a Cartesian plane (think X and Y axis) can sometimes give a picture of what the data is trying to indicate. If the data seems to fit a line or curve, then there is a relationship or correlation between the two variables
  • 38. VARIABLE DESCRIPTIVE ANALYSIS Multivariate Analysis the analysis of three or more variables to determine relationship
  • 39. VARIABLE DESCRIPTIVE ANALYSIS  Additive Tree  Canonical Correlation Analysis  Cluster Analysis  Correspondence Analysis / Multiple Correspondence Analysis  Factor Analysis  Generalized Procrustean Analysis  MANOVA  Multidimensional Scaling  Multiple Regression Analysis  Partial Least Square Regression  Principal Component Analysis / Regression / PARAFAC  Redundancy Analysis. Multivariate Analysis
  • 40. CHARACTERISTICS OF DATA Frequency Distribution - a tabular representation of a survey data set used to organize and summarize the data. Specifically, it is a list of either qualitative or quantitative values that a variable takes in a data set and the associated number of times each value occurs (frequencies) Measures of Central Tendency - a summary statistic that represents the center point or typical value of a dataset. In statistics, the three most common measures of central tendency are the mean, median, and mode. Each of these measures calculates the location of the central point using a different method.
  • 41. FREQUENCY DISTRIBUTION There are four important characteristics of frequency distribution: 1. Measures of central tendency and location (mean, median, mode) 2. Measures of dispersion (range, variance, standard deviation) 3. The extent of symmetry/asymmetry (skewness) 4. The flatness or peakedness (kurtosis)
  • 42. FREQUENCY DISTRIBUTION Frequency distribution tells how frequencies are distributed over values. Frequency distributions are mostly used for summarizing categorical variables
  • 43. MEASURES OF CENTRAL TENDENCY There are three main measures of central tendency: 1. Mode 2. Median 3. Mean Each of these measures describes a different indication of the typical or central value in the distribution. The mode is the most commonly occurring value in a distribution
  • 44. MEASURES OF CENTRAL TENDENCY Mode - a list of numbers that refers to the integers that occur most frequently. Unlike the median and mean, the mode is about the frequency of occurrence. There can be more than one mode or no mode at all; it all depends on the data set itself
  • 45. MEASURES OF CENTRAL TENDENCY Median - the middle number when listed in order from least to greatest
  • 46. MEASURES OF CENTRAL TENDENCY Mean – refers to the average. To calculate the mean, add together all of the numbers in your data set. Then divide that sum by the number of addends
  • 47. VARIANCE IN DATA The variance (σ2) is a measure of how far each value in the data set is from the mean. Variance is a measure of how spread out a data set is. It's useful when creating statistical models since low variance can be a sign that you are over-fitting your data. Here is how it is defined: Subtract the mean from each value in the data. This gives you a measure of the distance of each value from the mean.
  • 48. VARIANCE IN DATA Calculating Variance of a Sample  Write down your sample data set  Write down the sample variance formula  Calculate the mean of the sample  Subtract the mean from each data point  Square each result  Find the sum of the squared values  Divide by n - 1, where n is the number of data points.
  • 49. VARIANCE IN DATA Write down your sample data set X X1 17 X2 15 X3 23 X4 7 X5 9 X6 13
  • 50. VARIANCE IN DATA Write down the sample variance formula The variance of a data set tells you how spread out the data points are. The closer the variance is to zero, the more closely the data points are clustered together.
  • 51. VARIANCE IN DATA Calculate the mean of the sample The symbol x̅ or "x-bar" refers to the mean of a sample.[Calculate this as you would any mean: add all the data points together, then divide by the number of data points.
  • 52. VARIANCE IN DATA Subtract the mean from each data point your answers should add up to zero. This is due to the definition of mean, since the negative answers (distance from mean to smaller numbers) exactly cancel out the positive answers (distance from mean to larger numbers)
  • 53. VARIANCE IN DATA Square each result This means the "average deviation" will always be zero as well, so that doesn't tell anything about how spread out the data is. To solve this problem, find the square of each deviation. This will make them all positive numbers, so the negative and positive values no longer cancel out to zero
  • 54. VARIANCE IN DATA Find the sum of the squared values ∑ tells you to sum the value of the following term for each value of xi Because (xi – x)2 is already calculated, all you need to do is add the results.
  • 55. VARIANCE IN DATA Divide by n - 1, where n is the number of data points
  • 56. VARIANCE VS. STANDARD DEVIATION Variance is a numerical value that describes the variability of observations from its arithmetic mean Standard deviation is a measure of dispersion of observations within a data set Variance is nothing but an average of squared deviations. On the other hand, the standard deviation is the root mean square deviation
  • 57. VARIANCE VS. STANDARD DEVIATION A variance of zero indicates that all of the data values are identical. A high variance indicates that the data points are very spread out from the mean, and from one another. Variance is the average of the squared distances from each point to the mean. Standard deviation is a number used to tell how measurements for a group are spread out from the average (mean), or expected value. A low standard deviation means that most of the numbers are close to the average. A high standard deviation means that the numbers are more spread out.
  • 58. RELATIONSHIPS BETWEEN VARIABLES The statistical relationship between two variables is referred to as their correlation. A correlation could be positive, meaning both variables move in the same direction, or negative, meaning that when the value of one variable increases, the value of the other variable decreases
  • 59. ASPECTS OF ASSOCIATION BETWEEN VARIABLES Association between two variables means the values of one variable relate in some way to the values of the other. Association is usually measured by correlation for two continuous variables and by cross tabulation and a Chi- square test for two categorical variables.
  • 60. ASPECTS OF ASSOCIATION BETWEEN VARIABLES Chi Square Test relating to or denoting a statistical method assessing the goodness of fit between observed values and those expected theoretically. Commonly used for testing relationships between categorical variables. The null hypothesis of the Chi-Square test is that no relationship exists on the categorical variables in the population; they are independent.
  • 61. ASPECTS OF ASSOCIATION BETWEEN VARIABLES Chi Square Test The subscript “c” are the degrees of freedom. “O” is your observed value and E is your expected value
  • 62. MEASURES OF ASSOCIATION BETWEEN VARIABLES The measures of association refer to a wide variety of coefficients (including bivariate correlation and regression coefficients) that measure the strength and direction of the relationship between variables; these measures of strength, or association, can be described in several ways, depending on the analysis.
  • 63. MEASURES OF ASSOCIATION BETWEEN VARIABLES For measures of association, a value of zero signifies that no relationship exists. In a correlation analysis, if the coefficient (r) has a value of one, it signifies a perfect relationship on the variables of interest. In regression analyses, if the standardized beta weight (β) has a value of one, it also signifies a perfect relationship on the variables of interest.
  • 64. STATISTICAL MEASURES OF RELATIONSHIPS 1. Correlational Coefficient 2. Linear Regression 3. Multiple Regression 4. Discriminant Analysis 5. Factor Analysis
  • 65. STATISTICAL MEASURES OF RELATIONSHIPS Correlational Coefficient the relationship between two or more variables or sets of data. It is expressed in the form of a coefficient with +1.00 indicating a perfect positive correlation; -1.00 indicating a perfect inverse correlation; 0.00 indicating a complete lack of a relationship.
  • 66. STATISTICAL MEASURES OF RELATIONSHIPS Correlational Coefficient  Pearson's Product Moment Coefficient (r) is the most often used and most precise coefficient; and generally used with continuous variables  Spearman Rank Order Coefficient (p) is a form of the Pearson's Product Moment Coefficient that can be used with ordinal or ranked data  Phi Correlation Coefficient is a form of the Pearson's Product Moment Coefficient that can be used with dichotomous variables (i.e. pass/fail, male/female)
  • 67. STATISTICAL MEASURES OF RELATIONSHIPS Linear Regression the use of correlation coefficients to plot a line illustrating the linear relationship of two variables X and Y. It is based on the slope of the line which is represented by the formula : Y = a + bX where • Y = dependent variable • X = independent variable • b = slope of the line • a = constant or Y intercept Regression is used extensively in making predictions based on finding unknown Y values from known X values
  • 68. STATISTICAL MEASURES OF RELATIONSHIPS Multiple Regression the same as regression except that it attempts to predict Y from two or more independent X variables. The formula for multiple regression is an extension of the linear regression formula: Y = a + b1 X1 + b2 X2 + .... Multiple regression is used extensively in making predictions based on finding unknown Y values from known X values
  • 69. STATISTICAL MEASURES OF RELATIONSHIPS Discriminant Analysis analogous to multiple regression, except that the criterion variable consists of two categories rather than a continuous range of values
  • 70. STATISTICAL MEASURES OF RELATIONSHIPS Factor Analysis often used when a large number of correlations have been explored in a given study; it is a means of grouping certain variables into clusters or factors that are moderately to highly correlated with each other
  • 71. ANALYZING DIFFERENCES WITHIN THE DATA 1. T-Test 2. Matched Pairs T-Test 3. Analysis of Variance (ANOVA)
  • 72. ANALYZING DIFFERENCES WITHIN THE DATA T-Test A t-test is used to determine if the scores of two groups differ on a single variable. A t-test is designed to test for the differences in mean scores Note: A t-test is appropriate only when looking at paired data. It is useful in analyzing scores of two groups of participants on a particular variable or in analyzing scores of a single group of participants on two variables.
  • 73. ANALYZING DIFFERENCES WITHIN THE DATA Matched Pairs T-Test This type of t-test could be used to determine if the scores of the same participants in a study differ under different conditions Note: A t-test is appropriate only when looking at paired data. It is useful in analyzing scores of two groups of participants on a particular variable or in analyzing scores of a single group of participants on two variables
  • 74. ANALYZING DIFFERENCES WITHIN THE DATA Analysis of Variance (ANOVA) The ANOVA (analysis of variance) is a statistical test which makes a single, overall decision as to whether a significant difference is present among three or more sample means (Levin 484). An ANOVA is similar to a t-test. However, the ANOVA can also test multiple groups to see if they differ on one or more variables. The ANOVA can be used to test between-groups and within-groups differences.
  • 75. ANALYZING DIFFERENCES WITHIN THE DATA Analysis of Variance (ANOVA) One-Way ANOVA: This tests a group or groups to determine if there are differences on a single set of scores Multiple ANOVA (MANOVA): This tests a group or groups to determine if there are differences on two or more variables
  • 76. MULTIVARIATE ANALYSIS Multivariate analysis is used to study more complex sets of data than what univariate analysis methods can handle. This type of analysis is almost always performed with software (i.e. SPSS or SAS), as working with even the smallest of data sets can be overwhelming by hand. Multivariate analysis can reduce the likelihood of Type I errors. Sometimes, univariate analysis is preferred as multivariate techniques can result in difficulty interpreting the results of the test. For example, group differences on a linear combination of dependent variables in MANOVA can be unclear. In addition, multivariate analysis is usually unsuitable for small sets of data.
  • 77. MULTIVARIATE ANALYSIS There are more than 20 different ways to perform multivariate analysis, depending on the type of data and the objectives of the research. For single data sets there are several choices: 1. Additive Trees, Multidimensional Scaling, and Cluster Analysis are appropriate for when the rows and columns in a data table represent the same units and the measure is either a similarity or a distance 2. Principal Component Analysis (PCA) decomposes a data table with correlated measures into a new set of uncorrelated measures 3. Correspondence Analysis is similar to PCA. However, it applies to contingency tables
  • 78. MULTIVARIATE ANALYSIS  Additive Tree  Canonical Correlation Analysis  Cluster Analysis  Correspondent Analysis/Multiple Correspondence Analysis  Factor Analysis  Generalized Procrustean Analysis  Independent Component Analysis  MANOVA  Multidimensional Scaling  Multiple Regression Analysis  Partial Least Square Regression  Principal Component Analysis/Regression/PARAFAC  Redundancy Analysis
  • 79. MULTIVARIATE ANALYSIS Additive Tree a general way to represent clusters of data in a graph. It is used when the data table is composed of rows and columns that represent the same units; the measure must be a distance or a similarity.
  • 80. MULTIVARIATE ANALYSIS Additive Tree A “tree” is a finite, connected graph where any two nodes are connected by one path. The additive tree is a similar technique to cluster analysis. Both techniques have the “leaves” of the tree representing units. Where the additive tree differs is that the distance is graphically represented by the distance of those units on the tree
  • 81. MULTIVARIATE ANALYSIS Additive Tree Cluster Analysis creates the clusters but does not create a graph that represents the results. An additional limitation of hierarchical cluster analysis is that objects in the same cluster must be exactly the same distance from each other, and the distances between clusters must be larger than the “within clusters” distance. Additive trees do not have these limitations
  • 82. MULTIVARIATE ANALYSIS Canonical Correlation Analysis one way to find associations between two data sets. Like the Correlation Coefficient, CCA measures the relationship between variables. Where Canonical Correlation Analysis differs is that it is specifically used to find the relationships between two sets of variables
  • 83. MULTIVARIATE ANALYSIS Canonical Correlation Analysis appropriate to use in the same situations as you might use multiple regression analysis, but when you have multiple intercorrelated outcome variables. CCA is not recommended for small data sets.
  • 84. MULTIVARIATE ANALYSIS Canonical Correlation Analysis The purpose of Canonical Correlation Analysis is to explain the variability within and between sets through identification of several sets of canonical variates. Canonical variates are new variables formed by making a linear combination of two of more variables from the data sets. When running CCA, you choose weights that maximize the correlation between these sets of variates.
  • 85. MULTIVARIATE ANALYSIS Cluster Analysis Clustering in statistics refers to how data is gathered (“clustered”) by factors like:  Age.  Household size.  Income.  Education level.
  • 86. MULTIVARIATE ANALYSIS Cluster Analysis Clusters can be based on factors like: Distance-based clustering. Items are sorted based on their proximity (or distance). For example, cancer cases might be clustered together if they are in the same geographic location Conceptual clustering. Items are grouped by factors that items have in common. For example, cancer clusters could be grouped by “people who work in manufacturing
  • 87. MULTIVARIATE ANALYSIS Cluster Analysis Clustering Types (continued): Hierarchical Clustering This is a more complex approach to clustering used in data mining. Basically, each item is given its own cluster. A pair of clusters is joined based on similarities, giving one less cluster. This process is repeated until all items are clustered. The dendrogram is a graph that shows hierarchical clusters Probabilistic Clustering. Data is clustered using algorithms which connect items using distances or densities. This is usually performed by a computer
  • 88. MULTIVARIATE ANALYSIS Cluster Analysis Clustering Types (continued): Ward’s method: uses minimum variance in each step to create relatively small, even-sized clusters
  • 89. MULTIVARIATE ANALYSIS Cluster Analysis Clustering Types: Exclusive Clustering. Each item can only belong in a single cluster. It cannot belong in another cluster Fuzzy Clustering: Data points are assigned a probability of belonging to one or more clusters Overlapping Clustering. Each item can belong to more than one cluster
  • 90. MULTIVARIATE ANALYSIS Correspondence Analysis/Multiple Correspondence Analysis a descriptive/exploratory technique designed to analyze simple two-way and multi-way tables containing some measure of correspondence between the rows and columns
  • 91. MULTIVARIATE ANALYSIS Factor Analysis a way to take a mass of data and shrinking it to a smaller data set that is more manageable and more understandable. It is a way to find hidden patterns, show how those patterns overlap, and show what characteristics are seen in multiple patterns. It is also used to create a set of variables for similar items in the set called dimensions
  • 92. MULTIVARIATE ANALYSIS Factor Analysis It can be a very useful tool for complex sets of data involving psychological studies, socioeconomic status and other involved concepts. A “factor” is a set of observed variables that have similar response patterns; They are associated with a hidden variable (called a confounding variable) that is not directly measured. Factors are listed according to factor loadings, or how much variation in the data they can explain
  • 93. MULTIVARIATE ANALYSIS Factor Analysis Types: 1. Exploratory factor analysis is if the researcher does not any idea about the structure of the data or the number of dimensions exist in a set of variables 2. Confirmatory Factor Analysis is used for verification as long as there is a specific idea about the data structure or the number of dimensions in a set of variables
  • 94. MULTIVARIATE ANALYSIS Generalized Procrustean Analysis a way to compare two sets of configurations, or shapes. Originally developed to match two solutions from Factor Analysis, the technique was extended to Generalized Procrustes Analysis so that more than two shapes could be compared. The shapes are aligned to a target shape or to each other. GPA uses geometric transformations (i.e. isotropic rescaling, reflection, rotation, or translation) of matrices to compare the sets of data
  • 95. MULTIVARIATE ANALYSIS Independent Component Analysis used in statistics and signal processing to express a multivariate function by its hidden factors or subcomponents. These component signals are independent non-Gaussian signals, and the intention is that these independent subcomponents accurately represent the composite signal
  • 96. MULTIVARIATE ANALYSIS Multiple Analysis of Variance (MANOVA) Analysis of variance (ANOVA) tests for differences between means. MANOVA is just an ANOVA with several dependent variables. Similar to many other tests and experiments in that it’s purpose is to find out if the response variable (i.e. your dependent variable) is changed by manipulating the independent variable
  • 97. MULTIVARIATE ANALYSIS Multiple Analysis of Variance (MANOVA) The test helps to answer many research questions, including: Do changes to the independent variables have statistically significant effects on dependent variables? What are the interactions among dependent variables? What are the interactions among independent variables?
  • 98. MULTIVARIATE ANALYSIS Multidimensional Scaling a visual representation of distances or dissimilarities between sets of objects. “Objects” can be colors, faces, map coordinates, political persuasion, or any kind of real or conceptual stimuli. Objects that are more similar (or have shorter distances) are closer together on the graph than objects that are less similar (or have longer distances). As well as interpreting dissimilarities as distances on a graph, MDS can also serve as a dimension reduction technique for high-dimensional data
  • 99. MULTIVARIATE ANALYSIS Multiple Regression Analysis used to see if there is a statistically significant relationship between sets of variables. It’s used to find trends in those sets of data. Multiple regression analysis is almost the same as simple linear regression. The only difference between simple linear regression and multiple regression is in the number of predictors (“x” variables) used in the regression
  • 100. MULTIVARIATE ANALYSIS Partial Least Square Regression if the data shows a linear relationship between the X and Y variables, the line that best fits this linear relationship needs to be found. That line is called a Regression Line and has the equation ŷ= a + b x The Least Squares Regression Line is the line that makes the vertical distance from the data points to the regression line as small as possible. It’s called a “least squares” because the best line of fit is one that minimizes the variance
  • 101. MULTIVARIATE ANALYSIS Parallel Factor Analysis (PARAFAC) is a generalization of Principal Component Analysis to higher-order arrays. It is useful for exploratory data analysis on very particular sets of data, for example if you have three-way data. Where PARAFAC differs from Principal Component Analysis is that PARAFAC produces unique components
  • 102. MULTIVARIATE ANALYSIS Principal Component Analysis is a tool that has two main purposes: 1. To find variability in a data set and 2. To reduce the dimensions of the data set Reducing dimensions means that redundancy in the data is eliminated; This can make patterns in the data set more clear. Therefore, Principal Component Analysis is a good tool to use redundancies are suspected in a data set. Redundancy doesn’t mean that the variables are identical; it means that there is a strong correlation between them
  • 103. MULTIVARIATE ANALYSIS Principal Component Regression based on Principal Component Analysis. It is used when the data set exhibits multicollinearity, meaning that although least squares estimates are biased, variances may be too far away from the actual value. PCA adds some bias to the regression model and reduces standard error. The first step in PCA is the same as in Principal Component Analysis: identify the principal components. Regression is then performed on those components
  • 104. MULTIVARIATE ANALYSIS Redundancy Analysis the constrained version of Principal Components Analysis. Constrained basically means reduction of dimensions. This reduction is what leads to more understandable results. Redundancy Analysis is a way to summarize linear relationships in a set of dependent variables that are influenced by a set of independent variables.  Linear Regression is first applied to represent Y as a function of X.  PCA is then applied to a matrix of the results to provide a visual representation.