SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Statistical Analysis


By Rama Krishna Kompella
Relationships Between Variables
• The relationship between variables can be
  explained in various ways such as:
  –   Presence /absence of a relationship
  –   Directionality of the relationship
  –   Strength of association
  –   Type of relationship
Relationships Between Variables
• Presence / absence of a relationship
  – E.g., if we are interested to study the customer
    satisfaction levels of a fast-food restaurant, then
    we need to know if the quality of food and
    customer satisfaction have any relationship or not
Relationships Between Variables
• Direction of the relationship
  – The direction of a relationship can be either
    positive or negative
  – Food quality perceptions are related positively to
    customer commitment toward a restaurant.
Relationships Between Variables
• Strength of association
– They are generally categorized as nonexistent, weak,
  moderate, or strong.
– Quality of food is strongly associated with customer
  satisfaction in a fast-food restaurant
Relationships Between Variables
• Type of association
  – How can the link between Y and X best be
    described?
  – There are different ways in which two variables
    can share a relationship
     • Linear relationship
     • Curvilinear relationship
Chi-Square (χ2) and Frequency Data
• Today the data that we analyze consists of frequencies; that
  is, the number of individuals falling into categories. In other
  words, the variables are measured on a nominal scale.
• The test statistic for frequency data is Pearson Chi-Square.
  The magnitude of Pearson Chi-Square reflects the amount of
  discrepancy between observed frequencies and expected
  frequencies.
Steps in Test of Hypothesis
1.   Determine the appropriate test
2.   Establish the level of significance:Îą
3.   Formulate the statistical hypothesis
4.   Calculate the test statistic
5.   Determine the degree of freedom
6.   Compare computed test statistic against a
     tabled/critical value
1. Determine Appropriate Test
• Chi Square is used when both variables are
  measured on a nominal scale.
• It can be applied to interval or ratio data that
  have been categorized into a small number of
  groups.
• It assumes that the observations are randomly
  sampled from the population.
• All observations are independent (an individual
  can appear only once in a table and there are no
  overlapping categories).
• It does not make any assumptions about the
  shape of the distribution nor about the
  homogeneity of variances.
2. Establish Level of Significance
• α is a predetermined value
• The convention
     • α = .05
     • α = .01
     • α = .001
3. Determine The Hypothesis:
Whether There is an Association
            or Not
• Ho : The two variables are independent
• Ha : The two variables are associated
4. Calculating Test Statistics
• Contrasts observed frequencies in each cell of a
  contingency table with expected frequencies.
• The expected frequencies represent the number of
  cases that would be found in each cell if the null
  hypothesis were true ( i.e. the nominal variables are
  unrelated).
• Expected frequency of two unrelated events is
  product of the row and column frequency divided by
  number of cases.
            Fe= Fr Fc / N
4. Calculating Test Statistics



      ( Fo − Fe )         2
χ = ∑
 2
                   
           Fe     
4. Calculating Test Statistics
            O
         fre bse
            qu rv
              en ed
                cie
                   s


      ( Fo − Fe )                2
χ = ∑
 2
                   
           Fe     

                                   Ex que
                                     fre
                                      pe nc
                                         cte y
                                            d
                          qu ted
                              cy
                       fre pec
                            en
                         Ex
5. Determine Degrees of




                                                     of
                                                 ber
                                            Num ls in
                                             leve n
                                                    m
                          df = (R-1)(C-1)

                                               colu le
                                                     b
        Freedom


                                                varia
                                              Numb
                                                     e
                                            levels r of
                                                   in ro
                                              variab w
                                                     le
6. Compare computed test statistic
      against a tabled/critical value
• The computed value of the Pearson chi-
  square statistic is compared with the critical
  value to determine if the computed value is
  improbable
• The critical tabled values are based on
  sampling distributions of the Pearson chi-
  square statistic
• If calculated χ2 is greater than χ2 table
  value, reject Ho
Example
• Suppose a researcher is interested in buying
  preferences of environmentally conscious
  consumers.
• A questionnaire was developed and sent to a
  random sample of 90 voters.
• The researcher also collects information about
  the gender of the sample of 90 respondents.
Bivariate Frequency Table or
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                Contingency Table

                    Favor   Neutral   Oppose   f row

Male                10      10        30       50

Female              15      15        10       40


f column          e d 25    25        40       n = 90
               erv cies
            bs en
           O qu
            fre
Bivariate Frequency Table or




                                               Row frequency
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                   Contingency Table

                   Favor   Neutral   Oppose   f row

   Male            10      10        30       50

   Female          15      15        10       40


   f column        25      25        40       n = 90
Column frequency
1. Determine Appropriate Test

1. Gender ( 2 levels) and Nominal
2. Buying Preference ( 3 levels) and Nominal
2. Establish Level of Significance

            Alpha of .05
3. Determine The Hypothesis
• Ho : There is no difference between men and
  women in their opinion on pro-environmental
  products.

• Ha : There is an association between gender
  and opinion on pro-environmental products.
4. Calculating Test Statistics

               Favor    Neutral    Oppose     f row

Men            fo =10   fo =10     fo =30     50
               fe =13.9 fe =13.9   fe=22.2
Women          fo =15   fo =15     fo =10     40
               fe =11.1 fe =11.1   fe =17.8
f column       25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row
                        = 50*25/90
Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9   fe=22.2
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row

Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9 fe=22.2
                       = 40* 25/90
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics


    (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2
χ =
 2
                  +              +              +
        13.89          13.89          22.2

      (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2
                    +              +
          11.11          11.11          17.8


     = 11.03
5. Determine Degrees of
        Freedom
      df = (R-1)(C-1) =
       (2-1)(3-1) = 2
6. Compare computed test statistic
       against a tabled/critical value
•   α = 0.05
•   df = 2
•   Critical tabled value = 5.991
•   Test statistic, 11.03, exceeds critical value
•   Null hypothesis is rejected
•   Men and women differ significantly in their
    opinions on pro-environmental products
SPSS Output Example

                     Chi-Square Tests

                                                 Asymp. Sig.
                        Value           df        (2-sided)
Pearson Chi-Square       11.025a             2           .004
Likelihood Ratio         11.365              2           .003
Linear-by-Linear
                           8.722             1            .003
Association
N of Valid Cases              90
  a. 0 cells (.0%) have expected count less than 5. The
     minimum expected count is 11.11.
Additional Information in SPSS Output
• Exceptions that might distort χ2 Assumptions
  – Associations in some but not all categories
  – Low expected frequency per cell
• Extent of association is not same as statistical
  significance




                                             Demonstrated
                                          through an example
Another Example Heparin Lock
                       Placement
                   Complication Incidence * Heparin Lock Placement Time Group Crosstabulation


                                                                         Heparin Lock                       Time:
                                                                     Placement Time Group
                                                                                                          1 = 72 hrs
                                                                         1          2           Total
           Complication    Had Compilca      Count                            9         11           20
                                                                                                          2 = 96 hrs
           Incidence                         Expected Count                10.0       10.0         20.0
                                             % within Heparin Lock
                                                                        18.0%       22.0%        20.0%
                                             Placement Time Group
                           Had NO Compilca   Count                          41         39            80
                                             Expected Count               40.0       40.0          80.0
                                             % within Heparin Lock
                                                                        82.0%       78.0%        80.0%
                                             Placement Time Group
           Total                             Count                          50         50          100
                                             Expected Count               50.0       50.0        100.0
                                             % within Heparin Lock
                                                                       100.0%     100.0%        100.0%
                                             Placement Time Group




from Polit Text: Table 8-1
Hypotheses in Smoking Habit


• Ho: There is no association between
  complication incidence and duration of
  smoking habit. (The variables are
  independent).
• Ha: There is an association between
  complication incidence and duration of
  smoking habit. (The variables are related).
More of SPSS Output



                                     Chi-Square Tests

                                                  Asymp. Sig.    Exact Sig.   Exact Sig.
                         Value           df        (2-sided)      (2-sided)    (1-sided)
Pearson Chi-Square          .250b             1           .617
Continuity Correctiona      .063              1           .803
Likelihood Ratio            .250              1           .617
Fisher's Exact Test                                                   .803         .402
Linear-by-Linear
                            .248              1          .619
Association
N of Valid Cases            100
  a. Computed only for a 2x2 table
  b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.
     00.
Pearson Chi-Square
• Pearson Chi-Square = .
  250, p = .617
 Since the p > .05, we fail to
  reject the null hypothesis                                         Chi-Square Tests

  that the complication rate                              Value          df
                                                                                  Asymp. Sig.
                                                                                   (2-sided)
                                                                                                 Exact Sig.
                                                                                                  (2-sided)
                                                                                                              Exact Sig.
                                                                                                               (1-sided)

  is unrelated to smoking        Pearson Chi-Square
                                 Continuity Correctiona
                                                             .250b
                                                             .063
                                                                              1
                                                                              1
                                                                                          .617
                                                                                          .803


  habit duration.                Likelihood Ratio
                                 Fisher's Exact Test
                                 Linear-by-Linear
                                                             .250             1           .617
                                                                                                      .803         .402



• Continuity correction is
                                                             .248             1          .619
                                 Association
                                 N of Valid Cases            100


  used in situations in which
                                   a. Computed only for a 2x2 table
                                   b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.


  the expected frequency
                                      00.




  for any cell in a 2 by 2
  table is less than 10.
More SPSS Output



                                    Symmetric Measures

                                                          Asymp.
                                                                  a          b
                                              Value      Std. Error Approx. T Approx. Sig.
Nominal by           Phi                        -.050                                .617
Nominal              Cramer's V                  .050                                .617
Interval by Interval Pearson's R                -.050         .100      -.496        .621c
Ordinal by Ordinal Spearman Correlation         -.050         .100      -.496        .621c
N of Valid Cases                                  100
  a. Not assuming the null hypothesis.
  b. Using the asymptotic standard error assuming the null hypothesis.
  c. Based on normal approximation.
Phi Coefficient
• Pearson Chi-Square                                                 Symmetric Measures

                                                                                         Asymp.
                                                                                                 a
                                                                             Value      Std. Error

  provides information         Nominal by
                               Nominal
                                                      Phi
                                                      Cramer's V
                                                                               -.050
                                                                                .050


  about the existence of
                               Interval by Interval   Pearson's R              -.050         .100
                               Ordinal by Ordinal     Spearman Correlation     -.050         .100
                               N of Valid Cases                                  100

  relationship between 2         a. Not assuming the null hypothesis.
                                 b. Using the asymptotic standard error assuming the null hypothes


  nominal variables, but not
                                 c. Based on normal approximation.




  about the magnitude of
  the relationship
• Phi coefficient is the                       χ                     2
  measure of the strength                   φ=
  of the association                           N
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
                                   Number of                                    Smallest of
                                     cases                                   number of rows or
Q & As

Weitere ähnliche Inhalte

Ähnlich wie T10 statisitical analysis

T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric tests
kompellark
 
NON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptxNON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptx
DrLasya
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
ANSHU TIWARI
 

Ähnlich wie T10 statisitical analysis (20)

Chi square[1]
Chi square[1]Chi square[1]
Chi square[1]
 
7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
Top schools in India | Delhi NCR | Noida |
Top schools in India | Delhi NCR | Noida | Top schools in India | Delhi NCR | Noida |
Top schools in India | Delhi NCR | Noida |
 
Top schools in ghaziabad
Top schools in ghaziabadTop schools in ghaziabad
Top schools in ghaziabad
 
Categorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisCategorical Data and Statistical Analysis
Categorical Data and Statistical Analysis
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Unit 4
Unit 4Unit 4
Unit 4
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Correlation
CorrelationCorrelation
Correlation
 
Stat2013
Stat2013Stat2013
Stat2013
 
The chi – square test
The chi – square testThe chi – square test
The chi – square test
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric tests
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1
 
F-Distribution
F-DistributionF-Distribution
F-Distribution
 
NON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptxNON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptx
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)
 

Mehr von kompellark

Mehr von kompellark (20)

T22 research report writing
T22 research report writingT22 research report writing
T22 research report writing
 
Rubric assignment 2
Rubric   assignment 2Rubric   assignment 2
Rubric assignment 2
 
Answers mid-term
Answers   mid-termAnswers   mid-term
Answers mid-term
 
Exam paper
Exam paperExam paper
Exam paper
 
T21 conjoint analysis
T21 conjoint analysisT21 conjoint analysis
T21 conjoint analysis
 
T20 cluster analysis
T20 cluster analysisT20 cluster analysis
T20 cluster analysis
 
T19 factor analysis
T19 factor analysisT19 factor analysis
T19 factor analysis
 
T18 discriminant analysis
T18 discriminant analysisT18 discriminant analysis
T18 discriminant analysis
 
T17 correlation
T17 correlationT17 correlation
T17 correlation
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
 
T15 ancova
T15 ancovaT15 ancova
T15 ancova
 
T14 anova
T14 anovaT14 anova
T14 anova
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric tests
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of tests
 
T15 ancova
T15 ancovaT15 ancova
T15 ancova
 
T14 anova
T14 anovaT14 anova
T14 anova
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric tests
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of tests
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
 
Rubric assignment 1
Rubric   assignment 1Rubric   assignment 1
Rubric assignment 1
 

KĂźrzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
 

KĂźrzlich hochgeladen (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

T10 statisitical analysis

  • 1. Statistical Analysis By Rama Krishna Kompella
  • 2. Relationships Between Variables • The relationship between variables can be explained in various ways such as: – Presence /absence of a relationship – Directionality of the relationship – Strength of association – Type of relationship
  • 3. Relationships Between Variables • Presence / absence of a relationship – E.g., if we are interested to study the customer satisfaction levels of a fast-food restaurant, then we need to know if the quality of food and customer satisfaction have any relationship or not
  • 4. Relationships Between Variables • Direction of the relationship – The direction of a relationship can be either positive or negative – Food quality perceptions are related positively to customer commitment toward a restaurant.
  • 5. Relationships Between Variables • Strength of association – They are generally categorized as nonexistent, weak, moderate, or strong. – Quality of food is strongly associated with customer satisfaction in a fast-food restaurant
  • 6. Relationships Between Variables • Type of association – How can the link between Y and X best be described? – There are different ways in which two variables can share a relationship • Linear relationship • Curvilinear relationship
  • 7. Chi-Square (χ2) and Frequency Data • Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. • The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.
  • 8. Steps in Test of Hypothesis 1. Determine the appropriate test 2. Establish the level of significance:Îą 3. Formulate the statistical hypothesis 4. Calculate the test statistic 5. Determine the degree of freedom 6. Compare computed test statistic against a tabled/critical value
  • 9. 1. Determine Appropriate Test • Chi Square is used when both variables are measured on a nominal scale. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population. • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.
  • 10. 2. Establish Level of Significance • Îą is a predetermined value • The convention • Îą = .05 • Îą = .01 • Îą = .001
  • 11. 3. Determine The Hypothesis: Whether There is an Association or Not • Ho : The two variables are independent • Ha : The two variables are associated
  • 12. 4. Calculating Test Statistics • Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N
  • 13. 4. Calculating Test Statistics  ( Fo − Fe )  2 χ = ∑ 2   Fe 
  • 14. 4. Calculating Test Statistics O fre bse qu rv en ed cie s  ( Fo − Fe )  2 χ = ∑ 2   Fe  Ex que fre pe nc cte y d qu ted cy fre pec en Ex
  • 15. 5. Determine Degrees of of ber Num ls in leve n m df = (R-1)(C-1) colu le b Freedom varia Numb e levels r of in ro variab w le
  • 16. 6. Compare computed test statistic against a tabled/critical value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi- square statistic • If calculated χ2 is greater than χ2 table value, reject Ho
  • 17. Example • Suppose a researcher is interested in buying preferences of environmentally conscious consumers. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the gender of the sample of 90 respondents.
  • 18. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 19. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column e d 25 25 40 n = 90 erv cies bs en O qu fre
  • 20. Bivariate Frequency Table or Row frequency Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 21. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90 Column frequency
  • 22. 1. Determine Appropriate Test 1. Gender ( 2 levels) and Nominal 2. Buying Preference ( 3 levels) and Nominal
  • 23. 2. Establish Level of Significance Alpha of .05
  • 24. 3. Determine The Hypothesis • Ho : There is no difference between men and women in their opinion on pro-environmental products. • Ha : There is an association between gender and opinion on pro-environmental products.
  • 25. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 26. 4. Calculating Test Statistics Favor Neutral Oppose f row = 50*25/90 Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 27. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 = 40* 25/90 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 28. 4. Calculating Test Statistics (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2 χ = 2 + + + 13.89 13.89 22.2 (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2 + + 11.11 11.11 17.8 = 11.03
  • 29. 5. Determine Degrees of Freedom df = (R-1)(C-1) = (2-1)(3-1) = 2
  • 30. 6. Compare computed test statistic against a tabled/critical value • Îą = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Men and women differ significantly in their opinions on pro-environmental products
  • 31. SPSS Output Example Chi-Square Tests Asymp. Sig. Value df (2-sided) Pearson Chi-Square 11.025a 2 .004 Likelihood Ratio 11.365 2 .003 Linear-by-Linear 8.722 1 .003 Association N of Valid Cases 90 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 11.11.
  • 32. Additional Information in SPSS Output • Exceptions that might distort χ2 Assumptions – Associations in some but not all categories – Low expected frequency per cell • Extent of association is not same as statistical significance Demonstrated through an example
  • 33. Another Example Heparin Lock Placement Complication Incidence * Heparin Lock Placement Time Group Crosstabulation Heparin Lock Time: Placement Time Group 1 = 72 hrs 1 2 Total Complication Had Compilca Count 9 11 20 2 = 96 hrs Incidence Expected Count 10.0 10.0 20.0 % within Heparin Lock 18.0% 22.0% 20.0% Placement Time Group Had NO Compilca Count 41 39 80 Expected Count 40.0 40.0 80.0 % within Heparin Lock 82.0% 78.0% 80.0% Placement Time Group Total Count 50 50 100 Expected Count 50.0 50.0 100.0 % within Heparin Lock 100.0% 100.0% 100.0% Placement Time Group from Polit Text: Table 8-1
  • 34. Hypotheses in Smoking Habit • Ho: There is no association between complication incidence and duration of smoking habit. (The variables are independent). • Ha: There is an association between complication incidence and duration of smoking habit. (The variables are related).
  • 35. More of SPSS Output Chi-Square Tests Asymp. Sig. Exact Sig. Exact Sig. Value df (2-sided) (2-sided) (1-sided) Pearson Chi-Square .250b 1 .617 Continuity Correctiona .063 1 .803 Likelihood Ratio .250 1 .617 Fisher's Exact Test .803 .402 Linear-by-Linear .248 1 .619 Association N of Valid Cases 100 a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. 00.
  • 36. Pearson Chi-Square • Pearson Chi-Square = . 250, p = .617 Since the p > .05, we fail to reject the null hypothesis Chi-Square Tests that the complication rate Value df Asymp. Sig. (2-sided) Exact Sig. (2-sided) Exact Sig. (1-sided) is unrelated to smoking Pearson Chi-Square Continuity Correctiona .250b .063 1 1 .617 .803 habit duration. Likelihood Ratio Fisher's Exact Test Linear-by-Linear .250 1 .617 .803 .402 • Continuity correction is .248 1 .619 Association N of Valid Cases 100 used in situations in which a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. the expected frequency 00. for any cell in a 2 by 2 table is less than 10.
  • 37. More SPSS Output Symmetric Measures Asymp. a b Value Std. Error Approx. T Approx. Sig. Nominal by Phi -.050 .617 Nominal Cramer's V .050 .617 Interval by Interval Pearson's R -.050 .100 -.496 .621c Ordinal by Ordinal Spearman Correlation -.050 .100 -.496 .621c N of Valid Cases 100 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. c. Based on normal approximation.
  • 38. Phi Coefficient • Pearson Chi-Square Symmetric Measures Asymp. a Value Std. Error provides information Nominal by Nominal Phi Cramer's V -.050 .050 about the existence of Interval by Interval Pearson's R -.050 .100 Ordinal by Ordinal Spearman Correlation -.050 .100 N of Valid Cases 100 relationship between 2 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothes nominal variables, but not c. Based on normal approximation. about the magnitude of the relationship • Phi coefficient is the χ 2 measure of the strength φ= of the association N
  • 39. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1)
  • 40. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1) Number of Smallest of cases number of rows or

Hinweis der Redaktion

  1. Mean difference between pairs of values
  2. Mean difference between pairs of values
  3. Mean difference between pairs of values