SlideShare a Scribd company logo
1 of 41
Statistical Analysis


By Rama Krishna Kompella
Relationships Between Variables
• The relationship between variables can be
  explained in various ways such as:
  –   Presence /absence of a relationship
  –   Directionality of the relationship
  –   Strength of association
  –   Type of relationship
Relationships Between Variables
• Presence / absence of a relationship
  – E.g., if we are interested to study the customer
    satisfaction levels of a fast-food restaurant, then
    we need to know if the quality of food and
    customer satisfaction have any relationship or not
Relationships Between Variables
• Direction of the relationship
  – The direction of a relationship can be either
    positive or negative
  – Food quality perceptions are related positively to
    customer commitment toward a restaurant.
Relationships Between Variables
• Strength of association
– They are generally categorized as nonexistent, weak,
  moderate, or strong.
– Quality of food is strongly associated with customer
  satisfaction in a fast-food restaurant
Relationships Between Variables
• Type of association
  – How can the link between Y and X best be
    described?
  – There are different ways in which two variables
    can share a relationship
     • Linear relationship
     • Curvilinear relationship
Chi-Square (χ2) and Frequency Data
• Today the data that we analyze consists of frequencies; that
  is, the number of individuals falling into categories. In other
  words, the variables are measured on a nominal scale.
• The test statistic for frequency data is Pearson Chi-Square.
  The magnitude of Pearson Chi-Square reflects the amount of
  discrepancy between observed frequencies and expected
  frequencies.
Steps in Test of Hypothesis
1.   Determine the appropriate test
2.   Establish the level of significance:α
3.   Formulate the statistical hypothesis
4.   Calculate the test statistic
5.   Determine the degree of freedom
6.   Compare computed test statistic against a
     tabled/critical value
1. Determine Appropriate Test
• Chi Square is used when both variables are
  measured on a nominal scale.
• It can be applied to interval or ratio data that
  have been categorized into a small number of
  groups.
• It assumes that the observations are randomly
  sampled from the population.
• All observations are independent (an individual
  can appear only once in a table and there are no
  overlapping categories).
• It does not make any assumptions about the
  shape of the distribution nor about the
  homogeneity of variances.
2. Establish Level of Significance
• α is a predetermined value
• The convention
     • α = .05
     • α = .01
     • α = .001
3. Determine The Hypothesis:
Whether There is an Association
            or Not
• Ho : The two variables are independent
• Ha : The two variables are associated
4. Calculating Test Statistics
• Contrasts observed frequencies in each cell of a
  contingency table with expected frequencies.
• The expected frequencies represent the number of
  cases that would be found in each cell if the null
  hypothesis were true ( i.e. the nominal variables are
  unrelated).
• Expected frequency of two unrelated events is
  product of the row and column frequency divided by
  number of cases.
            Fe= Fr Fc / N
4. Calculating Test Statistics



      ( Fo − Fe )         2
χ = ∑
 2
                   
           Fe     
4. Calculating Test Statistics
            O
         fre bse
            qu rv
              en ed
                cie
                   s


      ( Fo − Fe )                2
χ = ∑
 2
                   
           Fe     

                                   Ex que
                                     fre
                                      pe nc
                                         cte y
                                            d
                          qu ted
                              cy
                       fre pec
                            en
                         Ex
5. Determine Degrees of




                                                     of
                                                 ber
                                            Num ls in
                                             leve n
                                                    m
                          df = (R-1)(C-1)

                                               colu le
                                                     b
        Freedom


                                                varia
                                              Numb
                                                     e
                                            levels r of
                                                   in ro
                                              variab w
                                                     le
6. Compare computed test statistic
      against a tabled/critical value
• The computed value of the Pearson chi-
  square statistic is compared with the critical
  value to determine if the computed value is
  improbable
• The critical tabled values are based on
  sampling distributions of the Pearson chi-
  square statistic
• If calculated χ2 is greater than χ2 table
  value, reject Ho
Example
• Suppose a researcher is interested in buying
  preferences of environmentally conscious
  consumers.
• A questionnaire was developed and sent to a
  random sample of 90 voters.
• The researcher also collects information about
  the gender of the sample of 90 respondents.
Bivariate Frequency Table or
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                Contingency Table

                    Favor   Neutral   Oppose   f row

Male                10      10        30       50

Female              15      15        10       40


f column          e d 25    25        40       n = 90
               erv cies
            bs en
           O qu
            fre
Bivariate Frequency Table or




                                               Row frequency
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                   Contingency Table

                   Favor   Neutral   Oppose   f row

   Male            10      10        30       50

   Female          15      15        10       40


   f column        25      25        40       n = 90
Column frequency
1. Determine Appropriate Test

1. Gender ( 2 levels) and Nominal
2. Buying Preference ( 3 levels) and Nominal
2. Establish Level of Significance

            Alpha of .05
3. Determine The Hypothesis
• Ho : There is no difference between men and
  women in their opinion on pro-environmental
  products.

• Ha : There is an association between gender
  and opinion on pro-environmental products.
4. Calculating Test Statistics

               Favor    Neutral    Oppose     f row

Men            fo =10   fo =10     fo =30     50
               fe =13.9 fe =13.9   fe=22.2
Women          fo =15   fo =15     fo =10     40
               fe =11.1 fe =11.1   fe =17.8
f column       25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row
                        = 50*25/90
Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9   fe=22.2
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row

Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9 fe=22.2
                       = 40* 25/90
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics


    (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2
χ =
 2
                  +              +              +
        13.89          13.89          22.2

      (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2
                    +              +
          11.11          11.11          17.8


     = 11.03
5. Determine Degrees of
        Freedom
      df = (R-1)(C-1) =
       (2-1)(3-1) = 2
6. Compare computed test statistic
       against a tabled/critical value
•   α = 0.05
•   df = 2
•   Critical tabled value = 5.991
•   Test statistic, 11.03, exceeds critical value
•   Null hypothesis is rejected
•   Men and women differ significantly in their
    opinions on pro-environmental products
SPSS Output Example

                     Chi-Square Tests

                                                 Asymp. Sig.
                        Value           df        (2-sided)
Pearson Chi-Square       11.025a             2           .004
Likelihood Ratio         11.365              2           .003
Linear-by-Linear
                           8.722             1            .003
Association
N of Valid Cases              90
  a. 0 cells (.0%) have expected count less than 5. The
     minimum expected count is 11.11.
Additional Information in SPSS Output
• Exceptions that might distort χ2 Assumptions
  – Associations in some but not all categories
  – Low expected frequency per cell
• Extent of association is not same as statistical
  significance




                                             Demonstrated
                                          through an example
Another Example Heparin Lock
                       Placement
                   Complication Incidence * Heparin Lock Placement Time Group Crosstabulation


                                                                         Heparin Lock                       Time:
                                                                     Placement Time Group
                                                                                                          1 = 72 hrs
                                                                         1          2           Total
           Complication    Had Compilca      Count                            9         11           20
                                                                                                          2 = 96 hrs
           Incidence                         Expected Count                10.0       10.0         20.0
                                             % within Heparin Lock
                                                                        18.0%       22.0%        20.0%
                                             Placement Time Group
                           Had NO Compilca   Count                          41         39            80
                                             Expected Count               40.0       40.0          80.0
                                             % within Heparin Lock
                                                                        82.0%       78.0%        80.0%
                                             Placement Time Group
           Total                             Count                          50         50          100
                                             Expected Count               50.0       50.0        100.0
                                             % within Heparin Lock
                                                                       100.0%     100.0%        100.0%
                                             Placement Time Group




from Polit Text: Table 8-1
Hypotheses in Smoking Habit


• Ho: There is no association between
  complication incidence and duration of
  smoking habit. (The variables are
  independent).
• Ha: There is an association between
  complication incidence and duration of
  smoking habit. (The variables are related).
More of SPSS Output



                                     Chi-Square Tests

                                                  Asymp. Sig.    Exact Sig.   Exact Sig.
                         Value           df        (2-sided)      (2-sided)    (1-sided)
Pearson Chi-Square          .250b             1           .617
Continuity Correctiona      .063              1           .803
Likelihood Ratio            .250              1           .617
Fisher's Exact Test                                                   .803         .402
Linear-by-Linear
                            .248              1          .619
Association
N of Valid Cases            100
  a. Computed only for a 2x2 table
  b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.
     00.
Pearson Chi-Square
• Pearson Chi-Square = .
  250, p = .617
 Since the p > .05, we fail to
  reject the null hypothesis                                         Chi-Square Tests

  that the complication rate                              Value          df
                                                                                  Asymp. Sig.
                                                                                   (2-sided)
                                                                                                 Exact Sig.
                                                                                                  (2-sided)
                                                                                                              Exact Sig.
                                                                                                               (1-sided)

  is unrelated to smoking        Pearson Chi-Square
                                 Continuity Correctiona
                                                             .250b
                                                             .063
                                                                              1
                                                                              1
                                                                                          .617
                                                                                          .803


  habit duration.                Likelihood Ratio
                                 Fisher's Exact Test
                                 Linear-by-Linear
                                                             .250             1           .617
                                                                                                      .803         .402



• Continuity correction is
                                                             .248             1          .619
                                 Association
                                 N of Valid Cases            100


  used in situations in which
                                   a. Computed only for a 2x2 table
                                   b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.


  the expected frequency
                                      00.




  for any cell in a 2 by 2
  table is less than 10.
More SPSS Output



                                    Symmetric Measures

                                                          Asymp.
                                                                  a          b
                                              Value      Std. Error Approx. T Approx. Sig.
Nominal by           Phi                        -.050                                .617
Nominal              Cramer's V                  .050                                .617
Interval by Interval Pearson's R                -.050         .100      -.496        .621c
Ordinal by Ordinal Spearman Correlation         -.050         .100      -.496        .621c
N of Valid Cases                                  100
  a. Not assuming the null hypothesis.
  b. Using the asymptotic standard error assuming the null hypothesis.
  c. Based on normal approximation.
Phi Coefficient
• Pearson Chi-Square                                                 Symmetric Measures

                                                                                         Asymp.
                                                                                                 a
                                                                             Value      Std. Error

  provides information         Nominal by
                               Nominal
                                                      Phi
                                                      Cramer's V
                                                                               -.050
                                                                                .050


  about the existence of
                               Interval by Interval   Pearson's R              -.050         .100
                               Ordinal by Ordinal     Spearman Correlation     -.050         .100
                               N of Valid Cases                                  100

  relationship between 2         a. Not assuming the null hypothesis.
                                 b. Using the asymptotic standard error assuming the null hypothes


  nominal variables, but not
                                 c. Based on normal approximation.




  about the magnitude of
  the relationship
• Phi coefficient is the                       χ                     2
  measure of the strength                   φ=
  of the association                           N
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
                                   Number of                                    Smallest of
                                     cases                                   number of rows or
Q & As

More Related Content

Similar to T10 statisitical analysis

Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaLong Beach City College
 
The chi – square test
The chi – square testThe chi – square test
The chi – square testMajesty Ortiz
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric testskompellark
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1Bùi Trâm
 
NON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptxNON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptxDrLasya
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSSANSHU TIWARI
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Sneh Kumari
 
The kolmogorov smirnov test
The kolmogorov smirnov testThe kolmogorov smirnov test
The kolmogorov smirnov testSubhradeep Mitra
 
chi-Square. test-
chi-Square. test-chi-Square. test-
chi-Square. test-shifanaz9
 
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docx
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docxSAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docx
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docxtodd331
 
f-test-100202074439-phpapp02.pdf
f-test-100202074439-phpapp02.pdff-test-100202074439-phpapp02.pdf
f-test-100202074439-phpapp02.pdfUMAIRASHFAQ20
 

Similar to T10 statisitical analysis (20)

Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Unit 4
Unit 4Unit 4
Unit 4
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Correlation
CorrelationCorrelation
Correlation
 
Stat2013
Stat2013Stat2013
Stat2013
 
The chi – square test
The chi – square testThe chi – square test
The chi – square test
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric tests
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1
 
F-Distribution
F-DistributionF-Distribution
F-Distribution
 
NON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptxNON-PARAMETRIC TESTS.pptx
NON-PARAMETRIC TESTS.pptx
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
The kolmogorov smirnov test
The kolmogorov smirnov testThe kolmogorov smirnov test
The kolmogorov smirnov test
 
chi-Square. test-
chi-Square. test-chi-Square. test-
chi-Square. test-
 
Chi square
Chi squareChi square
Chi square
 
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docx
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docxSAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docx
SAS Results for Problem 2Factor (IV) in ANOVA 2 levels (1 or.docx
 
f-test-100202074439-phpapp02.pdf
f-test-100202074439-phpapp02.pdff-test-100202074439-phpapp02.pdf
f-test-100202074439-phpapp02.pdf
 

More from kompellark

T22 research report writing
T22 research report writingT22 research report writing
T22 research report writingkompellark
 
Rubric assignment 2
Rubric   assignment 2Rubric   assignment 2
Rubric assignment 2kompellark
 
Answers mid-term
Answers   mid-termAnswers   mid-term
Answers mid-termkompellark
 
T21 conjoint analysis
T21 conjoint analysisT21 conjoint analysis
T21 conjoint analysiskompellark
 
T20 cluster analysis
T20 cluster analysisT20 cluster analysis
T20 cluster analysiskompellark
 
T19 factor analysis
T19 factor analysisT19 factor analysis
T19 factor analysiskompellark
 
T18 discriminant analysis
T18 discriminant analysisT18 discriminant analysis
T18 discriminant analysiskompellark
 
T17 correlation
T17 correlationT17 correlation
T17 correlationkompellark
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regressionkompellark
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric testskompellark
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of testskompellark
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric testskompellark
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of testskompellark
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regressionkompellark
 
T10 statisitical analysis
T10 statisitical analysisT10 statisitical analysis
T10 statisitical analysiskompellark
 
Rubric assignment 1
Rubric   assignment 1Rubric   assignment 1
Rubric assignment 1kompellark
 

More from kompellark (20)

T22 research report writing
T22 research report writingT22 research report writing
T22 research report writing
 
Rubric assignment 2
Rubric   assignment 2Rubric   assignment 2
Rubric assignment 2
 
Answers mid-term
Answers   mid-termAnswers   mid-term
Answers mid-term
 
T21 conjoint analysis
T21 conjoint analysisT21 conjoint analysis
T21 conjoint analysis
 
T20 cluster analysis
T20 cluster analysisT20 cluster analysis
T20 cluster analysis
 
T19 factor analysis
T19 factor analysisT19 factor analysis
T19 factor analysis
 
T18 discriminant analysis
T18 discriminant analysisT18 discriminant analysis
T18 discriminant analysis
 
T17 correlation
T17 correlationT17 correlation
T17 correlation
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
 
T15 ancova
T15 ancovaT15 ancova
T15 ancova
 
T14 anova
T14 anovaT14 anova
T14 anova
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric tests
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of tests
 
T15 ancova
T15 ancovaT15 ancova
T15 ancova
 
T14 anova
T14 anovaT14 anova
T14 anova
 
T13 parametric tests
T13 parametric testsT13 parametric tests
T13 parametric tests
 
T11 types of tests
T11 types of testsT11 types of tests
T11 types of tests
 
T16 multiple regression
T16 multiple regressionT16 multiple regression
T16 multiple regression
 
T10 statisitical analysis
T10 statisitical analysisT10 statisitical analysis
T10 statisitical analysis
 
Rubric assignment 1
Rubric   assignment 1Rubric   assignment 1
Rubric assignment 1
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

T10 statisitical analysis

  • 1. Statistical Analysis By Rama Krishna Kompella
  • 2. Relationships Between Variables • The relationship between variables can be explained in various ways such as: – Presence /absence of a relationship – Directionality of the relationship – Strength of association – Type of relationship
  • 3. Relationships Between Variables • Presence / absence of a relationship – E.g., if we are interested to study the customer satisfaction levels of a fast-food restaurant, then we need to know if the quality of food and customer satisfaction have any relationship or not
  • 4. Relationships Between Variables • Direction of the relationship – The direction of a relationship can be either positive or negative – Food quality perceptions are related positively to customer commitment toward a restaurant.
  • 5. Relationships Between Variables • Strength of association – They are generally categorized as nonexistent, weak, moderate, or strong. – Quality of food is strongly associated with customer satisfaction in a fast-food restaurant
  • 6. Relationships Between Variables • Type of association – How can the link between Y and X best be described? – There are different ways in which two variables can share a relationship • Linear relationship • Curvilinear relationship
  • 7. Chi-Square (χ2) and Frequency Data • Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. • The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.
  • 8. Steps in Test of Hypothesis 1. Determine the appropriate test 2. Establish the level of significance:α 3. Formulate the statistical hypothesis 4. Calculate the test statistic 5. Determine the degree of freedom 6. Compare computed test statistic against a tabled/critical value
  • 9. 1. Determine Appropriate Test • Chi Square is used when both variables are measured on a nominal scale. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population. • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.
  • 10. 2. Establish Level of Significance • α is a predetermined value • The convention • α = .05 • α = .01 • α = .001
  • 11. 3. Determine The Hypothesis: Whether There is an Association or Not • Ho : The two variables are independent • Ha : The two variables are associated
  • 12. 4. Calculating Test Statistics • Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N
  • 13. 4. Calculating Test Statistics  ( Fo − Fe )  2 χ = ∑ 2   Fe 
  • 14. 4. Calculating Test Statistics O fre bse qu rv en ed cie s  ( Fo − Fe )  2 χ = ∑ 2   Fe  Ex que fre pe nc cte y d qu ted cy fre pec en Ex
  • 15. 5. Determine Degrees of of ber Num ls in leve n m df = (R-1)(C-1) colu le b Freedom varia Numb e levels r of in ro variab w le
  • 16. 6. Compare computed test statistic against a tabled/critical value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi- square statistic • If calculated χ2 is greater than χ2 table value, reject Ho
  • 17. Example • Suppose a researcher is interested in buying preferences of environmentally conscious consumers. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the gender of the sample of 90 respondents.
  • 18. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 19. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column e d 25 25 40 n = 90 erv cies bs en O qu fre
  • 20. Bivariate Frequency Table or Row frequency Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 21. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90 Column frequency
  • 22. 1. Determine Appropriate Test 1. Gender ( 2 levels) and Nominal 2. Buying Preference ( 3 levels) and Nominal
  • 23. 2. Establish Level of Significance Alpha of .05
  • 24. 3. Determine The Hypothesis • Ho : There is no difference between men and women in their opinion on pro-environmental products. • Ha : There is an association between gender and opinion on pro-environmental products.
  • 25. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 26. 4. Calculating Test Statistics Favor Neutral Oppose f row = 50*25/90 Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 27. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 = 40* 25/90 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 28. 4. Calculating Test Statistics (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2 χ = 2 + + + 13.89 13.89 22.2 (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2 + + 11.11 11.11 17.8 = 11.03
  • 29. 5. Determine Degrees of Freedom df = (R-1)(C-1) = (2-1)(3-1) = 2
  • 30. 6. Compare computed test statistic against a tabled/critical value • α = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Men and women differ significantly in their opinions on pro-environmental products
  • 31. SPSS Output Example Chi-Square Tests Asymp. Sig. Value df (2-sided) Pearson Chi-Square 11.025a 2 .004 Likelihood Ratio 11.365 2 .003 Linear-by-Linear 8.722 1 .003 Association N of Valid Cases 90 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 11.11.
  • 32. Additional Information in SPSS Output • Exceptions that might distort χ2 Assumptions – Associations in some but not all categories – Low expected frequency per cell • Extent of association is not same as statistical significance Demonstrated through an example
  • 33. Another Example Heparin Lock Placement Complication Incidence * Heparin Lock Placement Time Group Crosstabulation Heparin Lock Time: Placement Time Group 1 = 72 hrs 1 2 Total Complication Had Compilca Count 9 11 20 2 = 96 hrs Incidence Expected Count 10.0 10.0 20.0 % within Heparin Lock 18.0% 22.0% 20.0% Placement Time Group Had NO Compilca Count 41 39 80 Expected Count 40.0 40.0 80.0 % within Heparin Lock 82.0% 78.0% 80.0% Placement Time Group Total Count 50 50 100 Expected Count 50.0 50.0 100.0 % within Heparin Lock 100.0% 100.0% 100.0% Placement Time Group from Polit Text: Table 8-1
  • 34. Hypotheses in Smoking Habit • Ho: There is no association between complication incidence and duration of smoking habit. (The variables are independent). • Ha: There is an association between complication incidence and duration of smoking habit. (The variables are related).
  • 35. More of SPSS Output Chi-Square Tests Asymp. Sig. Exact Sig. Exact Sig. Value df (2-sided) (2-sided) (1-sided) Pearson Chi-Square .250b 1 .617 Continuity Correctiona .063 1 .803 Likelihood Ratio .250 1 .617 Fisher's Exact Test .803 .402 Linear-by-Linear .248 1 .619 Association N of Valid Cases 100 a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. 00.
  • 36. Pearson Chi-Square • Pearson Chi-Square = . 250, p = .617 Since the p > .05, we fail to reject the null hypothesis Chi-Square Tests that the complication rate Value df Asymp. Sig. (2-sided) Exact Sig. (2-sided) Exact Sig. (1-sided) is unrelated to smoking Pearson Chi-Square Continuity Correctiona .250b .063 1 1 .617 .803 habit duration. Likelihood Ratio Fisher's Exact Test Linear-by-Linear .250 1 .617 .803 .402 • Continuity correction is .248 1 .619 Association N of Valid Cases 100 used in situations in which a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. the expected frequency 00. for any cell in a 2 by 2 table is less than 10.
  • 37. More SPSS Output Symmetric Measures Asymp. a b Value Std. Error Approx. T Approx. Sig. Nominal by Phi -.050 .617 Nominal Cramer's V .050 .617 Interval by Interval Pearson's R -.050 .100 -.496 .621c Ordinal by Ordinal Spearman Correlation -.050 .100 -.496 .621c N of Valid Cases 100 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. c. Based on normal approximation.
  • 38. Phi Coefficient • Pearson Chi-Square Symmetric Measures Asymp. a Value Std. Error provides information Nominal by Nominal Phi Cramer's V -.050 .050 about the existence of Interval by Interval Pearson's R -.050 .100 Ordinal by Ordinal Spearman Correlation -.050 .100 N of Valid Cases 100 relationship between 2 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothes nominal variables, but not c. Based on normal approximation. about the magnitude of the relationship • Phi coefficient is the χ 2 measure of the strength φ= of the association N
  • 39. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1)
  • 40. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1) Number of Smallest of cases number of rows or

Editor's Notes

  1. Mean difference between pairs of values
  2. Mean difference between pairs of values
  3. Mean difference between pairs of values