SlideShare a Scribd company logo
1 of 11
Download to read offline
Regression
APAM E4990
Modeling Social Data
Jake Hofman
Columbia University
February 24, 2017
Jake Hofman (Columbia University) Regression February 24, 2017 1 / 6
Deļ¬nition
?
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Deļ¬nition
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Deļ¬nition
ā€œThe primary goal in a regression analysis is to
understand, as far as possible with the available data,
how the conditional distribution of the response varies
across subpopulations determined by the possible
values of the predictor or predictors.ā€
-ā€œApplied Regression Including Computing and Graphicsā€
Cook & Weisberg (1999)
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Goals
Describe
Provide a compact summary of outcomes under diļ¬€erent conditions
Predict
Make forecasts for future outcomes or unobserved conditions
Explain
Account for associations between predictors and outcomes
Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
Goals
Describe
Provide a compact summary of outcomes under diļ¬€erent conditions
Never ā€œfalseā€, but may be wasteful or misleading
Predict
Make forecasts for future outcomes or unobserved conditions
Varying degrees of success, often room for improvement
Explain
Account for associations between predictors and outcomes
Diļ¬ƒcult to establish causality in observational studies
See ā€œRegression Analysis: A Constructive Critiqueā€, Berk (2004)
Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
Goals
Models should be ļ¬‚exible enough to describe observed phenomena
but simple enough to generalize to future observations
Jake Hofman (Columbia University) Regression February 24, 2017 4 / 6
Examples1
1.2 Setting the Regression Context 3
Should one be especially interested in a comparison of the means, one could
roceed descriptively with a conventional least squares regression analysis as
special case. That is, for each observation i, one could let
Ė†yi = Ī²0 + Ī²1xi, (1.1)
here the response variable y is each applicantā€™s SAT score, x is an indicator
Fig. 1.2. Distribution of SAT scores for Asian applicants.
SAT Scores for Asian Applicants
SAT Score
Frequency
600 800 1000 1200 1400 1600
050100150
of some response y varies across subpopulations determined by the po
values of the predictor or predictorsā€ (Cook and Weisberg, 1999: 27).
is, interest centers on the distribution of the response variable Y conditi
on one or more predictors X.
This deļ¬nition includes a wide variety of elementary procedures e
implemented in R. (See, for example, Maindonald and Braun, 2007: Ch
2.) For example, consider Figures 1.1 and 1.2. The ļ¬rst shows the distrib
of SAT scores for recent applicants to a major university, who self-ide
as ā€œHispanic.ā€ The second shows the distribution of SAT scores for r
applicants to that same university, who self-identify as ā€œAsian.ā€
Fig. 1.1. Distribution of SAT scores for Hispanic applicants.
It is clear that the two distributions diļ¬€er substantially. The Asian
tribution is shifted to the right, leading to a distribution with a higher
(1227 compared to 1072), a smaller standard deviation (170 compared to
and greater skewing. A comparative description of the two histograms
constitutes a proper regression analysis. Using various summary stati
some key features of the two displays are compared and contrasted (
1
ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Examples1
aph more legible.
2e+04 4e+04 6e+04 8e+04 1e+05
8001000120014001600
SAT Score by Household Income
Income Bounded at $100,000
SATScore
Fig. 1.4. SAT scores by family income.1
ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Examples1
6 1 Regression Framework
1234
400 600 800 1000 1200 1400 1600
400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600
1234
FreshmanGPA
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
High School GPA
Fig. 1.5. Freshman GPA on SAT holding high school GPA constant.1
ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Framework
ā€¢ Specify the outcome and predictors, along with the form of
the model relating them
ā€¢ Deļ¬ne a loss function that quantiļ¬es how close a modelā€™s
predictions are to observed outcomes
ā€¢ Develop an algorithm to ļ¬t the model to the observations by
minimizing this loss
ā€¢ Assess model performance and interpret results.
Jake Hofman (Columbia University) Regression February 24, 2017 6 / 6

More Related Content

What's hot

Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationColleen Farrelly
Ā 
Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsColleen Farrelly
Ā 
1.2 Data Classification
1.2 Data Classification1.2 Data Classification
1.2 Data Classificationleblance
Ā 
Strong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional DataStrong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional Datasahirbhatnagar
Ā 
1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempolaRising Media, Inc.
Ā 
Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Peter Kenny
Ā 
A1 m deon
A1 m deonA1 m deon
A1 m deonedjane_gf
Ā 
Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Debra Wallace
Ā 
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.PptDetecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.Pptbarthriley
Ā 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Laurens De Vocht
Ā 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisWansuklangk
Ā 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.DrBarada Mohanty
Ā 
Multicollinearity
MulticollinearityMulticollinearity
MulticollinearityBernard Asia
Ā 
Fitting and understanding Multilevel Models-Andrew Gelman
 Fitting and understanding Multilevel Models-Andrew Gelman  Fitting and understanding Multilevel Models-Andrew Gelman
Fitting and understanding Multilevel Models-Andrew Gelman Deepak Kumar
Ā 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataTianfan Song
Ā 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lectureShreyas S K
Ā 
Detecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice ModelDetecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice Modelkosby2000
Ā 

What's hot (20)

Mixed models
Mixed modelsMixed models
Mixed models
Ā 
Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validation
Ā 
Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerations
Ā 
1.2 Data Classification
1.2 Data Classification1.2 Data Classification
1.2 Data Classification
Ā 
Strong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional DataStrong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional Data
Ā 
1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola
Ā 
Combined queries
Combined queriesCombined queries
Combined queries
Ā 
Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design
Ā 
A1 m deon
A1 m deonA1 m deon
A1 m deon
Ā 
Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1
Ā 
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.PptDetecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Ā 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Ā 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
Ā 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.
Ā 
Multicollinearity
MulticollinearityMulticollinearity
Multicollinearity
Ā 
Fitting and understanding Multilevel Models-Andrew Gelman
 Fitting and understanding Multilevel Models-Andrew Gelman  Fitting and understanding Multilevel Models-Andrew Gelman
Fitting and understanding Multilevel Models-Andrew Gelman
Ā 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
Ā 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing Data
Ā 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
Ā 
Detecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice ModelDetecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice Model
Ā 

Viewers also liked

Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman
Ā 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overviewjakehofman
Ā 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
Ā 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scalejakehofman
Ā 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experimentsjakehofman
Ā 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman
Ā 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regressionjakehofman
Ā 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman
Ā 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classificationjakehofman
Ā 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wranglingjakehofman
Ā 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIjakehofman
Ā 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part Ijakehofman
Ā 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part Ijakehofman
Ā 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman
Ā 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Countingjakehofman
Ā 
Chapter 10 be wto_agriculture
Chapter 10 be wto_agricultureChapter 10 be wto_agriculture
Chapter 10 be wto_agricultureHo Cao Viet
Ā 
Capoeira power point- IKT
Capoeira power point- IKTCapoeira power point- IKT
Capoeira power point- IKTlanag
Ā 

Viewers also liked (20)

Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in R
Ā 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overview
Ā 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
Ā 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scale
Ā 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experiments
Ā 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part II
Ā 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regression
Ā 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part I
Ā 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classification
Ā 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
Ā 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part II
Ā 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part I
Ā 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part I
Ā 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part II
Ā 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Counting
Ā 
Chapter 10 be wto_agriculture
Chapter 10 be wto_agricultureChapter 10 be wto_agriculture
Chapter 10 be wto_agriculture
Ā 
Capoeira power point- IKT
Capoeira power point- IKTCapoeira power point- IKT
Capoeira power point- IKT
Ā 
Death2
Death2Death2
Death2
Ā 
Bx Lite
Bx LiteBx Lite
Bx Lite
Ā 
La ecologia social de hoy
La ecologia social de hoyLa ecologia social de hoy
La ecologia social de hoy
Ā 

Similar to Modeling Social Data, Lecture 6: Regression, Part 1

Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
Ā 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
Ā 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantPRAJAKTASAWANT33
Ā 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
Ā 
Correlation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotCorrelation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotDenzelMontuya1
Ā 
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxAssignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxdeanmtaylor1545
Ā 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayCrystal Alvarez
Ā 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...inventionjournals
Ā 
1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docxjackiewalcutt
Ā 
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxChapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxtiffanyd4
Ā 
Add slides
Add slidesAdd slides
Add slidesRupa D
Ā 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxhealdkathaleen
Ā 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxdaniahendric
Ā 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxSALU18
Ā 
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Nicha Tatsaneeyapan
Ā 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptxShivakumar B N
Ā 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression Dr Athar Khan
Ā 
Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6Daria Bogdanova
Ā 

Similar to Modeling Social Data, Lecture 6: Regression, Part 1 (20)

Ekonometrika
EkonometrikaEkonometrika
Ekonometrika
Ā 
Module5.slp
Module5.slpModule5.slp
Module5.slp
Ā 
Module5.slp
Module5.slpModule5.slp
Module5.slp
Ā 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
Ā 
Module5.slp
Module5.slpModule5.slp
Module5.slp
Ā 
Correlation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotCorrelation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter Plot
Ā 
exercises.pdf
exercises.pdfexercises.pdf
exercises.pdf
Ā 
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxAssignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Ā 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis Essay
Ā 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Ā 
1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx
Ā 
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxChapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Ā 
Add slides
Add slidesAdd slides
Add slides
Ā 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Ā 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docx
Ā 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docx
Ā 
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Ā 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
Ā 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression
Ā 
Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6
Ā 

More from jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman
Ā 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman
Ā 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networksjakehofman
Ā 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classificationjakehofman
Ā 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
Ā 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
Ā 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scalejakehofman
Ā 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
Ā 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studiesjakehofman
Ā 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Sciencejakehofman
Ā 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbitjakehofman
Ā 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10jakehofman
Ā 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09jakehofman
Ā 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brainjakehofman
Ā 

More from jakehofman (14)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Ā 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Ā 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networks
Ā 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classification
Ā 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
Ā 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Ā 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scale
Ā 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
Ā 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studies
Ā 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Science
Ā 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbit
Ā 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Ā 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09
Ā 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brain
Ā 

Recently uploaded

4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
Ā 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
Ā 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
Ā 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
Ā 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
Ā 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
Ā 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
Ā 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
Ā 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
Ā 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Nguyen Thanh Tu Collection
Ā 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
Ā 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
Ā 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
Ā 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...SeƔn Kennedy
Ā 

Recently uploaded (20)

4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
Ā 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
Ā 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
Ā 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
Ā 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
Ā 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Ā 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
Ā 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
Ā 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Ā 
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Ā 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
Ā 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Ā 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
Ā 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Ā 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Ā 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Ā 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
Ā 

Modeling Social Data, Lecture 6: Regression, Part 1

  • 1. Regression APAM E4990 Modeling Social Data Jake Hofman Columbia University February 24, 2017 Jake Hofman (Columbia University) Regression February 24, 2017 1 / 6
  • 2. Deļ¬nition ? Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 3. Deļ¬nition Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 4. Deļ¬nition ā€œThe primary goal in a regression analysis is to understand, as far as possible with the available data, how the conditional distribution of the response varies across subpopulations determined by the possible values of the predictor or predictors.ā€ -ā€œApplied Regression Including Computing and Graphicsā€ Cook & Weisberg (1999) Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 5. Goals Describe Provide a compact summary of outcomes under diļ¬€erent conditions Predict Make forecasts for future outcomes or unobserved conditions Explain Account for associations between predictors and outcomes Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
  • 6. Goals Describe Provide a compact summary of outcomes under diļ¬€erent conditions Never ā€œfalseā€, but may be wasteful or misleading Predict Make forecasts for future outcomes or unobserved conditions Varying degrees of success, often room for improvement Explain Account for associations between predictors and outcomes Diļ¬ƒcult to establish causality in observational studies See ā€œRegression Analysis: A Constructive Critiqueā€, Berk (2004) Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
  • 7. Goals Models should be ļ¬‚exible enough to describe observed phenomena but simple enough to generalize to future observations Jake Hofman (Columbia University) Regression February 24, 2017 4 / 6
  • 8. Examples1 1.2 Setting the Regression Context 3 Should one be especially interested in a comparison of the means, one could roceed descriptively with a conventional least squares regression analysis as special case. That is, for each observation i, one could let Ė†yi = Ī²0 + Ī²1xi, (1.1) here the response variable y is each applicantā€™s SAT score, x is an indicator Fig. 1.2. Distribution of SAT scores for Asian applicants. SAT Scores for Asian Applicants SAT Score Frequency 600 800 1000 1200 1400 1600 050100150 of some response y varies across subpopulations determined by the po values of the predictor or predictorsā€ (Cook and Weisberg, 1999: 27). is, interest centers on the distribution of the response variable Y conditi on one or more predictors X. This deļ¬nition includes a wide variety of elementary procedures e implemented in R. (See, for example, Maindonald and Braun, 2007: Ch 2.) For example, consider Figures 1.1 and 1.2. The ļ¬rst shows the distrib of SAT scores for recent applicants to a major university, who self-ide as ā€œHispanic.ā€ The second shows the distribution of SAT scores for r applicants to that same university, who self-identify as ā€œAsian.ā€ Fig. 1.1. Distribution of SAT scores for Hispanic applicants. It is clear that the two distributions diļ¬€er substantially. The Asian tribution is shifted to the right, leading to a distribution with a higher (1227 compared to 1072), a smaller standard deviation (170 compared to and greater skewing. A comparative description of the two histograms constitutes a proper regression analysis. Using various summary stati some key features of the two displays are compared and contrasted ( 1 ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 9. Examples1 aph more legible. 2e+04 4e+04 6e+04 8e+04 1e+05 8001000120014001600 SAT Score by Household Income Income Bounded at $100,000 SATScore Fig. 1.4. SAT scores by family income.1 ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 10. Examples1 6 1 Regression Framework 1234 400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600 1234 FreshmanGPA 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 High School GPA Fig. 1.5. Freshman GPA on SAT holding high school GPA constant.1 ā€œStatistical Learning from a Regression Perspectiveā€, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 11. Framework ā€¢ Specify the outcome and predictors, along with the form of the model relating them ā€¢ Deļ¬ne a loss function that quantiļ¬es how close a modelā€™s predictions are to observed outcomes ā€¢ Develop an algorithm to ļ¬t the model to the observations by minimizing this loss ā€¢ Assess model performance and interpret results. Jake Hofman (Columbia University) Regression February 24, 2017 6 / 6