Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Nächste SlideShare
Wird geladen in …5
×

# Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensible to any Finite Dimensionality

2019-09-14 下野寿之 明治大学MIMS共同研究集会 Data-driven Mathematical sciences 経済物理学とその周辺

• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Loggen Sie sich ein, um Kommentare anzuzeigen.

### Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensible to any Finite Dimensionality

1. 1. Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensible to any Finite Dimensionality 2019-09-14 下野寿之 明治⼤学MIMS共同研究集会 経済物理学とその周辺 Data-driven Mathematical sciences 1
2. 2. Main Claims uThe multiple regression can be interpreted using a Euclidean ellipse/ellipsoid/hyper-ellipsoid. 1. Multiple corr. coeff. : Lengths ratio of line segments. 2. Regression coeff. : Read by a linear scholar field. 3. Partial corr. coeff : Read by a measure inside an ellipse. uThe above results : p make it easy to understand the multiple regression both in (1) numerical results and (2) how to calculate. p may help in solving multicollinearity and other issues. 2
3. 3. Variables and Geometric Shapes 3 P : the Point inside E. q : a quadratic form. ri; rij : corr. coeff. btw. Xi and Y; Xi and Xj. S : the unit square/(hyper) cube. Ti : the tangent points for E and S. Ui : the distance of P from the i-th axis. Vi : the veer of ruler parallel to i-th axis. Wi : the radius of E along i-th axis. X = (X1,.., Xd) : the explanatory variables. Y : the response variable. z : an arbitrary point inside E or S. ai : regression coefficients b : the intercept term of regression. c : a constant scalar. d : the number of explanatory variables X. E : the Ellipse inscribing inside S. Fi : a linear scalar field (to read ai) gi : an affine function (to read partial corr.) i : the indicator e.g. X1,X2,.. or Xd. j : the second indicator. M : correlation matrix upon (X1,X2,.., Xd). O : the origin point (0,0,..,0).
4. 4. Variables and Geometric Shapes 4 P : the Point inside E. ri : corr. coeff. btw. Xi and Y rij : corr. coeff. btw. Xi and Xj. X = (X1,.., Xd) : the explanatory variables. Y : the response variable. ai : regression coefficients d : the number of explanatory variables X. E : the Ellipse inscribing inside S. M : correlation matrix upon (X1,X2,.., Xd).
5. 5. I. Background 4 slides About Multiple Regressions 5
6. 6. Linear Combination Modeling is Widely Used. Y = a1 X1 + … + ad Xd + b + error. Ø Multiple Regression. Ø Many of basic statistical/math/physics models. Ø Pieces of machine learning e.g. Deep Learning. 6
7. 7. The Multiple Regression : → Regression coeff. ai → Multiple corr. coeff. ∈ [0,1] → Partial corr. coeff. ∈ [-1,1] ˆY = a1 X1 + a2 X2 +..+ ad Xd + b 7
8. 8. Output of Multiple Regression 8 The formulas above are taken from [Kei Takeuchi, Haruo Yanai Tahenryou Kaiseki No Kiso. Tokyo Keizai Inc. (1972) ].
9. 9. Results by Multiple Regression is, However, Difficult to Interpret : 1. Multiple correlation coefficient : Ø How it has unexpectedly large value sometimes? 2. Regression coefficients for Xi : Ø Why is it different in ± signs from intuition sometimes? Ø Why is it very large sometimes? 3. Partial correlation coefficient for Xi : Ø Why is it different in ± signs from the corr. coeff. btw. Xi and Y? n Other issues: Ø Multicollinearity, especially for time series analysis. Ø Instability occurs w.r.t. sample from same population. Ø Incomputability by the correlation matrix having one or more negative eigenvalues during handling missing values. 9 L
10. 10. 数量の関係をどう理解したら 良いだろうか?? • 実は、難しい数式を経由しなくても、重相関係数などは 作図で求めることができる。 • 重回帰の幾何的な表現により、把握が容易になる。 (この後で述べる⽅法を広く普及させたい!) • 重回帰に関係するいろいろな現象の理解を俯瞰的に与え ることが出来る。 • 既にある多変量に関係する理論を分かりやすく再構築す る可能性がある。 • 新たな理論を導く可能性もある。 10
11. 11. II. New 3 Theorems 7 slides How to Interpret the Results of Multiple Regression Geometrically 11 An ellipse/ellipsoid in Euclidean space
12. 12. ri is corr. coef. btw. Xi and Y. rij is corr. coef. btw. Xi and Xj. (Note: rij =1 if i=j.) Draw S, Ti , E and P from correlations When d = 2 : E(ellipse) S(square) When d = 3 : (1,1) (1,−1)(−1,−1) (−1,1) 12 T1 -T1 -T2 T2 S is the unit square. Ti = (ri1,ri2,..,rid) for i=1,..,d. E inscribes S at±Ti (i=1..d) P = (r1,r2,..,rd). T1 T2 T3 P M := (T1|T2|..|Td) = (rij), then E = {z| zT M-1 z = 1} holds. (1) Correlations : (2) How to make Ellipse and Point : (3) The Matrix M (corr. matrix on X) :
13. 13. O P E S (square) (1,1) (1,−1)(−1,−1) (−1,1) S : the square surrounded by x1=±1, x2=±1. E : the ellipse inscribing S at (x1,x2)=±(ri1,ri2) for i=1,2 rij is corr. coeff. btw. Xi and Xj. P: the point (x1,x2) = (r1,r2) ri is corr. coeff. btw. Xi and Y. Note that : Extensible to dim = 3, 4, 5,.. E can be given by : { x | xT R-1 x = 1 } , R is corr. coeff. matrix X1, X2 ,.. , Xd . 13 Preparing S, E and P
14. 14. Square S, Ellipse E, Point P 1. Define d (the number of explanatory variables) and set up an d-dim Euclid space (axes of: x1,.., xd). 2. Draw S : surrounded by x1 =±1, x2 =±1 .. xd =±1. 3. Ellipse E inscribing S centering the origin O=(0,..,0) : inscribed with the points T1, T2,..,Td taken from the d x d correlation matrix over X1,.., Xd as split into (T1|T2|..|Td). 4. Point P inside E : whose i-th coordinate is specified by the correlation coefficient between Xi and Y. 14 Note: All above are determined only by the corr. coeff. of X1,..,Xd and Y.
15. 15. Multiple Corr. Coeff. = |OP|/|OP’| P P’ O PP’ O P P’ O P P’ O 15 Theorem 1 Case of r12 = 0 : R=(r1 2+r2 2)1/2 Case of R = r1 : iff r1 r12=r2. q(z):=(zTM-1z) . (A quadratic form!) Then q(cr)=c2q(r) for a scalar c i.e. q(cr) 1/2 =|c| q(r) 1/2. Recall E = { z | q(z)=1 } : q(P’)=1 thus q(P)1/2 =OP/OP’. (PT M-1 P)1/2 Let P’ be OP // OP’ and P’ ∈ E r1=0.4, r2=0, but R = 0.8 Multiple corr. coeff. can be considered geometrically!
16. 16. 16 Supplementary:
17. 17. 重相関係数は楕円の作図で求まる 説明変数間(総得点と総失点)の相関係数ρに応じて、x=±1,y=±1に囲まれた正⽅形に4点(±ρ, ±1), (±1, ±ρ) で内接する楕円を描く。そして、説明変数たちに対する⽬的変数(年間順位)への相関係数の組(ρ1,ρ2)に対 応する点に打点する。図において2個の楕円の相似⽐が、重相関係数に等しい。(原点から補助線を図のよ うに引くか、同⼼・同⽅向・相似な楕円を打点を通るように描く。) 決定係数は楕円の⾯積⽐となる。なお、⾼次元への拡張は容易。さらにある⼯夫をすることで偏相関係数を求めることも 可能。 17
18. 18. Regression Coefficient : ai For i=1,..,d : Consider a linear map Fi : such that it gives 0 at T1,..,Td except Ti it gives 1 at Ti ,it gives -1 at -Ti. If X1,X2,..,Xd and Y is standardized, ai = Fi (P). If not, ai = Fi (P) * sd [Y] / sd [Xi] ( sd is standard deviation). T1 T2 -1 18 Theorem 2 P If all X1,X2,..,Xd and Y are standardized. M = (T1|T2|..|Td) can be regarded as a new “coordinate system” with the axes T1,T2,..,Td ; M-1 P is the new coordinates of P. M-1 P The color corresponds to the linear scalar field F1 (blue → -1 yellow → 0, red → 1). The color at P gives a1(P).
19. 19. Regression Coeff. : fi(P) * sd(Y)/sd(Xi) Let a linear map fi : Rd→R fi ( Tj )= 1 if i = j . fi ( Tj )= 0 if i ≠ j . Recall : Tj is the j-th column of is corr. matrix over X. T1 T2 -T2 -T1 RX ×X RX ×X 19 Theorem (2) !d → ! P
20. 20. 20 Supplementary:
21. 21. Let a rod Qi- Qi+ be the longest one inside the ellipse E, passing through P, parallel to the i-th axis with the same direction. Let an affine function gi satisfy gi (-1) = Qi - andgi (+1) = Qi + Then the partial corr. coeff. btw. Xi and Y equals gi -1(P). Q1 - Q1 + P 21 Theorem 3 By watching at P, the partial correlation coefficient is read by the red measure inscribing Ellipse. ri - Vi Wi√1-Ui 2 W1 U1 V1 r1
22. 22. zzPartial Correlation : gi -1(P) Let a rod Pi- Pi+ be the longest one inside the ellipse E, passing through P, parallel to xi-axis with the same direction. Let an affine func. gi:R→Rd satisfy gi(Pi ±)=±1. Pi - Pi + P 22 Theorem (3) ! → !d The partial correlation coefficient can be read by the red measure at P.
23. 23. 23 Supplementary:
24. 24. Geometric 3 Theorems : 1. Multiple corr. coeff. : It is |OP|/|OP’| by letting OP and E cross at P’. 2. [ Regression coeff. :] ai is fi(P) * sd(Y)/sd(Xi) ← sd: standard deviation by letting linear functions fi : Rd→R as fi ( Cj )=δij (δij: Kronecker delta) for i, j∈{1,2,..,d}. 3. Partial corr. coeff. : Let a line segment Pi- Pi+ be the longest one inside E and parallel to xi-axis with the same direction. Fixing variables X1, .., Xd except Xi , the partial corr. coeff. btw. Xi and Y is gi -1(P) by letting affine func. gi:R→Rd satisfy gi(Pi ±)=±1. 24
25. 25. III. More intuitive, more geometric proofs. 2 slides 25
26. 26. More Intuitive Proofs.. ØW.L.O.G., one can reduce the variables’ distribution into high-dimensional Gaussian distribution. Øe := { ζ | ζT M’-1 ζ = 1 } where m := ( ). 26 “e” O P E S (square) (1,1) (1,−1)(−1,−1) (−1,1) M P PT 1 E is d - dimensional ellipsoid. “e” is (d+1) - dimensional ellipsoid.
27. 27. 27 ◆ m is a (d+1)×(d+1)-matrix. ▶ m =: (t1|t2|..|td|td+1). ▶ ei:= (0,..,0,1,0,..,0) where only the i-th element is 1. ▶ o:= (0,…,0) in (d+1)-dim space. t1 Consider : ● Multiple correlation coefficient : The section of “e” by the 2-dim plane containing o, ed+1, td+1. ● Standardized regression coefficient : The inclination of hyper-plane containing o, t1, t2, .. , td. ● Partial correlation coefficient : The section of “e” by the 2-dim plane containing o, ei, ed+1. Then the same conclusion will be obtained as the 3 theorem! t2 td+1 e1 e2 ed+1 (d+1) - dimensional
28. 28. IV. Conclusive Summary 1 slide 28
29. 29. Usefulness of Theorems: ü The results of the multiple regression can be visualized in an easily understandable way when d = 2 or 3. ü The theorems may exploit new theories about linear combination modeling, which solve: Ø the interpretation of numerical computation results, Ø unstableness, Ø multicollinearity, Ø etc. u The extension to canonical analysis is expected as a next step. 29 J
30. 30. Extra Slides 30
31. 31. ( )( ) ( ) ( ) 1 2 2 1 1 [ , ] n i i i n n i i i i X X Y Y X Y X X Y Y r = = = - - = - - å å å 31
32. 32. 年間総得点と年間順位の関係 相関係数は -0.419.. 年間の得点が多いほど 順位は上がり優勝に近づく 32
33. 33. 年間総失点と年間順位の関係 相関係数は +0.471.. 年間の失点が少ないほど 順位は上がり優勝に近づく 33
34. 34. 総得点(x)と総失点(y)の関係 相関係数は +0.423.. (得点と失点は正に相関す る) 34
35. 35. 順位を総得点と総失点で重回 帰 重相関係数は 0.828.. ◎⽬的変数(順位)は2個の説明変数を⽤いることで 予測精度が上がった。 ◎ これらの数量の関係をどう理解したら良いだろうか?? 35