SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Methods of Multivariate Analysis [Hardcover]
Contents
Cover
Half Title page
Title page
Copyright page
Preface
Acknowledgments
Chapter 1: Introduction
1.1 Why Multivariate Analysis?
1.2 Prerequisites
1.3 Objectives
1.4 Basic Types of Data and Analysis
Chapter 2: Matrix Algebra
2.1 Introduction
2.2 Notation and Basic Definitions
2.3 Operations
2.4 Partitioned Matrices
2.5 Rank
2.6 Inverse
2.7 Positive Definite Matrices
2.8 Determinants
2.9 Trace
2.10 Orthogonal Vectors and Matrices
2.11 Eigenvalues and Eigenvectors
2.12 Kronecker and Vec Notation
Problems
Chapter 3: Characterizing and Displaying Multivariate
Data
3.1 Mean and Variance of a Univariate Random Variable
3.2 Covariance and Correlation of Bivariate Random Variables
3.3 Scatterplots of Bivariate Samples
3.4 Graphical Displays for Multivariate Samples
3.5 Dynamic Graphics
3.6 Mean Vectors
3.7 Covariance Matrices
3.8 Correlation Matrices
3.9 Mean Vectors and Covariance Matrices for Subsets of Variables
3.10 Linear Combinations of Variables
3.11 Measures of Overall Variability
3.12 Estimation of Missing Values
3.13 Distance Between Vectors
Problems
Chapter 4: The Multivariate Normal Distribution
4.1 Multivariate Normal Density Function
4.2 Properties of Multivariate Normal Random Variables
4.3 Estimation in the Multivariate Normal
4.4 Assessing Multivariate Normality
4.5 Transformations to Normality
4.6 Outliers
Problems
Chapter 5: Tests on One or Two Mean Vectors
5.1 Multivariate Versus Univariate Tests
5.2 Tests on μ With Σ Known
5.3 Tests on μ When Σ Is Unknown
5.4 Comparing Two Mean Vectors
5.5 Tests on Individual Variables Conditional on Rejection of H0 By the T2-
Test
5.6 Computation of T2
5.7 Paired Observations Test
5.8 Test for Additional Information
5.9 Profile Analysis
Problems
Chapter 6: Multivariate Analysis of Variance
6.1 One-Way Models
6.2 Comparison of the Four Manova Test Statistics
6.3 Contrasts
6.4 Tests on Individual Variables Following Rejection ofH0 By the Overall
Manova Test
6.5 Two-Way Classification
6.6 Other Models
6.7 Checking on the Assumptions
6.8 Profile Analysis
6.9 Repeated Measures Designs
6.10 Growth Curves
6.11 Tests on a Subvector
Problems
Chapter 7: Tests on Covariance Matrices
7.1 Introduction
7.2 Testing a Specified Pattern for Σ
7.3 Tests Comparing Covariance Matrices
7.4 Tests of Independence
Problems
Chapter 8: Discriminant Analysis: Description of Group
Separation
8.1 Introduction
8.2 The Discriminant Function for Two Groups
8.3 Relationship Between Two-Group Discriminant Analysis and Multiple
Regression
8.4 Discriminant Analysis for Several Groups
8.5 Standardized Discriminant Functions
8.6 Tests of Significance
8.7 Interpretation of Discriminant Functions
8.8 Scatterplots
8.9 Stepwise Selection of Variables
Problems
Chapter 9: Classification Analysis: Allocation of
Observations to Groups
9.1 Introduction
9.2 Classification Into Two Groups
9.3 Classification Into Several Groups
9.4 Estimating Misclassification Rates
9.5 Improved Estimates of Error Rates
9.6 Subset Selection
9.7 Nonparametric Procedures
Problems
Chapter 10: Multivariate Regression
10.1 Introduction
10.2 Multiple Regression: Fixed x’s
10.4 Multivariate Multiple Regression: Estimation
10.5 Multivariate Multiple Regression: Hypothesis Tests
10.6 Multivariate Multiple Regression: Prediction
10.7 Measures of Association Between the y’s and the x’s
10.8 Subset Selection
10.9 Multivariate Regression: Random x’s
Problems
Chapter 11: Canonical Correlation
11.1 Introduction
11.2 Canonical Correlations and Canonical Variates
11.3 Properties of Canonical Correlations
11.4 Tests of Significance
11.5 Interpretation
11.6 Relationships of Canonical Correlation Analysis to Other Multivariate
Techniques
Problems
Chapter 12: Principal Component Analysis
12.1 Introduction
12.2 Geometric and Algebraic Bases of Principal Components
12.3 Principal Components and Perpendicular Regression
12.4 Plotting of Principal Components
12.5 Principal Components From the Correlation Matrix
12.6 Deciding How Many Components to Retain
12.7 Information in the Last Few Principal Components
12.8 Interpretation of Principal Components
12.9 Selection of Variables
Problems
Chapter 13: Exploratory Factor Analysis
13.1 Introduction
13.2 Orthogonal Factor Model
13.3 Estimation of Loadings and Communalities
13.4 Choosing the Number of Factors, M
13.5 Rotation
13.6 Factor Scores
13.7 Validity of the Factor Analysis Model
13.8 Relationship of Factor Analysis to Principal Component Analysis
Problems
Chapter 14: Confirmatory Factor Analysis
14.1 Introduction
14.2 Model Specification and Identification
14.3 Parameter Estimation and Model Assessment
14.4 Inference for Model Parameters
14.5 Factor Scores
Problems
Chapter 15: Cluster Analysis
15.1 Introduction
15.2 Measures of Similarity or Dissimilarity
15.3 Hierarchical Clustering
15.4 Nonhierarchical Methods
15.5 Choosing the Number of Clusters
15.6 Cluster Validity
15.7 Clustering Variables
Problems
Chapter 16: Graphical Procedures
16.1 Multidimensional Scaling
16.2 Correspondence Analysis
16.3 Biplots
Problems
Appendix A: Tables
Appendix B: Answers and Hints to Problems
Appendix C: Data Sets and Sas Files
References
Index
METHODS OF MULTIVARIATE ANALYSIS
WILEY SERIES IN PROBABILITY AND STATISTICS
ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M.
Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford
Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many
topics of current research interest in both pure and applied statistics and probability theory. Written by
leading statisticians and institutions, the titles span both state-of-the-art developments in the field and
classical methods.
Reflecting the wide range of current research in statistics, the series encompasses applied,
methodological and theoretical statistics, ranging from applications and new techniques made possible by
advances in computerized practice to rigorous treatment of theoretical approaches.
This series provides essential and invaluable reading for all statisticians, whether in academia,
industry, government, or research.
† ABRAHAM and LEDOLTER · Statistical Methods for Forecasting
AGRESTI · Analysis of Ordinal Categorical Data, Second Edition
AGRESTI · An Introduction to Categorical Data Analysis, Second Edition
AGRESTI · Categorical Data Analysis, Second Edition
ALTMAN, GILL, and McDONALD · Numerical Issues in Statistical Computing for the Social Scientist
AMARATUNGA and CABRERA · Exploration and Analysis of DNA Microarray and Protein Array Data
AND L · Mathematics of Chance
ANDERSON · An Introduction to Multivariate Statistical Analysis, Third Edition
* ANDERSON · The Statistical Analysis of Time Series
ANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG · Statistical Methods for
Comparative Studies
ANDERSON and LOYNES · The Teaching of Practical Statistics
ARMITAGE and DAVID (editors) · Advances in Biometry
ARNOLD, BALAKRISHNAN, and NAGARAJA · Records
* ARTHANARI and DODGE · Mathematical Programming in Statistics
* BAILEY · The Elements of Stochastic Processes with Applications to the Natural Sciences
BAJORSKI · Statistics for Imaging, Optics, and Photonics
BALAKRISHNAN and KOUTRAS · Runs and Scans with Applications
BALAKRISHNAN and NG · Precedence-Type Tests and Applications
BARNETT · Comparative Statistical Inference, Third Edition
BARNETT · Environmental Statistics
BARNETT and LEWIS · Outliers in Statistical Data, Third Edition
BARTHOLOMEW, KNOTT, and MOUSTAKI · Latent Variable Models and Factor Analysis: A Unified
Approach, Third Edition
BARTOSZYNSKI and NIEWIADOMSKA-BUGAJ · Probability and Statistical Inference, Second Edition
BASILEVSKY · Statistical Factor Analysis and Related Methods: Theory and Applications
BATES and WATTS · Nonlinear Regression Analysis and Its Applications
BECHHOFER, SANTNER, and GOLDSMAN · Design and Analysis of Experiments for Statistical
Selection, Screening, and Multiple Comparisons
BEIRLANT, GOEGEBEUR, SEGERS, TEUGELS, and DE WAAL · Statistics of Extremes: Theory and
Applications
BELSLEY · Conditioning Diagnostics: Collinearity and Weak Data in Regression
† BELSLEY, KUH, and WELSCH · Regression Diagnostics: Identifying Influential Data and Sources of
Collinearity
BENDAT and PIERSOL · Random Data: Analysis and Measurement Procedures,Fourth Edition
BERNARDO and SMITH · Bayesian Theory
BHAT and MILLER · Elements of Applied Stochastic Processes, Third Edition
BHATTACHARYA and WAYMIRE · Stochastic Processes with Applications
BIEMER, GROVES, LYBERG, MATHIOWETZ, and SUDMAN · Measurement Errors in Surveys
BILLINGSLEY · Convergence of Probability Measures, Second Edition
BILLINGSLEY · Probability and Measure, Anniversary Edition
BIRKES and DODGE · Alternative Methods of Regression
BISGAARD and KULAHCI · Time Series Analysis and Forecasting by Example
BISWAS, DATTA, FINE, and SEGAL · Statistical Advances in the Biomedical Sciences: Clinical Trials,
Epidemiology, Survival Analysis, and Bioinformatics
BLISCHKE and MURTHY (editors) · Case Studies in Reliability and Maintenance
BLISCHKE and MURTHY · Reliability: Modeling, Prediction, and Optimization
BLOOMFIELD · Fourier Analysis of Time Series: An Introduction, Second Edition
BOLLEN · Structural Equations with Latent Variables
BOLLEN and CURRAN · Latent Curve Models: A Structural Equation Perspective
BOROVKOV · Ergodicity and Stability of Stochastic Processes
BOSQ and BLANKE · Inference and Prediction in Large Dimensions
BOULEAU · Numerical Methods for Stochastic Processes
* BOX and TIAO · Bayesian Inference in Statistical Analysis
BOX · Improving Almost Anything, Revised Edition
* BOX and DRAPER · Evolutionary Operation: A Statistical Method for Process Improvement
BOX and DRAPER · Response Surfaces, Mixtures, and Ridge Analyses, Second Edition
BOX, HUNTER, and HUNTER · Statistics for Experimenters: Design, Innovation, and Discovery, Second
Editon
BOX, JENKINS, and REINSEL · Time Series Analysis: Forcasting and Control, Fourth Edition
BOX, LUCEÑO, and PANIAGUA-QUIÑONES · Statistical Control by Monitoring and Adjustment, Second
Edition
* BROWN and HOLLANDER · Statistics: A Biomedical Introduction
CAIROLI and DALANG · Sequential Stochastic Optimization
CASTILLO, HADI, BALAKRISHNAN, and SARABIA · Extreme Value and Related Models with
Applications in Engineering and Science
CHAN · Time Series: Applications to Finance with R and S-Plus®, Second Edition
CHARALAMBIDES · Combinatorial Methods in Discrete Distributions
CHATTERJEE and HADI · Regression Analysis by Example, Fourth Edition
CHATTERJEE and HADI · Sensitivity Analysis in Linear Regression
CHERNICK · Bootstrap Methods: A Guide for Practitioners and Researchers, Second Edition
CHERNICK and FRIIS · Introductory Biostatistics for the Health Sciences
CHILÈS and DELFINER · Geostatistics: Modeling Spatial Uncertainty, Second Edition
CHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies,Second Edition
CLARKE · Linear Models: The Theory and Application of Analysis of Variance CLARKE and DISNEY ·
Probability and Random Processes: A First Course with Applications, Second Edition
* COCHRAN and COX · Experimental Designs, Second Edition
COLLINS and LANZA · Latent Class and Latent Transition Analysis: With Applications in the Social,
Behavioral, and Health Sciences
CONGDON · Applied Bayesian Modelling
CONGDON · Bayesian Models for Categorical Data
CONGDON · Bayesian Statistical Modelling, Second Edition
CONOVER · Practical Nonparametric Statistics, Third Edition
COOK · Regression Graphics
COOK and WEISBERG · An Introduction to Regression Graphics
COOK and WEISBERG · Applied Regression Including Computing and Graphics CORNELL · A Primer on
Experiments with Mixtures
CORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Third
Edition
COX · A Handbook of Introductory Statistical Methods
CRESSIE · Statistics for Spatial Data, Revised Edition
CRESSIE and WIKLE · Statistics for Spatio-Temporal Data
CSÖRGÖ and HORVÁTH · Limit Theorems in Change Point Analysis
DAGPUNAR · Simulation and Monte Carlo: With Applications in Finance and MCMC
DANIEL · Applications of Statistics to Industrial Experimentation
DANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Eighth Edition
* DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition
DASU and JOHNSON · Exploratory Data Mining and Data Cleaning DAVID and NAGARAJA · Order
Statistics, Third Edition
* DEGROOT, FIENBERG, and KADANE · Statistics and the Law
DEL CASTILLO · Statistical Process Adjustment for Quality Control
DEMARIS · Regression with Social Data: Modeling Continuous and Limited Response Variables
DEMIDENKO · Mixed Models: Theory and Applications
DENISON, HOLMES, MALLICK and SMITH · Bayesian Methods for Nonlinear Classification and
Regression
DETTE and STUDDEN · The Theory of Canonical Moments with Applications in
Statistics, Probability, and Analysis
DEY and MUKERJEE · Fractional Factorial Plans
DILLON and GOLDSTEIN · Multivariate Analysis: Methods and Applications
* DODGE and ROMIG · Sampling Inspection Tables, Second Edition
* DOOB · Stochastic Processes
DOWDY, WEARDEN, and CHILKO · Statistics for Research, Third Edition
DRAPER and SMITH · Applied Regression Analysis, Third Edition
DRYDEN and MARDIA · Statistical Shape Analysis
DUDEWICZ and MISHRA · Modern Mathematical Statistics
DUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences, Fourth Edition
DUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations
EDLER and KITSOS · Recent Advances in Quantitative Methods in Cancer and Human Health Risk
Assessment
* ELANDT-JOHNSON and JOHNSON · Survival Models and Data Analysis
ENDERS · Applied Econometric Time Series, Third Edition
† ETHIER and KURTZ · Markov Processes: Characterization and Convergence
EVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third Edition
EVERITT, LANDAU, LEESE, and STAHL · Cluster Analysis, Fifth Edition
FEDERER and KING · Variations on Split Plot and Split Block Experiment Designs
FELLER · An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised;
Volume II, Second Edition
FITZMAURICE, LAIRD, and WARE · Applied Longitudinal Analysis, Second Edition
* FLEISS · The Design and Analysis of Clinical Experiments
FLEISS · Statistical Methods for Rates and Proportions, Third Edition
† FLEMING and HARRINGTON · Counting Processes and Survival Analysis
FUJIKOSHI, ULYANOV, and SHIMIZU · Multivariate Statistics: High-Dimensional and Large-Sample
Approximations
FULLER · Introduction to Statistical Time Series, Second Edition
† FULLER · Measurement Error Models
GALLANT · Nonlinear Statistical Models
GEISSER · Modes of Parametric Statistical Inference
GELMAN and MENG · Applied Bayesian Modeling and Causal Inference from Incomplete-Data
Perspectives
GEWEKE · Contemporary Bayesian Econometrics and Statistics
GHOSH, MUKHOPADHYAY, and SEN · Sequential Estimation
GIESBRECHT and GUMPERTZ · Planning, Construction, and Statistical Analysis of Comparative
Experiments
GIFI · Nonlinear Multivariate Analysis
GIVENS and HOETING · Computational Statistics
GLASSERMAN and YAO · Monotone Structure in Discrete-Event Systems
GNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations, Second Edition
GOLDSTEIN · Multilevel Statistical Models, Fourth Edition
GOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical Issues
GOLDSTEIN and WOOFF · Bayes Linear Statistics
GREENWOOD and NIKULIN · A Guide to Chi-Squared Testing
GROSS, SHORTLE, THOMPSON, and HARRIS · Fundamentals of Queueing Theory,Fourth Edition
GROSS, SHORTLE, THOMPSON, and HARRIS · Solutions Manual to Accompany Fundamentals of
Queueing Theory, Fourth Edition
* HAHN and SHAPIRO · Statistical Models in Engineering
HAHN and MEEKER · Statistical Intervals: A Guide for Practitioners
HALD · A History of Probability and Statistics and their Applications Before 1750
† HAMPEL · Robust Statistics: The Approach Based on Influence Functions
HARTUNG, KNAPP, and SINHA · Statistical Meta-Analysis with Applications
HEIBERGER · Computation for the Analysis of Designed Experiments
HEDAYAT and SINHA · Design and Inference in Finite Population Sampling
HEDEKER and GIBBONS · Longitudinal Data Analysis
HELLER · MACSYMA for Statisticians
HERITIER, CANTONI, COPT, and VICTORIA-FESER · Robust Methods in Biostatistics
HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 1: Introduction to
Experimental Design, Second Edition
HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 2: Advanced
Experimental Design
HINKELMANN (editor) · Design and Analysis of Experiments, Volume 3: Special
Designs and Applications
HOAGLIN, MOSTELLER, and TUKEY · Fundamentals of Exploratory Analysis of Variance
* HOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes
* HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data Analysis
HOCHBERG and TAMHANE · Multiple Comparison Procedures
HOCKING · Methods and Applications of Linear Models: Regression and the Analysis of Variance, Second
Edition
HOEL · Introduction to Mathematical Statistics, Fifth Edition
HOGG and KLUGMAN · Loss Distributions
HOLLANDER and WOLFE · Nonparametric Statistical Methods, Second Edition
HOSMER and LEMESHOW · Applied Logistic Regression, Second Edition
HOSMER, LEMESHOW, and MAY · Applied Survival Analysis: Regression Modeling of Time-to-Event
Data, Second Edition
HUBER · Data Analysis: What Can Be Learned From the Past 50 Years
HUBER · Robust Statistics
† HUBER and RONCHETTI · Robust Statistics, Second Edition
HUBERTY · Applied Discriminant Analysis, Second Edition
HUBERTY and OLEJNIK · Applied MANOVA and Discriminant Analysis, Second Edition
HUITEMA · The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi-
Experiments, and Single-Case Studies, Second Edition
HUNT and KENNEDY · Financial Derivatives in Theory and Practice, Revised Edition
HURD and MIAMEE · Periodically Correlated Random Sequences: Spectral Theory
and Practice
HUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek—with Commentary
HUZURBAZAR · Flowgraph Models for Multistate Time-to-Event Data
JACKMAN · Bayesian Analysis for the Social Sciences
† JACKSON · A User’s Guide to Principle Components
JOHN · Statistical Methods in Engineering and Quality Assurance
JOHNSON · Multivariate Statistical Simulation
JOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A Volume in Honor
of Samuel Kotz
JOHNSON, KEMP, and KOTZ · Univariate Discrete Distributions, Third Edition
JOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the Seventeenth
Century to the Present
JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 1, Second
Edition
JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 2, Second
Edition
JOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate Distributions
JUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE · The Theory and Practice of Econometrics, Second
Edition
JUREK and MASON · Operator-Limit Distributions in Probability Theory
KADANE · Bayesian Methods and Ethics in a Clinical Trial Design
KADANE AND SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti Evidence
KALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data, Second Edition
KARIYA and KURATA · Generalized Least Squares
KASS and VOS · Geometrical Foundations of Asymptotic Inference
† KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to Cluster Analysis
KEDEM and FOKIANOS · Regression Models for Time Series Analysis
KENDALL, BARDEN, CARNE, and LE · Shape and Shape Theory
KHURI · Advanced Calculus with Applications in Statistics, Second Edition
KHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear Models
* KISH · Statistical Design for Research
KLEIBER and KOTZ · Statistical Size Distributions in Economics and Actuarial Sciences
KLEMELÄ · Smoothing of Multivariate Data: Density Estimation and Visualization
KLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to Decisions, Third Edition
KLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models: From Data to
Decisions, Third Edition
KOSKI and NOBLE · Bayesian Networks: An Introduction
KOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions, Volume 1, Second
Edition
KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9 with Index
KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement Volume
KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 1
KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 2
KOWALSKI and TU · Modern Applied U-Statistics
KRISHNAMOORTHY and MATHEW · Statistical Tolerance Regions: Theory, Applications, and
Computation
KROESE, TAIMRE, and BOTEV · Handbook of Monte Carlo Methods
KROONENBERG · Applied Multiway Data Analysis
KULINSKAYA, MORGENTHALER, and STAUDTE · Meta Analysis: A Guide to Calibrating and
Combining Statistical Evidence
KULKARNI and HARMAN · An Elementary Introduction to Statistical Learning Theory
KUROWICKA and COOKE · Uncertainty Analysis with High Dimensional Dependence
Modelling
KVAM and VIDAKOVIC · Nonparametric Statistics with Applications to Science and Engineering
LACHIN · Biostatistical Methods: The Assessment of Relative Risks, Second Edition
LAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and Historical
Introduction
LAMPERTI · Probability: A Survey of the Mathematical Theory, Second Edition
LAWLESS · Statistical Models and Methods for Lifetime Data, Second Edition
LAWSON · Statistical Methods in Spatial Epidemiology, Second Edition
LE · Applied Categorical Data Analysis, Second Edition
LE · Applied Survival Analysis
LEE · Structural Equation Modeling: A Bayesian Approach
LEE and WANG · Statistical Methods for Survival Data Analysis, Third Edition
LEPAGE and BILLARD · Exploring the Limits of Bootstrap
LESSLER and KALSBEEK · Nonsampling Errors in Surveys
LEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health Statistics
LIAO · Statistical Group Comparison
LIN · Introductory Stochastic Analysis for Finance and Insurance
LITTLE and RUBIN · Statistical Analysis with Missing Data, Second Edition
LLOYD · The Statistical Analysis of Categorical Data
LOWEN and TEICH · Fractal-Based Point Processes
MAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in Statistics and
Econometrics, Revised Edition
MALLER and ZHOU · Survival Analysis with Long Term Survivors
MARCHETTE · Random Graphs for Statistical Pattern Recognition
MARDIA and JUPP · Directional Statistics
MARKOVICH · Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice
MARONNA, MARTIN and YOHAI · Robust Statistics: Theory and Methods
MASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with Applications to
Engineering and Science, Second Edition
McCULLOCH, SEARLE, and NEUHAUS · Generalized, Linear, and Mixed Models,Second Edition
McFADDEN · Management of Data in Clinical Trials, Second Edition
* McLACHLAN · Discriminant Analysis and Statistical Pattern Recognition
McLACHLAN, DO, and AMBROISE · Analyzing Microarray Gene Expression Data
McLACHLAN and KRISHNAN · The EM Algorithm and Extensions, Second Edition
McLACHLAN and PEEL · Finite Mixture Models
McNEIL · Epidemiological Research Methods
MEEKER and ESCOBAR · Statistical Methods for Reliability Data
MEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent Random Vectors:
Heavy Tails in Theory and Practice
MENGERSEN, ROBERT, and TITTERINGTON · Mixtures: Estimation and Applications
MICKEY, DUNN, and CLARK · Applied Statistics: Analysis of Variance and
Regression, Third Edition
* MILLER · Survival Analysis, Second Edition
MONTGOMERY, JENNINGS, and KULAHCI · Introduction to Time Series Analysis and Forecasting
MONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis,Fifth Edition
MORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical Robustness
MUIRHEAD · Aspects of Multivariate Statistical Theory
MULLER and STOYAN · Comparison Methods for Stochastic Models and Risks
MURTHY, XIE, and JIANG · Weibull Models
MYERS, MONTGOMERY, and ANDERSON-COOK · Response Surface Methodology: Process and
Product Optimization Using Designed Experiments, Third Edition
MYERS, MONTGOMERY, VINING, and ROBINSON · Generalized Linear Models. With Applications in
Engineering and the Sciences, Second Edition
NATVIG · Multistate Systems Reliability Theory With Applications
† NELSON · Accelerated Testing, Statistical Models, Test Plans, and Data Analyses
† NELSON · Applied Life Data Analysis
NEWMAN · Biostatistical Methods in Epidemiology
NG, TAIN, and TANG · Dirichlet Theory: Theory, Methods and Applications
OKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and
Applications of Voronoi Diagrams, Second Edition
OLIVER and SMITH · Influence Diagrams, Belief Nets and Decision Analysis
PALTA · Quantitative Methods in Population Health: Extensions of Ordinary Regressions
PANJER · Operational Risk: Modeling and Analytics
PANKRATZ · Forecasting with Dynamic Regression Models
PANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases
PARDOUX · Markov Processes and Applications: Algorithms, Networks, Genome and Finance
PARMIGIANI and INOUE · Decision Theory: Principles and Approaches
* PARZEN · Modern Probability Theory and Its Applications
PEÑA, TIAO, and TSAY · A Course in Time Series Analysis
PESARIN and SALMASO · Permutation Tests for Complex Data: Applications and Software
PIANTADOSI · Clinical Trials: A Methodologic Perspective, Second Edition
POURAHMADI · Foundations of Time Series Analysis and Prediction Theory
POWELL · Approximate Dynamic Programming: Solving the Curses of Dimensionality,Second Edition
POWELL and RYZHOV · Optimal Learning
PRESS · Subjective and Objective Bayesian Statistics, Second Edition
PRESS and TANUR · The Subjectivity of Scientists and the Bayesian Approach
PURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied Statistics
† PUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming
QIU · Image Processing and Jump Regression Analysis
* RAO · Linear Statistical Inference and Its Applications, Second Edition
RAO · Statistical Inference for Fractional Diffusion Processes
RAUSAND and HØYLAND · System Reliability Theory: Models, Statistical Methods, and
Applications, Second Edition
RAYNER, THAS, and BEST · Smooth Tests of Goodnes of Fit: Using R, Second Edition
RENCHER and SCHAALJE · Linear Models in Statistics, Second Edition
RENCHER and CHRISTENSEN · Methods of Multivariate Analysis, Third Edition
RENCHER · Multivariate Statistical Inference with Applications
RIGDON and BASU · Statistical Methods for the Reliability of Repairable Systems
* RIPLEY · Spatial Statistics
* RIPLEY · Stochastic Simulation
ROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition
ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insurance
and Finance
ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and Practice
ROSSI, ALLENBY, and McCULLOCH · Bayesian Statistics and Marketing
† ROUSSEEUW and LEROY · Robust Regression and Outlier Detection
ROYSTON and SAUERBREI · Multivariate Model Building: A Pragmatic Approach to Regression Analysis
Based on Fractional Polynomials for Modeling Continuous Variables
* RUBIN · Multiple Imputation for Nonresponse in Surveys
RUBINSTEIN and KROESE · Simulation and the Monte Carlo Method, Second Edition
RUBINSTEIN and MELAMED · Modern Simulation and Modeling
RYAN · Modern Engineering Statistics
RYAN · Modern Experimental Design
RYAN · Modern Regression Methods, Second Edition
RYAN · Statistical Methods for Quality Improvement, Third Edition
SALEH · Theory of Preliminary Test and Stein-Type Estimation with Applications
SALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis
SCHERER · Batch Effects and Noise in Microarray Experiments: Sources and Solutions
* SCHEFFE · The Analysis of Variance
SCHIMEK · Smoothing and Regression: Approaches, Computation, and Application SCHOTT · Matrix
Analysis for Statistics, Second Edition
SCHOUTENS · Levy Processes in Finance: Pricing Financial Derivatives
SCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization
* SEARLE · Linear Models
† SEARLE · Linear Models for Unbalanced Data
† SEARLE · Matrix Algebra Useful for Statistics
† SEARLE, CASELLA, and McCULLOCH · Variance Components
SEARLE and WILLETT · Matrix Algebra for Applied Economics
SEBER · A Matrix Handbook For Statisticians
† SEBER · Multivariate Observations
SEBER and LEE · Linear Regression Analysis, Second Edition
† SEBER and WILD · Nonlinear Regression
SENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems
* SERFLING · Approximation Theorems of Mathematical Statistics
SHAFER and VOVK · Probability and Finance: It’s Only a Game!
SHERMAN · Spatial Statistics and Spatio-Temporal Data: Covariance Functions and Directional
Properties
SILVAPULLE and SEN · Constrained Statistical Inference: Inequality, Order, and Shape Restrictions
SINGPURWALLA · Reliability and Risk: A Bayesian Perspective
SMALL and McLEISH · Hilbert Space Methods in Probability and Statistical Inference SRIVASTAVA ·
Methods of Multivariate Statistics
STAPLETON · Linear Statistical Models, Second Edition
STAPLETON · Models for Probability and Statistical Inference: Theory and Applications
STAUDTE and SHEATHER · Robust Estimation and Testing
STOYAN · Counterexamples in Probability, Second Edition
STOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second Edition
STOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of Geometrical Statistics
STREET and BURGESS · The Construction of Optimal Stated Choice Experiments:
Theory and Methods
STYAN · The Collected Papers of T. W. Anderson: 1943–1985
SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-Analysis in
Medical Research
TAKEZAWA · Introduction to Nonparametric Regression
TAMHANE · Statistical Analysis of Designed Experiments: Theory and Applications
TANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution Theory
THOMPSON · Empirical Model Building: Data, Models, and Reality, Second Edition
THOMPSON · Sampling, Third Edition
THOMPSON · Simulation: A Modeler’s Approach
THOMPSON and SEBER · Adaptive Sampling
THOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World Markets
TIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic
Graphics
TSAY · Analysis of Financial Time Series, Third Edition
UPTON and FINGLETON · Spatial Data Analysis by Example, Volume II: Categorical and Directional
Data
† VAN BELLE · Statistical Rules of Thumb, Second Edition
VAN BELLE, FISHER, HEAGERTY, and LUMLEY · Biostatistics: A Methodology for the Health
Sciences, Second Edition
VESTRUP · The Theory of Measures and Integration
VIDAKOVIC · Statistical Modeling by Wavelets
VIERTL · Statistical Methods for Fuzzy Data
VINOD and REAGLE · Preparing for the Worst: Incorporating Downside Risk in Stock Market
Investments
WALLER and GOTWAY · Applied Spatial Statistics for Public Health Data
WEISBERG · Applied Linear Regression, Third Edition
WEISBERG · Bias and Causation: Models and Judgment for Valid Comparisons
WELSH · Aspects of Statistical Inference
WESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and Methods for p-Value
Adjustment
* WHITTAKER · Graphical Models in Applied Multivariate Statistics
WINKER · Optimization Heuristics in Economics: Applications of Threshold Accepting
WOODWORTH · Biostatistics: A Bayesian Introduction
WOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data,Second Edition
WU and HAMADA · Experiments: Planning, Analysis, and Parameter Design Optimization, Second
Edition
WU and ZHANG · Nonparametric Regression Methods for Longitudinal Data Analysis
YIN · Clinical Trial Design: Bayesian and Frequentist Adaptive Methods
YOUNG, VALERO-MORA, and FRIENDLY · Visual Statistics: Seeing Data with Dynamic Interactive
Graphics
ZACKS · Stage-Wise Adaptive Designs
* ZELLNER · An Introduction to Bayesian Inference in Econometrics
ZELTERMAN · Discrete Distributions—Applications in the Health Sciences
ZHOU, OBUCHOWSKI, and McCLISH · Statistical Methods in Diagnostic
Medicine, Second Edition
* Now available in a lower priced paperback edition in the Wiley Classics Library.
† Now available in a lower priced paperback edition in the Wiley–Interscience Paperback Series.
Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be
addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030,
(201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be suitable
for your situation. You should consult with a professional where appropriate. Neither the publisher nor
author shall be liable for any loss of profit or any other commercial damages, including but not limited to
special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at (317)
572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic formats. For more information about Wiley products, visit our web site
at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Rencher, Alvin C, 1934–
Methods of multivariate analysis / Alvin C. Rencher, William F. Christensen, Department of Statistics,
Brigham Young University, Provo, UT. — Third Edition.
pages cm. — (Wiley series in probability and statistics)
Includes index.
ISBN 978-0-470-17896-6 (hardback)
1. Multivariate analysis. I. Christensen, William F., 1970– II. Title.
QA278.R45 2012
519.5’35—dc23
2012009793
Preface
We have long been fascinated by the interplay of variables in multivariate data and by the challenge of
unraveling the effect of each variable. Our continuing objective in the third edition has been to present the
power and utility of multivariate analysis in a highly readable format.
Practitioners and researchers in all applied disciplines often measure several variables on each subject
or experimental unit. In some cases, it may be productive to isolate each variable in a system and study it
separately. Typically, however, the variables are not only correlated with each other, but each variable is
influenced by the other variables as it affects a test statistic or descriptive statistics. Thus, in many
instances, the variables are intertwined in such a way that when analyzed individually they yield little
information about the system. Using multivariate analysis, the variables can be examined simultaneously
in order to access the key features of the process that produced them. The multivariate approach enables
us to (1) explore the joint performance of the variables and (2) determine the effect of each variable in the
presence of the others.
Multivariate analysis provides both descriptive and inferential procedures—we can search for patterns
in the data or test hypotheses about patterns of a priori interest. With multivariate descriptive techniques,
we can peer beneath the tangled web of variables on the surface and extract the essence of the system.
Multivariate inferential procedures include hypothesis tests that (1) process any number of variables
without inflating the Type I error rate and (2) allow for whatever intercorrelations the variables possess. A
wide variety of multivariate descriptive and inferential procedures is readily accessible in statistical
software packages.
Our selection of topics for this volume reflects years of consulting with researchers in many fields of
inquiry. A brief overview of multivariate analysis is given in Chapter 1. Chapter 2 reviews the
fundamentals of matrix algebra. Chapters 3 and 4 give an introduction to sampling from multivariate
populations. Chapters 5, 6, 7, 10, and 11 extend univariate procedures with one dependent variable
(including t-tests, analysis of variance, tests on variances, multiple regression, and multiple correlation) to
analogous multivariate techniques involving several dependent variables. A review of each univariate
procedure is presented before covering the multivariate counterpart. These reviews may provide key
insights that the student has missed in previous courses.
Chapters 8, 9, 12, 13, 14, 15, and 16 describe multivariate techniques that are not extensions of
univariate procedures. In Chapters 8 and 9, we find functions of the variables that discriminate among
groups in the data. In Chapters 12, 13, and 14 (new in the third edition), we find functions of the variables
that reveal the basic dimensionality and characteristic patterns of the data, and we discuss procedures for
finding the underlying latent variables of a system. In Chapters 15 and 16, we give methods for searching
for groups in the data, and we provide plotting techniques that show relationships in a reduced
dimensionality for various kinds of data.
In Appendix A, tables are provided for many multivariate distributions and tests. These enable the
reader to conduct an exact test in many cases for which software packages provide only approximate tests.
Appendix B gives answers and hints for most of the problems in the book.
Appendix C describes an ftp site that contains (1) all data sets and (2) SAS command files for all
examples in the text. These command files can be adapted for use in working problems or in analyzing
data sets encountered in applications.
To illustrate multivariate applications, we have provided many examples and exercises based on 60 real
data sets from a wide variety of disciplines. A practitioner or consultant in multivariate analysis gains
insights and acumen from long experience in working with data. It is not expected that a student can
achieve this kind of seasoning in a one-semester class. However, the examples provide a good start, and
further development is gained by working problems with the data sets. For example, in Chapters 12–14,
the exercises cover several typical patterns in the covariance or correlation matrix. The student’s intuition
is expanded by associating these covariance patterns with the resulting configuration of the principal
components or factors.
Although this is a methods book, we have included a few derivations. For some readers, an occasional
proof provides insights obtainable in no other way. We hope that instructors who do not wish to use
proofs will not be deterred by their presence. The proofs can easily be disregarded when reading the book.
Our objective has been to make the book accessible to readers who have taken as few as two statistical
methods courses. The students in our classes in multivariate analysis include majors in statistics and
majors from other departments. With the applied researcher in mind, we have provided careful intuitive
explanations of the concepts and have included many insights typically available only in journal articles or
in the minds of practitioners.
Our overriding goal in preparation of this book has been clarity of exposition. We hope that students
and instructors alike will find this multivariate text more comfortable than most. In the final stages of
development of each edition, we asked our students for written reports on their initial reaction as they
read each day’s assignment. They made many comments that led to improvements in the manuscript. We
will be very grateful if readers will take the time to notify us of errors or of other suggestions they might
have for improvements.
We have tried to use standard mathematical and statistical notation as far as possible and to maintain
consistency of notation throughout the book. We have refrained from the use of abbreviations and
mnemonic devices. These save space when one is reading a book page by page, but they are annoying to
those using a book as a reference.
Equations are numbered sequentially throughout a chapter; for example, (3.75)indicates the 75th
numbered equation in Chapter 3. Tables and figures are also numbered sequentially throughout a chapter
in the form “Table 3.9” or “Figure 3.1.” Examples are not numbered sequentially; each example is
identified by the same number as the section in which it appears and is placed at the end of the section.
When citing references in the text, we have used the standard format involving the year of publication.
For a journal article, the year alone suffices, for example, Fisher (1936). But for books, we have included a
page number, as in Seber (1984, p. 216).
This is the first volume of a two-volume set on multivariate analysis. The second volume is
entitled Multivariate Statistical Inference and Applications by Alvin Rencher (Wiley, 1998). The two
volumes are not necessarily sequential; they can be read independently. Al adopted the two-volume
format in order to (1) provide broader coverage than would be possible in a single volume and (2) offer the
reader a choice of approach.
The second volume includes proofs of many techniques covered in the first 13 chapters of the present
volume and also introduces additional topics. The present volume includes many examples and problems
using actual data sets, and there are fewer algebraic problems. The second volume emphasizes derivations
of the results and contains fewer examples and problems with real data. The present volume has fewer
references to the literature than the other volume, which includes a careful review of the latest
developments and a more comprehensive bibliography. In this third edition, we have occasionally
referred the reader to “Rencher (1998)” to note that added coverage of a certain subject is available in the
second volume.
We are indebted to many individuals in the preparation of the first two editions of this book. Al’s initial
exposure to multivariate analysis came in courses taught by Rolf Bargmann at the University of Georgia
and D.R. Jensen at Virginia Tech. Additional impetus to probe the subtleties of this field came from
research conducted with Bruce Brown at BYU. William’s interest and training in multivariate statistics
were primarily influenced by Alvin Rencher and Yasuo Amemiya. We thank these mentors and colleagues
and also thank Bruce Brown, Deane Branstetter, Del Scott, Robert Smidt, and Ingram Olkin for reading
various versions of the manuscript and making valuable suggestions.
We are grateful to the following colleagues and students at BYU who helped with computations and
typing: Mitchell Tolland, Tawnia Newton, Marianne Matis Mohr, Gregg Littlefield, Suzanne Kimball,
Wendy Nielsen, Tiffany Nordgren, David Whiting, Karla Wasden, Rachel Jones, Lonette Stoddard,
Candace B. McNaughton, J. D. Williams, and Jonathan Christensen. We are grateful to the many readers
who have pointed out errors or made suggestions for improvements. The book is better for their caring
and their efforts.
THIRD EDITION
For the third edition, we have added a new chapter covering confirmatory factor analysis (Chapter 14). We
have added new sections discussing Kronecker products and vec notation (Section 2.12), dynamic
graphics (Section 3.5), transformations to normality (Section 4.5), classification trees (Section 9.7.4),
seemingly unrelated regressions (Section 10.4.6), and prediction for multivariate multiple regression
(Section 10.6). Additionally, we have updated and revised the graphics throughout the book and have
substantially expanded the section discussing estimation for multivariate multiple regression (Section
10.4).
Many other additions and changes have been made in an effort to broaden the book’s scope and
improve its exposition. Additional problems have been added to accompany the new material. The new ftp
site for the third edition can be found
at:ftp://ftp.wiley.com/public/sci_tech_med/multivariate_analysis_3e.
We thank Jonathan Christensen for the countless ways he contributed to the revision, from updated
graphics to technical and formatting assistance. We are grateful to Byron Gajewski, Scott Grimshaw, and
Jeremy Yorgason, who have provided useful feedback on sections of the book that are new to this edition.
We are also grateful to Scott Simpson and Storm Atwood for invaluable editing assistance and to the
students of William’s multivariate statistics course at Brigham Young University for pointing out errors
and making suggestions for improvements in this edition. We are deeply appreciative of the support
provided by the BYU Department of Statistics for the writing of the book.
Al dedicates this volume to his wife, LaRue, who has supplied much needed support and
encouragement. William dedicates this volume to his wife Mary Elizabeth, who has provided both
encouragement and careful feedback throughout the writing process.
ALVIN C. RENCHER & WILLIAM F. CHRISTENSEN
Department of Statistics
Brigham Young University
Provo, Utah
Acknowledgments
We wish to thank the authors, editors, and owners of copyrights for permission to reproduce the following
materials:
 Figure 3.8 and Table 3.2, Kleiner and Hartigan (1981), Reprinted by
permission of Journal of the American Statistical Association
 Table 3.3, Colvin et al. (1997), Reprinted by permission of T. S.
Colvin and D. L. Karlen
 Table 3.4, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
 Table 3.5, Reaven and Miller (1979), Reprinted by permission
ofDiabetologia
 Table 3.6, Timm (1975), Reprinted by permission of Elsevier North-
Holland Publishing Company
 Table 3.7, Elston and Grizzle (1962), Reprinted by permission
ofBiometrics
 Table 3.8, Frets (1921), Reprinted by permission of Genetica
 Table 3.9, O’Sullivan and Mahan (1966), Reprinted by permission
of American Journal of Clinical Nutrition
 Table 4.2, Royston (1983), Reprinted by permission of Applied
Statistics
 Table 5.1, Beall (1945), Reprinted by permission ofPsychometrika
 Table 5.2, Hummel and Sligo (1971), Reprinted by permission
ofPsychological Bulletin
 Table 5.3, Kramer and Jensen (1969b), Reprinted by permission
of Journal of Quality Technology
 Table 5.5, Lubischew (1962), Reprinted by permission ofBiometrics
 Table 5.6, Travers (1939), Reprinted by permission ofPsychometrika
 Table 5.7, Andrews and Herzberg (1985), Reprinted by permission of
Springer-Verlag
 Table 5.8, Tintner (1946), Reprinted by permission of Journal of the
American Statistical Association
 Table 5.9, Kramer (1972), Reprinted by permission of the author
 Table 5.10, Cameron and Pauling (1978), Reprinted by permission of
National Academy of Sciences
 Table 6.2, Andrews and Herzberg (1985), Reprinted by permission of
Springer-Verlag
 Table 6.3, Rencher and Scott (1990), Reprinted by permission
ofCommunications in Statistics: Simulation and Computation
 Table 6.6, Posten (1962), Reprinted by permission of the author
 Table 6.8, Crowder and Hand (1990, pp. 21–29), Reprinted by
permission of Routledge Chapman and Hall
 Table 6.12, Cochran and Cox (1957), Timm (1980), Reprinted by
permission of John Wiley and Sons and Elsevier North-Holland
Publishing Company
 Table 6.14, Timm (1980), Reprinted by permission of Elsevier North-
Holland Publishing Company
 Table 6.16, Potthoff and Roy (1964), Reprinted by permission of
Biometrika Trustees
 Table 6.17, Baten, Tack, and Baeder (1958), Reprinted by permission
of Quality Progress
 Table 6.18, Keuls et al. (1984), Reprinted by permission ofScientia
Horticulturae
 Table 6.19, Burdick (1979), Reprinted by permission of the author
 Table 6.20, Box (1950), Reprinted by permission of Biometrics
 Table 6.21, Rao (1948), Reprinted by permission of Biometrika
Trustees
 Table 6.22, Cameron and Pauling (1978), Reprinted by permission of
National Academy of Sciences
 Table 6.23, Williams and Izenman (1981), Reprinted by permission
of Colorado State University
 Table 6.24, Beauchamp and Hoel (1973), Reprinted by permission
of Journal of Statistical Computation and Simulation
 Table 6.25, Box (1950), Reprinted by permission of Biometrics
 Table 6.26, Grizzle and Allen (1969), Reprinted by permission
ofBiometrics
 Table 6.27, Crepeau et. al (1985), Reprinted by permission
ofBiometrics
 Table 6.28, Zerbe (1979a), Reprinted by permission of Journal of the
American Statistical Association
 Table 6.29, Timm (1980), Reprinted by permission of Elsevier North-
Holland Publishing Company
 Table 7.1, Siotani et al. (1963), Reprinted by permission of Institute
of Statistical Mathematics
 Table 7.2, Reprinted by permission of R. J. Freund.
 Table 8.1, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
 Table 8.3, Reprinted by permission of G. R. Bryce and R. M. Barker
 Table 10.1, Box and Youle (1955), Reprinted by permission
ofBiometrics
 Tables 12.2, 12.3, and 12.4, Jeffers (1967), Reprinted by permission
of Applied Statistics
 Table 13.1, Brown et al. (1984), Reprinted by permission ofJournal of
Pascal, Ada, and Modula
 Correlation matrix in Example 13.6, Brown, Strong, and Rencher
(1973), Reprinted by permission of The Journal of the Acoustical
Society of America
 Table 15.1, Hartigan (1975a), Reprinted by permission of John Wiley
and Sons
 Table 15.3, Dawkins (1989), Reprinted by permission of The
American Statistician
 Table 15.7, Hand et al. (1994), Reprinted by permission of D. J. Hand
 Table 15.12, Sokol and Rohlf (1981), Reprinted by permission of W.
H. Freeman and Co.
 Table 15.13, Hand et al. (1994), Reprinted by permission of D. J.
Hand
 Table 16.1, Kruskal and Wish (1978), Reprinted by permission of
Sage Publications
 Tables 16.2 and 16.5, Hand et al. (1994), Reprinted by permission of
D. J. Hand
 Table 16.13, Edwards and Kreiner (1983), Reprinted by permission
of Biometrika
 Table 16.15, Hand et al. (1994), Reprinted by permission of D. J.
Hand
 Table 16.16, Everitt (1987), Reprinted by permission of the author
 Table 16.17, Andrews and Herzberg (1985), Reprinted by permission
of Springer Verlag
 Table 16.18, Clausen (1998), Reprinted by permission of Sage
Publications
 Table 16.19, Andrews and Herzberg (1985), Reprinted by permission
of Springer Verlag
 Table A.1, Mulholland (1977), Reprinted by permission of Biometrika
Trustees
 Table A.2, D’Agostino and Pearson (1973), Reprinted by permission
of Biometrika Trustees
 Table A.3, D’Agostino and Tietjen (1971), Reprinted by permission of
Biometrika Trustees
 Table A.4, D’Agostino (1972), Reprinted by permission of Biometrika
Trustees
 Table A.5, Mardia (1970, 1974), Reprinted by permission of
Biometrika Trustees
 Table A.6, Barnett and Lewis (1978), Reprinted by permission of
John Wiley and Sons
 Table A.7, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
 Table A.8, Bailey (1977), Reprinted by permission of Journal of the
American Statistical Association
 Table A.9, Wall (1967), Reprinted by permission of the author,
Albuquerque, NM
 Table A.10, Pearson and Hartley (1972) and Pillai (1964, 1965),
Reprinted by permission of Biometrika Trustees
 Table A.11, Schuurmann et al. (1975), Reprinted by permission
ofJournal of Statistical Computation and Simulation
 Table A.12, Davis (1970a, b, 1980), Reprinted by permission of
Biometrika Trustees
 Table A.13, Kleinbaum, Kupper, and Muller (1988), Reprinted by
permission of PWS-KENT Publishing Company
 Table A.14, Lee et al. (1977), Reprinted by permission of Elsevier
North-Holland Publishing Company
 Table A.15, Mathai and Katiyar (1979), Reprinted by permission of
Biometrika Trustees
CHAPTER 1
INTRODUCTION
1.1 WHY MULTIVARIATE
ANALYSIS?
Multivariate analysis consists of a collection of methods that can be used when several measurements are
made on each individual or object in one or more samples. We will refer to the measurements
as variables and to the individuals or objects as units(research units, sampling units, or experimental
units) or observations. In practice, multivariate data sets are common, although they are not always
analyzed as such. But the exclusive use of univariate procedures with such data is no longer excusable,
given the availability of multivariate techniques and inexpensive computing power to carry them out.
Historically, the bulk of applications of multivariate techniques have been in the behavioral and
biological sciences. However, interest in multivariate methods has now spread to numerous other fields of
investigation. For example, we have collaborated on multivariate problems with researchers in education,
chemistry, environmental science, physics, geology, medicine, engineering, law, business, literature,
religion, public broadcasting, nursing, mining, linguistics, biology, psychology, and many other
fields. Table 1.1 shows some examples of multivariate observations.
Table 1.1 Examples of Multivariate Data
Units Variables
1. Students Several exam scores in a single course
2. Students Grades in mathematics, history, music, art, physics
3. People Height, weight, percentage of body fat, resting heart
rate
4. Skulls Length, width, cranial capacity
5. Companies Expenditures for advertising, labor, raw materials
6. Manufactured
items
Various measurements to check on compliance with
specifications
7. Applicants for bank
loans
Income, education level, length of residence, savings
account, current debt load
8. Segments of
literature
Sentence length, frequency of usage of certain words
and style characteristics
9. Human hairs Composition of various elements
10. Birds Lengths of various bones
The reader will notice that in some cases all the variables are measured in the same scale (see 1 and 2
in Table 1.1). In other cases, measurements are in different scales (see 3 in Table 1.1). In a few techniques
such as profile analysis (Sections 5.9 and 6.8), the variables must be commensurate, that is, similar in
scale of measurement; however, most multivariate methods do not require this.
Ordinarily the variables are measured simultaneously on each sampling unit. Typically, these variables
are correlated. If this were not so, there would be little use for many of the techniques of multivariate
analysis. We need to untangle the overlapping information provided by correlated variables and peer
beneath the surface to see the underlying structure. Thus the goal of many multivariate approaches
is simplification. We seek to express “what is going on” in terms of a reduced set of dimensions. Such
multivariate techniques are exploratory; they essentially generate hypotheses rather than test them.
On the other hand, if our goal is a formal hypothesis test, we need a technique that will (1) allow several
variables to be tested and still preserve the significance level and (2) do this for any intercorrelation
structure of the variables. Many such tests are available.
As the two preceding paragraphs imply, multivariate analysis is concerned generally with two
areas, descriptive and inferential statistics. In the descriptive realm, we often obtain optimal linear
combinations of variables. The optimality criterion varies from one technique to another, depending on
the goal in each case. Although linear combinations may seem too simple to reveal the underlying
structure, we use them for two obvious reasons: (1) mathematical tractability (linear approximations are
used throughout all science for the same reason) and (2) they often perform well in practice. These linear
functions may also be useful as a follow-up to inferential procedures. When we have a statistically
significant test result that compares several groups, for example, we can find the linear combination (or
combinations) of variables that led to rejection. Then the contribution of each variable to these linear
combinations is of interest.
In the inferential area, many multivariate techniques are extensions of univariate procedures. In such
cases we review the univariate procedure before presenting the analogous multivariate approach.
Multivariate inference is especially useful in curbing the researcher’s natural tendency to read too
much into the data. Total control is provided for experimentwise error rate; that is, no matter how many
variables are tested simultaneously, the value of α (the significance level) remains at the level set by the
researcher.
Some authors warn against applying the common multivariate techniques to data for which the
measurement scale is not interval or ratio. It has been found, however, that many multivariate techniques
give reliable results when applied to ordinal data.
For many years the applications lagged behind the theory because the computations were beyond the
power of the available desk-top calculators. However, with modern computers, virtually any analysis one
desires, no matter how many variables or observations are involved, can be quickly and easily carried out.
Perhaps it is not premature to say that multivariate analysis has come of age.
1.2 PREREQUISITES
The mathematical prerequisite for reading this book is matrix algebra. Calculus is not used [with a brief
exception in equation (4.29)]. But the basic tools of matrix algebra are essential, and the presentation in
Chapter 2 is intended to be sufficiently complete so that the reader with no previous experience can
master matrix manipulation up to the level required in this book.
The statistical prerequisites are basic familiarity with the normal distribution, t-tests, confidence
intervals, multiple regression, and analysis of variance. These techniques are reviewed as each is extended
to the analogous multivariate procedure.
This is a multivariate methods text. Most of the results are given without proof. In a few cases proofs
are provided, but the major emphasis is on heuristic explanations. Our goal is an intuitive grasp of
multivariate analysis, in the same mode as other statistical methods courses. Some problems are algebraic
in nature, but the majority involve data sets to be analyzed.
1.3 OBJECTIVES
We have formulated three objectives that we hope this book will achieve for the reader. These objectives
are based on long experience teaching a course in multivariate methods, consulting on multivariate
problems with researchers in many fields, and guiding statistics graduate students as they consulted with
similar clients.
The first objective is to gain a thorough understanding of the details of various multivariate techniques,
their purposes, their assumptions, their limitations, and so on. Many of these techniques are related, yet
they differ in some essential ways. These similarities and differences are emphasized.
The second objective is to be able to select one or more appropriate techniques for a given multivariate
data set. Recognizing the essential nature of a multivariate data set is the first step in a meaningful
analysis. Basic types of multivariate data are introduced in Section 1.4.
The third objective is to be able to interpret the results of a computer analysis of a multivariate data set.
Reading the manual for a particular program package is not enough to make an intelligent appraisal of the
output. Achievement of the first objective and practice on data sets in the text should help achieve the
third objective.
1.4 BASIC TYPES OF DATA AND ANALYSIS
We will list four basic types of (continuous) multivariate data and then briefly describe some possible
analyses. Some writers would consider this an oversimplification and might prefer elaborate tree
diagrams of data structure. However, many data sets can fit into one of these categories, and the
simplicity of this structure makes it easier to remember. The four basic data types are as follows:
1. A single sample with several variables measured on each sampling unit
(subject or object).
2. A single sample with two sets of variables measured on each unit.
3. Two samples with several variables measured on each unit.
4. Three or more samples with several variables measured on each unit.
Each data type has extensions, and various combinations of the four are possible. A few examples of
analyses for each case will now be given:
1. A single sample with several variables measured on each sampling unit:
a. Test the hypothesis that the means of the variables have specified values.
b. Test the hypothesis that the variables are uncorrelated and have a common
variance.
c. Find a small set of linear combinations of the original variables that
summarizes most of the variation in the data (principal components).
d. Express the original variables as linear functions of a smaller set of
underlying variables that account for the original variables and their inter-
correlations (factor analysis).
2. A single sample with two sets of variables measured on each unit:
a. Determine the number, the size, and the nature of relationships between
the two sets of variables (canonical correlation). For example, we may wish to
relate a set of interest variables to a set of achievement variables. How much
overall correlation is there between these two sets?
b. Find a model to predict one set of variables from the other set (multivariate
multiple regression).
3. Two samples with several variables measured on each unit:
a. Compare the means of the variables across the two samples (Hotelling’s T2-
test).
b. Find a linear combination of the variables that best separates the two
samples (discriminant analysis).
c. Find a function of the variables that will accurately allocate the units into
the two groups (classification analysis).
4. Three or more samples with several variables measured on each unit:
a. Compare the means of the variables across the groups (multivariate
analysis of variance).
b. Extension of 3b to more than two groups.
c. Extension of 3c to more than two groups.
CHAPTER 2
MATRIX ALGEBRA
2.1 INTRODUCTION
This chapter introduces the basic elements of matrix algebra used in the remainder of this book. It is
essentially a review of the requisite matrix tools and is not intended to be a complete development.
However, it is sufficiently self-contained so that those with no previous exposure to the subject should
need no other reference. Anyone unfamiliar with matrix algebra should plan to work most of the problems
entailing numerical illustrations. It would also be helpful to explore some of the problems involving
general matrix manipulation.
With the exception of a few derivations that seemed instructive, most of the results are given without
proof. Some additional proofs are requested in the problems. For the remaining proofs, see any general
text on matrix theory or one of the specialized matrix texts oriented to statistics, such as Graybill (1969),
Searle (1982), or Harville (1997).
2.2 NOTATION AND BASIC
DEFINITIONS
2.2.1 Matrices, Vectors, and Scalars
A matrix is a rectangular or square array of numbers or variables arranged in rows and columns. We use
uppercase boldface letters to represent matrices. All entries in matrices will be real numbers or variables
representing real numbers. The elements of a matrix are displayed in brackets. For example, the ACT
score and GPA for three students can be conveniently listed in the following matrix:
(2.1)
The elements of A can also be variables, representing possible values of ACT and GPA for three students:
(2.2)
In this double-subscript notation for the elements of a matrix, the first subscript indicates the row; the
second identifies the column. The matrix A in (2.2) could also be expressed as
(2.3)
where aij is a general element.
With three rows and two columns, the matrix A in (2.1) or (2.2) is said to be 3 × 2. In general, if a
matrix A has n rows and p columns, it is said to be n × p. Alternatively, we say the size of A is n × p.
A vector is a matrix with a single column or row. The following could be the test scores of a student in a
course in multivariate analysis:
(2.4)
Variable elements in a vector can be identified by a single subscript:
(2.5)
We use lowercase boldface letters for column vectors. Row vectors are expressed as
where x′ indicates the transpose of x. The transpose operation is defined in Section 2.2.3.
Geometrically, a vector with p elements identifies a point in a p-dimensional space. The elements in the
vector are the coordinates of the point. In (2.35) in Section 2.3.3, we define the distance from the origin to
the point. In Section 3.13, we define the distance between two vectors. In some cases, we will be interested
in a directed line segment or arrow from the origin to the point.
A single real number is called a scalar, to distinguish it from a vector or matrix. Thus 2, −4, and 125 are
scalars. A variable representing a scalar will usually be denoted by a lowercase nonbolded letter, such
as a = 5. A product involving vectors and matrices may reduce to a matrix of size 1 × 1, which then
becomes a scalar.
2.2.2 Equality of Vectors and Matrices
Two matrices are equal if they are the same size and the elements in corresponding positions are equal.
Thus if A = (aij) and B = (bij), then A = B if aij = bij for all i and j. For example, let
Then A = C. But even though A and B have the same elements, A ≠ B because the two matrices are not
the same size. Likewise, A ≠ D because a23 ≠ d23. Thus two matrices of the same size are unequal if they
differ in a single position.
2.2.3 Transpose and Symmetric Matrices
The transpose of a matrix A, denoted by A′, is obtained from A by interchanging rows and columns. Thus
the columns of A′ are the rows of A, and the rows of A′ are the columns of A. The following examples
illustrate the transpose of a matrix or vector:
The transpose operation does not change a scalar, since it has only one row and one column.
If the transpose operator is applied twice to any matrix, the result is the original matrix:
(2.6)
If the transpose of a matrix is the same as the original matrix, the matrix is said to besymmetric; that
is, A is symmetric if A = A′. For example,
Clearly, all symmetric matrices are square.
2.2.4 Special Matrices
The diagonal of a p × p square matrix A consists of the elements a11, a22, …, app. For example, in the
matrix
the elements 5, 9, and 1 lie on the diagonal. If a matrix contains zeros in all off-diagonal positions, it is
said to be a diagonal matrix. An example of a diagonal matrix is
This matrix could also be denoted as
(2.7)
A diagonal matrix can be formed from any square matrix by replacing off-diagonal elements by 0’s.
This is denoted by diag(A). Thus for the above matrix A, we have
(2.8)
A diagonal matrix with a 1 in each diagonal position is called an identity matrix and is denoted by I. For
example, a 3 × 3 identity matrix is given by
(2.9)
An upper triangular matrix is a square matrix with zeros below the diagonal, for example,
(2.10)
A lower triangular matrix is defined similarly.
A vector of 1’s will be denoted by j:
(2.11)
A square matrix of 1’s is denoted by J. For example, a 3 × 3 matrix J is given by
(2.12)
Finally, we denote a vector of zeros by 0 and a matrix of zeros by O. For example,
(2.13)
2.3 OPERATIONS
2.3.1 Summation and Product Notation
For completeness, we review the standard mathematical notation for sums and products. The sum of a
sequence of numbers a1, a2, …, an is indicated by
If the n numbers are all the same, then a = a + a + … + a = na. The sum of all the numbers in an
array with double subscripts, such as
is indicated by
This is sometimes abbreviated to
The product of a sequence of numbers a1, a2, …, an is indicated by
If the n numbers are all equal, the product becomes a = (a)(a) … (a) = an.
2.3.2 Addition of Matrices and Vectors
If two matrices (or two vectors) are the same size, their sum is found by adding corresponding elements,
that is, if A is n × p and B is n × p, then C = A + B is also n × pand is found as (cij) = (aij + bij). For
example,
Similarly, the difference between two matrices or two vectors of the same size is found by subtracting
corresponding elements. Thus C = A − B is found as (cij) = (aij − bij). For example,
If two matrices are identical, their difference is a zero matrix; that is, A = B implies A −B = O. For
example,
Matrix addition is commutative:
(2.14)
The transpose of the sum (difference) of two matrices is the sum (difference) of the transposes:
(2.15)
(2.16)
(2.17)
(2.18)
2.3.3 Multiplication of Matrices and Vectors
In order for the product AB to be defined, the number of columns in A must be the same as the number of
rows in B, in which case A and B are said to be conformable. Then the (ij)th element of C = AB is
(2.19)
Thus cij is the sum of products of the ith row of A and the jth column of B. We therefore multiply each row
of A by each column of B, and the size of AB consists of the number of rows of A and the number of
columns of B. Thus, if A is n × m and B is m× p, then C = AB is n × p. For example, if
then
Note that A is 4 × 3, B is 3 × 2, and AB is 4 × 2. In this case, AB is of a different size than either A or B.
If A and B are both n × n, then AB is also n × n. Clearly, A2 is defined only if A is square.
In some cases AB is defined, but BA is not defined. In the above example, BAcannot be found
because B is 3 × 2 and A is 4 × 3 and a row of B cannot be multiplied by a column of A.
Sometimes AB and BA are both defined but are different in size. For example, if A is 2 × 4 and B is 4 × 2,
then AB is 2 × 2 and BA is 4 × 4. If A and B are square and the same size, then AB and BA are both
defined. However,
(2.20)
except for a few special cases. For example, let
Then
Thus we must be careful to specify the order of multiplication. If we wish to multiply both sides of a
matrix equation by a matrix, we must multiply “on the left” or “on the right” and be consistent on both
sides of the equation.
Multiplication is distributive over addition or subtraction:
(2.21)
(2.22)
(2.23)
(2.24)
Note that, in general, because of (2.20),
(2.25)
Using the distributive law, we can expand products such as (A − B)(C − D) to obtain
(2.26)
The transpose of a product is the product of the transposes in reverse order:
(2.27)
Note that (2.27) holds as long as A and B are conformable. They need not be square.
Multiplication involving vectors follows the same rules as for matrices. Suppose A isn × p, a is p ×
1, b is p × 1, and c is n × 1. Then some possible products are Ab, c′A, a′b, b′a, and ab′. For example, let
Then
Note that Ab is a column vector, c′A is a row vector, c′Ab is a scalar, and a′b = b′a. The triple
product c′Ab was obtained as c′(Ab). The same result would be obtained if we multiplied in the
order (c′A)b:
This is true in general for a triple product:
(2.28)
Thus multiplication of three matrices can be defined in terms of the product of two matrices, since
(fortunately) it does not matter which two are multiplied first. Note thatA and B must be conformable for
multiplication, and B and C must be conformable. For example, if A is n × p, B is p × q, and C is q × m,
then both multiplications are possible and the product ABC is n × m.
We can sometimes factor a sum of triple products on both the right and left sides. For example,
(2.29)
As another illustration, let X be n × p and A be n × n. Then
(2.30)
If a and b are both n × 1, then
(2.31)
is a sum of products and is a scalar. On the other hand, ab′ is defined for any size a andb and is a matrix,
either rectangular or square:
(2.32)
Similarly,
(2.33)
(2.34)
Thus a′a is a sum of squares and aa′ is a square (symmetric) matrix. The products a′aand aa′ are
sometimes referred to as the dot product and matrix product, respectively. The square root of the sum of
squares of the elements of a is the distance from the origin to the point a and is also referred to as
the length of a:
(2.35)
As special cases of (2.33) and (2.34), note that if j is n × 1, then
(2.36)
where j and J were defined in (2.11) and (2.12). If a is n × 1 and A is n × p, then
(2.37)
(2.38)
Thus a′j is the sum of the elements in a, j′A contains the column sums of A, and Ajcontains the row sums
of A. In a′j, the vector j is n × 1; in j′A, the vector j is n × 1; and in Aj, the vector j is p × 1.
Since a′b is a scalar, it is equal to its transpose:
(2.39)
This allows us to write (a′b)2 in the form
(2.40)
From (2.18), (2.26), and (2.39) we obtain
(2.41)
Note that in analogous expressions with matrices, however, the two middle terms cannot be combined:
If a and x1, x2, …, xn are all p × 1 and A is p × p, we obtain the following factoring results as extensions
of (2.21) and (2.29):
(2.42)
(2.43)
(2.44)
(2.45)
We can express matrix multiplication in terms of row vectors and column vectors. Ifai′ is the ith row
of A and bj is the jth column of B, then the (ij)th element of AB is ai′bj. For example, if A has three rows
and B has two columns,
then the product AB can be written as
(2.46)
This can be expressed in terms of the rows of A:
(2.47)
Note that the first column of AB in (2.46) is
and likewise the second column is Ab2. Thus AB can be written in the form
This result holds in general:
(2.48)
To further illustrate matrix multiplication in terms of rows and columns,
let matrix, x be a p × 1 vector, and S be a p × p matrix. Then
(2.49)
(2.50)
Any matrix can be multiplied by its transpose. If A is n × p, then
Similarly,
From (2.6) and (2.27), it is clear that both AA′ and A′A are symmetric.
In the above illustration for AB in terms of row and column vectors, the rows of Awere denoted by ai′
and the columns of B by bj. If both rows and columns of a matrix Aare under discussion, as in AA′
and A′A, we will use the notation ai′ for rows and a(j)for columns. To illustrate, if A is 3 × 4, we have
where, for example,
With this notation for rows and columns of A, we can express the elements of A′A or of AA′ as
products of the rows of A or of the columns of A. Thus if we write A in terms of its rows as
then we have
(2.51)
(2.52)
Similarly, if we express A in terms of columns as
then
(2.53)
(2.54)
Let A = (aij) be an n × n matrix and D be a diagonal matrix, D = diag(d1, d2, …, dn). Then, in the
product DA, the ith row of A is multiplied by di, and in AD, the jth column of A is multiplied by dj. For
example, if n = 3, we have
(2.55)
(2.56)
(2.57)
In the special case where the diagonal matrix is the identity, we have
(2.58)
If A is rectangular, (2.58) still holds, but the two identities are of different sizes.
The product of a scalar and a matrix is obtained by multiplying each element of the matrix by the
scalar:
(2.59)
For example,
(2.60)
(2.61)
Since caij = aijc, the product of a scalar and a matrix is commutative:
(2.62)
Multiplication of vectors or matrices by scalars permits the use of linear combinations, such as
If A is a symmetric matrix and x and y are vectors, the product
(2.63)
is called a quadratic form, while
(2.64)
is called a bilinear form. Either of these is, of course, a scalar and can be treated as such. Expressions such
as are permissible (assuming A is positive definite; see Section 2.7).
2.4 PARTITIONED MATRICES
It is sometimes convenient to partition a matrix into submatrices. For example, a partitioning of a
matrix A into four submatrices could be indicated symbolically as follows:
For example, a 4 × 5 matrix A could be partitioned as
where
If two matrices A and B are conformable and A and B are partitioned so that the submatrices are
appropriately conformable, then the product AB can be found by following the usual row-by-column
pattern of multiplication on the submatrices as if they were single elements; for example,
(2.65)
It can be seen that this formulation is equivalent to the usual row-by-column definition of matrix
multiplication. For example, the (1, 1) element of AB is the product of the first row of A and the first
column of B. In the (1, 1) element of A11B11 we have the sum of products of part of the first row of A and
part of the first column of B. In the (1, 1) element of A12B21 we have the sum of products of the rest of the
first row of Aand the remainder of the first column of B.
Multiplication of a matrix and a vector can also be carried out in partitioned form. For example,
(2.66)
where the partitioning of the columns of A corresponds to the partitioning of the elements of b. Note that
the partitioning of A into two sets of columns is indicated by a comma, A = (A1, A2).
The partitioned multiplication in (2.66) can be extended to individual columns of Aand individual
elements of b:
(2.67)
Thus Ab is expressible as a linear combination of the columns of A, the coefficients being elements of b.
For example, let
Then
Using a linear combination of columns of A as in (2.67), we obtain
We note that if A is partitioned as in (2.66), A = (A2, A2), the transpose is not equal to (A′1, A′2), but
rather
(2.68)
2.5 RANK
Before defining the rank of a matrix, we first introduce the notion of linear independence and
dependence. A set of vectors a1, a2, …, an is said to be linearly dependent if constants c1, c2, …, cn (not all
zero) can be found such that
(2.69)
If no constants c1, c2, …, cn can be found satisfying (2.69), the set of vectors is said to be linearly
independent.
If (2.69) holds, then at least one of the vectors ai can be expressed as a linear combination of the other
vectors in the set. Thus linear dependence of a set of vectors implies redundancy in the set. Among
linearly independent vectors there is no redundancy of this type.
The rank of any square or rectangular matrix A is defined as
rank(A) = number of linearly independent rows of A
= number of linearly independent columns of A.
It can be shown that the number of linearly independent rows of a matrix is always equal to the number of
linearly independent columns.
If A is n × p, the maximum possible rank of A is the smaller of n and p, in which caseA is said to be
of full rank (sometimes said full row rank or full column rank). For example,
has rank 2 because the two rows are linearly independent (neither row is a multiple of the other).
However, even though A is full rank, the columns are linearly dependent because rank 2 implies there are
only two linearly independent columns. Thus, by(2.69), there exist constants c1, c2, and c3 such that
(2.70)
By (2.67), we can write (2.70) in the form
or
(2.71)
A solution vector to (2.70) or (2.71) is given by any multiple of c = (14, −11, −12)′. Hence we have the
interesting result that a product of a matrix A and a vector c is equal to 0, even though A ≠ O and c ≠ 0.
This is a direct consequence of the linear dependence of the column vectors of A.
Another consequence of the linear dependence of rows or columns of a matrix is the possibility of
expressions such as AB = CB, where A ≠ C. For example, let
Then
All three of the matrices A, B, and C are full rank; but being rectangular, they have a rank deficiency in
either rows or columns, which permits us to construct AB = CB withA ≠ C. Thus in a matrix equation, we
cannot, in general, cancel matrices from both sides of the equation.
There are two exceptions to this rule. One involves a nonsingular matrix to be defined in Section 2.6.
The other special case occurs when the expression holds for all possible values of the matrix common to
both sides of the equation. For example,
(2.72)
To see this, let × = (1, 0, …, 0)′. Then the first column of A equals the first column of B. Now let × = (0, 1,
0, …, 0)′, and the second column of A equals the second column of B. Continuing in this fashion, we
obtain A = B.
Suppose a rectangular matrix A is n × p of rank p, where p < n. We typically shorten this statement to
“A is nx p of rank p < n.”
2.6 INVERSE
If a matrix A is square and of full rank, then A is said to be nonsingular, and A has a unique inverse,
denoted by A−1, with the property that
(2.73)
For example, let
Then
If A is square and of less than full rank, then an inverse does not exist, and A is said to be singular.
Note that rectangular matrices do not have inverses as in (2.73), even if they are full rank.
If A and B are the same size and nonsingular, then the inverse of their product is the product of their
inverses in reverse order,
(2.74)
Note that (2.74) holds only for nonsingular matrices. Thus, for example, if A is n × p of rank p < n,
then A′A has an inverse, but (A′A)−1 is not equal to A−1(A′)−1 because A is rectangular and does not have
an inverse.
If a matrix is nonsingular, it can be canceled from both sides of an equation, provided it appears on the
left (right) on both sides. For example, if B is nonsingular, then
since we can multiply on the right by B−1 to obtain
Otherwise, if A, B, and C are rectangular or square and singular, it is easy to constructAB = CB,
with A ≠ C, as illustrated near the end of Section 2.5.
The inverse of the transpose of a nonsingular matrix is given by the transpose of the inverse:
(2.75)
If the symmetric nonsingular matrix A is partitioned in the form
then the inverse is given by
(2.76)
where . A nonsingular matrix of the form B + cc′, where B is nonsingular, has as
its inverse
(2.77)
2.7 POSITIVE DEFINITE MATRICES
The symmetric matrix A is said to be positive definite if x′Ax > 0 for all possible vectors x (except x = 0).
Similarly, A is positive semidefinite if x′Ax ≥ 0 for all x ≠ 0. [A quadratic form x′Ax was defined
in (2.63).]
The diagonal elements aii of a positive definite matrix are positive. To see this, let x′ = (0, …, 0, 1, 0, …,
0) with a 1 in the ith position. Then x′Ax = aii > 0. Similarly, for a positive semidefinite matrix A, aii ≥ 0
for all i.
One way to obtain a positive definite matrix is as follows:
(2.78)
This is easily shown:
where z = Bx. Thus, which is positive (Bx cannot be 0 unless x = 0, because B is full
rank). If B is less than full rank, then by a similar argument, B′B is positive semidefinite.
Note that A = B′B is analogous to a = b2 in real numbers, where the square of any number (including
negative numbers) is positive.
In another analogy to positive real numbers, a positive definite matrix can be factored into a “square
root” in two ways. We give one method below in (2.79) and the other in Section 2.11.8.
A positive definite matrix A can be factored into
(2.79)
where T is a nonsingular upper triangular matrix. One way to obtain T is the Cholesky decomposition,
which can be carried out in the following steps.
Let A = (aij) and T = (tij) be n × n. Then the elements of T are found as follows:
For example, let
Then by the Cholesky method, we obtain
2.8 DETERMINANTS
The determinant of an n × n matrix A is defined as the sum of all n! possible products of n elements such
that
1. Each product contains one element from every row and every column, and
2. The factors in each product are written so that the column subscripts
appear in order of magnitude and each product is then preceded by a plus or
minus sign according to whether the number of inversions in the row
subscripts is even or odd.
An inversion occurs whenever a larger number precedes a smaller one. The symbol n! is defined as
(2.80)
The determinant of A is a scalar denoted by |A| or by det(A). The preceding definition is not useful in
evaluating determinants, except in the case of 2 × 2 or 3 × 3 matrices. For larger matrices, other methods
are available for manual computation, but determinants are typically evaluated by computer. For a 2 × 2
matrix, the determinant is found by
(2.81)
For a 3 × 3 matrix, the determinant is given by
(2.82)
This can be found by the following scheme. The three positive terms are obtained by
and the three negative terms by
The determinant of a diagonal matrix is the product of the diagonal elements; that is, if D = diag(d1, d2,
…, dn), then
(2.83)
As a special case of (2.83), suppose all diagonal elements are equal, say,
Then
(2.84)
The extension of (2.84) to any square matrix A is
(2.85)
Because the determinant is a scalar, we can carry out operations such as
provided that |A| > 0 for |A|1/2 and that |A| ≠ 0 for 1/|A|.
If the square matrix A is singular, its determinant is 0:
(2.86)
If A is near singular, then there exists a linear combination of the columns that is close to 0, and |A| is
also close to 0. If A is nonsingular, its determinant is nonzero:
(2.87)
If A is positive definite, its determinant is positive:
(2.88)
If A and B are square and the same size, the determinant of the product is the product of the
determinants:
(2.89)
For example, let
Then
The determinant of the transpose of a matrix is the same as the determinant of the matrix, and the
determinant of the the inverse of a matrix is the reciprocal of the determinant:
(2.90)
(2.91)
If a partitioned matrix has the form
where A11 and A22 are square, but not necessarily the same size, then
(2.92)
For a general partitioned matrix,
where A11 and A22 are square and nonsingular (not necessarily the same size), the determinant is given by
either of the following two expressions:
(2.93)
(2.94)
Note the analogy of (2.93) and (2.94) to the case of the determinant of a 2 × 2 matrix as given by (2.81):
If B is nonsingular and c is a vector, then
(2.95)
2.9 TRACE
A simple function of an n × n matrix A is the trace, denoted by tr(A) and defined as the sum of the
diagonal elements of A; that is, . The trace is, of course, a scalar. For example, suppose
Then
The trace of the sum of two square matrices is the sum of the traces of the two matrices:
(2.96)
An important result for the product of two matrices is
(2.97)
This result holds for any matrices A and B where AB and BA are both defined. It is not necessary
that A and B be square or that AB equal BA. For example, let
Then
From (2.52) and (2.54), we obtain
(2.98)
where the aij′s are elements of the n × p matrix A.
2.10 ORTHOGONAL VECTORS AND
MATRICES
Two vectors a and b of the same size are said to be orthogonal if
(2.99)
Geometrically, orthogonal vectors are perpendicular [see (3.14) and the comments following (3.14)].
If a′a = 1, the vector a is said to be normalized. The vector a can always be normalized by dividing by its
length, . Thus
(2.100)
is normalized so that c′c = 1.
A matrix C = (c1, c2, …, cp) whose columns are normalized and mutually orthogonal is called
an orthogonal matrix. Since the elements of C′C are products of columns of C[see (2.54)], which have the
properties for all i and for all i = j, we have
(2.101)
If C satisfies (2.101), it necessarily follows that
(2.102)
from which we see that the rows of C are also normalized and mutually orthogonal. It is clear
from (2.101) and (2.102) that C−1 = C′ for an orthogonal matrix C.
We illustrate the creation of an orthogonal matrix by starting with
whose columns are mutually orthogonal. To normalize the three columns, we divide by the respective
lengths, , , and , to obtain
Note that the rows also became normalized and mutually orthogonal so that C satisfies
both (2.101) and (2.102).
Multiplication by an orthogonal matrix has the effect of rotating axes; that is, if a point x is transformed
to z = Cx, where C is orthogonal, then
(2.103)
and the distance from the origin to z is the same as the distance to x.
2.11 EIGENVALUES AND
EIGENVECTORS
2.11.1 Definition
For every square matrix A, a scalar λ and a nonzero vector × can be found such that
(2.104)
In (2.104), λ is called an eigenvalue of A and x is an eigenvector. To find λ and x, we write (2.104) as
(2.105)
If |A − λI| ≠ 0, then (A − λI) has an inverse and x = 0 is the only solution. Hence, in order to obtain
nontrivial solutions, we set |A − λI| = 0 to find values of λ that can be substituted into (2.105) to find
corresponding values of x. Alternatively, (2.69) and(2.71) require that the columns of A − λI be linearly
dependent. Thus in (A − λI)x = 0, the matrix A − λI must be singular in order to find a solution
vector x that is not 0.
The equation |A − λI| = 0 is called the characteristic equation. If A is n × n, the characteristic equation
will have n roots; that is, A will have n eigenvalues λ1, λ2, …, λn. The λ’s will not necessarily all be distinct
or all nonzero. However, if A arises from computations on real (continuous) data and is nonsingular, the
λ’s will all be distinct (with probability 1). After finding λ1, λ2, …, λn, the accompanying eigenvectors x1, x2,
…, xn can be found using (2.105).
If we multiply both sides of (2.105) by a scalar k and note by (2.62) that k and A − λIcommute, we
obtain
(2.106)
Thus if x is an eigenvector of A, kx is also an eigenvector, and eigenvectors are unique only up to
multiplication by a scalar. Hence we can adjust the length of x, but the direction from the origin is unique;
that is, the relative values of (ratios of) the components of x = (x1, x2, …, xn)′ are unique. Typically, the
eigenvector x is scaled so that x′x = 1.
To illustrate, we will find the eigenvalues and eigenvectors for the matrix
The characteristic equation is
from which λ1 = 3 and λ2 = 2. To find the eigenvector corresponding to λ1 = 3, we use(2.105),
As expected, either equation is redundant in the presence of the other, and there remains a single
equation with two unknowns, x1 = x2. The solution vector can be written with an arbitrary constant,
If c is set equal to to normalize the eigenvector, we obtain
Similarly, corresponding to λ2 = 2, we have
2.11.2 I + A and I − A
If λ is an eigenvalue of A and x is the corresponding eigenvector, then 1 + λ is an eigenvalue of I + A and 1
− λ is an eigenvalue of I − A. In either case, x is the corresponding eigenvector.
We demonstrate this for I + A:
2.11.3 tr(A) and |A|
For any square matrix A with eigenvalues λ1, λ2, …, λn, we have
(2.107)
(2.108)
Note that by the definition in Section 2.9, tr(A) is also equal to , but aii ≠ λi.
We illustrate (2.107) and (2.108) using the matrix
from the illustration in Section 2.11.1, for which λ1 = 3 and λ2 = 2. Using (2.107), we obtain
and from (2.108), we have
By definition, we obtain
2.11.4 Positive Definite and Semidefinite
Matrices
The eigenvalues and eigenvectors of positive definite and positive semidefinite matrices have the
following properties:
1. The eigenvalues of a positive definite matrix are all positive.
2. The eigenvalues of a positive semidefinite matrix are positive or zero, with
the number of positive eigenvalues equal to the rank of the matrix.
It is customary to list the eigenvalues of a positive definite matrix in descending order: λ1 > λ2 > … > λp.
The eigenvectors x1, x2, …, xn are listed in the same order; x1corresponds to λ1, x2 corresponds to λ2, and
so on.
The following result, known as the Perron–Frobenius theorem, is of interest in Chapter 12: If all
elements of the positive definite matrix A are positive, then all elements of the first eigenvector are
positive. (The first eigenvector is the one associated with the first eigenvalue, λ1).
2.11.5 The Product AB
If A and B are square and the same size, the eigenvalues of AB are the same as those ofBA, although the
eigenvectors are usually different. This result also holds if AB andBA are both square but of different
sizes, as when A is n × p and B is p × n. (In this case, the nonzero eigenvalues of AB and BA will be the
same.)
2.11.6 Symmetric Matrix
The eigenvectors of an n × n symmetric matrix A are mutually orthogonal. It follows that if
the n eigenvectors of A are normalized and inserted as columns of a matrix C = (x1, x2, …, xn), then C is
orthogonal.
2.11.7 Spectral Decomposition
It was noted in Section 2.11.6 that if the matrix C = (x1, x2, …, xn) contains the normalized eigenvectors of
an n × n symmetric matrix A, then C is orthogonal. Therefore, by (2.102), I = CC′, which we can multiply
by A to obtain
We now substitute C = (x1, x2, …, xn):
(2.109)
where
(2.110)
The expression A = CDC′ in (2.109) for a symmetric matrix A in terms of its eigenvalues and eigenvectors
is known as the spectral decomposition of A.
Since C is orthogonal and C′C = CC′ = I, we can multiply (2.109) on the left by C′ and on the right
by C to obtain
(2.111)
Thus a symmetric matrix A can be diagonalized by an orthogonal matrix containing normalized
eigenvectors of A, and by (2.110) the resulting diagonal matrix contains eigenvalues of A.
2.11.8 Square Root Matrix
If A is positive definite, the spectral decomposition of A in (2.109) can be modified by taking the square
roots of the eigenvalues to produce a square root matrix,
(2.112)
where
(2.113)
The square root matrix A1/2 is symmetric and serves as the square root of A:
(2.114)
2.11.9 Square and Inverse Matrices
Other functions of A have spectral decompositions analogous to (2.112). Two of these are the square and
inverse of A. If the square matrix A has eigenvalues λ1, λ2, …, λnand accompanying eigenvectors x1, x2,
…, xn, then A2 has eigenvalues and eigenvectors x1, x2, …, xn. If A is nonsingular,
then A−1 has eigenvalues 1/λ1, 1/λ2, …, 1/λn and eigenvectors x1, x2, …, xn. If A is also symmetric, then
(2.115)
(2.116)
where C = (x1, x2, …, xn) has as columns the normalized eigenvectors of A (and of A2and A−1), D2 = diag
, and D−1 = diag(1/λ1, 1/λ2, …, 1/λn).
2.11.10 Singular Value Decomposition
In Section 2.11.7 we expressed a symmetric matrix in terms of its eigenvalues and eigenvectors in the
spectral decomposition. In a similar manner, we can express any (real) matrix A in terms of eigenvalues
and eigenvectors of A′A and AA′. Let A be an n× p matrix of rank k. Then the singular value
decomposition of A can be expressed as
(2.117)
where U is n × k, D is k × k, and V is p × k. The diagonal elements of the non-singular diagonal
matrix D = diag(λ1, λ2, …, λk) are the positive square roots of , which are the nonzero
eigenvalues of A′A or of AA′. The values λ1, λ2, …, λk are called the singular values of A. The k columns
of U are the normalized eigenvectors of AA′corresponding to the eigenvalues .
The k columns of V are the normalized eigenvectors of A′A corresponding to the
eigenvalues . Since the columns of U and V are (normalized) eigenvectors of symmetric
matrices, they are mutually orthogonal (see Section 2.11.6), and we have U′U = V′V = I.
2.12 KRONECKER AND VEC
NOTATION
When manipulating matrices with block structure, it is often convenient to use Kronecker and vec
notation. The Kronecker product of an m × n matrix A with a p × qmatrix B is an mp × nq matrix that is
denoted A B and is defined as
(2.118)
For example, because of the block structure in the matrix
we can write
where
If A is an m × n matrix with columns a1, …, an, then we can refer to the elements of Ain vector (or
“vec”) form using
(2.119)
so that vec A is an mn × 1 vector.
If A is an m × m symmetric matrix, then the m2 elements in vec A will include m(m − 1)/2 pairs of
identical elements (since aij = aji). In such settings it is often useful to denote the vector half (or “vech”) of
a symmetric matrix in order to include only the unique elements of the matrix. If we separate elements
from different columns using semicolons, we can define the “vech” operator with
(2.120)
so that vech A is an m(m + 1)/2 × 1 vector. Note that vech A can be obtained by finding vec A and then
eliminating the m(m − 1)/2 elements above the diagonal of A.

Weitere ähnliche Inhalte

Was ist angesagt?

Quantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonQuantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonOUmethods
 
Data Analysis, Presentation and Interpretation of Data
Data Analysis, Presentation and Interpretation of DataData Analysis, Presentation and Interpretation of Data
Data Analysis, Presentation and Interpretation of DataRoqui Malijan
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation ModelingUniversity of Southampton
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysissristi1992
 
Data processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewData processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewATHUL RAVI
 
Factor analysis using SPSS
Factor analysis using SPSSFactor analysis using SPSS
Factor analysis using SPSSRemas Mohamed
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"Bashir7576
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric testsai prakash
 
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shettyApplication of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shettySundar B N
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisRobinsonRaja1
 

Was ist angesagt? (20)

Quantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonQuantitative data analysis - John Richardson
Quantitative data analysis - John Richardson
 
Data Analysis, Presentation and Interpretation of Data
Data Analysis, Presentation and Interpretation of DataData Analysis, Presentation and Interpretation of Data
Data Analysis, Presentation and Interpretation of Data
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
 
Data processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewData processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overview
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor analysis using SPSS
Factor analysis using SPSSFactor analysis using SPSS
Factor analysis using SPSS
 
Systematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary SlidesSystematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary Slides
 
Exploratory factor analysis
Exploratory factor analysisExploratory factor analysis
Exploratory factor analysis
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
Univariate Analysis
Univariate AnalysisUnivariate Analysis
Univariate Analysis
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
SPSS
SPSSSPSS
SPSS
 
Correlation
CorrelationCorrelation
Correlation
 
DIstinguish between Parametric vs nonparametric test
 DIstinguish between Parametric vs nonparametric test DIstinguish between Parametric vs nonparametric test
DIstinguish between Parametric vs nonparametric test
 
Spss an introduction
Spss  an introductionSpss  an introduction
Spss an introduction
 
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shettyApplication of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Non-Parametric Tests
Non-Parametric TestsNon-Parametric Tests
Non-Parametric Tests
 

Ähnlich wie Methods of multivariate analysis

Statistical Modelling For Heterogeneous Dataset
Statistical Modelling For Heterogeneous DatasetStatistical Modelling For Heterogeneous Dataset
Statistical Modelling For Heterogeneous Datasetinventionjournals
 
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdf
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdfebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdf
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdfAgathaHaselvin
 
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Israel Vargas
 
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...Dennis j e, schnabel b numerical methods for unconstrained optimization and n...
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...Bruno Rush
 
Chapter 6 data analysis iec11
Chapter 6 data analysis iec11Chapter 6 data analysis iec11
Chapter 6 data analysis iec11Ho Cao Viet
 
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...MelisaRubio1
 
153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-dataNataniel Barros
 
Studyguide FK6163 Basic Statistics UKM
Studyguide FK6163 Basic Statistics UKMStudyguide FK6163 Basic Statistics UKM
Studyguide FK6163 Basic Statistics UKMAzmi Mohd Tamil
 
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdf
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdfUG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdf
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdfManjariPalani1
 
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...DSPG Bangaluru
 
Computer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsComputer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsJahan B Ghasemi
 
The practice of statistics for ap.
The practice of statistics for ap.The practice of statistics for ap.
The practice of statistics for ap.GideonSekyi
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsNitin George
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISBabasID2
 
Applied Time Series Analysis
Applied Time Series AnalysisApplied Time Series Analysis
Applied Time Series AnalysisAmy Cernava
 
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdfHannah Baker
 
Business Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfBusiness Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfJapneetDhillon1
 

Ähnlich wie Methods of multivariate analysis (20)

Statistical Modelling For Heterogeneous Dataset
Statistical Modelling For Heterogeneous DatasetStatistical Modelling For Heterogeneous Dataset
Statistical Modelling For Heterogeneous Dataset
 
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdf
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdfebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdf
ebooksclub.org_Quantitative_Ecology_Second_Edition_Measurement_Models_and_.pdf
 
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
 
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...Dennis j e, schnabel b numerical methods for unconstrained optimization and n...
Dennis j e, schnabel b numerical methods for unconstrained optimization and n...
 
Chapter 6 data analysis iec11
Chapter 6 data analysis iec11Chapter 6 data analysis iec11
Chapter 6 data analysis iec11
 
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
 
153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data153929081 80951377-regression-analysis-of-count-data
153929081 80951377-regression-analysis-of-count-data
 
Studyguide FK6163 Basic Statistics UKM
Studyguide FK6163 Basic Statistics UKMStudyguide FK6163 Basic Statistics UKM
Studyguide FK6163 Basic Statistics UKM
 
Paper473
Paper473Paper473
Paper473
 
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdf
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdfUG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdf
UG_B.Sc._Psycology_11933 –PSYCHOLOGICAL STATISTICS.pdf
 
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...
Machine learning-in-non-stationary-environments-introduction-to-covariate-shi...
 
Computer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related MethodsComputer Aided Drug Design QSAR Related Methods
Computer Aided Drug Design QSAR Related Methods
 
Statsci
StatsciStatsci
Statsci
 
The practice of statistics for ap.
The practice of statistics for ap.The practice of statistics for ap.
The practice of statistics for ap.
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
 
Applied Time Series Analysis
Applied Time Series AnalysisApplied Time Series Analysis
Applied Time Series Analysis
 
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
 
Business Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfBusiness Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdf
 
DS-Intro.pptx
DS-Intro.pptxDS-Intro.pptx
DS-Intro.pptx
 

Mehr von haramaya university

Yesidetegnamastawesha 131015133004-phpapp02
Yesidetegnamastawesha 131015133004-phpapp02Yesidetegnamastawesha 131015133004-phpapp02
Yesidetegnamastawesha 131015133004-phpapp02haramaya university
 
Yeehadegchinqet 131015131752-phpapp01
Yeehadegchinqet 131015131752-phpapp01Yeehadegchinqet 131015131752-phpapp01
Yeehadegchinqet 131015131752-phpapp01haramaya university
 
Tesfayegebreabmannew2 131017153753-phpapp01
Tesfayegebreabmannew2 131017153753-phpapp01Tesfayegebreabmannew2 131017153753-phpapp01
Tesfayegebreabmannew2 131017153753-phpapp01haramaya university
 
Digitalstrategygivecamp 120506111603-phpapp02
Digitalstrategygivecamp 120506111603-phpapp02Digitalstrategygivecamp 120506111603-phpapp02
Digitalstrategygivecamp 120506111603-phpapp02haramaya university
 
Computerapplicationsnotesgrade6 120320153437-phpapp01
Computerapplicationsnotesgrade6 120320153437-phpapp01Computerapplicationsnotesgrade6 120320153437-phpapp01
Computerapplicationsnotesgrade6 120320153437-phpapp01haramaya university
 
Stressmanagement 121030071806-phpapp01
Stressmanagement 121030071806-phpapp01Stressmanagement 121030071806-phpapp01
Stressmanagement 121030071806-phpapp01haramaya university
 
Pastoral growth study policy retrospective paper 1 final p1
Pastoral growth study policy retrospective paper 1 final p1Pastoral growth study policy retrospective paper 1 final p1
Pastoral growth study policy retrospective paper 1 final p1haramaya university
 
Nber working paper series explaining recent trends in the u
Nber working paper series explaining recent trends in the uNber working paper series explaining recent trends in the u
Nber working paper series explaining recent trends in the uharamaya university
 
Kearney levine-trends-nov2013-new nber
Kearney levine-trends-nov2013-new nberKearney levine-trends-nov2013-new nber
Kearney levine-trends-nov2013-new nberharamaya university
 

Mehr von haramaya university (20)

Chem 10 night
Chem 10 nightChem 10 night
Chem 10 night
 
Yesidetegnamastawesha 131015133004-phpapp02
Yesidetegnamastawesha 131015133004-phpapp02Yesidetegnamastawesha 131015133004-phpapp02
Yesidetegnamastawesha 131015133004-phpapp02
 
Yeehadegchinqet 131015131752-phpapp01
Yeehadegchinqet 131015131752-phpapp01Yeehadegchinqet 131015131752-phpapp01
Yeehadegchinqet 131015131752-phpapp01
 
Tesfayegebreabmannew2 131017153753-phpapp01
Tesfayegebreabmannew2 131017153753-phpapp01Tesfayegebreabmannew2 131017153753-phpapp01
Tesfayegebreabmannew2 131017153753-phpapp01
 
Semayawi1 131015152400-phpapp01
Semayawi1 131015152400-phpapp01Semayawi1 131015152400-phpapp01
Semayawi1 131015152400-phpapp01
 
Digitalstrategygivecamp 120506111603-phpapp02
Digitalstrategygivecamp 120506111603-phpapp02Digitalstrategygivecamp 120506111603-phpapp02
Digitalstrategygivecamp 120506111603-phpapp02
 
Computerapplicationsnotesgrade6 120320153437-phpapp01
Computerapplicationsnotesgrade6 120320153437-phpapp01Computerapplicationsnotesgrade6 120320153437-phpapp01
Computerapplicationsnotesgrade6 120320153437-phpapp01
 
Stressmanagement 121030071806-phpapp01
Stressmanagement 121030071806-phpapp01Stressmanagement 121030071806-phpapp01
Stressmanagement 121030071806-phpapp01
 
Working 105 pdf d
Working 105  pdf dWorking 105  pdf d
Working 105 pdf d
 
What is econometrics
What is econometricsWhat is econometrics
What is econometrics
 
W17964
W17964W17964
W17964
 
Seefeld stats r_bio
Seefeld stats r_bioSeefeld stats r_bio
Seefeld stats r_bio
 
Report 339
Report 339Report 339
Report 339
 
Report 339 x
Report 339 xReport 339 x
Report 339 x
 
Pastoral growth study policy retrospective paper 1 final p1
Pastoral growth study policy retrospective paper 1 final p1Pastoral growth study policy retrospective paper 1 final p1
Pastoral growth study policy retrospective paper 1 final p1
 
No 105
No 105No 105
No 105
 
Nber working paper series explaining recent trends in the u
Nber working paper series explaining recent trends in the uNber working paper series explaining recent trends in the u
Nber working paper series explaining recent trends in the u
 
Name
NameName
Name
 
Kearney levine-trends-nov2013-new nber
Kearney levine-trends-nov2013-new nberKearney levine-trends-nov2013-new nber
Kearney levine-trends-nov2013-new nber
 
Journal of agrarian change
Journal of agrarian changeJournal of agrarian change
Journal of agrarian change
 

Methods of multivariate analysis

  • 1. Methods of Multivariate Analysis [Hardcover]
  • 2. Contents Cover Half Title page Title page Copyright page Preface Acknowledgments Chapter 1: Introduction 1.1 Why Multivariate Analysis? 1.2 Prerequisites 1.3 Objectives 1.4 Basic Types of Data and Analysis Chapter 2: Matrix Algebra 2.1 Introduction 2.2 Notation and Basic Definitions 2.3 Operations 2.4 Partitioned Matrices 2.5 Rank 2.6 Inverse 2.7 Positive Definite Matrices 2.8 Determinants 2.9 Trace 2.10 Orthogonal Vectors and Matrices 2.11 Eigenvalues and Eigenvectors 2.12 Kronecker and Vec Notation Problems Chapter 3: Characterizing and Displaying Multivariate Data 3.1 Mean and Variance of a Univariate Random Variable 3.2 Covariance and Correlation of Bivariate Random Variables 3.3 Scatterplots of Bivariate Samples 3.4 Graphical Displays for Multivariate Samples 3.5 Dynamic Graphics 3.6 Mean Vectors 3.7 Covariance Matrices 3.8 Correlation Matrices 3.9 Mean Vectors and Covariance Matrices for Subsets of Variables 3.10 Linear Combinations of Variables 3.11 Measures of Overall Variability
  • 3. 3.12 Estimation of Missing Values 3.13 Distance Between Vectors Problems Chapter 4: The Multivariate Normal Distribution 4.1 Multivariate Normal Density Function 4.2 Properties of Multivariate Normal Random Variables 4.3 Estimation in the Multivariate Normal 4.4 Assessing Multivariate Normality 4.5 Transformations to Normality 4.6 Outliers Problems Chapter 5: Tests on One or Two Mean Vectors 5.1 Multivariate Versus Univariate Tests 5.2 Tests on μ With Σ Known 5.3 Tests on μ When Σ Is Unknown 5.4 Comparing Two Mean Vectors 5.5 Tests on Individual Variables Conditional on Rejection of H0 By the T2- Test 5.6 Computation of T2 5.7 Paired Observations Test 5.8 Test for Additional Information 5.9 Profile Analysis Problems Chapter 6: Multivariate Analysis of Variance 6.1 One-Way Models 6.2 Comparison of the Four Manova Test Statistics 6.3 Contrasts 6.4 Tests on Individual Variables Following Rejection ofH0 By the Overall Manova Test 6.5 Two-Way Classification 6.6 Other Models 6.7 Checking on the Assumptions 6.8 Profile Analysis 6.9 Repeated Measures Designs 6.10 Growth Curves 6.11 Tests on a Subvector Problems Chapter 7: Tests on Covariance Matrices 7.1 Introduction 7.2 Testing a Specified Pattern for Σ 7.3 Tests Comparing Covariance Matrices
  • 4. 7.4 Tests of Independence Problems Chapter 8: Discriminant Analysis: Description of Group Separation 8.1 Introduction 8.2 The Discriminant Function for Two Groups 8.3 Relationship Between Two-Group Discriminant Analysis and Multiple Regression 8.4 Discriminant Analysis for Several Groups 8.5 Standardized Discriminant Functions 8.6 Tests of Significance 8.7 Interpretation of Discriminant Functions 8.8 Scatterplots 8.9 Stepwise Selection of Variables Problems Chapter 9: Classification Analysis: Allocation of Observations to Groups 9.1 Introduction 9.2 Classification Into Two Groups 9.3 Classification Into Several Groups 9.4 Estimating Misclassification Rates 9.5 Improved Estimates of Error Rates 9.6 Subset Selection 9.7 Nonparametric Procedures Problems Chapter 10: Multivariate Regression 10.1 Introduction 10.2 Multiple Regression: Fixed x’s 10.4 Multivariate Multiple Regression: Estimation 10.5 Multivariate Multiple Regression: Hypothesis Tests 10.6 Multivariate Multiple Regression: Prediction 10.7 Measures of Association Between the y’s and the x’s 10.8 Subset Selection 10.9 Multivariate Regression: Random x’s Problems Chapter 11: Canonical Correlation 11.1 Introduction 11.2 Canonical Correlations and Canonical Variates 11.3 Properties of Canonical Correlations 11.4 Tests of Significance 11.5 Interpretation
  • 5. 11.6 Relationships of Canonical Correlation Analysis to Other Multivariate Techniques Problems Chapter 12: Principal Component Analysis 12.1 Introduction 12.2 Geometric and Algebraic Bases of Principal Components 12.3 Principal Components and Perpendicular Regression 12.4 Plotting of Principal Components 12.5 Principal Components From the Correlation Matrix 12.6 Deciding How Many Components to Retain 12.7 Information in the Last Few Principal Components 12.8 Interpretation of Principal Components 12.9 Selection of Variables Problems Chapter 13: Exploratory Factor Analysis 13.1 Introduction 13.2 Orthogonal Factor Model 13.3 Estimation of Loadings and Communalities 13.4 Choosing the Number of Factors, M 13.5 Rotation 13.6 Factor Scores 13.7 Validity of the Factor Analysis Model 13.8 Relationship of Factor Analysis to Principal Component Analysis Problems Chapter 14: Confirmatory Factor Analysis 14.1 Introduction 14.2 Model Specification and Identification 14.3 Parameter Estimation and Model Assessment 14.4 Inference for Model Parameters 14.5 Factor Scores Problems Chapter 15: Cluster Analysis 15.1 Introduction 15.2 Measures of Similarity or Dissimilarity 15.3 Hierarchical Clustering 15.4 Nonhierarchical Methods 15.5 Choosing the Number of Clusters 15.6 Cluster Validity 15.7 Clustering Variables Problems Chapter 16: Graphical Procedures
  • 6. 16.1 Multidimensional Scaling 16.2 Correspondence Analysis 16.3 Biplots Problems Appendix A: Tables Appendix B: Answers and Hints to Problems Appendix C: Data Sets and Sas Files References Index METHODS OF MULTIVARIATE ANALYSIS WILEY SERIES IN PROBABILITY AND STATISTICS ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, Jozef L. Teugels The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods. Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research. † ABRAHAM and LEDOLTER · Statistical Methods for Forecasting AGRESTI · Analysis of Ordinal Categorical Data, Second Edition AGRESTI · An Introduction to Categorical Data Analysis, Second Edition AGRESTI · Categorical Data Analysis, Second Edition ALTMAN, GILL, and McDONALD · Numerical Issues in Statistical Computing for the Social Scientist AMARATUNGA and CABRERA · Exploration and Analysis of DNA Microarray and Protein Array Data AND L · Mathematics of Chance ANDERSON · An Introduction to Multivariate Statistical Analysis, Third Edition * ANDERSON · The Statistical Analysis of Time Series ANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG · Statistical Methods for Comparative Studies ANDERSON and LOYNES · The Teaching of Practical Statistics ARMITAGE and DAVID (editors) · Advances in Biometry ARNOLD, BALAKRISHNAN, and NAGARAJA · Records * ARTHANARI and DODGE · Mathematical Programming in Statistics * BAILEY · The Elements of Stochastic Processes with Applications to the Natural Sciences BAJORSKI · Statistics for Imaging, Optics, and Photonics BALAKRISHNAN and KOUTRAS · Runs and Scans with Applications BALAKRISHNAN and NG · Precedence-Type Tests and Applications BARNETT · Comparative Statistical Inference, Third Edition BARNETT · Environmental Statistics BARNETT and LEWIS · Outliers in Statistical Data, Third Edition BARTHOLOMEW, KNOTT, and MOUSTAKI · Latent Variable Models and Factor Analysis: A Unified Approach, Third Edition
  • 7. BARTOSZYNSKI and NIEWIADOMSKA-BUGAJ · Probability and Statistical Inference, Second Edition BASILEVSKY · Statistical Factor Analysis and Related Methods: Theory and Applications BATES and WATTS · Nonlinear Regression Analysis and Its Applications BECHHOFER, SANTNER, and GOLDSMAN · Design and Analysis of Experiments for Statistical Selection, Screening, and Multiple Comparisons BEIRLANT, GOEGEBEUR, SEGERS, TEUGELS, and DE WAAL · Statistics of Extremes: Theory and Applications BELSLEY · Conditioning Diagnostics: Collinearity and Weak Data in Regression † BELSLEY, KUH, and WELSCH · Regression Diagnostics: Identifying Influential Data and Sources of Collinearity BENDAT and PIERSOL · Random Data: Analysis and Measurement Procedures,Fourth Edition BERNARDO and SMITH · Bayesian Theory BHAT and MILLER · Elements of Applied Stochastic Processes, Third Edition BHATTACHARYA and WAYMIRE · Stochastic Processes with Applications BIEMER, GROVES, LYBERG, MATHIOWETZ, and SUDMAN · Measurement Errors in Surveys BILLINGSLEY · Convergence of Probability Measures, Second Edition BILLINGSLEY · Probability and Measure, Anniversary Edition BIRKES and DODGE · Alternative Methods of Regression BISGAARD and KULAHCI · Time Series Analysis and Forecasting by Example BISWAS, DATTA, FINE, and SEGAL · Statistical Advances in the Biomedical Sciences: Clinical Trials, Epidemiology, Survival Analysis, and Bioinformatics BLISCHKE and MURTHY (editors) · Case Studies in Reliability and Maintenance BLISCHKE and MURTHY · Reliability: Modeling, Prediction, and Optimization BLOOMFIELD · Fourier Analysis of Time Series: An Introduction, Second Edition BOLLEN · Structural Equations with Latent Variables BOLLEN and CURRAN · Latent Curve Models: A Structural Equation Perspective BOROVKOV · Ergodicity and Stability of Stochastic Processes BOSQ and BLANKE · Inference and Prediction in Large Dimensions BOULEAU · Numerical Methods for Stochastic Processes * BOX and TIAO · Bayesian Inference in Statistical Analysis BOX · Improving Almost Anything, Revised Edition * BOX and DRAPER · Evolutionary Operation: A Statistical Method for Process Improvement BOX and DRAPER · Response Surfaces, Mixtures, and Ridge Analyses, Second Edition BOX, HUNTER, and HUNTER · Statistics for Experimenters: Design, Innovation, and Discovery, Second Editon BOX, JENKINS, and REINSEL · Time Series Analysis: Forcasting and Control, Fourth Edition BOX, LUCEÑO, and PANIAGUA-QUIÑONES · Statistical Control by Monitoring and Adjustment, Second Edition * BROWN and HOLLANDER · Statistics: A Biomedical Introduction CAIROLI and DALANG · Sequential Stochastic Optimization CASTILLO, HADI, BALAKRISHNAN, and SARABIA · Extreme Value and Related Models with Applications in Engineering and Science CHAN · Time Series: Applications to Finance with R and S-Plus®, Second Edition CHARALAMBIDES · Combinatorial Methods in Discrete Distributions CHATTERJEE and HADI · Regression Analysis by Example, Fourth Edition CHATTERJEE and HADI · Sensitivity Analysis in Linear Regression CHERNICK · Bootstrap Methods: A Guide for Practitioners and Researchers, Second Edition CHERNICK and FRIIS · Introductory Biostatistics for the Health Sciences CHILÈS and DELFINER · Geostatistics: Modeling Spatial Uncertainty, Second Edition CHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies,Second Edition CLARKE · Linear Models: The Theory and Application of Analysis of Variance CLARKE and DISNEY · Probability and Random Processes: A First Course with Applications, Second Edition * COCHRAN and COX · Experimental Designs, Second Edition COLLINS and LANZA · Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences CONGDON · Applied Bayesian Modelling CONGDON · Bayesian Models for Categorical Data
  • 8. CONGDON · Bayesian Statistical Modelling, Second Edition CONOVER · Practical Nonparametric Statistics, Third Edition COOK · Regression Graphics COOK and WEISBERG · An Introduction to Regression Graphics COOK and WEISBERG · Applied Regression Including Computing and Graphics CORNELL · A Primer on Experiments with Mixtures CORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Third Edition COX · A Handbook of Introductory Statistical Methods CRESSIE · Statistics for Spatial Data, Revised Edition CRESSIE and WIKLE · Statistics for Spatio-Temporal Data CSÖRGÖ and HORVÁTH · Limit Theorems in Change Point Analysis DAGPUNAR · Simulation and Monte Carlo: With Applications in Finance and MCMC DANIEL · Applications of Statistics to Industrial Experimentation DANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Eighth Edition * DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition DASU and JOHNSON · Exploratory Data Mining and Data Cleaning DAVID and NAGARAJA · Order Statistics, Third Edition * DEGROOT, FIENBERG, and KADANE · Statistics and the Law DEL CASTILLO · Statistical Process Adjustment for Quality Control DEMARIS · Regression with Social Data: Modeling Continuous and Limited Response Variables DEMIDENKO · Mixed Models: Theory and Applications DENISON, HOLMES, MALLICK and SMITH · Bayesian Methods for Nonlinear Classification and Regression DETTE and STUDDEN · The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis DEY and MUKERJEE · Fractional Factorial Plans DILLON and GOLDSTEIN · Multivariate Analysis: Methods and Applications * DODGE and ROMIG · Sampling Inspection Tables, Second Edition * DOOB · Stochastic Processes DOWDY, WEARDEN, and CHILKO · Statistics for Research, Third Edition DRAPER and SMITH · Applied Regression Analysis, Third Edition DRYDEN and MARDIA · Statistical Shape Analysis DUDEWICZ and MISHRA · Modern Mathematical Statistics DUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences, Fourth Edition DUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations EDLER and KITSOS · Recent Advances in Quantitative Methods in Cancer and Human Health Risk Assessment * ELANDT-JOHNSON and JOHNSON · Survival Models and Data Analysis ENDERS · Applied Econometric Time Series, Third Edition † ETHIER and KURTZ · Markov Processes: Characterization and Convergence EVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third Edition EVERITT, LANDAU, LEESE, and STAHL · Cluster Analysis, Fifth Edition FEDERER and KING · Variations on Split Plot and Split Block Experiment Designs FELLER · An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised; Volume II, Second Edition FITZMAURICE, LAIRD, and WARE · Applied Longitudinal Analysis, Second Edition * FLEISS · The Design and Analysis of Clinical Experiments FLEISS · Statistical Methods for Rates and Proportions, Third Edition † FLEMING and HARRINGTON · Counting Processes and Survival Analysis FUJIKOSHI, ULYANOV, and SHIMIZU · Multivariate Statistics: High-Dimensional and Large-Sample Approximations FULLER · Introduction to Statistical Time Series, Second Edition † FULLER · Measurement Error Models GALLANT · Nonlinear Statistical Models GEISSER · Modes of Parametric Statistical Inference
  • 9. GELMAN and MENG · Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives GEWEKE · Contemporary Bayesian Econometrics and Statistics GHOSH, MUKHOPADHYAY, and SEN · Sequential Estimation GIESBRECHT and GUMPERTZ · Planning, Construction, and Statistical Analysis of Comparative Experiments GIFI · Nonlinear Multivariate Analysis GIVENS and HOETING · Computational Statistics GLASSERMAN and YAO · Monotone Structure in Discrete-Event Systems GNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations, Second Edition GOLDSTEIN · Multilevel Statistical Models, Fourth Edition GOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical Issues GOLDSTEIN and WOOFF · Bayes Linear Statistics GREENWOOD and NIKULIN · A Guide to Chi-Squared Testing GROSS, SHORTLE, THOMPSON, and HARRIS · Fundamentals of Queueing Theory,Fourth Edition GROSS, SHORTLE, THOMPSON, and HARRIS · Solutions Manual to Accompany Fundamentals of Queueing Theory, Fourth Edition * HAHN and SHAPIRO · Statistical Models in Engineering HAHN and MEEKER · Statistical Intervals: A Guide for Practitioners HALD · A History of Probability and Statistics and their Applications Before 1750 † HAMPEL · Robust Statistics: The Approach Based on Influence Functions HARTUNG, KNAPP, and SINHA · Statistical Meta-Analysis with Applications HEIBERGER · Computation for the Analysis of Designed Experiments HEDAYAT and SINHA · Design and Inference in Finite Population Sampling HEDEKER and GIBBONS · Longitudinal Data Analysis HELLER · MACSYMA for Statisticians HERITIER, CANTONI, COPT, and VICTORIA-FESER · Robust Methods in Biostatistics HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 1: Introduction to Experimental Design, Second Edition HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 2: Advanced Experimental Design HINKELMANN (editor) · Design and Analysis of Experiments, Volume 3: Special Designs and Applications HOAGLIN, MOSTELLER, and TUKEY · Fundamentals of Exploratory Analysis of Variance * HOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes * HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data Analysis HOCHBERG and TAMHANE · Multiple Comparison Procedures HOCKING · Methods and Applications of Linear Models: Regression and the Analysis of Variance, Second Edition HOEL · Introduction to Mathematical Statistics, Fifth Edition HOGG and KLUGMAN · Loss Distributions HOLLANDER and WOLFE · Nonparametric Statistical Methods, Second Edition HOSMER and LEMESHOW · Applied Logistic Regression, Second Edition HOSMER, LEMESHOW, and MAY · Applied Survival Analysis: Regression Modeling of Time-to-Event Data, Second Edition HUBER · Data Analysis: What Can Be Learned From the Past 50 Years HUBER · Robust Statistics † HUBER and RONCHETTI · Robust Statistics, Second Edition HUBERTY · Applied Discriminant Analysis, Second Edition HUBERTY and OLEJNIK · Applied MANOVA and Discriminant Analysis, Second Edition HUITEMA · The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi- Experiments, and Single-Case Studies, Second Edition HUNT and KENNEDY · Financial Derivatives in Theory and Practice, Revised Edition HURD and MIAMEE · Periodically Correlated Random Sequences: Spectral Theory and Practice HUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek—with Commentary HUZURBAZAR · Flowgraph Models for Multistate Time-to-Event Data
  • 10. JACKMAN · Bayesian Analysis for the Social Sciences † JACKSON · A User’s Guide to Principle Components JOHN · Statistical Methods in Engineering and Quality Assurance JOHNSON · Multivariate Statistical Simulation JOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A Volume in Honor of Samuel Kotz JOHNSON, KEMP, and KOTZ · Univariate Discrete Distributions, Third Edition JOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the Seventeenth Century to the Present JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 1, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 2, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate Distributions JUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE · The Theory and Practice of Econometrics, Second Edition JUREK and MASON · Operator-Limit Distributions in Probability Theory KADANE · Bayesian Methods and Ethics in a Clinical Trial Design KADANE AND SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti Evidence KALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data, Second Edition KARIYA and KURATA · Generalized Least Squares KASS and VOS · Geometrical Foundations of Asymptotic Inference † KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to Cluster Analysis KEDEM and FOKIANOS · Regression Models for Time Series Analysis KENDALL, BARDEN, CARNE, and LE · Shape and Shape Theory KHURI · Advanced Calculus with Applications in Statistics, Second Edition KHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear Models * KISH · Statistical Design for Research KLEIBER and KOTZ · Statistical Size Distributions in Economics and Actuarial Sciences KLEMELÄ · Smoothing of Multivariate Data: Density Estimation and Visualization KLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to Decisions, Third Edition KLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models: From Data to Decisions, Third Edition KOSKI and NOBLE · Bayesian Networks: An Introduction KOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions, Volume 1, Second Edition KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9 with Index KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement Volume KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 1 KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 2 KOWALSKI and TU · Modern Applied U-Statistics KRISHNAMOORTHY and MATHEW · Statistical Tolerance Regions: Theory, Applications, and Computation KROESE, TAIMRE, and BOTEV · Handbook of Monte Carlo Methods KROONENBERG · Applied Multiway Data Analysis KULINSKAYA, MORGENTHALER, and STAUDTE · Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence KULKARNI and HARMAN · An Elementary Introduction to Statistical Learning Theory KUROWICKA and COOKE · Uncertainty Analysis with High Dimensional Dependence Modelling KVAM and VIDAKOVIC · Nonparametric Statistics with Applications to Science and Engineering LACHIN · Biostatistical Methods: The Assessment of Relative Risks, Second Edition LAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and Historical Introduction LAMPERTI · Probability: A Survey of the Mathematical Theory, Second Edition LAWLESS · Statistical Models and Methods for Lifetime Data, Second Edition LAWSON · Statistical Methods in Spatial Epidemiology, Second Edition
  • 11. LE · Applied Categorical Data Analysis, Second Edition LE · Applied Survival Analysis LEE · Structural Equation Modeling: A Bayesian Approach LEE and WANG · Statistical Methods for Survival Data Analysis, Third Edition LEPAGE and BILLARD · Exploring the Limits of Bootstrap LESSLER and KALSBEEK · Nonsampling Errors in Surveys LEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health Statistics LIAO · Statistical Group Comparison LIN · Introductory Stochastic Analysis for Finance and Insurance LITTLE and RUBIN · Statistical Analysis with Missing Data, Second Edition LLOYD · The Statistical Analysis of Categorical Data LOWEN and TEICH · Fractal-Based Point Processes MAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in Statistics and Econometrics, Revised Edition MALLER and ZHOU · Survival Analysis with Long Term Survivors MARCHETTE · Random Graphs for Statistical Pattern Recognition MARDIA and JUPP · Directional Statistics MARKOVICH · Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice MARONNA, MARTIN and YOHAI · Robust Statistics: Theory and Methods MASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with Applications to Engineering and Science, Second Edition McCULLOCH, SEARLE, and NEUHAUS · Generalized, Linear, and Mixed Models,Second Edition McFADDEN · Management of Data in Clinical Trials, Second Edition * McLACHLAN · Discriminant Analysis and Statistical Pattern Recognition McLACHLAN, DO, and AMBROISE · Analyzing Microarray Gene Expression Data McLACHLAN and KRISHNAN · The EM Algorithm and Extensions, Second Edition McLACHLAN and PEEL · Finite Mixture Models McNEIL · Epidemiological Research Methods MEEKER and ESCOBAR · Statistical Methods for Reliability Data MEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent Random Vectors: Heavy Tails in Theory and Practice MENGERSEN, ROBERT, and TITTERINGTON · Mixtures: Estimation and Applications MICKEY, DUNN, and CLARK · Applied Statistics: Analysis of Variance and Regression, Third Edition * MILLER · Survival Analysis, Second Edition MONTGOMERY, JENNINGS, and KULAHCI · Introduction to Time Series Analysis and Forecasting MONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis,Fifth Edition MORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical Robustness MUIRHEAD · Aspects of Multivariate Statistical Theory MULLER and STOYAN · Comparison Methods for Stochastic Models and Risks MURTHY, XIE, and JIANG · Weibull Models MYERS, MONTGOMERY, and ANDERSON-COOK · Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Third Edition MYERS, MONTGOMERY, VINING, and ROBINSON · Generalized Linear Models. With Applications in Engineering and the Sciences, Second Edition NATVIG · Multistate Systems Reliability Theory With Applications † NELSON · Accelerated Testing, Statistical Models, Test Plans, and Data Analyses † NELSON · Applied Life Data Analysis NEWMAN · Biostatistical Methods in Epidemiology NG, TAIN, and TANG · Dirichlet Theory: Theory, Methods and Applications OKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and Applications of Voronoi Diagrams, Second Edition OLIVER and SMITH · Influence Diagrams, Belief Nets and Decision Analysis PALTA · Quantitative Methods in Population Health: Extensions of Ordinary Regressions PANJER · Operational Risk: Modeling and Analytics PANKRATZ · Forecasting with Dynamic Regression Models PANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases
  • 12. PARDOUX · Markov Processes and Applications: Algorithms, Networks, Genome and Finance PARMIGIANI and INOUE · Decision Theory: Principles and Approaches * PARZEN · Modern Probability Theory and Its Applications PEÑA, TIAO, and TSAY · A Course in Time Series Analysis PESARIN and SALMASO · Permutation Tests for Complex Data: Applications and Software PIANTADOSI · Clinical Trials: A Methodologic Perspective, Second Edition POURAHMADI · Foundations of Time Series Analysis and Prediction Theory POWELL · Approximate Dynamic Programming: Solving the Curses of Dimensionality,Second Edition POWELL and RYZHOV · Optimal Learning PRESS · Subjective and Objective Bayesian Statistics, Second Edition PRESS and TANUR · The Subjectivity of Scientists and the Bayesian Approach PURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied Statistics † PUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming QIU · Image Processing and Jump Regression Analysis * RAO · Linear Statistical Inference and Its Applications, Second Edition RAO · Statistical Inference for Fractional Diffusion Processes RAUSAND and HØYLAND · System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition RAYNER, THAS, and BEST · Smooth Tests of Goodnes of Fit: Using R, Second Edition RENCHER and SCHAALJE · Linear Models in Statistics, Second Edition RENCHER and CHRISTENSEN · Methods of Multivariate Analysis, Third Edition RENCHER · Multivariate Statistical Inference with Applications RIGDON and BASU · Statistical Methods for the Reliability of Repairable Systems * RIPLEY · Spatial Statistics * RIPLEY · Stochastic Simulation ROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insurance and Finance ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and Practice ROSSI, ALLENBY, and McCULLOCH · Bayesian Statistics and Marketing † ROUSSEEUW and LEROY · Robust Regression and Outlier Detection ROYSTON and SAUERBREI · Multivariate Model Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modeling Continuous Variables * RUBIN · Multiple Imputation for Nonresponse in Surveys RUBINSTEIN and KROESE · Simulation and the Monte Carlo Method, Second Edition RUBINSTEIN and MELAMED · Modern Simulation and Modeling RYAN · Modern Engineering Statistics RYAN · Modern Experimental Design RYAN · Modern Regression Methods, Second Edition RYAN · Statistical Methods for Quality Improvement, Third Edition SALEH · Theory of Preliminary Test and Stein-Type Estimation with Applications SALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis SCHERER · Batch Effects and Noise in Microarray Experiments: Sources and Solutions * SCHEFFE · The Analysis of Variance SCHIMEK · Smoothing and Regression: Approaches, Computation, and Application SCHOTT · Matrix Analysis for Statistics, Second Edition SCHOUTENS · Levy Processes in Finance: Pricing Financial Derivatives SCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization * SEARLE · Linear Models † SEARLE · Linear Models for Unbalanced Data † SEARLE · Matrix Algebra Useful for Statistics † SEARLE, CASELLA, and McCULLOCH · Variance Components SEARLE and WILLETT · Matrix Algebra for Applied Economics SEBER · A Matrix Handbook For Statisticians † SEBER · Multivariate Observations SEBER and LEE · Linear Regression Analysis, Second Edition † SEBER and WILD · Nonlinear Regression
  • 13. SENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems * SERFLING · Approximation Theorems of Mathematical Statistics SHAFER and VOVK · Probability and Finance: It’s Only a Game! SHERMAN · Spatial Statistics and Spatio-Temporal Data: Covariance Functions and Directional Properties SILVAPULLE and SEN · Constrained Statistical Inference: Inequality, Order, and Shape Restrictions SINGPURWALLA · Reliability and Risk: A Bayesian Perspective SMALL and McLEISH · Hilbert Space Methods in Probability and Statistical Inference SRIVASTAVA · Methods of Multivariate Statistics STAPLETON · Linear Statistical Models, Second Edition STAPLETON · Models for Probability and Statistical Inference: Theory and Applications STAUDTE and SHEATHER · Robust Estimation and Testing STOYAN · Counterexamples in Probability, Second Edition STOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second Edition STOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of Geometrical Statistics STREET and BURGESS · The Construction of Optimal Stated Choice Experiments: Theory and Methods STYAN · The Collected Papers of T. W. Anderson: 1943–1985 SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-Analysis in Medical Research TAKEZAWA · Introduction to Nonparametric Regression TAMHANE · Statistical Analysis of Designed Experiments: Theory and Applications TANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution Theory THOMPSON · Empirical Model Building: Data, Models, and Reality, Second Edition THOMPSON · Sampling, Third Edition THOMPSON · Simulation: A Modeler’s Approach THOMPSON and SEBER · Adaptive Sampling THOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World Markets TIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics TSAY · Analysis of Financial Time Series, Third Edition UPTON and FINGLETON · Spatial Data Analysis by Example, Volume II: Categorical and Directional Data † VAN BELLE · Statistical Rules of Thumb, Second Edition VAN BELLE, FISHER, HEAGERTY, and LUMLEY · Biostatistics: A Methodology for the Health Sciences, Second Edition VESTRUP · The Theory of Measures and Integration VIDAKOVIC · Statistical Modeling by Wavelets VIERTL · Statistical Methods for Fuzzy Data VINOD and REAGLE · Preparing for the Worst: Incorporating Downside Risk in Stock Market Investments WALLER and GOTWAY · Applied Spatial Statistics for Public Health Data WEISBERG · Applied Linear Regression, Third Edition WEISBERG · Bias and Causation: Models and Judgment for Valid Comparisons WELSH · Aspects of Statistical Inference WESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment * WHITTAKER · Graphical Models in Applied Multivariate Statistics WINKER · Optimization Heuristics in Economics: Applications of Threshold Accepting WOODWORTH · Biostatistics: A Bayesian Introduction WOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data,Second Edition WU and HAMADA · Experiments: Planning, Analysis, and Parameter Design Optimization, Second Edition WU and ZHANG · Nonparametric Regression Methods for Longitudinal Data Analysis YIN · Clinical Trial Design: Bayesian and Frequentist Adaptive Methods YOUNG, VALERO-MORA, and FRIENDLY · Visual Statistics: Seeing Data with Dynamic Interactive Graphics
  • 14. ZACKS · Stage-Wise Adaptive Designs * ZELLNER · An Introduction to Bayesian Inference in Econometrics ZELTERMAN · Discrete Distributions—Applications in the Health Sciences ZHOU, OBUCHOWSKI, and McCLISH · Statistical Methods in Diagnostic Medicine, Second Edition * Now available in a lower priced paperback edition in the Wiley Classics Library. † Now available in a lower priced paperback edition in the Wiley–Interscience Paperback Series. Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
  • 15. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Rencher, Alvin C, 1934– Methods of multivariate analysis / Alvin C. Rencher, William F. Christensen, Department of Statistics, Brigham Young University, Provo, UT. — Third Edition. pages cm. — (Wiley series in probability and statistics) Includes index. ISBN 978-0-470-17896-6 (hardback) 1. Multivariate analysis. I. Christensen, William F., 1970– II. Title. QA278.R45 2012 519.5’35—dc23 2012009793 Preface We have long been fascinated by the interplay of variables in multivariate data and by the challenge of unraveling the effect of each variable. Our continuing objective in the third edition has been to present the power and utility of multivariate analysis in a highly readable format. Practitioners and researchers in all applied disciplines often measure several variables on each subject or experimental unit. In some cases, it may be productive to isolate each variable in a system and study it separately. Typically, however, the variables are not only correlated with each other, but each variable is influenced by the other variables as it affects a test statistic or descriptive statistics. Thus, in many instances, the variables are intertwined in such a way that when analyzed individually they yield little information about the system. Using multivariate analysis, the variables can be examined simultaneously in order to access the key features of the process that produced them. The multivariate approach enables us to (1) explore the joint performance of the variables and (2) determine the effect of each variable in the presence of the others. Multivariate analysis provides both descriptive and inferential procedures—we can search for patterns in the data or test hypotheses about patterns of a priori interest. With multivariate descriptive techniques, we can peer beneath the tangled web of variables on the surface and extract the essence of the system. Multivariate inferential procedures include hypothesis tests that (1) process any number of variables without inflating the Type I error rate and (2) allow for whatever intercorrelations the variables possess. A wide variety of multivariate descriptive and inferential procedures is readily accessible in statistical software packages. Our selection of topics for this volume reflects years of consulting with researchers in many fields of inquiry. A brief overview of multivariate analysis is given in Chapter 1. Chapter 2 reviews the fundamentals of matrix algebra. Chapters 3 and 4 give an introduction to sampling from multivariate populations. Chapters 5, 6, 7, 10, and 11 extend univariate procedures with one dependent variable (including t-tests, analysis of variance, tests on variances, multiple regression, and multiple correlation) to analogous multivariate techniques involving several dependent variables. A review of each univariate procedure is presented before covering the multivariate counterpart. These reviews may provide key insights that the student has missed in previous courses. Chapters 8, 9, 12, 13, 14, 15, and 16 describe multivariate techniques that are not extensions of univariate procedures. In Chapters 8 and 9, we find functions of the variables that discriminate among groups in the data. In Chapters 12, 13, and 14 (new in the third edition), we find functions of the variables that reveal the basic dimensionality and characteristic patterns of the data, and we discuss procedures for finding the underlying latent variables of a system. In Chapters 15 and 16, we give methods for searching for groups in the data, and we provide plotting techniques that show relationships in a reduced dimensionality for various kinds of data. In Appendix A, tables are provided for many multivariate distributions and tests. These enable the reader to conduct an exact test in many cases for which software packages provide only approximate tests. Appendix B gives answers and hints for most of the problems in the book. Appendix C describes an ftp site that contains (1) all data sets and (2) SAS command files for all examples in the text. These command files can be adapted for use in working problems or in analyzing data sets encountered in applications.
  • 16. To illustrate multivariate applications, we have provided many examples and exercises based on 60 real data sets from a wide variety of disciplines. A practitioner or consultant in multivariate analysis gains insights and acumen from long experience in working with data. It is not expected that a student can achieve this kind of seasoning in a one-semester class. However, the examples provide a good start, and further development is gained by working problems with the data sets. For example, in Chapters 12–14, the exercises cover several typical patterns in the covariance or correlation matrix. The student’s intuition is expanded by associating these covariance patterns with the resulting configuration of the principal components or factors. Although this is a methods book, we have included a few derivations. For some readers, an occasional proof provides insights obtainable in no other way. We hope that instructors who do not wish to use proofs will not be deterred by their presence. The proofs can easily be disregarded when reading the book. Our objective has been to make the book accessible to readers who have taken as few as two statistical methods courses. The students in our classes in multivariate analysis include majors in statistics and majors from other departments. With the applied researcher in mind, we have provided careful intuitive explanations of the concepts and have included many insights typically available only in journal articles or in the minds of practitioners. Our overriding goal in preparation of this book has been clarity of exposition. We hope that students and instructors alike will find this multivariate text more comfortable than most. In the final stages of development of each edition, we asked our students for written reports on their initial reaction as they read each day’s assignment. They made many comments that led to improvements in the manuscript. We will be very grateful if readers will take the time to notify us of errors or of other suggestions they might have for improvements. We have tried to use standard mathematical and statistical notation as far as possible and to maintain consistency of notation throughout the book. We have refrained from the use of abbreviations and mnemonic devices. These save space when one is reading a book page by page, but they are annoying to those using a book as a reference. Equations are numbered sequentially throughout a chapter; for example, (3.75)indicates the 75th numbered equation in Chapter 3. Tables and figures are also numbered sequentially throughout a chapter in the form “Table 3.9” or “Figure 3.1.” Examples are not numbered sequentially; each example is identified by the same number as the section in which it appears and is placed at the end of the section. When citing references in the text, we have used the standard format involving the year of publication. For a journal article, the year alone suffices, for example, Fisher (1936). But for books, we have included a page number, as in Seber (1984, p. 216). This is the first volume of a two-volume set on multivariate analysis. The second volume is entitled Multivariate Statistical Inference and Applications by Alvin Rencher (Wiley, 1998). The two volumes are not necessarily sequential; they can be read independently. Al adopted the two-volume format in order to (1) provide broader coverage than would be possible in a single volume and (2) offer the reader a choice of approach. The second volume includes proofs of many techniques covered in the first 13 chapters of the present volume and also introduces additional topics. The present volume includes many examples and problems using actual data sets, and there are fewer algebraic problems. The second volume emphasizes derivations of the results and contains fewer examples and problems with real data. The present volume has fewer references to the literature than the other volume, which includes a careful review of the latest developments and a more comprehensive bibliography. In this third edition, we have occasionally referred the reader to “Rencher (1998)” to note that added coverage of a certain subject is available in the second volume. We are indebted to many individuals in the preparation of the first two editions of this book. Al’s initial exposure to multivariate analysis came in courses taught by Rolf Bargmann at the University of Georgia and D.R. Jensen at Virginia Tech. Additional impetus to probe the subtleties of this field came from research conducted with Bruce Brown at BYU. William’s interest and training in multivariate statistics were primarily influenced by Alvin Rencher and Yasuo Amemiya. We thank these mentors and colleagues and also thank Bruce Brown, Deane Branstetter, Del Scott, Robert Smidt, and Ingram Olkin for reading various versions of the manuscript and making valuable suggestions. We are grateful to the following colleagues and students at BYU who helped with computations and typing: Mitchell Tolland, Tawnia Newton, Marianne Matis Mohr, Gregg Littlefield, Suzanne Kimball, Wendy Nielsen, Tiffany Nordgren, David Whiting, Karla Wasden, Rachel Jones, Lonette Stoddard, Candace B. McNaughton, J. D. Williams, and Jonathan Christensen. We are grateful to the many readers
  • 17. who have pointed out errors or made suggestions for improvements. The book is better for their caring and their efforts. THIRD EDITION For the third edition, we have added a new chapter covering confirmatory factor analysis (Chapter 14). We have added new sections discussing Kronecker products and vec notation (Section 2.12), dynamic graphics (Section 3.5), transformations to normality (Section 4.5), classification trees (Section 9.7.4), seemingly unrelated regressions (Section 10.4.6), and prediction for multivariate multiple regression (Section 10.6). Additionally, we have updated and revised the graphics throughout the book and have substantially expanded the section discussing estimation for multivariate multiple regression (Section 10.4). Many other additions and changes have been made in an effort to broaden the book’s scope and improve its exposition. Additional problems have been added to accompany the new material. The new ftp site for the third edition can be found at:ftp://ftp.wiley.com/public/sci_tech_med/multivariate_analysis_3e. We thank Jonathan Christensen for the countless ways he contributed to the revision, from updated graphics to technical and formatting assistance. We are grateful to Byron Gajewski, Scott Grimshaw, and Jeremy Yorgason, who have provided useful feedback on sections of the book that are new to this edition. We are also grateful to Scott Simpson and Storm Atwood for invaluable editing assistance and to the students of William’s multivariate statistics course at Brigham Young University for pointing out errors and making suggestions for improvements in this edition. We are deeply appreciative of the support provided by the BYU Department of Statistics for the writing of the book. Al dedicates this volume to his wife, LaRue, who has supplied much needed support and encouragement. William dedicates this volume to his wife Mary Elizabeth, who has provided both encouragement and careful feedback throughout the writing process. ALVIN C. RENCHER & WILLIAM F. CHRISTENSEN Department of Statistics Brigham Young University Provo, Utah Acknowledgments We wish to thank the authors, editors, and owners of copyrights for permission to reproduce the following materials:  Figure 3.8 and Table 3.2, Kleiner and Hartigan (1981), Reprinted by permission of Journal of the American Statistical Association  Table 3.3, Colvin et al. (1997), Reprinted by permission of T. S. Colvin and D. L. Karlen  Table 3.4, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table 3.5, Reaven and Miller (1979), Reprinted by permission ofDiabetologia  Table 3.6, Timm (1975), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 3.7, Elston and Grizzle (1962), Reprinted by permission ofBiometrics  Table 3.8, Frets (1921), Reprinted by permission of Genetica  Table 3.9, O’Sullivan and Mahan (1966), Reprinted by permission of American Journal of Clinical Nutrition  Table 4.2, Royston (1983), Reprinted by permission of Applied Statistics
  • 18.  Table 5.1, Beall (1945), Reprinted by permission ofPsychometrika  Table 5.2, Hummel and Sligo (1971), Reprinted by permission ofPsychological Bulletin  Table 5.3, Kramer and Jensen (1969b), Reprinted by permission of Journal of Quality Technology  Table 5.5, Lubischew (1962), Reprinted by permission ofBiometrics  Table 5.6, Travers (1939), Reprinted by permission ofPsychometrika  Table 5.7, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag  Table 5.8, Tintner (1946), Reprinted by permission of Journal of the American Statistical Association  Table 5.9, Kramer (1972), Reprinted by permission of the author  Table 5.10, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences  Table 6.2, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag  Table 6.3, Rencher and Scott (1990), Reprinted by permission ofCommunications in Statistics: Simulation and Computation  Table 6.6, Posten (1962), Reprinted by permission of the author  Table 6.8, Crowder and Hand (1990, pp. 21–29), Reprinted by permission of Routledge Chapman and Hall  Table 6.12, Cochran and Cox (1957), Timm (1980), Reprinted by permission of John Wiley and Sons and Elsevier North-Holland Publishing Company  Table 6.14, Timm (1980), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 6.16, Potthoff and Roy (1964), Reprinted by permission of Biometrika Trustees  Table 6.17, Baten, Tack, and Baeder (1958), Reprinted by permission of Quality Progress  Table 6.18, Keuls et al. (1984), Reprinted by permission ofScientia Horticulturae  Table 6.19, Burdick (1979), Reprinted by permission of the author  Table 6.20, Box (1950), Reprinted by permission of Biometrics  Table 6.21, Rao (1948), Reprinted by permission of Biometrika Trustees  Table 6.22, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences  Table 6.23, Williams and Izenman (1981), Reprinted by permission of Colorado State University  Table 6.24, Beauchamp and Hoel (1973), Reprinted by permission of Journal of Statistical Computation and Simulation
  • 19.  Table 6.25, Box (1950), Reprinted by permission of Biometrics  Table 6.26, Grizzle and Allen (1969), Reprinted by permission ofBiometrics  Table 6.27, Crepeau et. al (1985), Reprinted by permission ofBiometrics  Table 6.28, Zerbe (1979a), Reprinted by permission of Journal of the American Statistical Association  Table 6.29, Timm (1980), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 7.1, Siotani et al. (1963), Reprinted by permission of Institute of Statistical Mathematics  Table 7.2, Reprinted by permission of R. J. Freund.  Table 8.1, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table 8.3, Reprinted by permission of G. R. Bryce and R. M. Barker  Table 10.1, Box and Youle (1955), Reprinted by permission ofBiometrics  Tables 12.2, 12.3, and 12.4, Jeffers (1967), Reprinted by permission of Applied Statistics  Table 13.1, Brown et al. (1984), Reprinted by permission ofJournal of Pascal, Ada, and Modula  Correlation matrix in Example 13.6, Brown, Strong, and Rencher (1973), Reprinted by permission of The Journal of the Acoustical Society of America  Table 15.1, Hartigan (1975a), Reprinted by permission of John Wiley and Sons  Table 15.3, Dawkins (1989), Reprinted by permission of The American Statistician  Table 15.7, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 15.12, Sokol and Rohlf (1981), Reprinted by permission of W. H. Freeman and Co.  Table 15.13, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.1, Kruskal and Wish (1978), Reprinted by permission of Sage Publications  Tables 16.2 and 16.5, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.13, Edwards and Kreiner (1983), Reprinted by permission of Biometrika  Table 16.15, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.16, Everitt (1987), Reprinted by permission of the author
  • 20.  Table 16.17, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag  Table 16.18, Clausen (1998), Reprinted by permission of Sage Publications  Table 16.19, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag  Table A.1, Mulholland (1977), Reprinted by permission of Biometrika Trustees  Table A.2, D’Agostino and Pearson (1973), Reprinted by permission of Biometrika Trustees  Table A.3, D’Agostino and Tietjen (1971), Reprinted by permission of Biometrika Trustees  Table A.4, D’Agostino (1972), Reprinted by permission of Biometrika Trustees  Table A.5, Mardia (1970, 1974), Reprinted by permission of Biometrika Trustees  Table A.6, Barnett and Lewis (1978), Reprinted by permission of John Wiley and Sons  Table A.7, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table A.8, Bailey (1977), Reprinted by permission of Journal of the American Statistical Association  Table A.9, Wall (1967), Reprinted by permission of the author, Albuquerque, NM  Table A.10, Pearson and Hartley (1972) and Pillai (1964, 1965), Reprinted by permission of Biometrika Trustees  Table A.11, Schuurmann et al. (1975), Reprinted by permission ofJournal of Statistical Computation and Simulation  Table A.12, Davis (1970a, b, 1980), Reprinted by permission of Biometrika Trustees  Table A.13, Kleinbaum, Kupper, and Muller (1988), Reprinted by permission of PWS-KENT Publishing Company  Table A.14, Lee et al. (1977), Reprinted by permission of Elsevier North-Holland Publishing Company  Table A.15, Mathai and Katiyar (1979), Reprinted by permission of Biometrika Trustees
  • 21. CHAPTER 1 INTRODUCTION 1.1 WHY MULTIVARIATE ANALYSIS? Multivariate analysis consists of a collection of methods that can be used when several measurements are made on each individual or object in one or more samples. We will refer to the measurements as variables and to the individuals or objects as units(research units, sampling units, or experimental units) or observations. In practice, multivariate data sets are common, although they are not always analyzed as such. But the exclusive use of univariate procedures with such data is no longer excusable, given the availability of multivariate techniques and inexpensive computing power to carry them out. Historically, the bulk of applications of multivariate techniques have been in the behavioral and biological sciences. However, interest in multivariate methods has now spread to numerous other fields of investigation. For example, we have collaborated on multivariate problems with researchers in education, chemistry, environmental science, physics, geology, medicine, engineering, law, business, literature, religion, public broadcasting, nursing, mining, linguistics, biology, psychology, and many other fields. Table 1.1 shows some examples of multivariate observations. Table 1.1 Examples of Multivariate Data Units Variables 1. Students Several exam scores in a single course 2. Students Grades in mathematics, history, music, art, physics 3. People Height, weight, percentage of body fat, resting heart rate 4. Skulls Length, width, cranial capacity 5. Companies Expenditures for advertising, labor, raw materials 6. Manufactured items Various measurements to check on compliance with specifications 7. Applicants for bank loans Income, education level, length of residence, savings account, current debt load 8. Segments of literature Sentence length, frequency of usage of certain words and style characteristics 9. Human hairs Composition of various elements 10. Birds Lengths of various bones The reader will notice that in some cases all the variables are measured in the same scale (see 1 and 2 in Table 1.1). In other cases, measurements are in different scales (see 3 in Table 1.1). In a few techniques such as profile analysis (Sections 5.9 and 6.8), the variables must be commensurate, that is, similar in scale of measurement; however, most multivariate methods do not require this. Ordinarily the variables are measured simultaneously on each sampling unit. Typically, these variables are correlated. If this were not so, there would be little use for many of the techniques of multivariate analysis. We need to untangle the overlapping information provided by correlated variables and peer beneath the surface to see the underlying structure. Thus the goal of many multivariate approaches
  • 22. is simplification. We seek to express “what is going on” in terms of a reduced set of dimensions. Such multivariate techniques are exploratory; they essentially generate hypotheses rather than test them. On the other hand, if our goal is a formal hypothesis test, we need a technique that will (1) allow several variables to be tested and still preserve the significance level and (2) do this for any intercorrelation structure of the variables. Many such tests are available. As the two preceding paragraphs imply, multivariate analysis is concerned generally with two areas, descriptive and inferential statistics. In the descriptive realm, we often obtain optimal linear combinations of variables. The optimality criterion varies from one technique to another, depending on the goal in each case. Although linear combinations may seem too simple to reveal the underlying structure, we use them for two obvious reasons: (1) mathematical tractability (linear approximations are used throughout all science for the same reason) and (2) they often perform well in practice. These linear functions may also be useful as a follow-up to inferential procedures. When we have a statistically significant test result that compares several groups, for example, we can find the linear combination (or combinations) of variables that led to rejection. Then the contribution of each variable to these linear combinations is of interest. In the inferential area, many multivariate techniques are extensions of univariate procedures. In such cases we review the univariate procedure before presenting the analogous multivariate approach. Multivariate inference is especially useful in curbing the researcher’s natural tendency to read too much into the data. Total control is provided for experimentwise error rate; that is, no matter how many variables are tested simultaneously, the value of α (the significance level) remains at the level set by the researcher. Some authors warn against applying the common multivariate techniques to data for which the measurement scale is not interval or ratio. It has been found, however, that many multivariate techniques give reliable results when applied to ordinal data. For many years the applications lagged behind the theory because the computations were beyond the power of the available desk-top calculators. However, with modern computers, virtually any analysis one desires, no matter how many variables or observations are involved, can be quickly and easily carried out. Perhaps it is not premature to say that multivariate analysis has come of age. 1.2 PREREQUISITES The mathematical prerequisite for reading this book is matrix algebra. Calculus is not used [with a brief exception in equation (4.29)]. But the basic tools of matrix algebra are essential, and the presentation in Chapter 2 is intended to be sufficiently complete so that the reader with no previous experience can master matrix manipulation up to the level required in this book. The statistical prerequisites are basic familiarity with the normal distribution, t-tests, confidence intervals, multiple regression, and analysis of variance. These techniques are reviewed as each is extended to the analogous multivariate procedure. This is a multivariate methods text. Most of the results are given without proof. In a few cases proofs are provided, but the major emphasis is on heuristic explanations. Our goal is an intuitive grasp of multivariate analysis, in the same mode as other statistical methods courses. Some problems are algebraic in nature, but the majority involve data sets to be analyzed. 1.3 OBJECTIVES We have formulated three objectives that we hope this book will achieve for the reader. These objectives are based on long experience teaching a course in multivariate methods, consulting on multivariate problems with researchers in many fields, and guiding statistics graduate students as they consulted with similar clients. The first objective is to gain a thorough understanding of the details of various multivariate techniques, their purposes, their assumptions, their limitations, and so on. Many of these techniques are related, yet they differ in some essential ways. These similarities and differences are emphasized.
  • 23. The second objective is to be able to select one or more appropriate techniques for a given multivariate data set. Recognizing the essential nature of a multivariate data set is the first step in a meaningful analysis. Basic types of multivariate data are introduced in Section 1.4. The third objective is to be able to interpret the results of a computer analysis of a multivariate data set. Reading the manual for a particular program package is not enough to make an intelligent appraisal of the output. Achievement of the first objective and practice on data sets in the text should help achieve the third objective. 1.4 BASIC TYPES OF DATA AND ANALYSIS We will list four basic types of (continuous) multivariate data and then briefly describe some possible analyses. Some writers would consider this an oversimplification and might prefer elaborate tree diagrams of data structure. However, many data sets can fit into one of these categories, and the simplicity of this structure makes it easier to remember. The four basic data types are as follows: 1. A single sample with several variables measured on each sampling unit (subject or object). 2. A single sample with two sets of variables measured on each unit. 3. Two samples with several variables measured on each unit. 4. Three or more samples with several variables measured on each unit. Each data type has extensions, and various combinations of the four are possible. A few examples of analyses for each case will now be given: 1. A single sample with several variables measured on each sampling unit: a. Test the hypothesis that the means of the variables have specified values. b. Test the hypothesis that the variables are uncorrelated and have a common variance. c. Find a small set of linear combinations of the original variables that summarizes most of the variation in the data (principal components). d. Express the original variables as linear functions of a smaller set of underlying variables that account for the original variables and their inter- correlations (factor analysis). 2. A single sample with two sets of variables measured on each unit: a. Determine the number, the size, and the nature of relationships between the two sets of variables (canonical correlation). For example, we may wish to relate a set of interest variables to a set of achievement variables. How much overall correlation is there between these two sets? b. Find a model to predict one set of variables from the other set (multivariate multiple regression). 3. Two samples with several variables measured on each unit: a. Compare the means of the variables across the two samples (Hotelling’s T2- test). b. Find a linear combination of the variables that best separates the two samples (discriminant analysis). c. Find a function of the variables that will accurately allocate the units into the two groups (classification analysis). 4. Three or more samples with several variables measured on each unit:
  • 24. a. Compare the means of the variables across the groups (multivariate analysis of variance). b. Extension of 3b to more than two groups. c. Extension of 3c to more than two groups. CHAPTER 2 MATRIX ALGEBRA 2.1 INTRODUCTION This chapter introduces the basic elements of matrix algebra used in the remainder of this book. It is essentially a review of the requisite matrix tools and is not intended to be a complete development. However, it is sufficiently self-contained so that those with no previous exposure to the subject should need no other reference. Anyone unfamiliar with matrix algebra should plan to work most of the problems entailing numerical illustrations. It would also be helpful to explore some of the problems involving general matrix manipulation. With the exception of a few derivations that seemed instructive, most of the results are given without proof. Some additional proofs are requested in the problems. For the remaining proofs, see any general text on matrix theory or one of the specialized matrix texts oriented to statistics, such as Graybill (1969), Searle (1982), or Harville (1997). 2.2 NOTATION AND BASIC DEFINITIONS 2.2.1 Matrices, Vectors, and Scalars A matrix is a rectangular or square array of numbers or variables arranged in rows and columns. We use uppercase boldface letters to represent matrices. All entries in matrices will be real numbers or variables representing real numbers. The elements of a matrix are displayed in brackets. For example, the ACT score and GPA for three students can be conveniently listed in the following matrix: (2.1) The elements of A can also be variables, representing possible values of ACT and GPA for three students: (2.2) In this double-subscript notation for the elements of a matrix, the first subscript indicates the row; the second identifies the column. The matrix A in (2.2) could also be expressed as (2.3) where aij is a general element. With three rows and two columns, the matrix A in (2.1) or (2.2) is said to be 3 × 2. In general, if a matrix A has n rows and p columns, it is said to be n × p. Alternatively, we say the size of A is n × p.
  • 25. A vector is a matrix with a single column or row. The following could be the test scores of a student in a course in multivariate analysis: (2.4) Variable elements in a vector can be identified by a single subscript: (2.5) We use lowercase boldface letters for column vectors. Row vectors are expressed as where x′ indicates the transpose of x. The transpose operation is defined in Section 2.2.3. Geometrically, a vector with p elements identifies a point in a p-dimensional space. The elements in the vector are the coordinates of the point. In (2.35) in Section 2.3.3, we define the distance from the origin to the point. In Section 3.13, we define the distance between two vectors. In some cases, we will be interested in a directed line segment or arrow from the origin to the point. A single real number is called a scalar, to distinguish it from a vector or matrix. Thus 2, −4, and 125 are scalars. A variable representing a scalar will usually be denoted by a lowercase nonbolded letter, such as a = 5. A product involving vectors and matrices may reduce to a matrix of size 1 × 1, which then becomes a scalar. 2.2.2 Equality of Vectors and Matrices Two matrices are equal if they are the same size and the elements in corresponding positions are equal. Thus if A = (aij) and B = (bij), then A = B if aij = bij for all i and j. For example, let Then A = C. But even though A and B have the same elements, A ≠ B because the two matrices are not the same size. Likewise, A ≠ D because a23 ≠ d23. Thus two matrices of the same size are unequal if they differ in a single position. 2.2.3 Transpose and Symmetric Matrices The transpose of a matrix A, denoted by A′, is obtained from A by interchanging rows and columns. Thus the columns of A′ are the rows of A, and the rows of A′ are the columns of A. The following examples illustrate the transpose of a matrix or vector:
  • 26. The transpose operation does not change a scalar, since it has only one row and one column. If the transpose operator is applied twice to any matrix, the result is the original matrix: (2.6) If the transpose of a matrix is the same as the original matrix, the matrix is said to besymmetric; that is, A is symmetric if A = A′. For example, Clearly, all symmetric matrices are square. 2.2.4 Special Matrices The diagonal of a p × p square matrix A consists of the elements a11, a22, …, app. For example, in the matrix the elements 5, 9, and 1 lie on the diagonal. If a matrix contains zeros in all off-diagonal positions, it is said to be a diagonal matrix. An example of a diagonal matrix is This matrix could also be denoted as (2.7) A diagonal matrix can be formed from any square matrix by replacing off-diagonal elements by 0’s. This is denoted by diag(A). Thus for the above matrix A, we have (2.8) A diagonal matrix with a 1 in each diagonal position is called an identity matrix and is denoted by I. For example, a 3 × 3 identity matrix is given by (2.9) An upper triangular matrix is a square matrix with zeros below the diagonal, for example,
  • 27. (2.10) A lower triangular matrix is defined similarly. A vector of 1’s will be denoted by j: (2.11) A square matrix of 1’s is denoted by J. For example, a 3 × 3 matrix J is given by (2.12) Finally, we denote a vector of zeros by 0 and a matrix of zeros by O. For example, (2.13) 2.3 OPERATIONS 2.3.1 Summation and Product Notation For completeness, we review the standard mathematical notation for sums and products. The sum of a sequence of numbers a1, a2, …, an is indicated by If the n numbers are all the same, then a = a + a + … + a = na. The sum of all the numbers in an array with double subscripts, such as is indicated by This is sometimes abbreviated to The product of a sequence of numbers a1, a2, …, an is indicated by If the n numbers are all equal, the product becomes a = (a)(a) … (a) = an.
  • 28. 2.3.2 Addition of Matrices and Vectors If two matrices (or two vectors) are the same size, their sum is found by adding corresponding elements, that is, if A is n × p and B is n × p, then C = A + B is also n × pand is found as (cij) = (aij + bij). For example, Similarly, the difference between two matrices or two vectors of the same size is found by subtracting corresponding elements. Thus C = A − B is found as (cij) = (aij − bij). For example, If two matrices are identical, their difference is a zero matrix; that is, A = B implies A −B = O. For example, Matrix addition is commutative: (2.14) The transpose of the sum (difference) of two matrices is the sum (difference) of the transposes: (2.15) (2.16) (2.17) (2.18) 2.3.3 Multiplication of Matrices and Vectors In order for the product AB to be defined, the number of columns in A must be the same as the number of rows in B, in which case A and B are said to be conformable. Then the (ij)th element of C = AB is (2.19) Thus cij is the sum of products of the ith row of A and the jth column of B. We therefore multiply each row of A by each column of B, and the size of AB consists of the number of rows of A and the number of columns of B. Thus, if A is n × m and B is m× p, then C = AB is n × p. For example, if then
  • 29. Note that A is 4 × 3, B is 3 × 2, and AB is 4 × 2. In this case, AB is of a different size than either A or B. If A and B are both n × n, then AB is also n × n. Clearly, A2 is defined only if A is square. In some cases AB is defined, but BA is not defined. In the above example, BAcannot be found because B is 3 × 2 and A is 4 × 3 and a row of B cannot be multiplied by a column of A. Sometimes AB and BA are both defined but are different in size. For example, if A is 2 × 4 and B is 4 × 2, then AB is 2 × 2 and BA is 4 × 4. If A and B are square and the same size, then AB and BA are both defined. However, (2.20) except for a few special cases. For example, let Then Thus we must be careful to specify the order of multiplication. If we wish to multiply both sides of a matrix equation by a matrix, we must multiply “on the left” or “on the right” and be consistent on both sides of the equation. Multiplication is distributive over addition or subtraction: (2.21) (2.22) (2.23) (2.24) Note that, in general, because of (2.20), (2.25) Using the distributive law, we can expand products such as (A − B)(C − D) to obtain (2.26) The transpose of a product is the product of the transposes in reverse order: (2.27) Note that (2.27) holds as long as A and B are conformable. They need not be square. Multiplication involving vectors follows the same rules as for matrices. Suppose A isn × p, a is p × 1, b is p × 1, and c is n × 1. Then some possible products are Ab, c′A, a′b, b′a, and ab′. For example, let Then
  • 30. Note that Ab is a column vector, c′A is a row vector, c′Ab is a scalar, and a′b = b′a. The triple product c′Ab was obtained as c′(Ab). The same result would be obtained if we multiplied in the order (c′A)b: This is true in general for a triple product: (2.28) Thus multiplication of three matrices can be defined in terms of the product of two matrices, since (fortunately) it does not matter which two are multiplied first. Note thatA and B must be conformable for multiplication, and B and C must be conformable. For example, if A is n × p, B is p × q, and C is q × m, then both multiplications are possible and the product ABC is n × m. We can sometimes factor a sum of triple products on both the right and left sides. For example, (2.29) As another illustration, let X be n × p and A be n × n. Then (2.30) If a and b are both n × 1, then (2.31) is a sum of products and is a scalar. On the other hand, ab′ is defined for any size a andb and is a matrix, either rectangular or square:
  • 31. (2.32) Similarly, (2.33) (2.34) Thus a′a is a sum of squares and aa′ is a square (symmetric) matrix. The products a′aand aa′ are sometimes referred to as the dot product and matrix product, respectively. The square root of the sum of squares of the elements of a is the distance from the origin to the point a and is also referred to as the length of a: (2.35) As special cases of (2.33) and (2.34), note that if j is n × 1, then (2.36) where j and J were defined in (2.11) and (2.12). If a is n × 1 and A is n × p, then (2.37) (2.38) Thus a′j is the sum of the elements in a, j′A contains the column sums of A, and Ajcontains the row sums of A. In a′j, the vector j is n × 1; in j′A, the vector j is n × 1; and in Aj, the vector j is p × 1. Since a′b is a scalar, it is equal to its transpose: (2.39) This allows us to write (a′b)2 in the form (2.40) From (2.18), (2.26), and (2.39) we obtain (2.41) Note that in analogous expressions with matrices, however, the two middle terms cannot be combined: If a and x1, x2, …, xn are all p × 1 and A is p × p, we obtain the following factoring results as extensions of (2.21) and (2.29):
  • 32. (2.42) (2.43) (2.44) (2.45) We can express matrix multiplication in terms of row vectors and column vectors. Ifai′ is the ith row of A and bj is the jth column of B, then the (ij)th element of AB is ai′bj. For example, if A has three rows and B has two columns, then the product AB can be written as (2.46) This can be expressed in terms of the rows of A: (2.47) Note that the first column of AB in (2.46) is and likewise the second column is Ab2. Thus AB can be written in the form This result holds in general: (2.48) To further illustrate matrix multiplication in terms of rows and columns, let matrix, x be a p × 1 vector, and S be a p × p matrix. Then (2.49) (2.50) Any matrix can be multiplied by its transpose. If A is n × p, then Similarly, From (2.6) and (2.27), it is clear that both AA′ and A′A are symmetric. In the above illustration for AB in terms of row and column vectors, the rows of Awere denoted by ai′ and the columns of B by bj. If both rows and columns of a matrix Aare under discussion, as in AA′ and A′A, we will use the notation ai′ for rows and a(j)for columns. To illustrate, if A is 3 × 4, we have
  • 33. where, for example, With this notation for rows and columns of A, we can express the elements of A′A or of AA′ as products of the rows of A or of the columns of A. Thus if we write A in terms of its rows as then we have (2.51) (2.52) Similarly, if we express A in terms of columns as then (2.53)
  • 34. (2.54) Let A = (aij) be an n × n matrix and D be a diagonal matrix, D = diag(d1, d2, …, dn). Then, in the product DA, the ith row of A is multiplied by di, and in AD, the jth column of A is multiplied by dj. For example, if n = 3, we have (2.55) (2.56) (2.57) In the special case where the diagonal matrix is the identity, we have (2.58) If A is rectangular, (2.58) still holds, but the two identities are of different sizes. The product of a scalar and a matrix is obtained by multiplying each element of the matrix by the scalar: (2.59) For example, (2.60)
  • 35. (2.61) Since caij = aijc, the product of a scalar and a matrix is commutative: (2.62) Multiplication of vectors or matrices by scalars permits the use of linear combinations, such as If A is a symmetric matrix and x and y are vectors, the product (2.63) is called a quadratic form, while (2.64) is called a bilinear form. Either of these is, of course, a scalar and can be treated as such. Expressions such as are permissible (assuming A is positive definite; see Section 2.7). 2.4 PARTITIONED MATRICES It is sometimes convenient to partition a matrix into submatrices. For example, a partitioning of a matrix A into four submatrices could be indicated symbolically as follows: For example, a 4 × 5 matrix A could be partitioned as where If two matrices A and B are conformable and A and B are partitioned so that the submatrices are appropriately conformable, then the product AB can be found by following the usual row-by-column pattern of multiplication on the submatrices as if they were single elements; for example, (2.65)
  • 36. It can be seen that this formulation is equivalent to the usual row-by-column definition of matrix multiplication. For example, the (1, 1) element of AB is the product of the first row of A and the first column of B. In the (1, 1) element of A11B11 we have the sum of products of part of the first row of A and part of the first column of B. In the (1, 1) element of A12B21 we have the sum of products of the rest of the first row of Aand the remainder of the first column of B. Multiplication of a matrix and a vector can also be carried out in partitioned form. For example, (2.66) where the partitioning of the columns of A corresponds to the partitioning of the elements of b. Note that the partitioning of A into two sets of columns is indicated by a comma, A = (A1, A2). The partitioned multiplication in (2.66) can be extended to individual columns of Aand individual elements of b: (2.67) Thus Ab is expressible as a linear combination of the columns of A, the coefficients being elements of b. For example, let Then Using a linear combination of columns of A as in (2.67), we obtain We note that if A is partitioned as in (2.66), A = (A2, A2), the transpose is not equal to (A′1, A′2), but rather (2.68) 2.5 RANK Before defining the rank of a matrix, we first introduce the notion of linear independence and dependence. A set of vectors a1, a2, …, an is said to be linearly dependent if constants c1, c2, …, cn (not all zero) can be found such that (2.69) If no constants c1, c2, …, cn can be found satisfying (2.69), the set of vectors is said to be linearly independent.
  • 37. If (2.69) holds, then at least one of the vectors ai can be expressed as a linear combination of the other vectors in the set. Thus linear dependence of a set of vectors implies redundancy in the set. Among linearly independent vectors there is no redundancy of this type. The rank of any square or rectangular matrix A is defined as rank(A) = number of linearly independent rows of A = number of linearly independent columns of A. It can be shown that the number of linearly independent rows of a matrix is always equal to the number of linearly independent columns. If A is n × p, the maximum possible rank of A is the smaller of n and p, in which caseA is said to be of full rank (sometimes said full row rank or full column rank). For example, has rank 2 because the two rows are linearly independent (neither row is a multiple of the other). However, even though A is full rank, the columns are linearly dependent because rank 2 implies there are only two linearly independent columns. Thus, by(2.69), there exist constants c1, c2, and c3 such that (2.70) By (2.67), we can write (2.70) in the form or (2.71) A solution vector to (2.70) or (2.71) is given by any multiple of c = (14, −11, −12)′. Hence we have the interesting result that a product of a matrix A and a vector c is equal to 0, even though A ≠ O and c ≠ 0. This is a direct consequence of the linear dependence of the column vectors of A. Another consequence of the linear dependence of rows or columns of a matrix is the possibility of expressions such as AB = CB, where A ≠ C. For example, let Then All three of the matrices A, B, and C are full rank; but being rectangular, they have a rank deficiency in either rows or columns, which permits us to construct AB = CB withA ≠ C. Thus in a matrix equation, we cannot, in general, cancel matrices from both sides of the equation. There are two exceptions to this rule. One involves a nonsingular matrix to be defined in Section 2.6. The other special case occurs when the expression holds for all possible values of the matrix common to both sides of the equation. For example, (2.72) To see this, let × = (1, 0, …, 0)′. Then the first column of A equals the first column of B. Now let × = (0, 1, 0, …, 0)′, and the second column of A equals the second column of B. Continuing in this fashion, we obtain A = B. Suppose a rectangular matrix A is n × p of rank p, where p < n. We typically shorten this statement to “A is nx p of rank p < n.”
  • 38. 2.6 INVERSE If a matrix A is square and of full rank, then A is said to be nonsingular, and A has a unique inverse, denoted by A−1, with the property that (2.73) For example, let Then If A is square and of less than full rank, then an inverse does not exist, and A is said to be singular. Note that rectangular matrices do not have inverses as in (2.73), even if they are full rank. If A and B are the same size and nonsingular, then the inverse of their product is the product of their inverses in reverse order, (2.74) Note that (2.74) holds only for nonsingular matrices. Thus, for example, if A is n × p of rank p < n, then A′A has an inverse, but (A′A)−1 is not equal to A−1(A′)−1 because A is rectangular and does not have an inverse. If a matrix is nonsingular, it can be canceled from both sides of an equation, provided it appears on the left (right) on both sides. For example, if B is nonsingular, then since we can multiply on the right by B−1 to obtain Otherwise, if A, B, and C are rectangular or square and singular, it is easy to constructAB = CB, with A ≠ C, as illustrated near the end of Section 2.5. The inverse of the transpose of a nonsingular matrix is given by the transpose of the inverse: (2.75) If the symmetric nonsingular matrix A is partitioned in the form then the inverse is given by (2.76) where . A nonsingular matrix of the form B + cc′, where B is nonsingular, has as its inverse (2.77)
  • 39. 2.7 POSITIVE DEFINITE MATRICES The symmetric matrix A is said to be positive definite if x′Ax > 0 for all possible vectors x (except x = 0). Similarly, A is positive semidefinite if x′Ax ≥ 0 for all x ≠ 0. [A quadratic form x′Ax was defined in (2.63).] The diagonal elements aii of a positive definite matrix are positive. To see this, let x′ = (0, …, 0, 1, 0, …, 0) with a 1 in the ith position. Then x′Ax = aii > 0. Similarly, for a positive semidefinite matrix A, aii ≥ 0 for all i. One way to obtain a positive definite matrix is as follows: (2.78) This is easily shown: where z = Bx. Thus, which is positive (Bx cannot be 0 unless x = 0, because B is full rank). If B is less than full rank, then by a similar argument, B′B is positive semidefinite. Note that A = B′B is analogous to a = b2 in real numbers, where the square of any number (including negative numbers) is positive. In another analogy to positive real numbers, a positive definite matrix can be factored into a “square root” in two ways. We give one method below in (2.79) and the other in Section 2.11.8. A positive definite matrix A can be factored into (2.79) where T is a nonsingular upper triangular matrix. One way to obtain T is the Cholesky decomposition, which can be carried out in the following steps. Let A = (aij) and T = (tij) be n × n. Then the elements of T are found as follows: For example, let Then by the Cholesky method, we obtain
  • 40. 2.8 DETERMINANTS The determinant of an n × n matrix A is defined as the sum of all n! possible products of n elements such that 1. Each product contains one element from every row and every column, and 2. The factors in each product are written so that the column subscripts appear in order of magnitude and each product is then preceded by a plus or minus sign according to whether the number of inversions in the row subscripts is even or odd. An inversion occurs whenever a larger number precedes a smaller one. The symbol n! is defined as (2.80) The determinant of A is a scalar denoted by |A| or by det(A). The preceding definition is not useful in evaluating determinants, except in the case of 2 × 2 or 3 × 3 matrices. For larger matrices, other methods are available for manual computation, but determinants are typically evaluated by computer. For a 2 × 2 matrix, the determinant is found by (2.81) For a 3 × 3 matrix, the determinant is given by (2.82) This can be found by the following scheme. The three positive terms are obtained by and the three negative terms by The determinant of a diagonal matrix is the product of the diagonal elements; that is, if D = diag(d1, d2, …, dn), then (2.83) As a special case of (2.83), suppose all diagonal elements are equal, say, Then (2.84) The extension of (2.84) to any square matrix A is (2.85) Because the determinant is a scalar, we can carry out operations such as provided that |A| > 0 for |A|1/2 and that |A| ≠ 0 for 1/|A|. If the square matrix A is singular, its determinant is 0: (2.86)
  • 41. If A is near singular, then there exists a linear combination of the columns that is close to 0, and |A| is also close to 0. If A is nonsingular, its determinant is nonzero: (2.87) If A is positive definite, its determinant is positive: (2.88) If A and B are square and the same size, the determinant of the product is the product of the determinants: (2.89) For example, let Then The determinant of the transpose of a matrix is the same as the determinant of the matrix, and the determinant of the the inverse of a matrix is the reciprocal of the determinant: (2.90) (2.91) If a partitioned matrix has the form where A11 and A22 are square, but not necessarily the same size, then (2.92) For a general partitioned matrix, where A11 and A22 are square and nonsingular (not necessarily the same size), the determinant is given by either of the following two expressions: (2.93) (2.94) Note the analogy of (2.93) and (2.94) to the case of the determinant of a 2 × 2 matrix as given by (2.81): If B is nonsingular and c is a vector, then (2.95) 2.9 TRACE
  • 42. A simple function of an n × n matrix A is the trace, denoted by tr(A) and defined as the sum of the diagonal elements of A; that is, . The trace is, of course, a scalar. For example, suppose Then The trace of the sum of two square matrices is the sum of the traces of the two matrices: (2.96) An important result for the product of two matrices is (2.97) This result holds for any matrices A and B where AB and BA are both defined. It is not necessary that A and B be square or that AB equal BA. For example, let Then From (2.52) and (2.54), we obtain (2.98) where the aij′s are elements of the n × p matrix A. 2.10 ORTHOGONAL VECTORS AND MATRICES Two vectors a and b of the same size are said to be orthogonal if (2.99) Geometrically, orthogonal vectors are perpendicular [see (3.14) and the comments following (3.14)]. If a′a = 1, the vector a is said to be normalized. The vector a can always be normalized by dividing by its length, . Thus (2.100) is normalized so that c′c = 1. A matrix C = (c1, c2, …, cp) whose columns are normalized and mutually orthogonal is called an orthogonal matrix. Since the elements of C′C are products of columns of C[see (2.54)], which have the properties for all i and for all i = j, we have (2.101) If C satisfies (2.101), it necessarily follows that (2.102) from which we see that the rows of C are also normalized and mutually orthogonal. It is clear from (2.101) and (2.102) that C−1 = C′ for an orthogonal matrix C. We illustrate the creation of an orthogonal matrix by starting with
  • 43. whose columns are mutually orthogonal. To normalize the three columns, we divide by the respective lengths, , , and , to obtain Note that the rows also became normalized and mutually orthogonal so that C satisfies both (2.101) and (2.102). Multiplication by an orthogonal matrix has the effect of rotating axes; that is, if a point x is transformed to z = Cx, where C is orthogonal, then (2.103) and the distance from the origin to z is the same as the distance to x. 2.11 EIGENVALUES AND EIGENVECTORS 2.11.1 Definition For every square matrix A, a scalar λ and a nonzero vector × can be found such that (2.104) In (2.104), λ is called an eigenvalue of A and x is an eigenvector. To find λ and x, we write (2.104) as (2.105) If |A − λI| ≠ 0, then (A − λI) has an inverse and x = 0 is the only solution. Hence, in order to obtain nontrivial solutions, we set |A − λI| = 0 to find values of λ that can be substituted into (2.105) to find corresponding values of x. Alternatively, (2.69) and(2.71) require that the columns of A − λI be linearly dependent. Thus in (A − λI)x = 0, the matrix A − λI must be singular in order to find a solution vector x that is not 0. The equation |A − λI| = 0 is called the characteristic equation. If A is n × n, the characteristic equation will have n roots; that is, A will have n eigenvalues λ1, λ2, …, λn. The λ’s will not necessarily all be distinct or all nonzero. However, if A arises from computations on real (continuous) data and is nonsingular, the λ’s will all be distinct (with probability 1). After finding λ1, λ2, …, λn, the accompanying eigenvectors x1, x2, …, xn can be found using (2.105). If we multiply both sides of (2.105) by a scalar k and note by (2.62) that k and A − λIcommute, we obtain (2.106) Thus if x is an eigenvector of A, kx is also an eigenvector, and eigenvectors are unique only up to multiplication by a scalar. Hence we can adjust the length of x, but the direction from the origin is unique; that is, the relative values of (ratios of) the components of x = (x1, x2, …, xn)′ are unique. Typically, the eigenvector x is scaled so that x′x = 1. To illustrate, we will find the eigenvalues and eigenvectors for the matrix The characteristic equation is
  • 44. from which λ1 = 3 and λ2 = 2. To find the eigenvector corresponding to λ1 = 3, we use(2.105), As expected, either equation is redundant in the presence of the other, and there remains a single equation with two unknowns, x1 = x2. The solution vector can be written with an arbitrary constant, If c is set equal to to normalize the eigenvector, we obtain Similarly, corresponding to λ2 = 2, we have 2.11.2 I + A and I − A If λ is an eigenvalue of A and x is the corresponding eigenvector, then 1 + λ is an eigenvalue of I + A and 1 − λ is an eigenvalue of I − A. In either case, x is the corresponding eigenvector. We demonstrate this for I + A: 2.11.3 tr(A) and |A| For any square matrix A with eigenvalues λ1, λ2, …, λn, we have (2.107) (2.108) Note that by the definition in Section 2.9, tr(A) is also equal to , but aii ≠ λi. We illustrate (2.107) and (2.108) using the matrix from the illustration in Section 2.11.1, for which λ1 = 3 and λ2 = 2. Using (2.107), we obtain and from (2.108), we have By definition, we obtain
  • 45. 2.11.4 Positive Definite and Semidefinite Matrices The eigenvalues and eigenvectors of positive definite and positive semidefinite matrices have the following properties: 1. The eigenvalues of a positive definite matrix are all positive. 2. The eigenvalues of a positive semidefinite matrix are positive or zero, with the number of positive eigenvalues equal to the rank of the matrix. It is customary to list the eigenvalues of a positive definite matrix in descending order: λ1 > λ2 > … > λp. The eigenvectors x1, x2, …, xn are listed in the same order; x1corresponds to λ1, x2 corresponds to λ2, and so on. The following result, known as the Perron–Frobenius theorem, is of interest in Chapter 12: If all elements of the positive definite matrix A are positive, then all elements of the first eigenvector are positive. (The first eigenvector is the one associated with the first eigenvalue, λ1). 2.11.5 The Product AB If A and B are square and the same size, the eigenvalues of AB are the same as those ofBA, although the eigenvectors are usually different. This result also holds if AB andBA are both square but of different sizes, as when A is n × p and B is p × n. (In this case, the nonzero eigenvalues of AB and BA will be the same.) 2.11.6 Symmetric Matrix The eigenvectors of an n × n symmetric matrix A are mutually orthogonal. It follows that if the n eigenvectors of A are normalized and inserted as columns of a matrix C = (x1, x2, …, xn), then C is orthogonal. 2.11.7 Spectral Decomposition It was noted in Section 2.11.6 that if the matrix C = (x1, x2, …, xn) contains the normalized eigenvectors of an n × n symmetric matrix A, then C is orthogonal. Therefore, by (2.102), I = CC′, which we can multiply by A to obtain We now substitute C = (x1, x2, …, xn): (2.109) where (2.110) The expression A = CDC′ in (2.109) for a symmetric matrix A in terms of its eigenvalues and eigenvectors is known as the spectral decomposition of A.
  • 46. Since C is orthogonal and C′C = CC′ = I, we can multiply (2.109) on the left by C′ and on the right by C to obtain (2.111) Thus a symmetric matrix A can be diagonalized by an orthogonal matrix containing normalized eigenvectors of A, and by (2.110) the resulting diagonal matrix contains eigenvalues of A. 2.11.8 Square Root Matrix If A is positive definite, the spectral decomposition of A in (2.109) can be modified by taking the square roots of the eigenvalues to produce a square root matrix, (2.112) where (2.113) The square root matrix A1/2 is symmetric and serves as the square root of A: (2.114) 2.11.9 Square and Inverse Matrices Other functions of A have spectral decompositions analogous to (2.112). Two of these are the square and inverse of A. If the square matrix A has eigenvalues λ1, λ2, …, λnand accompanying eigenvectors x1, x2, …, xn, then A2 has eigenvalues and eigenvectors x1, x2, …, xn. If A is nonsingular, then A−1 has eigenvalues 1/λ1, 1/λ2, …, 1/λn and eigenvectors x1, x2, …, xn. If A is also symmetric, then (2.115) (2.116) where C = (x1, x2, …, xn) has as columns the normalized eigenvectors of A (and of A2and A−1), D2 = diag , and D−1 = diag(1/λ1, 1/λ2, …, 1/λn). 2.11.10 Singular Value Decomposition In Section 2.11.7 we expressed a symmetric matrix in terms of its eigenvalues and eigenvectors in the spectral decomposition. In a similar manner, we can express any (real) matrix A in terms of eigenvalues and eigenvectors of A′A and AA′. Let A be an n× p matrix of rank k. Then the singular value decomposition of A can be expressed as (2.117) where U is n × k, D is k × k, and V is p × k. The diagonal elements of the non-singular diagonal matrix D = diag(λ1, λ2, …, λk) are the positive square roots of , which are the nonzero eigenvalues of A′A or of AA′. The values λ1, λ2, …, λk are called the singular values of A. The k columns of U are the normalized eigenvectors of AA′corresponding to the eigenvalues . The k columns of V are the normalized eigenvectors of A′A corresponding to the eigenvalues . Since the columns of U and V are (normalized) eigenvectors of symmetric matrices, they are mutually orthogonal (see Section 2.11.6), and we have U′U = V′V = I.
  • 47. 2.12 KRONECKER AND VEC NOTATION When manipulating matrices with block structure, it is often convenient to use Kronecker and vec notation. The Kronecker product of an m × n matrix A with a p × qmatrix B is an mp × nq matrix that is denoted A B and is defined as (2.118) For example, because of the block structure in the matrix we can write where If A is an m × n matrix with columns a1, …, an, then we can refer to the elements of Ain vector (or “vec”) form using (2.119) so that vec A is an mn × 1 vector. If A is an m × m symmetric matrix, then the m2 elements in vec A will include m(m − 1)/2 pairs of identical elements (since aij = aji). In such settings it is often useful to denote the vector half (or “vech”) of a symmetric matrix in order to include only the unique elements of the matrix. If we separate elements from different columns using semicolons, we can define the “vech” operator with (2.120) so that vech A is an m(m + 1)/2 × 1 vector. Note that vech A can be obtained by finding vec A and then eliminating the m(m − 1)/2 elements above the diagonal of A.