2. Contents
Cover
Half Title page
Title page
Copyright page
Preface
Acknowledgments
Chapter 1: Introduction
1.1 Why Multivariate Analysis?
1.2 Prerequisites
1.3 Objectives
1.4 Basic Types of Data and Analysis
Chapter 2: Matrix Algebra
2.1 Introduction
2.2 Notation and Basic Definitions
2.3 Operations
2.4 Partitioned Matrices
2.5 Rank
2.6 Inverse
2.7 Positive Definite Matrices
2.8 Determinants
2.9 Trace
2.10 Orthogonal Vectors and Matrices
2.11 Eigenvalues and Eigenvectors
2.12 Kronecker and Vec Notation
Problems
Chapter 3: Characterizing and Displaying Multivariate
Data
3.1 Mean and Variance of a Univariate Random Variable
3.2 Covariance and Correlation of Bivariate Random Variables
3.3 Scatterplots of Bivariate Samples
3.4 Graphical Displays for Multivariate Samples
3.5 Dynamic Graphics
3.6 Mean Vectors
3.7 Covariance Matrices
3.8 Correlation Matrices
3.9 Mean Vectors and Covariance Matrices for Subsets of Variables
3.10 Linear Combinations of Variables
3.11 Measures of Overall Variability
3. 3.12 Estimation of Missing Values
3.13 Distance Between Vectors
Problems
Chapter 4: The Multivariate Normal Distribution
4.1 Multivariate Normal Density Function
4.2 Properties of Multivariate Normal Random Variables
4.3 Estimation in the Multivariate Normal
4.4 Assessing Multivariate Normality
4.5 Transformations to Normality
4.6 Outliers
Problems
Chapter 5: Tests on One or Two Mean Vectors
5.1 Multivariate Versus Univariate Tests
5.2 Tests on μ With Σ Known
5.3 Tests on μ When Σ Is Unknown
5.4 Comparing Two Mean Vectors
5.5 Tests on Individual Variables Conditional on Rejection of H0 By the T2-
Test
5.6 Computation of T2
5.7 Paired Observations Test
5.8 Test for Additional Information
5.9 Profile Analysis
Problems
Chapter 6: Multivariate Analysis of Variance
6.1 One-Way Models
6.2 Comparison of the Four Manova Test Statistics
6.3 Contrasts
6.4 Tests on Individual Variables Following Rejection ofH0 By the Overall
Manova Test
6.5 Two-Way Classification
6.6 Other Models
6.7 Checking on the Assumptions
6.8 Profile Analysis
6.9 Repeated Measures Designs
6.10 Growth Curves
6.11 Tests on a Subvector
Problems
Chapter 7: Tests on Covariance Matrices
7.1 Introduction
7.2 Testing a Specified Pattern for Σ
7.3 Tests Comparing Covariance Matrices
4. 7.4 Tests of Independence
Problems
Chapter 8: Discriminant Analysis: Description of Group
Separation
8.1 Introduction
8.2 The Discriminant Function for Two Groups
8.3 Relationship Between Two-Group Discriminant Analysis and Multiple
Regression
8.4 Discriminant Analysis for Several Groups
8.5 Standardized Discriminant Functions
8.6 Tests of Significance
8.7 Interpretation of Discriminant Functions
8.8 Scatterplots
8.9 Stepwise Selection of Variables
Problems
Chapter 9: Classification Analysis: Allocation of
Observations to Groups
9.1 Introduction
9.2 Classification Into Two Groups
9.3 Classification Into Several Groups
9.4 Estimating Misclassification Rates
9.5 Improved Estimates of Error Rates
9.6 Subset Selection
9.7 Nonparametric Procedures
Problems
Chapter 10: Multivariate Regression
10.1 Introduction
10.2 Multiple Regression: Fixed x’s
10.4 Multivariate Multiple Regression: Estimation
10.5 Multivariate Multiple Regression: Hypothesis Tests
10.6 Multivariate Multiple Regression: Prediction
10.7 Measures of Association Between the y’s and the x’s
10.8 Subset Selection
10.9 Multivariate Regression: Random x’s
Problems
Chapter 11: Canonical Correlation
11.1 Introduction
11.2 Canonical Correlations and Canonical Variates
11.3 Properties of Canonical Correlations
11.4 Tests of Significance
11.5 Interpretation
5. 11.6 Relationships of Canonical Correlation Analysis to Other Multivariate
Techniques
Problems
Chapter 12: Principal Component Analysis
12.1 Introduction
12.2 Geometric and Algebraic Bases of Principal Components
12.3 Principal Components and Perpendicular Regression
12.4 Plotting of Principal Components
12.5 Principal Components From the Correlation Matrix
12.6 Deciding How Many Components to Retain
12.7 Information in the Last Few Principal Components
12.8 Interpretation of Principal Components
12.9 Selection of Variables
Problems
Chapter 13: Exploratory Factor Analysis
13.1 Introduction
13.2 Orthogonal Factor Model
13.3 Estimation of Loadings and Communalities
13.4 Choosing the Number of Factors, M
13.5 Rotation
13.6 Factor Scores
13.7 Validity of the Factor Analysis Model
13.8 Relationship of Factor Analysis to Principal Component Analysis
Problems
Chapter 14: Confirmatory Factor Analysis
14.1 Introduction
14.2 Model Specification and Identification
14.3 Parameter Estimation and Model Assessment
14.4 Inference for Model Parameters
14.5 Factor Scores
Problems
Chapter 15: Cluster Analysis
15.1 Introduction
15.2 Measures of Similarity or Dissimilarity
15.3 Hierarchical Clustering
15.4 Nonhierarchical Methods
15.5 Choosing the Number of Clusters
15.6 Cluster Validity
15.7 Clustering Variables
Problems
Chapter 16: Graphical Procedures
6. 16.1 Multidimensional Scaling
16.2 Correspondence Analysis
16.3 Biplots
Problems
Appendix A: Tables
Appendix B: Answers and Hints to Problems
Appendix C: Data Sets and Sas Files
References
Index
METHODS OF MULTIVARIATE ANALYSIS
WILEY SERIES IN PROBABILITY AND STATISTICS
ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M.
Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford
Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many
topics of current research interest in both pure and applied statistics and probability theory. Written by
leading statisticians and institutions, the titles span both state-of-the-art developments in the field and
classical methods.
Reflecting the wide range of current research in statistics, the series encompasses applied,
methodological and theoretical statistics, ranging from applications and new techniques made possible by
advances in computerized practice to rigorous treatment of theoretical approaches.
This series provides essential and invaluable reading for all statisticians, whether in academia,
industry, government, or research.
† ABRAHAM and LEDOLTER · Statistical Methods for Forecasting
AGRESTI · Analysis of Ordinal Categorical Data, Second Edition
AGRESTI · An Introduction to Categorical Data Analysis, Second Edition
AGRESTI · Categorical Data Analysis, Second Edition
ALTMAN, GILL, and McDONALD · Numerical Issues in Statistical Computing for the Social Scientist
AMARATUNGA and CABRERA · Exploration and Analysis of DNA Microarray and Protein Array Data
AND L · Mathematics of Chance
ANDERSON · An Introduction to Multivariate Statistical Analysis, Third Edition
* ANDERSON · The Statistical Analysis of Time Series
ANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG · Statistical Methods for
Comparative Studies
ANDERSON and LOYNES · The Teaching of Practical Statistics
ARMITAGE and DAVID (editors) · Advances in Biometry
ARNOLD, BALAKRISHNAN, and NAGARAJA · Records
* ARTHANARI and DODGE · Mathematical Programming in Statistics
* BAILEY · The Elements of Stochastic Processes with Applications to the Natural Sciences
BAJORSKI · Statistics for Imaging, Optics, and Photonics
BALAKRISHNAN and KOUTRAS · Runs and Scans with Applications
BALAKRISHNAN and NG · Precedence-Type Tests and Applications
BARNETT · Comparative Statistical Inference, Third Edition
BARNETT · Environmental Statistics
BARNETT and LEWIS · Outliers in Statistical Data, Third Edition
BARTHOLOMEW, KNOTT, and MOUSTAKI · Latent Variable Models and Factor Analysis: A Unified
Approach, Third Edition
7. BARTOSZYNSKI and NIEWIADOMSKA-BUGAJ · Probability and Statistical Inference, Second Edition
BASILEVSKY · Statistical Factor Analysis and Related Methods: Theory and Applications
BATES and WATTS · Nonlinear Regression Analysis and Its Applications
BECHHOFER, SANTNER, and GOLDSMAN · Design and Analysis of Experiments for Statistical
Selection, Screening, and Multiple Comparisons
BEIRLANT, GOEGEBEUR, SEGERS, TEUGELS, and DE WAAL · Statistics of Extremes: Theory and
Applications
BELSLEY · Conditioning Diagnostics: Collinearity and Weak Data in Regression
† BELSLEY, KUH, and WELSCH · Regression Diagnostics: Identifying Influential Data and Sources of
Collinearity
BENDAT and PIERSOL · Random Data: Analysis and Measurement Procedures,Fourth Edition
BERNARDO and SMITH · Bayesian Theory
BHAT and MILLER · Elements of Applied Stochastic Processes, Third Edition
BHATTACHARYA and WAYMIRE · Stochastic Processes with Applications
BIEMER, GROVES, LYBERG, MATHIOWETZ, and SUDMAN · Measurement Errors in Surveys
BILLINGSLEY · Convergence of Probability Measures, Second Edition
BILLINGSLEY · Probability and Measure, Anniversary Edition
BIRKES and DODGE · Alternative Methods of Regression
BISGAARD and KULAHCI · Time Series Analysis and Forecasting by Example
BISWAS, DATTA, FINE, and SEGAL · Statistical Advances in the Biomedical Sciences: Clinical Trials,
Epidemiology, Survival Analysis, and Bioinformatics
BLISCHKE and MURTHY (editors) · Case Studies in Reliability and Maintenance
BLISCHKE and MURTHY · Reliability: Modeling, Prediction, and Optimization
BLOOMFIELD · Fourier Analysis of Time Series: An Introduction, Second Edition
BOLLEN · Structural Equations with Latent Variables
BOLLEN and CURRAN · Latent Curve Models: A Structural Equation Perspective
BOROVKOV · Ergodicity and Stability of Stochastic Processes
BOSQ and BLANKE · Inference and Prediction in Large Dimensions
BOULEAU · Numerical Methods for Stochastic Processes
* BOX and TIAO · Bayesian Inference in Statistical Analysis
BOX · Improving Almost Anything, Revised Edition
* BOX and DRAPER · Evolutionary Operation: A Statistical Method for Process Improvement
BOX and DRAPER · Response Surfaces, Mixtures, and Ridge Analyses, Second Edition
BOX, HUNTER, and HUNTER · Statistics for Experimenters: Design, Innovation, and Discovery, Second
Editon
BOX, JENKINS, and REINSEL · Time Series Analysis: Forcasting and Control, Fourth Edition
BOX, LUCEÑO, and PANIAGUA-QUIÑONES · Statistical Control by Monitoring and Adjustment, Second
Edition
* BROWN and HOLLANDER · Statistics: A Biomedical Introduction
CAIROLI and DALANG · Sequential Stochastic Optimization
CASTILLO, HADI, BALAKRISHNAN, and SARABIA · Extreme Value and Related Models with
Applications in Engineering and Science
CHAN · Time Series: Applications to Finance with R and S-Plus®, Second Edition
CHARALAMBIDES · Combinatorial Methods in Discrete Distributions
CHATTERJEE and HADI · Regression Analysis by Example, Fourth Edition
CHATTERJEE and HADI · Sensitivity Analysis in Linear Regression
CHERNICK · Bootstrap Methods: A Guide for Practitioners and Researchers, Second Edition
CHERNICK and FRIIS · Introductory Biostatistics for the Health Sciences
CHILÈS and DELFINER · Geostatistics: Modeling Spatial Uncertainty, Second Edition
CHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies,Second Edition
CLARKE · Linear Models: The Theory and Application of Analysis of Variance CLARKE and DISNEY ·
Probability and Random Processes: A First Course with Applications, Second Edition
* COCHRAN and COX · Experimental Designs, Second Edition
COLLINS and LANZA · Latent Class and Latent Transition Analysis: With Applications in the Social,
Behavioral, and Health Sciences
CONGDON · Applied Bayesian Modelling
CONGDON · Bayesian Models for Categorical Data
8. CONGDON · Bayesian Statistical Modelling, Second Edition
CONOVER · Practical Nonparametric Statistics, Third Edition
COOK · Regression Graphics
COOK and WEISBERG · An Introduction to Regression Graphics
COOK and WEISBERG · Applied Regression Including Computing and Graphics CORNELL · A Primer on
Experiments with Mixtures
CORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Third
Edition
COX · A Handbook of Introductory Statistical Methods
CRESSIE · Statistics for Spatial Data, Revised Edition
CRESSIE and WIKLE · Statistics for Spatio-Temporal Data
CSÖRGÖ and HORVÁTH · Limit Theorems in Change Point Analysis
DAGPUNAR · Simulation and Monte Carlo: With Applications in Finance and MCMC
DANIEL · Applications of Statistics to Industrial Experimentation
DANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Eighth Edition
* DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition
DASU and JOHNSON · Exploratory Data Mining and Data Cleaning DAVID and NAGARAJA · Order
Statistics, Third Edition
* DEGROOT, FIENBERG, and KADANE · Statistics and the Law
DEL CASTILLO · Statistical Process Adjustment for Quality Control
DEMARIS · Regression with Social Data: Modeling Continuous and Limited Response Variables
DEMIDENKO · Mixed Models: Theory and Applications
DENISON, HOLMES, MALLICK and SMITH · Bayesian Methods for Nonlinear Classification and
Regression
DETTE and STUDDEN · The Theory of Canonical Moments with Applications in
Statistics, Probability, and Analysis
DEY and MUKERJEE · Fractional Factorial Plans
DILLON and GOLDSTEIN · Multivariate Analysis: Methods and Applications
* DODGE and ROMIG · Sampling Inspection Tables, Second Edition
* DOOB · Stochastic Processes
DOWDY, WEARDEN, and CHILKO · Statistics for Research, Third Edition
DRAPER and SMITH · Applied Regression Analysis, Third Edition
DRYDEN and MARDIA · Statistical Shape Analysis
DUDEWICZ and MISHRA · Modern Mathematical Statistics
DUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences, Fourth Edition
DUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations
EDLER and KITSOS · Recent Advances in Quantitative Methods in Cancer and Human Health Risk
Assessment
* ELANDT-JOHNSON and JOHNSON · Survival Models and Data Analysis
ENDERS · Applied Econometric Time Series, Third Edition
† ETHIER and KURTZ · Markov Processes: Characterization and Convergence
EVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third Edition
EVERITT, LANDAU, LEESE, and STAHL · Cluster Analysis, Fifth Edition
FEDERER and KING · Variations on Split Plot and Split Block Experiment Designs
FELLER · An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised;
Volume II, Second Edition
FITZMAURICE, LAIRD, and WARE · Applied Longitudinal Analysis, Second Edition
* FLEISS · The Design and Analysis of Clinical Experiments
FLEISS · Statistical Methods for Rates and Proportions, Third Edition
† FLEMING and HARRINGTON · Counting Processes and Survival Analysis
FUJIKOSHI, ULYANOV, and SHIMIZU · Multivariate Statistics: High-Dimensional and Large-Sample
Approximations
FULLER · Introduction to Statistical Time Series, Second Edition
† FULLER · Measurement Error Models
GALLANT · Nonlinear Statistical Models
GEISSER · Modes of Parametric Statistical Inference
9. GELMAN and MENG · Applied Bayesian Modeling and Causal Inference from Incomplete-Data
Perspectives
GEWEKE · Contemporary Bayesian Econometrics and Statistics
GHOSH, MUKHOPADHYAY, and SEN · Sequential Estimation
GIESBRECHT and GUMPERTZ · Planning, Construction, and Statistical Analysis of Comparative
Experiments
GIFI · Nonlinear Multivariate Analysis
GIVENS and HOETING · Computational Statistics
GLASSERMAN and YAO · Monotone Structure in Discrete-Event Systems
GNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations, Second Edition
GOLDSTEIN · Multilevel Statistical Models, Fourth Edition
GOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical Issues
GOLDSTEIN and WOOFF · Bayes Linear Statistics
GREENWOOD and NIKULIN · A Guide to Chi-Squared Testing
GROSS, SHORTLE, THOMPSON, and HARRIS · Fundamentals of Queueing Theory,Fourth Edition
GROSS, SHORTLE, THOMPSON, and HARRIS · Solutions Manual to Accompany Fundamentals of
Queueing Theory, Fourth Edition
* HAHN and SHAPIRO · Statistical Models in Engineering
HAHN and MEEKER · Statistical Intervals: A Guide for Practitioners
HALD · A History of Probability and Statistics and their Applications Before 1750
† HAMPEL · Robust Statistics: The Approach Based on Influence Functions
HARTUNG, KNAPP, and SINHA · Statistical Meta-Analysis with Applications
HEIBERGER · Computation for the Analysis of Designed Experiments
HEDAYAT and SINHA · Design and Inference in Finite Population Sampling
HEDEKER and GIBBONS · Longitudinal Data Analysis
HELLER · MACSYMA for Statisticians
HERITIER, CANTONI, COPT, and VICTORIA-FESER · Robust Methods in Biostatistics
HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 1: Introduction to
Experimental Design, Second Edition
HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 2: Advanced
Experimental Design
HINKELMANN (editor) · Design and Analysis of Experiments, Volume 3: Special
Designs and Applications
HOAGLIN, MOSTELLER, and TUKEY · Fundamentals of Exploratory Analysis of Variance
* HOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes
* HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data Analysis
HOCHBERG and TAMHANE · Multiple Comparison Procedures
HOCKING · Methods and Applications of Linear Models: Regression and the Analysis of Variance, Second
Edition
HOEL · Introduction to Mathematical Statistics, Fifth Edition
HOGG and KLUGMAN · Loss Distributions
HOLLANDER and WOLFE · Nonparametric Statistical Methods, Second Edition
HOSMER and LEMESHOW · Applied Logistic Regression, Second Edition
HOSMER, LEMESHOW, and MAY · Applied Survival Analysis: Regression Modeling of Time-to-Event
Data, Second Edition
HUBER · Data Analysis: What Can Be Learned From the Past 50 Years
HUBER · Robust Statistics
† HUBER and RONCHETTI · Robust Statistics, Second Edition
HUBERTY · Applied Discriminant Analysis, Second Edition
HUBERTY and OLEJNIK · Applied MANOVA and Discriminant Analysis, Second Edition
HUITEMA · The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi-
Experiments, and Single-Case Studies, Second Edition
HUNT and KENNEDY · Financial Derivatives in Theory and Practice, Revised Edition
HURD and MIAMEE · Periodically Correlated Random Sequences: Spectral Theory
and Practice
HUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek—with Commentary
HUZURBAZAR · Flowgraph Models for Multistate Time-to-Event Data
10. JACKMAN · Bayesian Analysis for the Social Sciences
† JACKSON · A User’s Guide to Principle Components
JOHN · Statistical Methods in Engineering and Quality Assurance
JOHNSON · Multivariate Statistical Simulation
JOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A Volume in Honor
of Samuel Kotz
JOHNSON, KEMP, and KOTZ · Univariate Discrete Distributions, Third Edition
JOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the Seventeenth
Century to the Present
JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 1, Second
Edition
JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 2, Second
Edition
JOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate Distributions
JUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE · The Theory and Practice of Econometrics, Second
Edition
JUREK and MASON · Operator-Limit Distributions in Probability Theory
KADANE · Bayesian Methods and Ethics in a Clinical Trial Design
KADANE AND SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti Evidence
KALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data, Second Edition
KARIYA and KURATA · Generalized Least Squares
KASS and VOS · Geometrical Foundations of Asymptotic Inference
† KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to Cluster Analysis
KEDEM and FOKIANOS · Regression Models for Time Series Analysis
KENDALL, BARDEN, CARNE, and LE · Shape and Shape Theory
KHURI · Advanced Calculus with Applications in Statistics, Second Edition
KHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear Models
* KISH · Statistical Design for Research
KLEIBER and KOTZ · Statistical Size Distributions in Economics and Actuarial Sciences
KLEMELÄ · Smoothing of Multivariate Data: Density Estimation and Visualization
KLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to Decisions, Third Edition
KLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models: From Data to
Decisions, Third Edition
KOSKI and NOBLE · Bayesian Networks: An Introduction
KOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions, Volume 1, Second
Edition
KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9 with Index
KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement Volume
KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 1
KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 2
KOWALSKI and TU · Modern Applied U-Statistics
KRISHNAMOORTHY and MATHEW · Statistical Tolerance Regions: Theory, Applications, and
Computation
KROESE, TAIMRE, and BOTEV · Handbook of Monte Carlo Methods
KROONENBERG · Applied Multiway Data Analysis
KULINSKAYA, MORGENTHALER, and STAUDTE · Meta Analysis: A Guide to Calibrating and
Combining Statistical Evidence
KULKARNI and HARMAN · An Elementary Introduction to Statistical Learning Theory
KUROWICKA and COOKE · Uncertainty Analysis with High Dimensional Dependence
Modelling
KVAM and VIDAKOVIC · Nonparametric Statistics with Applications to Science and Engineering
LACHIN · Biostatistical Methods: The Assessment of Relative Risks, Second Edition
LAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and Historical
Introduction
LAMPERTI · Probability: A Survey of the Mathematical Theory, Second Edition
LAWLESS · Statistical Models and Methods for Lifetime Data, Second Edition
LAWSON · Statistical Methods in Spatial Epidemiology, Second Edition
11. LE · Applied Categorical Data Analysis, Second Edition
LE · Applied Survival Analysis
LEE · Structural Equation Modeling: A Bayesian Approach
LEE and WANG · Statistical Methods for Survival Data Analysis, Third Edition
LEPAGE and BILLARD · Exploring the Limits of Bootstrap
LESSLER and KALSBEEK · Nonsampling Errors in Surveys
LEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health Statistics
LIAO · Statistical Group Comparison
LIN · Introductory Stochastic Analysis for Finance and Insurance
LITTLE and RUBIN · Statistical Analysis with Missing Data, Second Edition
LLOYD · The Statistical Analysis of Categorical Data
LOWEN and TEICH · Fractal-Based Point Processes
MAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in Statistics and
Econometrics, Revised Edition
MALLER and ZHOU · Survival Analysis with Long Term Survivors
MARCHETTE · Random Graphs for Statistical Pattern Recognition
MARDIA and JUPP · Directional Statistics
MARKOVICH · Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice
MARONNA, MARTIN and YOHAI · Robust Statistics: Theory and Methods
MASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with Applications to
Engineering and Science, Second Edition
McCULLOCH, SEARLE, and NEUHAUS · Generalized, Linear, and Mixed Models,Second Edition
McFADDEN · Management of Data in Clinical Trials, Second Edition
* McLACHLAN · Discriminant Analysis and Statistical Pattern Recognition
McLACHLAN, DO, and AMBROISE · Analyzing Microarray Gene Expression Data
McLACHLAN and KRISHNAN · The EM Algorithm and Extensions, Second Edition
McLACHLAN and PEEL · Finite Mixture Models
McNEIL · Epidemiological Research Methods
MEEKER and ESCOBAR · Statistical Methods for Reliability Data
MEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent Random Vectors:
Heavy Tails in Theory and Practice
MENGERSEN, ROBERT, and TITTERINGTON · Mixtures: Estimation and Applications
MICKEY, DUNN, and CLARK · Applied Statistics: Analysis of Variance and
Regression, Third Edition
* MILLER · Survival Analysis, Second Edition
MONTGOMERY, JENNINGS, and KULAHCI · Introduction to Time Series Analysis and Forecasting
MONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis,Fifth Edition
MORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical Robustness
MUIRHEAD · Aspects of Multivariate Statistical Theory
MULLER and STOYAN · Comparison Methods for Stochastic Models and Risks
MURTHY, XIE, and JIANG · Weibull Models
MYERS, MONTGOMERY, and ANDERSON-COOK · Response Surface Methodology: Process and
Product Optimization Using Designed Experiments, Third Edition
MYERS, MONTGOMERY, VINING, and ROBINSON · Generalized Linear Models. With Applications in
Engineering and the Sciences, Second Edition
NATVIG · Multistate Systems Reliability Theory With Applications
† NELSON · Accelerated Testing, Statistical Models, Test Plans, and Data Analyses
† NELSON · Applied Life Data Analysis
NEWMAN · Biostatistical Methods in Epidemiology
NG, TAIN, and TANG · Dirichlet Theory: Theory, Methods and Applications
OKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and
Applications of Voronoi Diagrams, Second Edition
OLIVER and SMITH · Influence Diagrams, Belief Nets and Decision Analysis
PALTA · Quantitative Methods in Population Health: Extensions of Ordinary Regressions
PANJER · Operational Risk: Modeling and Analytics
PANKRATZ · Forecasting with Dynamic Regression Models
PANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases
12. PARDOUX · Markov Processes and Applications: Algorithms, Networks, Genome and Finance
PARMIGIANI and INOUE · Decision Theory: Principles and Approaches
* PARZEN · Modern Probability Theory and Its Applications
PEÑA, TIAO, and TSAY · A Course in Time Series Analysis
PESARIN and SALMASO · Permutation Tests for Complex Data: Applications and Software
PIANTADOSI · Clinical Trials: A Methodologic Perspective, Second Edition
POURAHMADI · Foundations of Time Series Analysis and Prediction Theory
POWELL · Approximate Dynamic Programming: Solving the Curses of Dimensionality,Second Edition
POWELL and RYZHOV · Optimal Learning
PRESS · Subjective and Objective Bayesian Statistics, Second Edition
PRESS and TANUR · The Subjectivity of Scientists and the Bayesian Approach
PURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied Statistics
† PUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming
QIU · Image Processing and Jump Regression Analysis
* RAO · Linear Statistical Inference and Its Applications, Second Edition
RAO · Statistical Inference for Fractional Diffusion Processes
RAUSAND and HØYLAND · System Reliability Theory: Models, Statistical Methods, and
Applications, Second Edition
RAYNER, THAS, and BEST · Smooth Tests of Goodnes of Fit: Using R, Second Edition
RENCHER and SCHAALJE · Linear Models in Statistics, Second Edition
RENCHER and CHRISTENSEN · Methods of Multivariate Analysis, Third Edition
RENCHER · Multivariate Statistical Inference with Applications
RIGDON and BASU · Statistical Methods for the Reliability of Repairable Systems
* RIPLEY · Spatial Statistics
* RIPLEY · Stochastic Simulation
ROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition
ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insurance
and Finance
ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and Practice
ROSSI, ALLENBY, and McCULLOCH · Bayesian Statistics and Marketing
† ROUSSEEUW and LEROY · Robust Regression and Outlier Detection
ROYSTON and SAUERBREI · Multivariate Model Building: A Pragmatic Approach to Regression Analysis
Based on Fractional Polynomials for Modeling Continuous Variables
* RUBIN · Multiple Imputation for Nonresponse in Surveys
RUBINSTEIN and KROESE · Simulation and the Monte Carlo Method, Second Edition
RUBINSTEIN and MELAMED · Modern Simulation and Modeling
RYAN · Modern Engineering Statistics
RYAN · Modern Experimental Design
RYAN · Modern Regression Methods, Second Edition
RYAN · Statistical Methods for Quality Improvement, Third Edition
SALEH · Theory of Preliminary Test and Stein-Type Estimation with Applications
SALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis
SCHERER · Batch Effects and Noise in Microarray Experiments: Sources and Solutions
* SCHEFFE · The Analysis of Variance
SCHIMEK · Smoothing and Regression: Approaches, Computation, and Application SCHOTT · Matrix
Analysis for Statistics, Second Edition
SCHOUTENS · Levy Processes in Finance: Pricing Financial Derivatives
SCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization
* SEARLE · Linear Models
† SEARLE · Linear Models for Unbalanced Data
† SEARLE · Matrix Algebra Useful for Statistics
† SEARLE, CASELLA, and McCULLOCH · Variance Components
SEARLE and WILLETT · Matrix Algebra for Applied Economics
SEBER · A Matrix Handbook For Statisticians
† SEBER · Multivariate Observations
SEBER and LEE · Linear Regression Analysis, Second Edition
† SEBER and WILD · Nonlinear Regression
13. SENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems
* SERFLING · Approximation Theorems of Mathematical Statistics
SHAFER and VOVK · Probability and Finance: It’s Only a Game!
SHERMAN · Spatial Statistics and Spatio-Temporal Data: Covariance Functions and Directional
Properties
SILVAPULLE and SEN · Constrained Statistical Inference: Inequality, Order, and Shape Restrictions
SINGPURWALLA · Reliability and Risk: A Bayesian Perspective
SMALL and McLEISH · Hilbert Space Methods in Probability and Statistical Inference SRIVASTAVA ·
Methods of Multivariate Statistics
STAPLETON · Linear Statistical Models, Second Edition
STAPLETON · Models for Probability and Statistical Inference: Theory and Applications
STAUDTE and SHEATHER · Robust Estimation and Testing
STOYAN · Counterexamples in Probability, Second Edition
STOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second Edition
STOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of Geometrical Statistics
STREET and BURGESS · The Construction of Optimal Stated Choice Experiments:
Theory and Methods
STYAN · The Collected Papers of T. W. Anderson: 1943–1985
SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-Analysis in
Medical Research
TAKEZAWA · Introduction to Nonparametric Regression
TAMHANE · Statistical Analysis of Designed Experiments: Theory and Applications
TANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution Theory
THOMPSON · Empirical Model Building: Data, Models, and Reality, Second Edition
THOMPSON · Sampling, Third Edition
THOMPSON · Simulation: A Modeler’s Approach
THOMPSON and SEBER · Adaptive Sampling
THOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World Markets
TIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic
Graphics
TSAY · Analysis of Financial Time Series, Third Edition
UPTON and FINGLETON · Spatial Data Analysis by Example, Volume II: Categorical and Directional
Data
† VAN BELLE · Statistical Rules of Thumb, Second Edition
VAN BELLE, FISHER, HEAGERTY, and LUMLEY · Biostatistics: A Methodology for the Health
Sciences, Second Edition
VESTRUP · The Theory of Measures and Integration
VIDAKOVIC · Statistical Modeling by Wavelets
VIERTL · Statistical Methods for Fuzzy Data
VINOD and REAGLE · Preparing for the Worst: Incorporating Downside Risk in Stock Market
Investments
WALLER and GOTWAY · Applied Spatial Statistics for Public Health Data
WEISBERG · Applied Linear Regression, Third Edition
WEISBERG · Bias and Causation: Models and Judgment for Valid Comparisons
WELSH · Aspects of Statistical Inference
WESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and Methods for p-Value
Adjustment
* WHITTAKER · Graphical Models in Applied Multivariate Statistics
WINKER · Optimization Heuristics in Economics: Applications of Threshold Accepting
WOODWORTH · Biostatistics: A Bayesian Introduction
WOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data,Second Edition
WU and HAMADA · Experiments: Planning, Analysis, and Parameter Design Optimization, Second
Edition
WU and ZHANG · Nonparametric Regression Methods for Longitudinal Data Analysis
YIN · Clinical Trial Design: Bayesian and Frequentist Adaptive Methods
YOUNG, VALERO-MORA, and FRIENDLY · Visual Statistics: Seeing Data with Dynamic Interactive
Graphics
15. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic formats. For more information about Wiley products, visit our web site
at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Rencher, Alvin C, 1934–
Methods of multivariate analysis / Alvin C. Rencher, William F. Christensen, Department of Statistics,
Brigham Young University, Provo, UT. — Third Edition.
pages cm. — (Wiley series in probability and statistics)
Includes index.
ISBN 978-0-470-17896-6 (hardback)
1. Multivariate analysis. I. Christensen, William F., 1970– II. Title.
QA278.R45 2012
519.5’35—dc23
2012009793
Preface
We have long been fascinated by the interplay of variables in multivariate data and by the challenge of
unraveling the effect of each variable. Our continuing objective in the third edition has been to present the
power and utility of multivariate analysis in a highly readable format.
Practitioners and researchers in all applied disciplines often measure several variables on each subject
or experimental unit. In some cases, it may be productive to isolate each variable in a system and study it
separately. Typically, however, the variables are not only correlated with each other, but each variable is
influenced by the other variables as it affects a test statistic or descriptive statistics. Thus, in many
instances, the variables are intertwined in such a way that when analyzed individually they yield little
information about the system. Using multivariate analysis, the variables can be examined simultaneously
in order to access the key features of the process that produced them. The multivariate approach enables
us to (1) explore the joint performance of the variables and (2) determine the effect of each variable in the
presence of the others.
Multivariate analysis provides both descriptive and inferential procedures—we can search for patterns
in the data or test hypotheses about patterns of a priori interest. With multivariate descriptive techniques,
we can peer beneath the tangled web of variables on the surface and extract the essence of the system.
Multivariate inferential procedures include hypothesis tests that (1) process any number of variables
without inflating the Type I error rate and (2) allow for whatever intercorrelations the variables possess. A
wide variety of multivariate descriptive and inferential procedures is readily accessible in statistical
software packages.
Our selection of topics for this volume reflects years of consulting with researchers in many fields of
inquiry. A brief overview of multivariate analysis is given in Chapter 1. Chapter 2 reviews the
fundamentals of matrix algebra. Chapters 3 and 4 give an introduction to sampling from multivariate
populations. Chapters 5, 6, 7, 10, and 11 extend univariate procedures with one dependent variable
(including t-tests, analysis of variance, tests on variances, multiple regression, and multiple correlation) to
analogous multivariate techniques involving several dependent variables. A review of each univariate
procedure is presented before covering the multivariate counterpart. These reviews may provide key
insights that the student has missed in previous courses.
Chapters 8, 9, 12, 13, 14, 15, and 16 describe multivariate techniques that are not extensions of
univariate procedures. In Chapters 8 and 9, we find functions of the variables that discriminate among
groups in the data. In Chapters 12, 13, and 14 (new in the third edition), we find functions of the variables
that reveal the basic dimensionality and characteristic patterns of the data, and we discuss procedures for
finding the underlying latent variables of a system. In Chapters 15 and 16, we give methods for searching
for groups in the data, and we provide plotting techniques that show relationships in a reduced
dimensionality for various kinds of data.
In Appendix A, tables are provided for many multivariate distributions and tests. These enable the
reader to conduct an exact test in many cases for which software packages provide only approximate tests.
Appendix B gives answers and hints for most of the problems in the book.
Appendix C describes an ftp site that contains (1) all data sets and (2) SAS command files for all
examples in the text. These command files can be adapted for use in working problems or in analyzing
data sets encountered in applications.
16. To illustrate multivariate applications, we have provided many examples and exercises based on 60 real
data sets from a wide variety of disciplines. A practitioner or consultant in multivariate analysis gains
insights and acumen from long experience in working with data. It is not expected that a student can
achieve this kind of seasoning in a one-semester class. However, the examples provide a good start, and
further development is gained by working problems with the data sets. For example, in Chapters 12–14,
the exercises cover several typical patterns in the covariance or correlation matrix. The student’s intuition
is expanded by associating these covariance patterns with the resulting configuration of the principal
components or factors.
Although this is a methods book, we have included a few derivations. For some readers, an occasional
proof provides insights obtainable in no other way. We hope that instructors who do not wish to use
proofs will not be deterred by their presence. The proofs can easily be disregarded when reading the book.
Our objective has been to make the book accessible to readers who have taken as few as two statistical
methods courses. The students in our classes in multivariate analysis include majors in statistics and
majors from other departments. With the applied researcher in mind, we have provided careful intuitive
explanations of the concepts and have included many insights typically available only in journal articles or
in the minds of practitioners.
Our overriding goal in preparation of this book has been clarity of exposition. We hope that students
and instructors alike will find this multivariate text more comfortable than most. In the final stages of
development of each edition, we asked our students for written reports on their initial reaction as they
read each day’s assignment. They made many comments that led to improvements in the manuscript. We
will be very grateful if readers will take the time to notify us of errors or of other suggestions they might
have for improvements.
We have tried to use standard mathematical and statistical notation as far as possible and to maintain
consistency of notation throughout the book. We have refrained from the use of abbreviations and
mnemonic devices. These save space when one is reading a book page by page, but they are annoying to
those using a book as a reference.
Equations are numbered sequentially throughout a chapter; for example, (3.75)indicates the 75th
numbered equation in Chapter 3. Tables and figures are also numbered sequentially throughout a chapter
in the form “Table 3.9” or “Figure 3.1.” Examples are not numbered sequentially; each example is
identified by the same number as the section in which it appears and is placed at the end of the section.
When citing references in the text, we have used the standard format involving the year of publication.
For a journal article, the year alone suffices, for example, Fisher (1936). But for books, we have included a
page number, as in Seber (1984, p. 216).
This is the first volume of a two-volume set on multivariate analysis. The second volume is
entitled Multivariate Statistical Inference and Applications by Alvin Rencher (Wiley, 1998). The two
volumes are not necessarily sequential; they can be read independently. Al adopted the two-volume
format in order to (1) provide broader coverage than would be possible in a single volume and (2) offer the
reader a choice of approach.
The second volume includes proofs of many techniques covered in the first 13 chapters of the present
volume and also introduces additional topics. The present volume includes many examples and problems
using actual data sets, and there are fewer algebraic problems. The second volume emphasizes derivations
of the results and contains fewer examples and problems with real data. The present volume has fewer
references to the literature than the other volume, which includes a careful review of the latest
developments and a more comprehensive bibliography. In this third edition, we have occasionally
referred the reader to “Rencher (1998)” to note that added coverage of a certain subject is available in the
second volume.
We are indebted to many individuals in the preparation of the first two editions of this book. Al’s initial
exposure to multivariate analysis came in courses taught by Rolf Bargmann at the University of Georgia
and D.R. Jensen at Virginia Tech. Additional impetus to probe the subtleties of this field came from
research conducted with Bruce Brown at BYU. William’s interest and training in multivariate statistics
were primarily influenced by Alvin Rencher and Yasuo Amemiya. We thank these mentors and colleagues
and also thank Bruce Brown, Deane Branstetter, Del Scott, Robert Smidt, and Ingram Olkin for reading
various versions of the manuscript and making valuable suggestions.
We are grateful to the following colleagues and students at BYU who helped with computations and
typing: Mitchell Tolland, Tawnia Newton, Marianne Matis Mohr, Gregg Littlefield, Suzanne Kimball,
Wendy Nielsen, Tiffany Nordgren, David Whiting, Karla Wasden, Rachel Jones, Lonette Stoddard,
Candace B. McNaughton, J. D. Williams, and Jonathan Christensen. We are grateful to the many readers
17. who have pointed out errors or made suggestions for improvements. The book is better for their caring
and their efforts.
THIRD EDITION
For the third edition, we have added a new chapter covering confirmatory factor analysis (Chapter 14). We
have added new sections discussing Kronecker products and vec notation (Section 2.12), dynamic
graphics (Section 3.5), transformations to normality (Section 4.5), classification trees (Section 9.7.4),
seemingly unrelated regressions (Section 10.4.6), and prediction for multivariate multiple regression
(Section 10.6). Additionally, we have updated and revised the graphics throughout the book and have
substantially expanded the section discussing estimation for multivariate multiple regression (Section
10.4).
Many other additions and changes have been made in an effort to broaden the book’s scope and
improve its exposition. Additional problems have been added to accompany the new material. The new ftp
site for the third edition can be found
at:ftp://ftp.wiley.com/public/sci_tech_med/multivariate_analysis_3e.
We thank Jonathan Christensen for the countless ways he contributed to the revision, from updated
graphics to technical and formatting assistance. We are grateful to Byron Gajewski, Scott Grimshaw, and
Jeremy Yorgason, who have provided useful feedback on sections of the book that are new to this edition.
We are also grateful to Scott Simpson and Storm Atwood for invaluable editing assistance and to the
students of William’s multivariate statistics course at Brigham Young University for pointing out errors
and making suggestions for improvements in this edition. We are deeply appreciative of the support
provided by the BYU Department of Statistics for the writing of the book.
Al dedicates this volume to his wife, LaRue, who has supplied much needed support and
encouragement. William dedicates this volume to his wife Mary Elizabeth, who has provided both
encouragement and careful feedback throughout the writing process.
ALVIN C. RENCHER & WILLIAM F. CHRISTENSEN
Department of Statistics
Brigham Young University
Provo, Utah
Acknowledgments
We wish to thank the authors, editors, and owners of copyrights for permission to reproduce the following
materials:
Figure 3.8 and Table 3.2, Kleiner and Hartigan (1981), Reprinted by
permission of Journal of the American Statistical Association
Table 3.3, Colvin et al. (1997), Reprinted by permission of T. S.
Colvin and D. L. Karlen
Table 3.4, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
Table 3.5, Reaven and Miller (1979), Reprinted by permission
ofDiabetologia
Table 3.6, Timm (1975), Reprinted by permission of Elsevier North-
Holland Publishing Company
Table 3.7, Elston and Grizzle (1962), Reprinted by permission
ofBiometrics
Table 3.8, Frets (1921), Reprinted by permission of Genetica
Table 3.9, O’Sullivan and Mahan (1966), Reprinted by permission
of American Journal of Clinical Nutrition
Table 4.2, Royston (1983), Reprinted by permission of Applied
Statistics
18. Table 5.1, Beall (1945), Reprinted by permission ofPsychometrika
Table 5.2, Hummel and Sligo (1971), Reprinted by permission
ofPsychological Bulletin
Table 5.3, Kramer and Jensen (1969b), Reprinted by permission
of Journal of Quality Technology
Table 5.5, Lubischew (1962), Reprinted by permission ofBiometrics
Table 5.6, Travers (1939), Reprinted by permission ofPsychometrika
Table 5.7, Andrews and Herzberg (1985), Reprinted by permission of
Springer-Verlag
Table 5.8, Tintner (1946), Reprinted by permission of Journal of the
American Statistical Association
Table 5.9, Kramer (1972), Reprinted by permission of the author
Table 5.10, Cameron and Pauling (1978), Reprinted by permission of
National Academy of Sciences
Table 6.2, Andrews and Herzberg (1985), Reprinted by permission of
Springer-Verlag
Table 6.3, Rencher and Scott (1990), Reprinted by permission
ofCommunications in Statistics: Simulation and Computation
Table 6.6, Posten (1962), Reprinted by permission of the author
Table 6.8, Crowder and Hand (1990, pp. 21–29), Reprinted by
permission of Routledge Chapman and Hall
Table 6.12, Cochran and Cox (1957), Timm (1980), Reprinted by
permission of John Wiley and Sons and Elsevier North-Holland
Publishing Company
Table 6.14, Timm (1980), Reprinted by permission of Elsevier North-
Holland Publishing Company
Table 6.16, Potthoff and Roy (1964), Reprinted by permission of
Biometrika Trustees
Table 6.17, Baten, Tack, and Baeder (1958), Reprinted by permission
of Quality Progress
Table 6.18, Keuls et al. (1984), Reprinted by permission ofScientia
Horticulturae
Table 6.19, Burdick (1979), Reprinted by permission of the author
Table 6.20, Box (1950), Reprinted by permission of Biometrics
Table 6.21, Rao (1948), Reprinted by permission of Biometrika
Trustees
Table 6.22, Cameron and Pauling (1978), Reprinted by permission of
National Academy of Sciences
Table 6.23, Williams and Izenman (1981), Reprinted by permission
of Colorado State University
Table 6.24, Beauchamp and Hoel (1973), Reprinted by permission
of Journal of Statistical Computation and Simulation
19. Table 6.25, Box (1950), Reprinted by permission of Biometrics
Table 6.26, Grizzle and Allen (1969), Reprinted by permission
ofBiometrics
Table 6.27, Crepeau et. al (1985), Reprinted by permission
ofBiometrics
Table 6.28, Zerbe (1979a), Reprinted by permission of Journal of the
American Statistical Association
Table 6.29, Timm (1980), Reprinted by permission of Elsevier North-
Holland Publishing Company
Table 7.1, Siotani et al. (1963), Reprinted by permission of Institute
of Statistical Mathematics
Table 7.2, Reprinted by permission of R. J. Freund.
Table 8.1, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
Table 8.3, Reprinted by permission of G. R. Bryce and R. M. Barker
Table 10.1, Box and Youle (1955), Reprinted by permission
ofBiometrics
Tables 12.2, 12.3, and 12.4, Jeffers (1967), Reprinted by permission
of Applied Statistics
Table 13.1, Brown et al. (1984), Reprinted by permission ofJournal of
Pascal, Ada, and Modula
Correlation matrix in Example 13.6, Brown, Strong, and Rencher
(1973), Reprinted by permission of The Journal of the Acoustical
Society of America
Table 15.1, Hartigan (1975a), Reprinted by permission of John Wiley
and Sons
Table 15.3, Dawkins (1989), Reprinted by permission of The
American Statistician
Table 15.7, Hand et al. (1994), Reprinted by permission of D. J. Hand
Table 15.12, Sokol and Rohlf (1981), Reprinted by permission of W.
H. Freeman and Co.
Table 15.13, Hand et al. (1994), Reprinted by permission of D. J.
Hand
Table 16.1, Kruskal and Wish (1978), Reprinted by permission of
Sage Publications
Tables 16.2 and 16.5, Hand et al. (1994), Reprinted by permission of
D. J. Hand
Table 16.13, Edwards and Kreiner (1983), Reprinted by permission
of Biometrika
Table 16.15, Hand et al. (1994), Reprinted by permission of D. J.
Hand
Table 16.16, Everitt (1987), Reprinted by permission of the author
20. Table 16.17, Andrews and Herzberg (1985), Reprinted by permission
of Springer Verlag
Table 16.18, Clausen (1998), Reprinted by permission of Sage
Publications
Table 16.19, Andrews and Herzberg (1985), Reprinted by permission
of Springer Verlag
Table A.1, Mulholland (1977), Reprinted by permission of Biometrika
Trustees
Table A.2, D’Agostino and Pearson (1973), Reprinted by permission
of Biometrika Trustees
Table A.3, D’Agostino and Tietjen (1971), Reprinted by permission of
Biometrika Trustees
Table A.4, D’Agostino (1972), Reprinted by permission of Biometrika
Trustees
Table A.5, Mardia (1970, 1974), Reprinted by permission of
Biometrika Trustees
Table A.6, Barnett and Lewis (1978), Reprinted by permission of
John Wiley and Sons
Table A.7, Kramer and Jensen (1969a), Reprinted by permission
of Journal of Quality Technology
Table A.8, Bailey (1977), Reprinted by permission of Journal of the
American Statistical Association
Table A.9, Wall (1967), Reprinted by permission of the author,
Albuquerque, NM
Table A.10, Pearson and Hartley (1972) and Pillai (1964, 1965),
Reprinted by permission of Biometrika Trustees
Table A.11, Schuurmann et al. (1975), Reprinted by permission
ofJournal of Statistical Computation and Simulation
Table A.12, Davis (1970a, b, 1980), Reprinted by permission of
Biometrika Trustees
Table A.13, Kleinbaum, Kupper, and Muller (1988), Reprinted by
permission of PWS-KENT Publishing Company
Table A.14, Lee et al. (1977), Reprinted by permission of Elsevier
North-Holland Publishing Company
Table A.15, Mathai and Katiyar (1979), Reprinted by permission of
Biometrika Trustees
21. CHAPTER 1
INTRODUCTION
1.1 WHY MULTIVARIATE
ANALYSIS?
Multivariate analysis consists of a collection of methods that can be used when several measurements are
made on each individual or object in one or more samples. We will refer to the measurements
as variables and to the individuals or objects as units(research units, sampling units, or experimental
units) or observations. In practice, multivariate data sets are common, although they are not always
analyzed as such. But the exclusive use of univariate procedures with such data is no longer excusable,
given the availability of multivariate techniques and inexpensive computing power to carry them out.
Historically, the bulk of applications of multivariate techniques have been in the behavioral and
biological sciences. However, interest in multivariate methods has now spread to numerous other fields of
investigation. For example, we have collaborated on multivariate problems with researchers in education,
chemistry, environmental science, physics, geology, medicine, engineering, law, business, literature,
religion, public broadcasting, nursing, mining, linguistics, biology, psychology, and many other
fields. Table 1.1 shows some examples of multivariate observations.
Table 1.1 Examples of Multivariate Data
Units Variables
1. Students Several exam scores in a single course
2. Students Grades in mathematics, history, music, art, physics
3. People Height, weight, percentage of body fat, resting heart
rate
4. Skulls Length, width, cranial capacity
5. Companies Expenditures for advertising, labor, raw materials
6. Manufactured
items
Various measurements to check on compliance with
specifications
7. Applicants for bank
loans
Income, education level, length of residence, savings
account, current debt load
8. Segments of
literature
Sentence length, frequency of usage of certain words
and style characteristics
9. Human hairs Composition of various elements
10. Birds Lengths of various bones
The reader will notice that in some cases all the variables are measured in the same scale (see 1 and 2
in Table 1.1). In other cases, measurements are in different scales (see 3 in Table 1.1). In a few techniques
such as profile analysis (Sections 5.9 and 6.8), the variables must be commensurate, that is, similar in
scale of measurement; however, most multivariate methods do not require this.
Ordinarily the variables are measured simultaneously on each sampling unit. Typically, these variables
are correlated. If this were not so, there would be little use for many of the techniques of multivariate
analysis. We need to untangle the overlapping information provided by correlated variables and peer
beneath the surface to see the underlying structure. Thus the goal of many multivariate approaches
22. is simplification. We seek to express “what is going on” in terms of a reduced set of dimensions. Such
multivariate techniques are exploratory; they essentially generate hypotheses rather than test them.
On the other hand, if our goal is a formal hypothesis test, we need a technique that will (1) allow several
variables to be tested and still preserve the significance level and (2) do this for any intercorrelation
structure of the variables. Many such tests are available.
As the two preceding paragraphs imply, multivariate analysis is concerned generally with two
areas, descriptive and inferential statistics. In the descriptive realm, we often obtain optimal linear
combinations of variables. The optimality criterion varies from one technique to another, depending on
the goal in each case. Although linear combinations may seem too simple to reveal the underlying
structure, we use them for two obvious reasons: (1) mathematical tractability (linear approximations are
used throughout all science for the same reason) and (2) they often perform well in practice. These linear
functions may also be useful as a follow-up to inferential procedures. When we have a statistically
significant test result that compares several groups, for example, we can find the linear combination (or
combinations) of variables that led to rejection. Then the contribution of each variable to these linear
combinations is of interest.
In the inferential area, many multivariate techniques are extensions of univariate procedures. In such
cases we review the univariate procedure before presenting the analogous multivariate approach.
Multivariate inference is especially useful in curbing the researcher’s natural tendency to read too
much into the data. Total control is provided for experimentwise error rate; that is, no matter how many
variables are tested simultaneously, the value of α (the significance level) remains at the level set by the
researcher.
Some authors warn against applying the common multivariate techniques to data for which the
measurement scale is not interval or ratio. It has been found, however, that many multivariate techniques
give reliable results when applied to ordinal data.
For many years the applications lagged behind the theory because the computations were beyond the
power of the available desk-top calculators. However, with modern computers, virtually any analysis one
desires, no matter how many variables or observations are involved, can be quickly and easily carried out.
Perhaps it is not premature to say that multivariate analysis has come of age.
1.2 PREREQUISITES
The mathematical prerequisite for reading this book is matrix algebra. Calculus is not used [with a brief
exception in equation (4.29)]. But the basic tools of matrix algebra are essential, and the presentation in
Chapter 2 is intended to be sufficiently complete so that the reader with no previous experience can
master matrix manipulation up to the level required in this book.
The statistical prerequisites are basic familiarity with the normal distribution, t-tests, confidence
intervals, multiple regression, and analysis of variance. These techniques are reviewed as each is extended
to the analogous multivariate procedure.
This is a multivariate methods text. Most of the results are given without proof. In a few cases proofs
are provided, but the major emphasis is on heuristic explanations. Our goal is an intuitive grasp of
multivariate analysis, in the same mode as other statistical methods courses. Some problems are algebraic
in nature, but the majority involve data sets to be analyzed.
1.3 OBJECTIVES
We have formulated three objectives that we hope this book will achieve for the reader. These objectives
are based on long experience teaching a course in multivariate methods, consulting on multivariate
problems with researchers in many fields, and guiding statistics graduate students as they consulted with
similar clients.
The first objective is to gain a thorough understanding of the details of various multivariate techniques,
their purposes, their assumptions, their limitations, and so on. Many of these techniques are related, yet
they differ in some essential ways. These similarities and differences are emphasized.
23. The second objective is to be able to select one or more appropriate techniques for a given multivariate
data set. Recognizing the essential nature of a multivariate data set is the first step in a meaningful
analysis. Basic types of multivariate data are introduced in Section 1.4.
The third objective is to be able to interpret the results of a computer analysis of a multivariate data set.
Reading the manual for a particular program package is not enough to make an intelligent appraisal of the
output. Achievement of the first objective and practice on data sets in the text should help achieve the
third objective.
1.4 BASIC TYPES OF DATA AND ANALYSIS
We will list four basic types of (continuous) multivariate data and then briefly describe some possible
analyses. Some writers would consider this an oversimplification and might prefer elaborate tree
diagrams of data structure. However, many data sets can fit into one of these categories, and the
simplicity of this structure makes it easier to remember. The four basic data types are as follows:
1. A single sample with several variables measured on each sampling unit
(subject or object).
2. A single sample with two sets of variables measured on each unit.
3. Two samples with several variables measured on each unit.
4. Three or more samples with several variables measured on each unit.
Each data type has extensions, and various combinations of the four are possible. A few examples of
analyses for each case will now be given:
1. A single sample with several variables measured on each sampling unit:
a. Test the hypothesis that the means of the variables have specified values.
b. Test the hypothesis that the variables are uncorrelated and have a common
variance.
c. Find a small set of linear combinations of the original variables that
summarizes most of the variation in the data (principal components).
d. Express the original variables as linear functions of a smaller set of
underlying variables that account for the original variables and their inter-
correlations (factor analysis).
2. A single sample with two sets of variables measured on each unit:
a. Determine the number, the size, and the nature of relationships between
the two sets of variables (canonical correlation). For example, we may wish to
relate a set of interest variables to a set of achievement variables. How much
overall correlation is there between these two sets?
b. Find a model to predict one set of variables from the other set (multivariate
multiple regression).
3. Two samples with several variables measured on each unit:
a. Compare the means of the variables across the two samples (Hotelling’s T2-
test).
b. Find a linear combination of the variables that best separates the two
samples (discriminant analysis).
c. Find a function of the variables that will accurately allocate the units into
the two groups (classification analysis).
4. Three or more samples with several variables measured on each unit:
24. a. Compare the means of the variables across the groups (multivariate
analysis of variance).
b. Extension of 3b to more than two groups.
c. Extension of 3c to more than two groups.
CHAPTER 2
MATRIX ALGEBRA
2.1 INTRODUCTION
This chapter introduces the basic elements of matrix algebra used in the remainder of this book. It is
essentially a review of the requisite matrix tools and is not intended to be a complete development.
However, it is sufficiently self-contained so that those with no previous exposure to the subject should
need no other reference. Anyone unfamiliar with matrix algebra should plan to work most of the problems
entailing numerical illustrations. It would also be helpful to explore some of the problems involving
general matrix manipulation.
With the exception of a few derivations that seemed instructive, most of the results are given without
proof. Some additional proofs are requested in the problems. For the remaining proofs, see any general
text on matrix theory or one of the specialized matrix texts oriented to statistics, such as Graybill (1969),
Searle (1982), or Harville (1997).
2.2 NOTATION AND BASIC
DEFINITIONS
2.2.1 Matrices, Vectors, and Scalars
A matrix is a rectangular or square array of numbers or variables arranged in rows and columns. We use
uppercase boldface letters to represent matrices. All entries in matrices will be real numbers or variables
representing real numbers. The elements of a matrix are displayed in brackets. For example, the ACT
score and GPA for three students can be conveniently listed in the following matrix:
(2.1)
The elements of A can also be variables, representing possible values of ACT and GPA for three students:
(2.2)
In this double-subscript notation for the elements of a matrix, the first subscript indicates the row; the
second identifies the column. The matrix A in (2.2) could also be expressed as
(2.3)
where aij is a general element.
With three rows and two columns, the matrix A in (2.1) or (2.2) is said to be 3 × 2. In general, if a
matrix A has n rows and p columns, it is said to be n × p. Alternatively, we say the size of A is n × p.
25. A vector is a matrix with a single column or row. The following could be the test scores of a student in a
course in multivariate analysis:
(2.4)
Variable elements in a vector can be identified by a single subscript:
(2.5)
We use lowercase boldface letters for column vectors. Row vectors are expressed as
where x′ indicates the transpose of x. The transpose operation is defined in Section 2.2.3.
Geometrically, a vector with p elements identifies a point in a p-dimensional space. The elements in the
vector are the coordinates of the point. In (2.35) in Section 2.3.3, we define the distance from the origin to
the point. In Section 3.13, we define the distance between two vectors. In some cases, we will be interested
in a directed line segment or arrow from the origin to the point.
A single real number is called a scalar, to distinguish it from a vector or matrix. Thus 2, −4, and 125 are
scalars. A variable representing a scalar will usually be denoted by a lowercase nonbolded letter, such
as a = 5. A product involving vectors and matrices may reduce to a matrix of size 1 × 1, which then
becomes a scalar.
2.2.2 Equality of Vectors and Matrices
Two matrices are equal if they are the same size and the elements in corresponding positions are equal.
Thus if A = (aij) and B = (bij), then A = B if aij = bij for all i and j. For example, let
Then A = C. But even though A and B have the same elements, A ≠ B because the two matrices are not
the same size. Likewise, A ≠ D because a23 ≠ d23. Thus two matrices of the same size are unequal if they
differ in a single position.
2.2.3 Transpose and Symmetric Matrices
The transpose of a matrix A, denoted by A′, is obtained from A by interchanging rows and columns. Thus
the columns of A′ are the rows of A, and the rows of A′ are the columns of A. The following examples
illustrate the transpose of a matrix or vector:
26. The transpose operation does not change a scalar, since it has only one row and one column.
If the transpose operator is applied twice to any matrix, the result is the original matrix:
(2.6)
If the transpose of a matrix is the same as the original matrix, the matrix is said to besymmetric; that
is, A is symmetric if A = A′. For example,
Clearly, all symmetric matrices are square.
2.2.4 Special Matrices
The diagonal of a p × p square matrix A consists of the elements a11, a22, …, app. For example, in the
matrix
the elements 5, 9, and 1 lie on the diagonal. If a matrix contains zeros in all off-diagonal positions, it is
said to be a diagonal matrix. An example of a diagonal matrix is
This matrix could also be denoted as
(2.7)
A diagonal matrix can be formed from any square matrix by replacing off-diagonal elements by 0’s.
This is denoted by diag(A). Thus for the above matrix A, we have
(2.8)
A diagonal matrix with a 1 in each diagonal position is called an identity matrix and is denoted by I. For
example, a 3 × 3 identity matrix is given by
(2.9)
An upper triangular matrix is a square matrix with zeros below the diagonal, for example,
27. (2.10)
A lower triangular matrix is defined similarly.
A vector of 1’s will be denoted by j:
(2.11)
A square matrix of 1’s is denoted by J. For example, a 3 × 3 matrix J is given by
(2.12)
Finally, we denote a vector of zeros by 0 and a matrix of zeros by O. For example,
(2.13)
2.3 OPERATIONS
2.3.1 Summation and Product Notation
For completeness, we review the standard mathematical notation for sums and products. The sum of a
sequence of numbers a1, a2, …, an is indicated by
If the n numbers are all the same, then a = a + a + … + a = na. The sum of all the numbers in an
array with double subscripts, such as
is indicated by
This is sometimes abbreviated to
The product of a sequence of numbers a1, a2, …, an is indicated by
If the n numbers are all equal, the product becomes a = (a)(a) … (a) = an.
28. 2.3.2 Addition of Matrices and Vectors
If two matrices (or two vectors) are the same size, their sum is found by adding corresponding elements,
that is, if A is n × p and B is n × p, then C = A + B is also n × pand is found as (cij) = (aij + bij). For
example,
Similarly, the difference between two matrices or two vectors of the same size is found by subtracting
corresponding elements. Thus C = A − B is found as (cij) = (aij − bij). For example,
If two matrices are identical, their difference is a zero matrix; that is, A = B implies A −B = O. For
example,
Matrix addition is commutative:
(2.14)
The transpose of the sum (difference) of two matrices is the sum (difference) of the transposes:
(2.15)
(2.16)
(2.17)
(2.18)
2.3.3 Multiplication of Matrices and Vectors
In order for the product AB to be defined, the number of columns in A must be the same as the number of
rows in B, in which case A and B are said to be conformable. Then the (ij)th element of C = AB is
(2.19)
Thus cij is the sum of products of the ith row of A and the jth column of B. We therefore multiply each row
of A by each column of B, and the size of AB consists of the number of rows of A and the number of
columns of B. Thus, if A is n × m and B is m× p, then C = AB is n × p. For example, if
then
29. Note that A is 4 × 3, B is 3 × 2, and AB is 4 × 2. In this case, AB is of a different size than either A or B.
If A and B are both n × n, then AB is also n × n. Clearly, A2 is defined only if A is square.
In some cases AB is defined, but BA is not defined. In the above example, BAcannot be found
because B is 3 × 2 and A is 4 × 3 and a row of B cannot be multiplied by a column of A.
Sometimes AB and BA are both defined but are different in size. For example, if A is 2 × 4 and B is 4 × 2,
then AB is 2 × 2 and BA is 4 × 4. If A and B are square and the same size, then AB and BA are both
defined. However,
(2.20)
except for a few special cases. For example, let
Then
Thus we must be careful to specify the order of multiplication. If we wish to multiply both sides of a
matrix equation by a matrix, we must multiply “on the left” or “on the right” and be consistent on both
sides of the equation.
Multiplication is distributive over addition or subtraction:
(2.21)
(2.22)
(2.23)
(2.24)
Note that, in general, because of (2.20),
(2.25)
Using the distributive law, we can expand products such as (A − B)(C − D) to obtain
(2.26)
The transpose of a product is the product of the transposes in reverse order:
(2.27)
Note that (2.27) holds as long as A and B are conformable. They need not be square.
Multiplication involving vectors follows the same rules as for matrices. Suppose A isn × p, a is p ×
1, b is p × 1, and c is n × 1. Then some possible products are Ab, c′A, a′b, b′a, and ab′. For example, let
Then
30. Note that Ab is a column vector, c′A is a row vector, c′Ab is a scalar, and a′b = b′a. The triple
product c′Ab was obtained as c′(Ab). The same result would be obtained if we multiplied in the
order (c′A)b:
This is true in general for a triple product:
(2.28)
Thus multiplication of three matrices can be defined in terms of the product of two matrices, since
(fortunately) it does not matter which two are multiplied first. Note thatA and B must be conformable for
multiplication, and B and C must be conformable. For example, if A is n × p, B is p × q, and C is q × m,
then both multiplications are possible and the product ABC is n × m.
We can sometimes factor a sum of triple products on both the right and left sides. For example,
(2.29)
As another illustration, let X be n × p and A be n × n. Then
(2.30)
If a and b are both n × 1, then
(2.31)
is a sum of products and is a scalar. On the other hand, ab′ is defined for any size a andb and is a matrix,
either rectangular or square:
31. (2.32)
Similarly,
(2.33)
(2.34)
Thus a′a is a sum of squares and aa′ is a square (symmetric) matrix. The products a′aand aa′ are
sometimes referred to as the dot product and matrix product, respectively. The square root of the sum of
squares of the elements of a is the distance from the origin to the point a and is also referred to as
the length of a:
(2.35)
As special cases of (2.33) and (2.34), note that if j is n × 1, then
(2.36)
where j and J were defined in (2.11) and (2.12). If a is n × 1 and A is n × p, then
(2.37)
(2.38)
Thus a′j is the sum of the elements in a, j′A contains the column sums of A, and Ajcontains the row sums
of A. In a′j, the vector j is n × 1; in j′A, the vector j is n × 1; and in Aj, the vector j is p × 1.
Since a′b is a scalar, it is equal to its transpose:
(2.39)
This allows us to write (a′b)2 in the form
(2.40)
From (2.18), (2.26), and (2.39) we obtain
(2.41)
Note that in analogous expressions with matrices, however, the two middle terms cannot be combined:
If a and x1, x2, …, xn are all p × 1 and A is p × p, we obtain the following factoring results as extensions
of (2.21) and (2.29):
32. (2.42)
(2.43)
(2.44)
(2.45)
We can express matrix multiplication in terms of row vectors and column vectors. Ifai′ is the ith row
of A and bj is the jth column of B, then the (ij)th element of AB is ai′bj. For example, if A has three rows
and B has two columns,
then the product AB can be written as
(2.46)
This can be expressed in terms of the rows of A:
(2.47)
Note that the first column of AB in (2.46) is
and likewise the second column is Ab2. Thus AB can be written in the form
This result holds in general:
(2.48)
To further illustrate matrix multiplication in terms of rows and columns,
let matrix, x be a p × 1 vector, and S be a p × p matrix. Then
(2.49)
(2.50)
Any matrix can be multiplied by its transpose. If A is n × p, then
Similarly,
From (2.6) and (2.27), it is clear that both AA′ and A′A are symmetric.
In the above illustration for AB in terms of row and column vectors, the rows of Awere denoted by ai′
and the columns of B by bj. If both rows and columns of a matrix Aare under discussion, as in AA′
and A′A, we will use the notation ai′ for rows and a(j)for columns. To illustrate, if A is 3 × 4, we have
33. where, for example,
With this notation for rows and columns of A, we can express the elements of A′A or of AA′ as
products of the rows of A or of the columns of A. Thus if we write A in terms of its rows as
then we have
(2.51)
(2.52)
Similarly, if we express A in terms of columns as
then
(2.53)
34. (2.54)
Let A = (aij) be an n × n matrix and D be a diagonal matrix, D = diag(d1, d2, …, dn). Then, in the
product DA, the ith row of A is multiplied by di, and in AD, the jth column of A is multiplied by dj. For
example, if n = 3, we have
(2.55)
(2.56)
(2.57)
In the special case where the diagonal matrix is the identity, we have
(2.58)
If A is rectangular, (2.58) still holds, but the two identities are of different sizes.
The product of a scalar and a matrix is obtained by multiplying each element of the matrix by the
scalar:
(2.59)
For example,
(2.60)
35. (2.61)
Since caij = aijc, the product of a scalar and a matrix is commutative:
(2.62)
Multiplication of vectors or matrices by scalars permits the use of linear combinations, such as
If A is a symmetric matrix and x and y are vectors, the product
(2.63)
is called a quadratic form, while
(2.64)
is called a bilinear form. Either of these is, of course, a scalar and can be treated as such. Expressions such
as are permissible (assuming A is positive definite; see Section 2.7).
2.4 PARTITIONED MATRICES
It is sometimes convenient to partition a matrix into submatrices. For example, a partitioning of a
matrix A into four submatrices could be indicated symbolically as follows:
For example, a 4 × 5 matrix A could be partitioned as
where
If two matrices A and B are conformable and A and B are partitioned so that the submatrices are
appropriately conformable, then the product AB can be found by following the usual row-by-column
pattern of multiplication on the submatrices as if they were single elements; for example,
(2.65)
36. It can be seen that this formulation is equivalent to the usual row-by-column definition of matrix
multiplication. For example, the (1, 1) element of AB is the product of the first row of A and the first
column of B. In the (1, 1) element of A11B11 we have the sum of products of part of the first row of A and
part of the first column of B. In the (1, 1) element of A12B21 we have the sum of products of the rest of the
first row of Aand the remainder of the first column of B.
Multiplication of a matrix and a vector can also be carried out in partitioned form. For example,
(2.66)
where the partitioning of the columns of A corresponds to the partitioning of the elements of b. Note that
the partitioning of A into two sets of columns is indicated by a comma, A = (A1, A2).
The partitioned multiplication in (2.66) can be extended to individual columns of Aand individual
elements of b:
(2.67)
Thus Ab is expressible as a linear combination of the columns of A, the coefficients being elements of b.
For example, let
Then
Using a linear combination of columns of A as in (2.67), we obtain
We note that if A is partitioned as in (2.66), A = (A2, A2), the transpose is not equal to (A′1, A′2), but
rather
(2.68)
2.5 RANK
Before defining the rank of a matrix, we first introduce the notion of linear independence and
dependence. A set of vectors a1, a2, …, an is said to be linearly dependent if constants c1, c2, …, cn (not all
zero) can be found such that
(2.69)
If no constants c1, c2, …, cn can be found satisfying (2.69), the set of vectors is said to be linearly
independent.
37. If (2.69) holds, then at least one of the vectors ai can be expressed as a linear combination of the other
vectors in the set. Thus linear dependence of a set of vectors implies redundancy in the set. Among
linearly independent vectors there is no redundancy of this type.
The rank of any square or rectangular matrix A is defined as
rank(A) = number of linearly independent rows of A
= number of linearly independent columns of A.
It can be shown that the number of linearly independent rows of a matrix is always equal to the number of
linearly independent columns.
If A is n × p, the maximum possible rank of A is the smaller of n and p, in which caseA is said to be
of full rank (sometimes said full row rank or full column rank). For example,
has rank 2 because the two rows are linearly independent (neither row is a multiple of the other).
However, even though A is full rank, the columns are linearly dependent because rank 2 implies there are
only two linearly independent columns. Thus, by(2.69), there exist constants c1, c2, and c3 such that
(2.70)
By (2.67), we can write (2.70) in the form
or
(2.71)
A solution vector to (2.70) or (2.71) is given by any multiple of c = (14, −11, −12)′. Hence we have the
interesting result that a product of a matrix A and a vector c is equal to 0, even though A ≠ O and c ≠ 0.
This is a direct consequence of the linear dependence of the column vectors of A.
Another consequence of the linear dependence of rows or columns of a matrix is the possibility of
expressions such as AB = CB, where A ≠ C. For example, let
Then
All three of the matrices A, B, and C are full rank; but being rectangular, they have a rank deficiency in
either rows or columns, which permits us to construct AB = CB withA ≠ C. Thus in a matrix equation, we
cannot, in general, cancel matrices from both sides of the equation.
There are two exceptions to this rule. One involves a nonsingular matrix to be defined in Section 2.6.
The other special case occurs when the expression holds for all possible values of the matrix common to
both sides of the equation. For example,
(2.72)
To see this, let × = (1, 0, …, 0)′. Then the first column of A equals the first column of B. Now let × = (0, 1,
0, …, 0)′, and the second column of A equals the second column of B. Continuing in this fashion, we
obtain A = B.
Suppose a rectangular matrix A is n × p of rank p, where p < n. We typically shorten this statement to
“A is nx p of rank p < n.”
38. 2.6 INVERSE
If a matrix A is square and of full rank, then A is said to be nonsingular, and A has a unique inverse,
denoted by A−1, with the property that
(2.73)
For example, let
Then
If A is square and of less than full rank, then an inverse does not exist, and A is said to be singular.
Note that rectangular matrices do not have inverses as in (2.73), even if they are full rank.
If A and B are the same size and nonsingular, then the inverse of their product is the product of their
inverses in reverse order,
(2.74)
Note that (2.74) holds only for nonsingular matrices. Thus, for example, if A is n × p of rank p < n,
then A′A has an inverse, but (A′A)−1 is not equal to A−1(A′)−1 because A is rectangular and does not have
an inverse.
If a matrix is nonsingular, it can be canceled from both sides of an equation, provided it appears on the
left (right) on both sides. For example, if B is nonsingular, then
since we can multiply on the right by B−1 to obtain
Otherwise, if A, B, and C are rectangular or square and singular, it is easy to constructAB = CB,
with A ≠ C, as illustrated near the end of Section 2.5.
The inverse of the transpose of a nonsingular matrix is given by the transpose of the inverse:
(2.75)
If the symmetric nonsingular matrix A is partitioned in the form
then the inverse is given by
(2.76)
where . A nonsingular matrix of the form B + cc′, where B is nonsingular, has as
its inverse
(2.77)
39. 2.7 POSITIVE DEFINITE MATRICES
The symmetric matrix A is said to be positive definite if x′Ax > 0 for all possible vectors x (except x = 0).
Similarly, A is positive semidefinite if x′Ax ≥ 0 for all x ≠ 0. [A quadratic form x′Ax was defined
in (2.63).]
The diagonal elements aii of a positive definite matrix are positive. To see this, let x′ = (0, …, 0, 1, 0, …,
0) with a 1 in the ith position. Then x′Ax = aii > 0. Similarly, for a positive semidefinite matrix A, aii ≥ 0
for all i.
One way to obtain a positive definite matrix is as follows:
(2.78)
This is easily shown:
where z = Bx. Thus, which is positive (Bx cannot be 0 unless x = 0, because B is full
rank). If B is less than full rank, then by a similar argument, B′B is positive semidefinite.
Note that A = B′B is analogous to a = b2 in real numbers, where the square of any number (including
negative numbers) is positive.
In another analogy to positive real numbers, a positive definite matrix can be factored into a “square
root” in two ways. We give one method below in (2.79) and the other in Section 2.11.8.
A positive definite matrix A can be factored into
(2.79)
where T is a nonsingular upper triangular matrix. One way to obtain T is the Cholesky decomposition,
which can be carried out in the following steps.
Let A = (aij) and T = (tij) be n × n. Then the elements of T are found as follows:
For example, let
Then by the Cholesky method, we obtain
40. 2.8 DETERMINANTS
The determinant of an n × n matrix A is defined as the sum of all n! possible products of n elements such
that
1. Each product contains one element from every row and every column, and
2. The factors in each product are written so that the column subscripts
appear in order of magnitude and each product is then preceded by a plus or
minus sign according to whether the number of inversions in the row
subscripts is even or odd.
An inversion occurs whenever a larger number precedes a smaller one. The symbol n! is defined as
(2.80)
The determinant of A is a scalar denoted by |A| or by det(A). The preceding definition is not useful in
evaluating determinants, except in the case of 2 × 2 or 3 × 3 matrices. For larger matrices, other methods
are available for manual computation, but determinants are typically evaluated by computer. For a 2 × 2
matrix, the determinant is found by
(2.81)
For a 3 × 3 matrix, the determinant is given by
(2.82)
This can be found by the following scheme. The three positive terms are obtained by
and the three negative terms by
The determinant of a diagonal matrix is the product of the diagonal elements; that is, if D = diag(d1, d2,
…, dn), then
(2.83)
As a special case of (2.83), suppose all diagonal elements are equal, say,
Then
(2.84)
The extension of (2.84) to any square matrix A is
(2.85)
Because the determinant is a scalar, we can carry out operations such as
provided that |A| > 0 for |A|1/2 and that |A| ≠ 0 for 1/|A|.
If the square matrix A is singular, its determinant is 0:
(2.86)
41. If A is near singular, then there exists a linear combination of the columns that is close to 0, and |A| is
also close to 0. If A is nonsingular, its determinant is nonzero:
(2.87)
If A is positive definite, its determinant is positive:
(2.88)
If A and B are square and the same size, the determinant of the product is the product of the
determinants:
(2.89)
For example, let
Then
The determinant of the transpose of a matrix is the same as the determinant of the matrix, and the
determinant of the the inverse of a matrix is the reciprocal of the determinant:
(2.90)
(2.91)
If a partitioned matrix has the form
where A11 and A22 are square, but not necessarily the same size, then
(2.92)
For a general partitioned matrix,
where A11 and A22 are square and nonsingular (not necessarily the same size), the determinant is given by
either of the following two expressions:
(2.93)
(2.94)
Note the analogy of (2.93) and (2.94) to the case of the determinant of a 2 × 2 matrix as given by (2.81):
If B is nonsingular and c is a vector, then
(2.95)
2.9 TRACE
42. A simple function of an n × n matrix A is the trace, denoted by tr(A) and defined as the sum of the
diagonal elements of A; that is, . The trace is, of course, a scalar. For example, suppose
Then
The trace of the sum of two square matrices is the sum of the traces of the two matrices:
(2.96)
An important result for the product of two matrices is
(2.97)
This result holds for any matrices A and B where AB and BA are both defined. It is not necessary
that A and B be square or that AB equal BA. For example, let
Then
From (2.52) and (2.54), we obtain
(2.98)
where the aij′s are elements of the n × p matrix A.
2.10 ORTHOGONAL VECTORS AND
MATRICES
Two vectors a and b of the same size are said to be orthogonal if
(2.99)
Geometrically, orthogonal vectors are perpendicular [see (3.14) and the comments following (3.14)].
If a′a = 1, the vector a is said to be normalized. The vector a can always be normalized by dividing by its
length, . Thus
(2.100)
is normalized so that c′c = 1.
A matrix C = (c1, c2, …, cp) whose columns are normalized and mutually orthogonal is called
an orthogonal matrix. Since the elements of C′C are products of columns of C[see (2.54)], which have the
properties for all i and for all i = j, we have
(2.101)
If C satisfies (2.101), it necessarily follows that
(2.102)
from which we see that the rows of C are also normalized and mutually orthogonal. It is clear
from (2.101) and (2.102) that C−1 = C′ for an orthogonal matrix C.
We illustrate the creation of an orthogonal matrix by starting with
43. whose columns are mutually orthogonal. To normalize the three columns, we divide by the respective
lengths, , , and , to obtain
Note that the rows also became normalized and mutually orthogonal so that C satisfies
both (2.101) and (2.102).
Multiplication by an orthogonal matrix has the effect of rotating axes; that is, if a point x is transformed
to z = Cx, where C is orthogonal, then
(2.103)
and the distance from the origin to z is the same as the distance to x.
2.11 EIGENVALUES AND
EIGENVECTORS
2.11.1 Definition
For every square matrix A, a scalar λ and a nonzero vector × can be found such that
(2.104)
In (2.104), λ is called an eigenvalue of A and x is an eigenvector. To find λ and x, we write (2.104) as
(2.105)
If |A − λI| ≠ 0, then (A − λI) has an inverse and x = 0 is the only solution. Hence, in order to obtain
nontrivial solutions, we set |A − λI| = 0 to find values of λ that can be substituted into (2.105) to find
corresponding values of x. Alternatively, (2.69) and(2.71) require that the columns of A − λI be linearly
dependent. Thus in (A − λI)x = 0, the matrix A − λI must be singular in order to find a solution
vector x that is not 0.
The equation |A − λI| = 0 is called the characteristic equation. If A is n × n, the characteristic equation
will have n roots; that is, A will have n eigenvalues λ1, λ2, …, λn. The λ’s will not necessarily all be distinct
or all nonzero. However, if A arises from computations on real (continuous) data and is nonsingular, the
λ’s will all be distinct (with probability 1). After finding λ1, λ2, …, λn, the accompanying eigenvectors x1, x2,
…, xn can be found using (2.105).
If we multiply both sides of (2.105) by a scalar k and note by (2.62) that k and A − λIcommute, we
obtain
(2.106)
Thus if x is an eigenvector of A, kx is also an eigenvector, and eigenvectors are unique only up to
multiplication by a scalar. Hence we can adjust the length of x, but the direction from the origin is unique;
that is, the relative values of (ratios of) the components of x = (x1, x2, …, xn)′ are unique. Typically, the
eigenvector x is scaled so that x′x = 1.
To illustrate, we will find the eigenvalues and eigenvectors for the matrix
The characteristic equation is
44. from which λ1 = 3 and λ2 = 2. To find the eigenvector corresponding to λ1 = 3, we use(2.105),
As expected, either equation is redundant in the presence of the other, and there remains a single
equation with two unknowns, x1 = x2. The solution vector can be written with an arbitrary constant,
If c is set equal to to normalize the eigenvector, we obtain
Similarly, corresponding to λ2 = 2, we have
2.11.2 I + A and I − A
If λ is an eigenvalue of A and x is the corresponding eigenvector, then 1 + λ is an eigenvalue of I + A and 1
− λ is an eigenvalue of I − A. In either case, x is the corresponding eigenvector.
We demonstrate this for I + A:
2.11.3 tr(A) and |A|
For any square matrix A with eigenvalues λ1, λ2, …, λn, we have
(2.107)
(2.108)
Note that by the definition in Section 2.9, tr(A) is also equal to , but aii ≠ λi.
We illustrate (2.107) and (2.108) using the matrix
from the illustration in Section 2.11.1, for which λ1 = 3 and λ2 = 2. Using (2.107), we obtain
and from (2.108), we have
By definition, we obtain
45. 2.11.4 Positive Definite and Semidefinite
Matrices
The eigenvalues and eigenvectors of positive definite and positive semidefinite matrices have the
following properties:
1. The eigenvalues of a positive definite matrix are all positive.
2. The eigenvalues of a positive semidefinite matrix are positive or zero, with
the number of positive eigenvalues equal to the rank of the matrix.
It is customary to list the eigenvalues of a positive definite matrix in descending order: λ1 > λ2 > … > λp.
The eigenvectors x1, x2, …, xn are listed in the same order; x1corresponds to λ1, x2 corresponds to λ2, and
so on.
The following result, known as the Perron–Frobenius theorem, is of interest in Chapter 12: If all
elements of the positive definite matrix A are positive, then all elements of the first eigenvector are
positive. (The first eigenvector is the one associated with the first eigenvalue, λ1).
2.11.5 The Product AB
If A and B are square and the same size, the eigenvalues of AB are the same as those ofBA, although the
eigenvectors are usually different. This result also holds if AB andBA are both square but of different
sizes, as when A is n × p and B is p × n. (In this case, the nonzero eigenvalues of AB and BA will be the
same.)
2.11.6 Symmetric Matrix
The eigenvectors of an n × n symmetric matrix A are mutually orthogonal. It follows that if
the n eigenvectors of A are normalized and inserted as columns of a matrix C = (x1, x2, …, xn), then C is
orthogonal.
2.11.7 Spectral Decomposition
It was noted in Section 2.11.6 that if the matrix C = (x1, x2, …, xn) contains the normalized eigenvectors of
an n × n symmetric matrix A, then C is orthogonal. Therefore, by (2.102), I = CC′, which we can multiply
by A to obtain
We now substitute C = (x1, x2, …, xn):
(2.109)
where
(2.110)
The expression A = CDC′ in (2.109) for a symmetric matrix A in terms of its eigenvalues and eigenvectors
is known as the spectral decomposition of A.
46. Since C is orthogonal and C′C = CC′ = I, we can multiply (2.109) on the left by C′ and on the right
by C to obtain
(2.111)
Thus a symmetric matrix A can be diagonalized by an orthogonal matrix containing normalized
eigenvectors of A, and by (2.110) the resulting diagonal matrix contains eigenvalues of A.
2.11.8 Square Root Matrix
If A is positive definite, the spectral decomposition of A in (2.109) can be modified by taking the square
roots of the eigenvalues to produce a square root matrix,
(2.112)
where
(2.113)
The square root matrix A1/2 is symmetric and serves as the square root of A:
(2.114)
2.11.9 Square and Inverse Matrices
Other functions of A have spectral decompositions analogous to (2.112). Two of these are the square and
inverse of A. If the square matrix A has eigenvalues λ1, λ2, …, λnand accompanying eigenvectors x1, x2,
…, xn, then A2 has eigenvalues and eigenvectors x1, x2, …, xn. If A is nonsingular,
then A−1 has eigenvalues 1/λ1, 1/λ2, …, 1/λn and eigenvectors x1, x2, …, xn. If A is also symmetric, then
(2.115)
(2.116)
where C = (x1, x2, …, xn) has as columns the normalized eigenvectors of A (and of A2and A−1), D2 = diag
, and D−1 = diag(1/λ1, 1/λ2, …, 1/λn).
2.11.10 Singular Value Decomposition
In Section 2.11.7 we expressed a symmetric matrix in terms of its eigenvalues and eigenvectors in the
spectral decomposition. In a similar manner, we can express any (real) matrix A in terms of eigenvalues
and eigenvectors of A′A and AA′. Let A be an n× p matrix of rank k. Then the singular value
decomposition of A can be expressed as
(2.117)
where U is n × k, D is k × k, and V is p × k. The diagonal elements of the non-singular diagonal
matrix D = diag(λ1, λ2, …, λk) are the positive square roots of , which are the nonzero
eigenvalues of A′A or of AA′. The values λ1, λ2, …, λk are called the singular values of A. The k columns
of U are the normalized eigenvectors of AA′corresponding to the eigenvalues .
The k columns of V are the normalized eigenvectors of A′A corresponding to the
eigenvalues . Since the columns of U and V are (normalized) eigenvectors of symmetric
matrices, they are mutually orthogonal (see Section 2.11.6), and we have U′U = V′V = I.
47. 2.12 KRONECKER AND VEC
NOTATION
When manipulating matrices with block structure, it is often convenient to use Kronecker and vec
notation. The Kronecker product of an m × n matrix A with a p × qmatrix B is an mp × nq matrix that is
denoted A B and is defined as
(2.118)
For example, because of the block structure in the matrix
we can write
where
If A is an m × n matrix with columns a1, …, an, then we can refer to the elements of Ain vector (or
“vec”) form using
(2.119)
so that vec A is an mn × 1 vector.
If A is an m × m symmetric matrix, then the m2 elements in vec A will include m(m − 1)/2 pairs of
identical elements (since aij = aji). In such settings it is often useful to denote the vector half (or “vech”) of
a symmetric matrix in order to include only the unique elements of the matrix. If we separate elements
from different columns using semicolons, we can define the “vech” operator with
(2.120)
so that vech A is an m(m + 1)/2 × 1 vector. Note that vech A can be obtained by finding vec A and then
eliminating the m(m − 1)/2 elements above the diagonal of A.