High Breakdown Estimation of Multivariate Location and Scale With Missing Observations

Cheng, Tsung-Chi
Published in British Journal of Mathematical and Statistical Psychology. 2002, vol. 55, no. 2, p. 317-335
Abstract In this paper, we consider the problem of outliers in incomplete multivariate data, when the aim is to estimate a measure of mean and covariance as it is the case for example in factor analysis. In such a situation the ER algorithm of Little and Smith (1987) which combines the EM algorithm for missing data and a robust estimation step based on an Mestimator could be used. However, the ER algorithm as originally proposed can fail to be robust in some cases especially in high dimensions. We propose here two alternatives to avoid the problem. One is to combine a small modification of the ER algorithm with a socalled high breakdown estimator as starting point for the iterative procedure and the other is to base the estimation step of the ER algorithm on a high breakdown estimator. Among the high breakdown estimators which are actually built to keep their robustness properties even if the number of variables is relatively large, we consider here the minimum covariance determinant (MCD) estimator and the t-biweight S-estimator. Simulated and real data are used to compare and illustrate the different procedures.
Keywords Detection of outliersInfluence functionEM algorithmER algorithmForward search algorithmHigh breakdown estimatorMinimum covariance determinant estimatorMissing valuesT-biweight S-estimatorM-estimatorRobust statisticsFactor analysis
CHENG, Tsung-Chi, VICTORIA-FESER, Maria-Pia. High Breakdown Estimation of Multivariate Location and Scale With Missing Observations. In: British Journal of Mathematical and Statistical Psychology, 2002, vol. 55, n° 2, p. 317-335.

