Archive ouverte UNIGE | last documentshttps://archive-ouverte.unige.ch/Latest objects deposited in the Archive ouverte UNIGEengSimulation based bias correction methods for complex modelshttps://archive-ouverte.unige.ch/unige:100295https://archive-ouverte.unige.ch/unige:100295Along the ever increasing data size and model complexity, an important challenge frequently encountered in constructing new estimators or in implementing a classical one such as the maximum likelihood estimator, is the computational aspect of the estimation procedure. To carry out estimation, approximate methods such as pseudo-likelihood functions or approximated estimating equations are increasingly used in practice as these methods are typically easier to implement numerically although they can lead to inconsistent and/or biased estimators. In this context, we extend and provide refinements on the known bias correction properties of two simulation based methods, respectively indirect inference and bootstrap, each with two alternatives. These results allow one to build a framework defining simulation based estimators that can be implemented for complex models. Indeed, based on a biased or even inconsistent estimator, several simulation based methods can be used to define new estimators that are both consistent and with reduced finite sample bias. This framework includes the classical method of indirect inference for bias correction without requiring specification of an auxiliary model. We demonstrate the equivalence between one version of the indirect inference and the iterative bootstrap, both correct sample biases up to the order n^{-3}. The iterative method can be thought of as a computationally efficient algorithm to solve the optimization problem of the indirect inference. Our results provide different tools to correct the asymptotic as well as finite sample biases of estimators and give insight on which method should be applied for the problem at hand. The usefulness of the proposed approach is illustrated with the estimation of robust income distributions and generalized linear latent variable models.Wed, 13 Dec 2017 16:53:09 +0100Robust Filteringhttps://archive-ouverte.unige.ch/unige:80765https://archive-ouverte.unige.ch/unige:80765Filtering methods are powerful tools to estimate the hidden state of a state-space model from observations available in real time. However, they are known to be highly sensitive to the presence of small misspecifications of the underlying model and to outliers in the observation process. In this article, we show that the methodology of robust statistics can be adapted to sequential filtering. We define a filter as being robust if the relative error in the state distribution caused by misspecifications is uniformly bounded by a linear function of the perturbation size. Since standard filters are nonrobust even in the simplest cases, we propose robustified filters which provide accurate state inference in the presence of model misspecifications. The robust particle filter naturally mitigates the degeneracy problems that plague the bootstrap particle filler (Gordon, Salmond, and Smith) and its many extensions. We illustrate the good properties of robust filters in linear and nonlinear state-space examples. Supplementary materials for this article are available online.Fri, 19 Feb 2016 16:54:37 +0100Infinitesimal robustness for diffusionshttps://archive-ouverte.unige.ch/unige:75145https://archive-ouverte.unige.ch/unige:75145We develop infinitesimally robust statistical procedures for the general diffusion processes. We first prove the existence and uniqueness of the times-series influence function of conditionally unbiased M-estimators for ergodic and stationary diffusions, under weak conditions on the (martingale) estimating function used. We then characterize the robustness of M-estimators for diffusions and derive a class of conditionally unbiased optimal robust estimators. To compute these estimators, we propose a general algorithm, which exploits approximation methods for diffusions in the computation of the robust estimating function. Monte Carlo simulation shows a good performance of our robust estimators and an application to the robust estimation of the exchange rate dynamics within a target zone illustrates the methodology in a real-data application.Sun, 13 Sep 2015 19:53:52 +0200Wavelet-Variance-Based Estimation for Composite Stochastic Processeshttps://archive-ouverte.unige.ch/unige:38161https://archive-ouverte.unige.ch/unige:38161This article presents a new estimationmethod for the parameters of a times series model.We consider here composite Gaussian processes that are the sum of independent Gaussian processes which, in turn, explain an important aspect of the time series, as is the case in engineering and natural sciences. The proposed estimation method offers an alternative to classical estimation based on the likelihood, that is straightforward to implement and often the only feasible estimation method with complex models. The estimator furnishes results as the optimization of a criterion based on a standardized distance between the sample wavelet variances (WV) estimates and the model-basedWV. Indeed, the WV provides a decomposition of the variance process through different scales, so that they contain the information about different features of the stochastic model. We derive the asymptotic properties of the proposed estimator for inference and perform a simulation study to compare our estimator to the MLE and the LSE with different models. We also set sufficient conditions on composite models for our estimator to be consistent, that are easy to verify. We use the new estimator to estimate the stochastic error's parameters of the sum of three first order Gauss–Markov processes by means of a sample of over 800,000 issued from gyroscopes that compose inertial navigation systems.Thu, 26 Jun 2014 09:47:52 +0200Structural tests in additive regressionhttps://archive-ouverte.unige.ch/unige:36576https://archive-ouverte.unige.ch/unige:36576abstract not availableTue, 13 May 2014 13:25:35 +0200Higher-Order Infinitesimal Robustnesshttps://archive-ouverte.unige.ch/unige:28784https://archive-ouverte.unige.ch/unige:28784Using the von Mises expansion, we study the higher-order infinitesimal robustness of a general M-functional and characterize its second-order properties. We show that second-order robustness is equivalent to the boundedness of both the estimator's estimating function and its derivative with respect to the parameter. It implies, at the same time, (i) variance robustness and (ii) robustness of higher-order saddlepoint approximations to the estimator's finite sample density. The proposed construction of second-order robust M-estimators is fairly general and potentially useful in a variety of relevant settings. Besides the theoretical contributions, we discuss the main computational issues and provide an algorithm for the implementation of second-order robust M-estimators. Finally, we illustrate our theory by Monte Carlo simulation and in a real-data estimation of the maximal losses of Nikkei 225 index returns. Our findings indicate that second-order robust estimators can improve on other widely applied robust estimators, in terms of efficiency and robustness, for moderate to small sample sizes and in the presence of deviations from ideal parametric models. Supplementary materials for this article are available online.Fri, 05 Jul 2013 09:36:01 +0200Robust Estimation for Grouped Datahttps://archive-ouverte.unige.ch/unige:23354https://archive-ouverte.unige.ch/unige:23354Here we investigate the robustness properties of the class of minimum power divergence estimators for grouped data. This class contains the classical maximum likelihood estimators for grouped data. We find that the bias of these estimators due to deviations from the assumed underlying model can be large. Therefore, we propose a more general class of estimators that allows us to construct robust procedures. By analogy with Hampel's theorem, we define optimal bounded influence function estimators, and by a simulation study, we show that under small model contaminations, these estimators are more stable than the classical estimators for grouped data. Finally, we apply our results to a particular real example.Tue, 16 Oct 2012 11:12:31 +0200Robust Linear Model Selection by Cross-Validationhttps://archive-ouverte.unige.ch/unige:23222https://archive-ouverte.unige.ch/unige:23222This article gives a robust technique for model selection in regression models, an important aspect of any data analysis involving regression. There is a danger that outliers will have an undue influence on the model chosen and distort any subsequent analysis. We provide a robust algorithm for model selection using Shao's cross-validation methods for choice of variables as a starting point. Because Shao's techniques are based on least squares, they are sensitive to outliers. We develop our robust procedure using the same ideas of cross-validation as Shao but using estimators that are optimal bounded influence for prediction. We demonstrate the effectiveness of our robust procedure in providing protection against outliers both in a simulation study and in a real example. We contrast the results with those obtained by Shao's method, demonstrating a substantial improvement in choosing the correct model in the presence of outliers with little loss of efficiency at the normal model.Sat, 06 Oct 2012 21:37:06 +0200General Saddlepoint Approximations of Marginal Densities and Tail Probabilitieshttps://archive-ouverte.unige.ch/unige:23218https://archive-ouverte.unige.ch/unige:23218Saddlepoint approximations of marginal densities and tail probabilities of general nonlinear statistics are derived. These are based on the expansion of the statistic up to the second order. Their accuracy is shown in a variety of examples, including logit and probit models and rank estimators for regression.Sat, 06 Oct 2012 21:33:59 +0200Robust Bounded-Influence Tests in General Parametric Modelshttps://archive-ouverte.unige.ch/unige:23217https://archive-ouverte.unige.ch/unige:23217We introduce robust tests for testing hypotheses in a general parametric model. These are robust versions of the Wald, scores, and likelihood ratio tests and are based on general M estimators. Their asymptotic properties and influence functions are derived. It is shown that the stability of the level is obtained by bounding the self-standardized sensitivity of the corresponding M estimator. Furthermore, optimally bounded-influence tests are derived for the Wald- and scores-type tests. Applications to real and simulated data sets are given to illustrate the tests' performance.Sat, 06 Oct 2012 21:33:17 +0200A Robust Version of Mallows's Cphttps://archive-ouverte.unige.ch/unige:23216https://archive-ouverte.unige.ch/unige:23216We present a robust version of Mallows's Cp for regression models. It is defined by RC P = W Pσ2 - (U P - V P), where W P = ω i ω2 i r 2 i is a weighted residual sum of squares computed from a robust fit of model P, σ2 is a robust and consistent estimator of σ2 in the full model, and U P and V P are constants depending on the weight function and the number of parameters in model P. Good subset models are those with RC P close to V P or smaller than V P. When the weights are identically 1, W P becomes the residual sum of squares of a least squares fit, and RC P reduces to Mallows's Cp. The robust model selection procedure based on RC P allows us to choose the models that fit the majority of the data by taking into account the presence of outliers and possible departures from the normality assumption on the error distribution. Together with the classical Cp, the robust version suggests several models from which we can choose.Sat, 06 Oct 2012 21:32:28 +0200General Saddlepoint Approximations with Applications to L Statisticshttps://archive-ouverte.unige.ch/unige:23209https://archive-ouverte.unige.ch/unige:23209Saddlepoint approximations are extended to general statistics. The technique is applied to derive approximations to the density of linear combinations of order statistics, including trimmed means. A comparison with exact results shows the accuracy of these approximations even in very small sample sizes.Sat, 06 Oct 2012 21:23:53 +0200The Change-of-Variance Curve and Optimal Redescending M-Estimatorshttps://archive-ouverte.unige.ch/unige:23202https://archive-ouverte.unige.ch/unige:23202We define the change-of-variance curve (CVC) of location M-estimators in order to investigate the infinitesimal stability of the asymptotic variance. We also construct the so-called hyperbolic tangent estimators, proving their existence and performing certain numerical computations of their defining constants. Their introduction is motivated by a theorem that shows they are the optimally robust redescending M-estimators in the sense of the CVC.Sat, 06 Oct 2012 20:58:26 +0200Direct Simultaneous Inference in Additive Models and its Application to Model Undernutritionhttps://archive-ouverte.unige.ch/unige:23196https://archive-ouverte.unige.ch/unige:23196abstract not availableSat, 06 Oct 2012 20:46:21 +0200Do-validation for Kernel Density Estimationhttps://archive-ouverte.unige.ch/unige:23192https://archive-ouverte.unige.ch/unige:23192abstract not availableSat, 06 Oct 2012 20:42:47 +0200[Review of:] Semiparametric and Nonparametric Econometricshttps://archive-ouverte.unige.ch/unige:23189https://archive-ouverte.unige.ch/unige:23189abstract not availableSat, 06 Oct 2012 20:35:44 +0200Robust Indirect Inferencehttps://archive-ouverte.unige.ch/unige:23050https://archive-ouverte.unige.ch/unige:23050In this article we develop robust indirect inference for a variety of models in a unified framework. We investigate the local robustness properties of indirect inference and derive the influence function of the indirect estimator, as well as the level and power influence functions of indirect tests. These tools are then used to design indirect inference procedures that are stable in the presence of small deviations from the assumed model. Although indirect inference was originally proposed for statistical models whose likelihood is difficult or even impossible to compute and/or to maximize, we use it here as a device to robustify the estimators and tests for models where this is not possible or is difficult with classical techniques such as M estimators. Examples from financial applications, time series, and spatial statistics are used for illustration.Thu, 27 Sep 2012 14:57:03 +0200Optimal Conditionally Unbiased Bounded-Influence Inference in Dynamic Location and Scale Modelshttps://archive-ouverte.unige.ch/unige:22949https://archive-ouverte.unige.ch/unige:22949This article studies the local robustness of estimators and tests for the conditional location and scale parameters in a strictly stationary time series model. We first derive optimal bounded-influence estimators for such settings under a conditionally Gaussian reference model. Based on these results, we obtain optimal bounded-influence versions of the classical likelihood-based tests for parametric hypotheses. We propose a feasible and efficient algorithm for the computation of our robust estimators, which uses analytical Laplace approximations to estimate the auxiliary recentering vectors, ensuring Fisher consistency in robust estimation. This strongly reduces the computation time by avoiding the simulation of multidimensional integrals, a task that typically must be addressed in the robust estimation of nonlinear models for time series. In some Monte Carlo simulations of an AR(1)–ARCH(1) process, we show that our robust procedures maintain a very high efficiency under ideal model conditions and at the same time perform very satisfactorily under several forms of departure from conditional normality. In contrast, classical pseudo–maximum likelihood inference procedures are found to be highly inefficient under such local model misspecifications. These patterns are confirmed by an application to robust testing for autoregressive conditional heteroscedasticity.Tue, 18 Sep 2012 11:26:30 +0200Robust Inference for Generalized Linear Modelshttps://archive-ouverte.unige.ch/unige:22899https://archive-ouverte.unige.ch/unige:22899By starting from a natural class of robust estimators for generalized linear models based on the notion of quasi-likelihood, we define robust deviances that can be used for stepwise model selection as in the classical framework. We derive the asymptotic distribution of tests based on robust deviances, and we investigate the stability of their asymptotic level under contamination. The binomial and Poisson models are treated in detail. Two applications to real data and a sensitivity analysis show that the inference obtained by means of the new techniques is more reliable than that obtained by classical estimation and testing procedures.Wed, 12 Sep 2012 18:28:38 +0200Saddlepoint Test in Measurement Error Modelshttps://archive-ouverte.unige.ch/unige:22890https://archive-ouverte.unige.ch/unige:22890We develop second-order hypothesis testing procedures in functional measurement error models for small or moderate sample sizes, where the classical first-order asymptotic analysis often fails to provide accurate results. In functional models no distributional assumptions are made on the unobservable covariates and this leads to semiparametric models. Our testing procedure is derived using saddlepoint techniques and is based on an empirical distribution estimation subject to the null hypothesis constraints, in combination with a set of estimating equations which avoid a distribution approximation. The validity of the method is proved in theorems for both simple and composite hypothesis tests, and is demonstrated through simulation and a farm size data analysis.Wed, 12 Sep 2012 18:12:21 +0200Fast Robust Model Selection in Large Datasetshttps://archive-ouverte.unige.ch/unige:12736https://archive-ouverte.unige.ch/unige:12736Large datasets are more and more common in many research fields. In particular, in the linear regression context, it is often the case that a huge number of potential covariates are available to explain a response variable, and the first step of a reasonable statistical analysis is to reduce the number of covariates. This can be done in a forward selection procedure that includes the selection of the variable to enter, the decision to retain it or stop the selection and estimation of the augmented model. Least squares plus t-tests can be fast, but the outcome of a forward selection might be suboptimal when there are outliers. In this paper, we propose a complete algorithm for fast robust model selection, including considerations for huge sample sizes. Since simply replacing the classical statistical criteria by robust ones is not computationally possible, we develop simplified robust estimators, selection criteria and testing procedures for linear regression. The robust estimator is a one-step weighted M-estimator that can be biased if the covariates are not orthogonal. We show that the bias can be made smaller by iterating the M-estimator one or more steps further. In the variable selection process, we propose a simplified robust criterion based on a robust t-statistic that we compare to a false discovery rate adjusted level. We carry out a simulation study to show the good performance of our approach. We also analyze two datasets and show that the results obtained by our method outperform those from robust LARS and random forests.Tue, 30 Nov 2010 15:46:00 +0100Goodness-of-fit for Generalized Linear Latent Variables Modelshttps://archive-ouverte.unige.ch/unige:6491https://archive-ouverte.unige.ch/unige:6491Generalized Linear Latent Variables Models (GLLVM) enable the modeling of relationships between manifest and latent variables, where the manifest variables are distributed according to a distribution of the exponential family (e.g. binomial or normal) and to the multinomial distribution (for ordinal manifest variables). These models are widely used in social sciences. To test the appropriateness of a particular model, one needs to define a Goodness-of-fit test statistic (GFI). In the normal case, one can use a likelihood ratio test or a modified version proposed by citeN{SaBe:01} (S&B GFI) that compares the sample covariance matrix to the estimated covariance matrix induced by the model. In the binary case, Pearson-type test statistics can be used if the number of observations is sufficiently large. In the other cases, including the case of mixed types of manifest variables, there exists GFI based on a comparison between a pseudo sample covariance and the model covariance of the manifest variables. These types of GFI are based on latent variable models that suppose that the manifest variables are themselves induced by underlying normal variables (underlying variable approach). The pseudo sample covariance matrices are then made of polychoric, tetrachoric or polyserial correlations. In this article, we propose an alternative GFI that is more generally applicable. It is based on some distance comparison between the latent scores and the original data. This GFI takes into account the nature of each manifest variable and can in principle be applied in various situations and in particular with models with ordinal, and both discrete and continuous manifest variables. To compute theTue, 04 May 2010 15:41:50 +0200Bounded-Influence Robust Estimation in Generalized Linear Latent Variable Modelshttps://archive-ouverte.unige.ch/unige:6460https://archive-ouverte.unige.ch/unige:6460Latent variable models are used for analyzing multivariate data. Recently, generalized linear latent variable models for categorical, metric, and mixed-type responses estimated via maximum likelihood (ML) have been proposed. Model deviations, such as data contamination, are shown analytically, using the influence function and through a simulation study, to seriously affect ML estimation. This article proposes a robust estimator that is made consistent using the basic principle of indirect inference and can be easily numerically implemented. The performance of the robust estimator is significantly better than that of the ML estimators in terms of both bias and variance. A real example from a consumption survey is used to highlight the consequences in practice of the choice of the estimator.Tue, 04 May 2010 11:52:30 +0200High Breakdown Inference in the Mixed Linear Modelhttps://archive-ouverte.unige.ch/unige:6459https://archive-ouverte.unige.ch/unige:6459Mixed linear models are used to analyze data in many settings. These models have a multivariate normal formulation in most cases. The maximum likelihood estimator (MLE) or the residual MLE (REML) is usually chosen to estimate the parameters. However, the latter are based on the strong assumption of exact multivariate normality. Welsh and Richardson have shown that these estimators are not robust to small deviations from the multivariate normality. This means that in practice a small proportion of data (even only one) can drive the value of the estimates on their own. Because the model is multivariate, we propose a high-breakdown robust estimator for very general mixed linear models that include, for example, covariates. This robust estimator belongs to the class of S-estimators, from which we can derive the asymptotic properties for inference. We also use it as a diagnostic tool to detect outlying subjects. We discuss the advantages of this estimator compared with other robust estimators proposed previously and illustrate its performance with simulation studies and analysis of three datasets. We also consider robust inference for multivariate hypotheses as an alternative to the classical F-test by using a robust score-type test statistic proposed by Heritier and Ronchetti, and study its properties through simulations and analysis of real data.Tue, 04 May 2010 11:51:58 +0200