Preprints

by David Olive
        Preprint M-02-006
        Copyright May 2003, January 2018

  • Why the Rousseeuw Yohai Paradigm is One of the Largest and Longest Running Scientific Hoaxes in History hoax.pdf
    • Talk on prediction regions and intervals predtalk.pdf
    • The next preprint provides large sample theory for the elastic net and OLS variable selection, and simplifies large sample theory for ridge regression and lasso. Theory for 3 bootstrap confidence regions is given, and the coverage should be near the nominal for forward selection.
    • Jan. 2018 1st draft of Prediction and Statistical Learning (online course notes):
    • http://lagrange.math.siu.edu/Olive/slearnbk.htm
    • The following preprints have been submitted.
    • The next preprint gives prediction intervals that can be useful when the sample size is less than the number of variables. These prediction intervals are useful for comparing shrinkage estimators like forward selection and lasso.
    • Pelawa Watagoda and Olive (2018a), Comparing Shrinkage Estimators With Asymptotically Optimal Prediction Intervals picomp.pdf
    • The next preprint gives bootstrap confidence regions that can be used for bootstrap tests for OLS variable selection estimators. The tests also simulate well for lasso.
    • Pelawa Watagoda and Olive (2018b), Bootstrapping Multiple Linear Regression After Variable Selection piboottest.pdf
    • THE FOLLOWING PREPRINTS HAVE NOT YET BEEN SUBMITTED OR NEED TO BE RESUBMITTED.
    • The next preprint needs a lot of work but shows how to get prediction intervals for a large class of parametric regression models such as GLMs, and the PIs can work after model selection.
    • Olive and Rathnayake (2018c), Prediction Intervals for Some GLMs, GAMs, and Survival Regression Models pigam.pdf
    • The next preprint provides large sample theory for the elastic net and OLS variable selection. Theory for 3 bootstrap confidence regions is given, and the coverage should be near the nominal for forward selection. Prediction interval are also given.
    • The preprint has too many ideas to be published in a major journal, so will be broken into the 2 papers Pelawa and Watagoda (2018ab) above.
    • The ideas also appear in the online course notes Prediction and Statistical Learning listed above.
    • Pelawa Watagoda and Olive (2018), Inference for Multiple Linear Regression After Model or Variable Selection vsinf.pdf
    • The next preprint provides large sample theory for OLS variable selection. Theory for elastic net, ridge regression and lasso is simplified.
    • This preprint will be incorporated into Pelawa Watagoda and Olive (2018b).
    • Pelawa Watagoda and Olive (2018c), Large Sample Theory for OLS Variable Selection Estimators enols.pdf
    • Model Selection, Prediction Intervals and Outlier Detection for Time Series mselpred.pdf
    • The following preprint had too many ideas to be published in a major journal but part of it resulted in the paper Olive (2018).
    • Highest Density Region Prediction hdrpred.pdf
    • This preprint shows how to visualize several important survival regression models in the background of the data.
    • Plots for Survival Regression sreg.pdf
    • 1D Regression onedreg.pdf
    • Graphical Aids for Regression. gaid.pdf
    • A Simple Plot for Model Assessment simp.pdf
    • THE FOLLOWING PREPRINT HAS BEEN INCORPORATED IN
    • Olive, D.J. (2017), Linear Regression and
    • Olive, D.J. (2017), Robust Multivariate Analysis, two Springer texts,
    • and in the Olive (2018) paper Applications of Hyperellipsoidal Prediction Regions (below).
    • This preprint shows the equivalence between a prediction region and a confidence region that can easily be bootstrapped. This method can be used for hypothesis testing, for robust statistics, and after variable selection.
    • Bootstrapping Hypothesis Tests and Confidence Regions vselboot.pdf
    • THE FOLLOWING 4 PREPRINTS WERE INCORPORATED IN
    • Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
    • This paper gives the first easily computed estimators of multivariate location and dispersion that have been shown to be sqrt(n) consistent and highly outlier resistant.
    • Olive, D.J., and Hawkins, D.M. (2010), Robust Multivariate Location and Dispersion rmld.pdf
    • This preprint shows how improve low breakdown consistent regression estimators and outlier resistant estimators that do not have theory. The resulting estimator is the first easily computed regression estimator that has been shown to be sqrt(n) consistent and high breakdown. The response plot is very useful for detecting outliers.
    • Olive, D.J., and Hawkins, D.M. (2011), Practical High Breakdown Regression hbreg.pdf
    • Olive, D.J. (2013), Robust Multivariate Linear Regression robmreg.pdf
    • Olive, D.J. (2014), Robust Principal Component Analysis rpca.pdf
    • THE FOLLOWING TWO PREPRINTS HAVE BEEN CITED BY OTHER
    • AUTHORS, BUT WERE REVISED AND PUBLISHED.
    • Chang, J., and Olive, D.J. (2007), Resistant Dimension Reduction resdr.pdf
    • was revised and published as Chang and Olive (2010).
    • Applications of a Robust Dispersion Estimator rcovm.pdf
    • was revised and published as Zhang, Olive, and Ye (2012).
    • THE FOLLOWING SIX PREPRINTS HAVE BEEN CITED BY OTHER AUTHORS.
    • This paper shows that the bootstrap is not first order accurate unless the number of bootstrap samples B is proportional to the sample size n. For second order accuracy, need B proportional to n^2. This was published in Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY, ch. 9.
    • Olive, D.J. (2011), The Number of Samples for Resampling Algorithms resamp.pdf
    • This preprint provides some of the most important theory in the field of robust statistics. The paper shows that a simple modification to the most used but inconsistent algorithms for robust statistics results in easily computed sqrt n consistent highly outlier resistant estimators. It was converted to the Robust Multivariate Location and Dispersion and Practical High Breakdown Regression preprints above. The material is in Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
    • Olive, D.J., and Hawkins, D.M. (2008), High Breakdown Multivariate Estimators hbrs.pdf
    • The material in the following preprint is in Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
    • Olive, D.J., and Hawkins, D.M. (2007), Robustifying Robust Estimators, preprint available from ppconc.pdf
    • For location scale families, estimators based on the median and mad have optimal robustness properties. Use He's cross checking technique to make an asymptoticaly efficient estimator.
    • Olive, D.J. (2006), Robust Estimators for Transformed Location-Scale Families. robloc.pdf
    • The material in the following preprint is in Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
    • Olive, D.J. (2005), A Simple Confidence Interval for the Median, preprint available from ppmedci.pdf
    • The June 2008 ROBUST STATISTICS NOTES are below. PLEASE CITE THIS WORK IF YOU USE IT. Much of this work is in
    • Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
    • Olive, D.J. (2008), Applied Robust Statistics, preprint available from (http://lagrange.math.siu.edu/Olive/run.pdf). robnotes.pdf
    • Web page with data sets and programs to go with the course notes. robust.html
    • Much of the course notes below is in Olive, D.J. (2017), Linear Reression, Springer, New York, NY.
    • Olive, D.J. (2010), Multiple Linear and 1D Regression. regbk.htm
    • TWO COMPETITORS FOR Casella and Berger (2002), Statistical Inference:
    • Olive, D.J. (2008), A Course in Statistical Theory, preprint available from (http://lagrange.math.siu.edu/Olive/infer.htm). infer.htm
    • Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY.
    • The Springer eBook is available on SpringerLink, Springer's online platform. http://dx.doi.org/10.1007/978-3-319-04972-4
    • TWO COMPETITORS FOR Kutner, Nachtsheim, Neter, and Li (2005), Applied Linear Statistical Models:
    • Olive, D.J. (2010), Multiple Linear and 1D Regression Models, preprint available from (http://lagrange.math.siu.edu/Olive/regbk.htm). regbk.htm
    • Olive, D.J. (2017a), Linear Regression, Springer, New York, NY.
    • The Springer eBook is available on SpringerLink, Springer's online platform. http://dx.doi.org/10.1007/978-3-319-55252-1
    • A COMPETITOR FOR Johnson and Wichern (2007), Applied Multivariate Analysis:
    • Olive, D.J. (2017b), Robust Multivariate Analysis, Springer, New York, NY.
    • The Springer eBook is available on SpringerLink, Springer's online platform. https://link.springer.com/book/10.1007%2F978-3-319-68253-2
    • Jan. 2013 1st draft of Robust Multivariate Analysis:
    • http://lagrange.math.siu.edu/Olive/multbk.htm
    • Here are some rejected Letters to the Editor. Erratum should have been published.
    • This slightly revised letter was sent to the Journal of Computational and Graphical Statistics about the latest Fake-MCD estimator of Hubert, Rousseeuw and Verdonk (2012). It pointed out that DetMCD is not the MCD estimator, that DetMCD has no theory, and that it will be a massive undertaking to modify the theory for concentration estimators in Olive and Hawkins (2010) to show whether DetMCD has any good properties.
    • Fake MCD fakemcd.pdf
    • This letter was sent to the Annals of Statistics regarding the Bali, Boente, Tyler and Wang (2011) bait and switch paper.
    • Fake Projection Estimator fakeproj.pdf
    • This Letter was sent to The Annals of Statistics regarding the Salibian-Barrera and Yohai (2008) bait and switch paper. After a rejection, it was revised and sent to the American Statistician as a paper, but rejected.
    • The Breakdown of Breakdown bdbd.pdf
    • THE NEXT 11 DOCUMENTS MAY BE OF MILD INTEREST, BUT WILL PROBABLY NEVER BE PUBLISHED.
    • The following 5 preprints have been incorporated into Olive (2013) ``Plots for Generalized Additive Models."
    • Response Transformations for Models with Additive Errors rtrans.pdf
    • Response Plots and Related Plots for Regression rplot.pdf
    • Response Plots for Linear Models lm.pdf
    • Response Plots for Experimental Design rploted.pdf
    • Plots for Binomial and Poisson Regression gfit.pdf
    • Comments on Breakdown bkdn.pdf
    • Abuhassan, H. and Olive, D.J. (2008), Inference for the Pareto, Half Normal and Related Distributions. std.pdf
    • (long version of) Robustifying Robust Estimators lconc.pdf
    • Prediction intervals in the presence of outliers pi.pdf
    • This 1996 result grew into a 2002 JASA discussion paper. dense.pdf
    • This 1997 result on partitioning may be of mild interest. part.pdf
    • THIS IS MY PhD DISSERTATION: Olive, D.J. (1998), Applied Robust Statistics, Ph.D. Thesis, University of Minnesota. It shows my 1998 ideas on Robust Statistics. arsdiss.pdf
    • THE FOLLOWING ARE PREPRINTS OF PUBLISHED OR ACCEPTED PAPERS.
    • This paper shows how to bootstrap analogs of the one way MANOVA model where we do not assume equal covariance matrices.
    • Rupasinghe Arachchige Don, H.S., and Olive, D.J. (2018), Bootstrapping Analogs of the One Way MANOVA Test, Communications in Statistics, to appear. manova.pdf
    • This paper shows that applying the Olive (2013b) nonparametric prediction region to a bootstrap sample can result in a confidence region, and applying the prediction region to Yhat_f + e_i, where the e_i are residual vectors, results in a nonparametric prediction region for a future response vector Y_f for multivariate regression.
    • Olive, D.J. (2018), Applications of Hyperellipsoidal Prediction Regions, Statistical Papers, 59, 913-931. hpred.pdf
    • Olive, D.J., Pelawa Watagoda, L.C.R., and Rupasinghe Arachchige Don, H.S. (2015), Visualizing and Testing the Multivariate Linear Regression Model, International Journal of Statistics and Probability, 4, 126-137. vtmreg.pdf
    • This paper gives response plots, plots for response transformations and plots for detecting overdispersion for GAMs and GLMs.
    • Olive, D.J. (2013a), Plots for Generalized Additive Models, Communications in Statistics: Theory and Methods, 42, 2610-2628. gam.pdf R/Splus code: gamcode.txt
    • Olive, D.J. (2013b), Asymptotically Optimal Regression Prediction Intervals and Prediction Regions for Multivariate Data, International Journal of Statistics and Probability, 2, 90-100. apred.pdf
    • This paper describes the sqrt(n) consistent highly outlier resistant FCH, RFCH and RMVN estimators and gives an application for canonical correlation analysis.
    • Zhang, J., Olive, D.J., and Ye, P. (2012), Robust Covariance Matrix Estimation with Canonical Correlation Analysis, International Journal of Statistics and Probability, 1, 119-136. rcca.pdf
    • This paper shows that OLS partial F tests, originally meant for multiple linear regression, are useful for exploratory purposes for or a much larger class of models, including generalized linear models and single index models.
    • Chang, J. and Olive, D.J. (2010), OLS for 1D Regression Models, Communications in Statistics: Theory and Methods, 39, 1869-1882. sindx.pdf
    • Olive, D.J. and Hawkins, D.M. (2007), Behavior of Elemental Sets in Regression, Statistics and Probability Letters, 77, 621-624. elem.pdf
    • This paper shows how to construct asymptotically optimal prediction intervals for regression models of the form Y = m(x) + e. The errors need to be iid unimodal and emphasis is on linear regression.
    • Olive, D.J. (2007), Prediction Intervals for Regression Models, Computational Statistics and Data Analysis, 51, 3115-3122. spi.pdf
    • This paper shows that the variable selection software originally meant for multiple linear regression gives useful results for a much larger class of models, including generalized linear models and single index models, if the Mallows' Cp criterion is used. For models I with k predictors, the screen Cp(I) < 2k is much more effective than the screen Cp(I) < k. Use response plots to show that the final submodel is similar to the original full model.
    • Olive, D.J. and Hawkins, D.M. (2005), Variable Selection for 1D Regression Models, Technometrics, 47, 43-50. varsel.pdf
    • Olive, D.J. (2005), Two Simple Resistant Regression Estimators, Computational Statistics and Data Analysis, 49, 809-819. mba.pdf
    • The MBA estimator is not as good as the FCH estimator in "High Breakdown Robust Estimators," but was the first easily computed estimator of multivariate location and dispersion shown (in 2004) to be sqrt(n) consistent and highly outlier resistant. See "Robustifying Robust Estimators" or "Applied Robust Statistics" for proofs.
    • Olive, D.J. (2004a), A Resistant Estimator of Multivariate Location and Dispersion, Computational Statistics and Data Analysis, 46, 99-102. rcov.pdf
    • The following paper suggests ways to robustify regression techniques for single index models and sliced inverse regression.
    • Olive, D.J. (2004b), Visualizing 1D Regression, in Theory and Applications of Recent Robust Methods, edited by M. Hubert, G. Pison, A. Struyf and S. Van Aelst, Series: Statistics for Industry and Technology, Birkhauser, Basel, 221-233. vreg.pdf
    • Olive, D.J., and Hawkins, D.M. (2003), Robust Regression with High Coverage, Statistics and Probability Letters, 63, 259-266. hcov.pdf
    • The following paper provides a simultaneous diagnostic for whether the data follows a multivariate normal distribution or some other elliptically contoured distribution. It also provides a nice way to estimate and visualize single index models.
    • Olive, D.J. (2002), Applications of Robust Distances for Regression, Technometrics, 44, 64-71. rdist.pdf
    • The following paper gives extremely important theoretical results. It shows that software implementations for estimators of robust regression and robust multivariate location and dispersion tend to be inconsistent with zero breakdown value. The commonly used elemental basic resampling algorithm draws K elemental sets. Each elemental fit is inconsistent, so the final estimator is inconsistent, regardless of how the algorithm chooses the elemental fit. The CM, GS, LMS, LQD, LTS, maximum depth, MCD, MVE, one step GM and GR, projection, S, tau, t type, and many other robust estimators are of little applied interest because they are impractical to compute. The "Robustifying Robust Estimators" paper shows how modify some algorithms so that the resulting regression estimators are easily computed sqrt n consistent high breakdown estimators and the resulting multivariate location and dispersion estimators are sqrt n consistent with high outlier resistance.
    • Hawkins, D.M., and Olive, D.J. (2002), Inconsistency of Resampling Algorithms for High Breakdown Regression Estimators and a New Algorithm (with discussion), Journal of the American Statistical Association, 97, 136-148. incon.pdf
    • This paper gives a graphical method for estimating response transformations that can be used to complement or replace the numerical Box-Cox method.
    • Cook, R.D., and Olive, D.J. (2001), A Note on Visualizing Response Transformations, Technometrics, 43, 443-449. resp.pdf
    • Olive, D.J. (2001), High Breakdown Analogs of the Trimmed Mean, Statistics and Probability Letters, 51, 87-92.rloc.pdf
    • Hawkins, D.M., and Olive, D.J. (1999a), Improved Feasible Solution Algorithms for High Breakdown Estimation, Computational Statistics and Data Analysis, 30, 1-11. ifsa.pdf
    • Hawkins, D.M., and Olive, D. (1999b), Applications and Algorithms for Least Trimmed Sum of Absolute Deviations Regression, Computational Statistics and Data Analysis, 32, 119-134. lta.pdf

Comments: Webmaster

Copyright © 2005, Board of Trustees, Southern Illinois University
Privacy Policy Last Updated