- Why the Rousseeuw Yohai Paradigm is One of the Largest
and Longest Running Scientific Hoaxes in History
hoax.pdf
-
- Talk on prediction regions and intervals
predtalk.pdf
-
- The following preprints have been submitted.
-
- Jin and Olive (2023), Large Sample Theory for Some Ridge-Type Regression Estimators
ridgetype.pdf
-
- Welagedara, Haile, and Olive (2024), ARIMA Model Selection and Prediction Intervals
tspi.pdf
-
- Olive, Alshammari, Pathiranage, and Hettige (2024), Testing with the One Component Partial Least Squares and the Marginal Maximum Likelihood Estimators
hdwls.pdf
-
- THE FOLLOWING PREPRINTS HAVE NOT YET BEEN SUBMITTED OR NEED TO BE RESUBMITTED.
-
- The following 3 preprints may be ready for submission by August 2025,
-
- Abid and Olive (2024), Some Simple High Dimensional One and Two Sample Tests
hd1samp.pdf
-
- Olive and Quaye (2024), Testing Poisson Regression and Related Models with the One Component Partial Least Squares Estimator
hdpois.pdf
-
- Olive (2024a), High Dimensional Binary Regression and Classification
hdbreg.pdf
-
- The following preprint may be ready for submission by August 2026.
-
Olive (2024b), Testing Multivariate Linear Regression with Univariate OPLS Estimators
hdmreg.pdf
-
- The following preprint had too many ideas to be published in a major journal but part of it resulted in the paper Olive (2018) below.
-
- Highest Density Region Prediction
hdrpred.pdf
-
- This preprint shows how to visualize several important survival regression
models in the background of the data.
- Plots for Survival Regression
sreg.pdf
-
- 1D Regression
onedreg.pdf
- Graphical Aids for Regression.
gaid.pdf
- A Simple Plot for Model Assessment
simp.pdf
-
- THE FOLLOWING PREPRINT HAS BEEN INCORPORATED IN
- Olive, D.J. (2017), Linear Regression and
- Olive, D.J. (2017), Robust Multivariate Analysis, two Springer texts,
- and in the Olive (2018) paper Applications of Hyperellipsoidal Prediction Regions (below).
-
- This preprint shows the equivalence between a prediction region and a confidence region that can easily be bootstrapped. This method can be used for hypothesis testing, for robust statistics, and after variable selection. See the Pelawa Watagoda and Olive (2021ab) published papers below and the Rathnayake and Olive (2023) preprint for better theory.
- Bootstrapping Hypothesis Tests and Confidence Regions
vselboot.pdf
-
- THE FOLLOWING 4 PREPRINTS WERE INCORPORATED IN
- Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
-
- This paper gives the first easily computed estimators of multivariate
location and dispersion that have been shown to be sqrt(n) consistent
and highly outlier resistant.
- Olive, D.J., and Hawkins, D.M. (2010), Robust Multivariate Location and Dispersion
rmld.pdf
-
- This preprint shows how improve low breakdown consistent regression estimators
and outlier resistant estimators that do not have theory. The resulting estimator
is the first easily computed regression estimator that has been shown
to be sqrt(n) consistent and high breakdown. The response plot
is very useful for detecting outliers.
- Olive, D.J., and Hawkins, D.M. (2011), Practical High Breakdown Regression
hbreg.pdf
-
- Olive, D.J. (2013), Robust Multivariate Linear Regression
robmreg.pdf
-
- Olive, D.J. (2014), Robust Principal Component Analysis
rpca.pdf
-
- THE FOLLOWING TWO PREPRINTS HAVE BEEN CITED BY OTHER
- AUTHORS, BUT WERE REVISED AND PUBLISHED.
-
- Chang, J., and Olive, D.J. (2007), Resistant Dimension Reduction
resdr.pdf
- was revised and published as Chang and Olive (2010).
- Applications of a Robust Dispersion Estimator
rcovm.pdf
- was revised and published as Zhang, Olive, and Ye (2012).
-
- THE FOLLOWING SIX PREPRINTS HAVE BEEN CITED BY OTHER AUTHORS.
-
- This paper shows that the bootstrap is not first order accurate
unless the number of bootstrap samples B is proportional to the sample size n.
For second order accuracy, need B proportional to n^2. This was published
in Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY, ch. 9.
- Olive, D.J. (2011), The Number of Samples for Resampling Algorithms
resamp.pdf
-
- This preprint provides some of the most important theory in the field of robust statistics.
The paper shows that a simple modification to the most used but inconsistent algorithms
for robust statistics results in easily computed sqrt n consistent highly outlier resistant estimators.
It was converted to the Robust Multivariate Location and Dispersion and Practical High Breakdown Regression preprints
above. The material is in Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
-
- Olive, D.J., and Hawkins, D.M. (2008), High Breakdown Multivariate Estimators
hbrs.pdf
-
- The material in the following preprint is in
Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
- Olive, D.J., and Hawkins, D.M. (2007), Robustifying Robust Estimators, preprint
available from
ppconc.pdf
- For location scale families, estimators based on the median and mad have optimal robustness
properties. Use He's cross checking technique to make an asymptoticaly efficient estimator.
- Olive, D.J. (2006), Robust Estimators for Transformed Location-Scale Families.
robloc.pdf
- The material in the following preprint is in
Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
- Olive, D.J. (2005), A Simple Confidence Interval for the Median, preprint
available from
ppmedci.pdf
- The June 2008 ROBUST STATISTICS NOTES are below.
PLEASE CITE THIS WORK IF YOU USE IT. Much of this work is in
- Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
- Olive, D.J. (2008), Applied Robust Statistics,
preprint available from (http://parker.ad.siu.edu/Olive/run.pdf).
robnotes.pdf
-
- Web page with data sets and programs to go with the course notes.
robust.html
-
- MORE ONLINE COURSE NOTES.
-
- The next preprint simplifies large sample theory for elastic net, ridge regression and lasso. A new variable selection estimator with simple theory that is easy to bootstrap is given. Theory for 3 bootstrap confidence regions is given, and the coverage should be near the nominal for the new estimator. Need to update ch. 4 variable selection material as in Rathnayke and Olive (2023).
- Jan. 2023: Webpage for second draft of Olive, D.J. (2023), Prediction and Statistical Learning.
- http://parker.ad.siu.edu/Olive/slearnbk.htm
-
- Need to incorporate Rathnayake and Olive (2023) variable selection material into Chapter 10.
- Jan. 2023: Webpage for first draft of Math 584 notes Olive, D.J. (2023), Theory for Linear Models.
- http://parker.ad.siu.edu/Olive/linmodbk.htm
-
- Need to update the variable selection material as in Rathnayke and Olive (2023).
- Jan. 2023: Webpage for first draft of Math 473 notes Olive, D.J. (2023), Survival Analysis. http://parker.ad.siu.edu/Olive/survbk.htm
-
- Jan. 2023: Webpage for first draft of Olive, D.J. (2023), Large Sample Theory.
http://parker.ad.siu.edu/Olive/lsampbk.htm
-
- Need to update ch. 10 variable selection material as in Rathnayke and Olive (2023).
- Jan. 2023: Webpage for first draft of Olive, D.J. (2023), Robust Statistics.
http://parker.ad.siu.edu/Olive/robbook.htm
-
- WEBPAGES AND COURSE NOTES TO GO WITH THREE PUBLISHED BOOKS
-
- TWO COMPETITORS FOR Casella and Berger (2002), Statistical Inference:
- Olive, D.J. (2008), A Course in Statistical Theory,
preprint available from (http://parker.ad.siu.edu/Olive/infer.htm).
infer.htm
- Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY.
- The Springer eBook is available on SpringerLink, Springer's online platform.
http://dx.doi.org/10.1007/978-3-319-04972-4
-
- TWO COMPETITORS FOR Kutner, Nachtsheim, Neter, and Li (2005), Applied Linear Statistical Models:
- Olive, D.J. (2010), Multiple Linear and 1D Regression Models,
preprint available from (http://parker.ad.siu.edu/Olive/regbk.htm).
regbk.htm
- Olive, D.J. (2017a), Linear Regression, Springer, New York, NY.
- The Springer eBook is available on SpringerLink, Springer's online platform.
http://dx.doi.org/10.1007/978-3-319-55252-1
-
- A COMPETITOR FOR Johnson and Wichern (2007), Applied Multivariate Analysis:
- Olive, D.J. (2017b), Robust Multivariate Analysis, Springer, New York, NY.
- The Springer eBook is available on SpringerLink, Springer's online platform.
https://link.springer.com/book/10.1007%2F978-3-319-68253-2
-
- Jan. 2013 1st draft of Robust Multivariate Analysis:
- http://parker.ad.siu.edu/Olive/multbk.htm
-
-
- Here are some rejected Letters to the Editor. Erratum should have been published.
-
- This slightly revised letter was sent to the Journal of Computational and Graphical Statistics about the latest Fake-MCD estimator of Hubert, Rousseeuw and Verdonk (2012). It pointed out that DetMCD is not the MCD estimator, that DetMCD has no theory, and that it will be a massive undertaking to modify the theory for concentration estimators in Olive and Hawkins (2010) to show whether DetMCD has any good properties.
- Fake MCD
fakemcd.pdf
-
- This letter was sent to the Annals of Statistics regarding the
Bali, Boente, Tyler and Wang (2011) bait and switch paper.
- Fake Projection Estimator
fakeproj.pdf
-
- This Letter was sent to The Annals of Statistics regarding the
Salibian-Barrera and Yohai (2008) bait and switch paper. After a rejection,
it was revised and sent to the American Statistician as a paper, but rejected.
- The Breakdown of Breakdown
bdbd.pdf
-
- THE NEXT 11 DOCUMENTS MAY BE OF MILD INTEREST, BUT WILL PROBABLY NEVER BE PUBLISHED.
-
- The following 5 preprints have been incorporated into the published paper
Olive (2013) ``Plots for Generalized Additive Models."
- Response Transformations for Models with Additive Errors
rtrans.pdf
- Response Plots and Related Plots for Regression
rplot.pdf
- Response Plots for Linear Models
lm.pdf
- Response Plots for Experimental Design
rploted.pdf
- Plots for Binomial and Poisson Regression
gfit.pdf
-
- Comments on Breakdown
bkdn.pdf
- Abuhassan, H. and Olive, D.J. (2008), Inference for the Pareto, Half Normal and
Related Distributions.
std.pdf
-
- (long version of) Robustifying Robust Estimators
lconc.pdf
- Prediction intervals in the presence of outliers
pi.pdf
- This 1996 result grew into a 2002 JASA discussion paper.
dense.pdf
- This 1997 result on partitioning may be of mild interest.
part.pdf
-
- THIS IS MY PhD DISSERTATION: Olive, D.J. (1998), Applied Robust Statistics,
Ph.D. Thesis, University of Minnesota. It shows my 1998 ideas on Robust Statistics.
The Figures are missing and the page numbers differ from the orignial dissertation.
arsdiss.pdf
-
- THE FOLLOWING ARE PREPRINTS OF PUBLISHED OR ACCEPTED PAPERS.
-
- The following preprint greatly increases the scope of data splitting for regression, and finds the large sample theory for OPLS.
-
- Olive, D.J., and Zhang, L. (2024), One Component Partial Least Squares, High Dimensional Regression, Data Splitting, and the Multitude of Models,
Communications in Statistics: Theory and Methods, to appear.opls.pdf
-
- The following preprint gives the large sample theory for some ARMA model selection estimators. The preprint also shows how to use bootstrap confidence regions for hypothesis testing.
-
- Haile and Olive (2024), Bootstrapping ARMA Time Series Models after Model Selection, Communications in Statistics: Theory and Methods, 53, 8255-8270.
tsboot.pdf
-
- Welagedara, W.A.D.M. and Olive, D.J. (2024), Calibrating and Visualizing Some Bootstrap Confidence Regions, Axioms, 13(10), 659.
-
- This paper shows how to get better cutoffs for many common tests,
and gives a new weighted least squares method.
-
- Rajapaksha, K.W.G.D.H. and Olive, D.J (2024), Wald Type Tests with the Wrong Dispersion Matrix, Communications in Statistics: Theory and Methods, 53, 2236-2251.
waldtype.pdf
-
- The following preprint gives a data splitting prediction region
and shows how to predict the random walk.
-
- Haile, Zhang, and Olive (2024), Predicting Random Walks and
a Data Splitting Prediction Region, Stats, 7(1), 23-33.
rwalkpi.pdf
-
- This paper gives the large sample theory for many variable selection estimators for several important regression models. A new estimator that does not have selection bias is given. The preprint also shows how to use bootstrap confidence regions for hypothesis testing for the usual and new variable selection estimators.
-
- Rathnayake, R.C. and Olive, D.J. (2023), Bootstrapping Some GLM and Survival Regression Variable Selection Estimators, Communications in Statistics: Theory and Methods, 52, 2625-2645.
bootglm.pdf
-
R code:
Rcodebootglm.pdf
-
-
- This paper shows how to get prediction intervals for a large class of parametric regression models such as GLMs, GAMs, and survival regression models. The PIs can work after variable selection and if the number of predictors is larger than the sample size.
-
- Olive, D.J, Rathnayake, R.C., and Haile, M.G. (2022), Prediction Intervals for GLMs, GAMs, and Some Survival Regression Models, Communications in Statistics: Theory and Methods, 51, 8012-8026.
pigam.pdf
R code:
Rcodepigam.pdf
-
- This paper gives prediction intervals that can be useful when
the sample size is less than the number of variables. These prediction intervals are useful for comparing shrinkage estimators like forward selection and lasso. Large sample theory for lasso, the elastic net, and ridge regression is simplified. New large sample theory for many OLS variable selection estimators is given. The theory shows that lasso variable selection is sqrt(n) consistent when lasso is consistent.
-
- Pelawa Watagoda, L.C.R. and Olive, D.J. (2021b), Comparing Six Shrinkage Estimators With Large Sample Theory and Asymptotically Optimal Prediction Intervals, Statistical Papers, 62, 2407-2431.
picomp.pdf
-
- This paper gives theory for three useful bootstrap confidence regions. We use betahatImin0 to denote the variable seletion estimator, but we are using the usual estimator betahatVS and a new estimator betahatMIX, and the paper would be clearer
if we did not use context to decide which estimator betahatImin0 is. The large sample theory for betahatMIX is derived, and is only asymptotically equivalent to that of betahatVS under strong regularity conditions. See the above paper and Rathnayake and Olive (2020). Theory for 3 bootstrap confidence regions is given.
-
- Pelawa Watagoda, L.C.R. and Olive, D.J. (2021a), Bootstrapping Multiple Linear Regression After Variable Selection, Statistical Papers, 62, 681-700.
piboottest.pdf
-
- This paper shows how to bootstrap analogs of the one way MANOVA model where we
do not assume equal covariance matrices.
-
- Rupasinghe Arachchige Don, H.S., and Olive, D.J. (2019), Bootstrapping Analogs of
the One Way MANOVA Test, Communications in Statistics: Theory and Methods, 48, 5546-5558.
manova.pdf
-
- This paper shows that applying the Olive (2013b) nonparametric prediction region to
a bootstrap sample can result in a confidence region, and applying the prediction
region to Yhat_f + e_i, where the e_i are residual vectors, results in a nonparametric
prediction region for a future response vector Y_f for multivariate regression.
-
- Olive, D.J. (2018), Applications of Hyperellipsoidal Prediction Regions,
Statistical Papers, 59, 913-931.
hpred.pdf
-
- Olive, D.J., Pelawa Watagoda, L.C.R., and Rupasinghe Arachchige Don, H.S. (2015),
Visualizing and Testing the Multivariate Linear Regression Model,
International Journal of Statistics and Probability, 4, 126-137.
vtmreg.pdf
-
- This paper gives response plots, plots for response transformations
and plots for detecting overdispersion for GAMs and GLMs.
- Olive, D.J. (2013a), Plots for Generalized Additive Models, Communications in
Statistics: Theory and Methods, 42, 2610-2628.
gam.pdf
R/Splus code:
gamcode.txt
-
- Olive, D.J. (2013b), Asymptotically Optimal Regression Prediction Intervals and Prediction Regions
for Multivariate Data, International Journal of Statistics and Probability, 2, 90-100.
apred.pdf
-
- This paper describes the sqrt(n) consistent highly outlier resistant
FCH, RFCH and RMVN estimators and gives an application for canonical correlation analysis.
- Zhang, J., Olive, D.J., and Ye, P. (2012), Robust Covariance Matrix Estimation with
Canonical Correlation Analysis, International Journal of Statistics and Probability, 1, 119-136.
rcca.pdf
-
- This paper shows that OLS partial F tests, originally meant for multiple linear
regression, are useful for exploratory purposes for or a much larger class of
models, including generalized linear models and single index models.
- Chang, J. and Olive, D.J. (2010), OLS for 1D Regression Models, Communications in
Statistics: Theory and Methods, 39, 1869-1882.
sindx.pdf
-
- Olive, D.J. and Hawkins, D.M. (2007), Behavior of Elemental Sets in Regression,
Statistics and Probability Letters, 77, 621-624.
elem.pdf
-
- This paper shows how to construct asymptotically optimal prediction intervals
for regression models of the form Y = m(x) + e. The errors need to be iid unimodal
and emphasis is on linear regression.
- Olive, D.J. (2007), Prediction Intervals for Regression Models, Computational Statistics
and Data Analysis, 51, 3115-3122.
spi.pdf
-
- This paper shows that the variable selection software originally meant
for multiple linear regression gives useful results for a much larger class of
models, including generalized linear models and single index models, if the Mallows' Cp criterion is used.
For models I with k predictors, the screen Cp(I) < 2k is much
more effective than the screen Cp(I) < k. Use response plots to show that the
final submodel is similar to the original full model.
- Olive, D.J. and Hawkins, D.M. (2005), Variable Selection for 1D Regression Models,
Technometrics, 47, 43-50.
varsel.pdf
-
- Olive, D.J. (2005), Two Simple Resistant Regression Estimators, Computational Statistics
and Data Analysis, 49, 809-819.
mba.pdf
-
- The MBA estimator is not as good as the FCH estimator in "High Breakdown Robust
Estimators," but was the first easily computed estimator of multivariate
location and dispersion shown (in 2004) to be sqrt(n) consistent and highly outlier resistant.
See "Robustifying Robust Estimators" or "Applied Robust Statistics" for proofs.
- Olive, D.J. (2004a), A Resistant Estimator of Multivariate Location and Dispersion,
Computational Statistics and Data Analysis, 46, 99-102.
rcov.pdf
-
- The following paper suggests ways to robustify regression techniques for single index models
and sliced inverse regression.
- Olive, D.J. (2004b), Visualizing 1D Regression, in
Theory and Applications of Recent Robust Methods, edited by M. Hubert, G. Pison, A. Struyf and
S. Van Aelst, Series: Statistics for Industry and Technology, Birkhauser, Basel, 221-233.
vreg.pdf
-
- Olive, D.J., and Hawkins, D.M. (2003), Robust Regression with High Coverage,
Statistics and Probability Letters, 63, 259-266.
hcov.pdf
-
- The following paper provides a simultaneous diagnostic for whether the data
follows a multivariate normal distribution or some other elliptically contoured distribution.
It also provides a nice way to estimate and visualize single index models.
- Olive, D.J. (2002), Applications of Robust Distances for Regression, Technometrics, 44, 64-71.
rdist.pdf
-
- The following paper gives extremely important theoretical results.
It shows that software implementations for estimators of robust regression and
robust multivariate location and dispersion tend to be inconsistent with zero breakdown value.
The commonly used elemental basic resampling algorithm draws K elemental sets. Each
elemental fit is inconsistent, so the final estimator is inconsistent, regardless
of how the algorithm chooses the elemental fit.
The CM, GS, LMS, LQD, LTS, maximum depth, MCD, MVE, one step GM and GR, projection, S, tau,
t type, and many other robust estimators are of little applied interest because they are
impractical to compute. The "Robustifying Robust Estimators" paper shows how modify some
algorithms so that the resulting regression estimators are easily computed sqrt n consistent
high breakdown estimators and the resulting multivariate location and dispersion estimators
are sqrt n consistent with high outlier resistance.
- Hawkins, D.M., and Olive, D.J. (2002), Inconsistency of Resampling Algorithms for High
Breakdown Regression Estimators and a New Algorithm (with discussion), Journal of the American
Statistical Association, 97, 136-148.
incon.pdf
-
- This paper gives a graphical method for estimating response transformations
that can be used to complement or replace the numerical Box-Cox method.
- Cook, R.D., and Olive, D.J. (2001), A Note on Visualizing Response Transformations,
Technometrics, 43, 443-449. resp.pdf
-
- Olive, D.J. (2001), High Breakdown Analogs of the Trimmed Mean, Statistics and Probability
Letters, 51, 87-92.rloc.pdf
-
- Hawkins, D.M., and Olive, D.J. (1999a), Improved Feasible Solution Algorithms for
High Breakdown Estimation, Computational Statistics and Data Analysis, 30, 1-11.
ifsa.pdf
-
- Hawkins, D.M., and Olive, D. (1999b), Applications and Algorithms for Least Trimmed Sum
of Absolute Deviations Regression, Computational Statistics and Data Analysis, 32, 119-134.
lta.pdf
|