Performs single or double bootstrap (or bootknife) resampling and calculates confidence intervals. -- Function File: CI = bootci (NBOOT, BOOTFUN, D) -- Function File: CI = bootci (NBOOT, BOOTFUN, D1,...,DN) -- Function File: CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE) -- Function File: CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE) -- Function File: CI = bootci (...,'type', TYPE) -- Function File: CI = bootci (...,'type', 'stud', 'nbootstd', NBOOTSTD) -- Function File: CI = bootci (...,'type', 'cal', 'nbootcal', NBOOTCAL) -- Function File: CI = bootci (...,'alpha', ALPHA) -- Function File: CI = bootci (...,'strata', STRATA) -- Function File: CI = bootci (...,'loo', LOO) -- Function File: CI = bootci (...,'seed', SEED) -- Function File: CI = bootci (...,'Options', PAROPT) -- Function File: [CI, BOOTSTAT] = bootci (...) 'CI = bootci (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples from the rows of a data sample D and returns 95% confidence intervals (CI) for the bootstrap statistics computed by BOOTFUN [1]. BOOTFUN is a function handle (e.g. specified with @), or a string indicating the function name. The third input argument, data D (a column vector or a matrix), is used as input for BOOTFUN. The bootstrap resampling method yields first-order balance [2-3]. 'CI = bootci (NBOOT, BOOTFUN, D1,...,DN)' is as above except that the third and subsequent numeric input arguments are data vectors that are used to create inputs for bootfun. 'CI = bootci (NBOOT, {BOOTFUN, D}, NAME, VALUE)' is as above but includes setting optional parameters using Name-Value pairs. 'CI = bootci (NBOOT, {BOOTFUN, D1, ..., DN}, NAME, VALUE)' is as above but includes setting optional parameters using NAME-VALUE pairs. bootci can take a number of optional parameters as NAME-VALUE pairs: 'CI = bootci (..., 'alpha', ALPHA)' where ALPHA sets the lower and upper bounds of the confidence interval(s). The value of ALPHA must be between 0 and 1. The nominal lower and upper percentiles of the confidence intervals CI are then 100*(ALPHA/2)% and 100*(1-ALPHA/2)% respectively, and nominal central coverage of the intervals is 100*(1-ALPHA)%. The default value of ALPHA is 0.05. 'CI = bootci (..., 'type', TYPE)' computes bootstrap confidence interval CI using one of the following methods: <> 'norm' or 'normal': Using bootstrap bias and standard error [4]. <> 'per' or 'percentile': Percentile method [1,4]. <> 'basic': Basic bootstrap method [1,4]. <> 'bca': Bias-corrected and accelerated method [5,6] (Default). <> 'stud' or 'student': Studentized bootstrap (bootstrap-t) [1,4]. <> 'cal': Calibrated percentile method (by double bootstrap [7]). Note that when BOOTFUN is the mean, percentile, basic and bca intervals are automatically expanded using Student's t-distribution in order to improve coverage for small samples [8]. The bootstrap-t method includes an additive correction to stabilize the variance when the sample size is small [9]. 'CI = bootci (..., 'type', 'stud', 'nbootstd', NBOOTSTD)' computes the Studentized bootstrap confidence intervals CI, with the standard errors of the bootstrap statistics estimated automatically using resampling methods. NBOOTSTD is a positive integer value > 0 defining the number of resamples. Standard errors are computed using NBOOTSTD bootstrap resamples. The default value of NBOOTSTD is 100. 'CI = bootci (..., 'type', 'cal', 'nbootcal', NBOOTCAL)' computes the calibrated percentile bootstrap confidence intervals CI, with the calibrated percentiles of the bootstrap statistics estimated from NBOOTCAL bootstrap data samples. NBOOTCAL is a positive integer value. The default value of NBOOTCAL is 199. 'CI = bootci (..., 'strata', STRATA)' sets STRATA, which are identifiers that define the grouping of the DATA rows for stratified bootstrap resampling. STRATA should be a column vector or cell array with the same number of rows as the DATA. 'CI = bootci (..., 'loo', LOO)' is a logical scalar that specifies whether the resamples of size n should be obtained by sampling from the original data (false) or from Leave-One-Out (LOO) jackknife samples (true) of the data - otherwise known as bootknife resampling [10]. Default is false. 'CI = bootci (..., 'seed', SEED)' initialises the Mersenne Twister random number generator using an integer SEED value so that bootci results are reproducible. 'CI = bootci (..., 'Options', PAROPT)' specifies options that govern if and how to perform bootstrap iterations using multiple processors (if the Parallel Computing Toolbox or Octave Parallel package is available). This argument is a structure with the following recognised fields: <> 'UseParallel': If true, use parallel processes to accelerate bootstrap computations on multicore machines, specifically non-vectorized function evaluations, double bootstrap resampling and jackknife function evaluations. Default is false for serial computation. In MATLAB, the default is true if a parallel pool has already been started. <> 'nproc': nproc sets the number of parallel processes '[CI, BOOTSTAT] = bootci (...)' also returns the bootstrap statistics used to calculate the confidence intervals CI. '[CI, BOOTSTAT, BOOTSAM] = bootci (...)' also returns BOOTSAM, a matrix of indices from the bootstrap. Each column in BOOTSAM corresponds to one bootstrap sample and contains the row indices of the values drawn from the nonscalar data argument to create that sample. Bibliography: [1] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. New York, NY: Chapman & Hall [2] Davison et al. (1986) Efficient Bootstrap Simulation. Biometrika, 73: 555-66 [3] Booth, Hall and Wood (1993) Balanced Importance Resampling for the Bootstrap. The Annals of Statistics. 21(1):286-298 [4] Davison and Hinkley (1997) Bootstrap Methods and their Application. (Cambridge University Press) [5] Efron (1987) Better Bootstrap Confidence Intervals. JASA, 82(397): 171-185 [6] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. New York, NY: Chapman & Hall [7] Hall, Lee and Young (2000) Importance of interpolation when constructing double-bootstrap confidence intervals. Journal of the Royal Statistical Society. Series B. 62(3): 479-491 [8] Hesterberg, Tim (2014), What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum, http://arxiv.org/abs/1411.5279 [9] Polansky (2000) Stabilizing bootstrap-t confidence intervals for small samples. Can J Stat. 28(3):501-516 [10] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling vs. Smoothing; Proceedings of the Section on Statistics & the Environment. Alexandria, VA: American Statistical Association. bootci (version 2023.07.04) Author: Andrew Charles Penn https://www.researchgate.net/profile/Andrew_Penn/ Copyright 2019 Andrew Charles Penn This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 95% BCa bootstrap confidence intervals for the mean ci = bootci (1999, @mean, data)
Produces the following output
ci = 23.616 34.358
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 95% calibrated percentile bootstrap confidence intervals for the mean ci = bootci (1999, {@mean, data}, 'type', 'cal','nbootcal',199) ## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 23.975 34.269
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 95% calibrated percentile bootstrap confidence intervals for the median ## with smoothing ci = bootci (1999, {@smoothmedian, data}, 'type', 'cal', 'nbootcal', 199) ## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 25.040 36.477
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 90% percentile bootstrap confidence intervals for the variance ci = bootci (1999, {{@var,1}, data}, 'type', 'per', 'alpha', 0.1)
Produces the following output
ci = 96.629 235.910
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 90% BCa bootstrap confidence intervals for the variance ci = bootci (1999, {{@var,1}, data}, 'type', 'bca', 'alpha', 0.1)
Produces the following output
ci = 117.01 260.73
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41]'; ## 90% Studentized bootstrap confidence intervals for the variance ci = bootci (1999, {{@var,1}, data}, 'type', 'stud', ... 'nbootstd', 50, 'alpha', 0.1) ## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 108.55 297.71
The following code
## Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## 90% calibrated percentile bootstrap confidence intervals for the variance ci = bootci (1999, {{@var,1}, data}, 'type', 'cal', 'nbootcal', ... 199, 'alpha', 0.1) ## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 111.53 268.13
The following code
## Input bivariate dataset x = [2.12,4.35,3.39,2.51,4.04,5.1,3.77,3.35,4.1,3.35, ... 4.15,3.56, 3.39,1.88,2.56,2.96,2.49,3.03,2.66,3].'; y = [2.47,4.61,5.26,3.02,6.36,5.93,3.93,4.09,4.88,3.81, ... 4.74,3.29,5.55,2.82,4.23,3.23,2.56,4.31,4.37,2.4].'; ## 95% BCa bootstrap confidence intervals for the correlation coefficient ci = bootci (1999, @cor, x, y) ## Please be patient, the calculations will be completed soon...
Produces the following output
ci = 0.5050 0.8633
The following code
## Spatial Test Data from Table 14.1 of Efron and Tibshirani (1993) ## An Introduction to the Bootstrap in Monographs on Statistics and Applied ## Probability 57 (Springer) ## AIM: ## To construct 90% nonparametric bootstrap confidence intervals for var(A,1) ## var(A,1) = 171.5 ## Exact intervals based on Normal theory are [118.4, 305.2]. ## Calculations using Matlab's 'Statistics and Machine Learning toolbox' ## (R2020b) ## ## A = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... ## 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## varfun = @(A) var(A, 1); ## rng('default'); % For reproducibility ## rng('default'); ci1 = bootci (19999,{varfun,A},'alpha',0.1,'type','norm'); ## rng('default'); ci2 = bootci (19999,{varfun,A},'alpha',0.1,'type','per'); ## rng('default'); ci4 = bootci (19999,{varfun,A},'alpha',0.1,'type','bca'); ## rng('default'); ci5 = bootci (19999,{varfun,A},'alpha',0.1,'type','stud'); ## ## Summary of results from Matlab's 'Statistics and Machine Learning toolbox' ## (R2020b) ## ## method | 0.05 | 0.95 | length | shape | ## -------------------|--------|--------|--------|-------| ## ci1 - normal | 108.9 | 247.4 | 138.5 | 1.21 | ## ci2 - percentile | 97.6 | 235.8 | 138.2 | 0.87 | ## ci4 - BCa | 114.9 | 260.5 | 145.6 | 1.57 |* ## ci5 - bootstrap-t | 46.7 | 232.5 | 185.8 | 0.49 |** ## -------------------|--------|--------|--------|-------| ## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 | ## ## * Bug in the fx0 subfunction of bootci ## ** Bug in the bootstud subfunction of bootci ## Calculations using the 'boot' and 'bootstrap' packages in R ## ## library (boot) # Functions from Davison and Hinkley (1997) ## A <- c(48,36,20,29,42,42,20,42,22,41,45,14,6, ... ## 0,33,28,34,4,32,24,47,41,24,26,30,41); ## n <- length(A) ## var.fun <- function (d, i) { ## # Function to compute the population variance ## n <- length (d); ## return (var (d[i]) * (n - 1) / n) }; ## boot.fun <- function (d, i) { ## # Compute the estimate ## t <- var.fun (d, i); ## # Compute sampling variance of the estimate using Tukey's jackknife ## n <- length (d); ## U <- empinf (data=d[i], statistic=var.fun, type="jack", stype="i"); ## var.t <- sum (U^2 / (n * (n - 1))); ## return ( c(t, var.t) ) }; ## set.seed(1) ## var.boot <- boot (data=A, statistic=boot.fun, R=19999, sim='balanced') ## ci1 <- boot.ci (var.boot, conf=0.90, type="norm") ## ci2 <- boot.ci (var.boot, conf=0.90, type="perc") ## ci3 <- boot.ci (var.boot, conf=0.90, type="basic") ## ci4 <- boot.ci (var.boot, conf=0.90, type="bca") ## ci5 <- boot.ci (var.boot, conf=0.90, type="stud") ## ## library (bootstrap) # Functions from Efron and Tibshirani (1993) ## set.seed(1); ## ci4a <- bcanon (A, 19999, var.fun, alpha=c(0.05,0.95)) ## set.seed(1); ## ci5a <- boott (A, var.fun, nboott=19999, nbootsd=499, perc=c(.05,.95)) ## ## Summary of results from 'boot' and 'bootstrap' packages in R ## ## method | 0.05 | 0.95 | length | shape | ## -------------------|--------|--------|--------|-------| ## ci1 - normal | 109.6 | 246.7 | 137.1 | 1.22 | ## ci2 - percentile | 97.9 | 234.8 | 136.9 | 0.86 | ## ci3 - basic | 108.3 | 245.1 | 136.8 | 1.16 | ## ci4 - BCa | 116.0 | 260.7 | 144.7 | 1.60 | ## ci4a - BCa | 115.8 | 260.6 | 144.8 | 1.60 | ## ci5 - bootstrap-t | 112.0 | 291.8 | 179.8 | 2.02 | ## ci5a - bootstrap-t | 116.1 | 290.9 | 174.8 | 2.16 | ## -------------------|--------|--------|--------|-------| ## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 | ## Calculations using the 'statistics-resampling' package for Octave/Matlab ## ## A = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... ## 0 33 28 34 4 32 24 47 41 24 26 30 41].'; ## ci1 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','norm','seed',1); ## ci2 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','per','seed',1); ## ci3 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','basic','seed',1); ## ci4 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','bca','seed',1); ## ci5 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','stud',... ## 'nbootstd',100,'seed',1); ## ci6 = bootci (19999,{{@var,1},A},'alpha',0.1,'type','cal', ... ## 'nbootcal',499,'seed',1); ## ## Summary of results from 'statistics-resampling' package for Octave/Matlab ## ## method | 0.05 | 0.95 | length | shape | ## -------------------|--------|--------|--------|-------| ## ci1 - normal | 110.1 | 246.2 | 136.1 | 1.22 | ## ci2 - percentile | 98.1 | 234.7 | 136.6 | 0.86 | ## ci3 - basic | 108.4 | 245.0 | 136.1 | 1.17 | ## ci4 - BCa | 116.1 | 259.3 | 143.2 | 1.59 | ## ci5 - bootstrap-t | 114.0 | 290.3 | 176.3 | 2.07 | ## ci6 - calibrated | 115.3 | 276.4 | 161.1 | 1.87 | ## -------------------|--------|--------|--------|-------| ## parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 | ## ## Simulation results for constructing 90% confidence intervals for the ## variance of a population N(0,1) from 1000 random samples of size 26 ## (analagous to the situation above). Simulation performed using the ## bootsim script with nboot of 1999. ## ## method | coverage | lower | upper | length | shape | ## ---------------------|----------|--------|--------|--------|-------| ## normal | 81.5% | 3.0% | 15.5% | 0.77 | 1.21 | ## percentile | 81.5% | 0.9% | 17.6% | 0.76 | 0.91 | ## basic | 81.1% | 2.5% | 16.4% | 0.78 | 1.09 | ## BCa | 84.2% | 5.4% | 10.4% | 0.86 | 1.82 | ## bootstrap-t | 89.2% | 4.3% | 6.5% | 0.99 | 2.15 | ## calibrated | 87.4% | 4.2% | 8.4% | 0.91 | 2.03 | ## ---------------------|----------|--------|--------|--------|-------| ## parametric - exact | 90.8% | 3.7% | 5.5% | 0.99 | 2.52 |
gives an example of how 'bootci' is used.
Package: statistics-resampling