Tally test statistics — statTally • mosaic

Tally test statistics from data and from multiple draws from a simulated null distribution

statTally(
  sample,
  rdata,
  FUN,
  direction = NULL,
  alternative = c("default", "two.sided", "less", "greater"),
  sig.level = 0.1,
  system = c("gg", "lattice"),
  shade = "navy",
  alpha = 0.1,
  binwidth = NULL,
  bins = NULL,
  fill = "gray80",
  color = "black",
  center = NULL,
  stemplot = dim(rdata)[direction] < 201,
  q = c(0.5, 0.9, 0.95, 0.99),
  fun = function(x) x,
  xlim,
  quiet = FALSE,
  ...
)

Arguments

sample: sample data
rdata: a matrix of randomly generated data under null hypothesis.
FUN: a function that computes the test statistic from a data set. The default value does nothing, making it easy to use this to tabulate precomputed statistics into a null distribution. See the examples.
direction: 1 or 2 indicating whether samples in rdata are in rows (1) or columns (2).
alternative: one of default, two.sided, less, or greater
sig.level: significance threshold for wilcox.test used to detect lack of symmetry
system: graphics system to use for the plot
shade: a color to use for shading.
alpha: opacity of shading.
binwidth: bin width for histogram.
bins: number of bins for histogram.
fill: fill color for histogram.
color: border color for histogram.
center: center of null distribution
stemplot: indicates whether a stem plot should be displayed
q: quantiles of sampling distribution to display
fun: same as FUN so you don't have to remember if it should be capitalized
xlim: limits for the horizontal axis of the plot.
quiet: a logicial indicating whether the text output should be suppressed
...: additional arguments passed to lattice::histogram() or ggplot2::geom_histogram()

Value

A lattice or ggplot showing the sampling distribution.

As side effects, information about the empirical sampling distribution and (optionally) a stem plot are printed to the screen.

Examples

# is my spinner fair?
x <- c(10, 18, 9, 15)   # counts in four cells
rdata <- rmultinom(999, sum(x), prob = rep(.25, 4))
statTally(x, rdata, fun = max, binwidth = 1)  # unusual test statistic
#> 
#> Null distribution appears to be asymmetric. (p = 0.00144) 
#> 
#> Test statistic applied to sample data =  18 
#> 
#> Quantiles of test statistic applied to random data: 
#> 50% 90% 95% 99% 
#>  17  20  21  23  
#> 
#> Of the  1000  samples (1 original +  999  random), 
#> 	 131 ( 13.1 % ) had test stats = 18 
#> 	 321 ( 32.1 % ) had test stats >= 18 

statTally(x, rdata, fun = var, shade = "red", binwidth = 2)  # equivalent to chi-squared test
#> 
#> Null distribution appears to be asymmetric. (p = 7.94e-06) 
#> 
#> Test statistic applied to sample data =  18 
#> 
#> Quantiles of test statistic applied to random data: 
#>      50%      90%      95%      99% 
#> 10.66667 28.00000 36.66667 51.33333  
#> 
#> Of the  1000  samples (1 original +  999  random), 
#> 	 24 ( 2.4 % ) had test stats = 18 
#> 	 261 ( 26.1 % ) had test stats >= 18 

# Can also be used with test stats that are precomputed.
if (require(mosaicData)) {
D <- diffmean( age ~ sex, data = HELPrct); D
nullDist <- do(999) * diffmean( age ~ shuffle(sex), data = HELPrct)
statTally(D, nullDist)
statTally(D, nullDist, system = "lattice")
}
#> Using parallel package.
#>   * Set seed with set.rseed().
#>   * Disable this message with options(`mosaic:parallelMessage` = FALSE)
#> 
#> Null distribution appears to be symmetric. (p =  0.894 ) 
#> 
#> Test statistic applied to sample data =  -0.7841 
#> 
#> Quantiles of test statistic applied to random data: 
#>         50%         90%         95%         99% 
#> -0.06220626  1.02679488  1.34615364  1.81014154  
#> 
#> Of the  1000  samples (1 original +  999  random), 
#> 	 9 ( 0.9 % ) had test stats = -0.7841 
#> 	 205 ( 20.5 % ) had test stats <= -0.7841 
#> 	 194 ( 19.4 % ) had test stats >= 0.6597 
#> 
#> Null distribution appears to be symmetric. (p =  0.894 ) 
#> 
#> Test statistic applied to sample data =  -0.7841 
#> 
#> Quantiles of test statistic applied to random data: 
#>         50%         90%         95%         99% 
#> -0.06220626  1.02679488  1.34615364  1.81014154  
#> 
#> Of the  1000  samples (1 original +  999  random), 
#> 	 9 ( 0.9 % ) had test stats = -0.7841 
#> 	 205 ( 20.5 % ) had test stats <= -0.7841 
#> 	 194 ( 19.4 % ) had test stats >= 0.6597