Skip to content

Commit

Permalink
clean up a bit, including historical remarks
Browse files Browse the repository at this point in the history
git-svn-id: https://svn.r-project.org/R/trunk@87470 00db46b3-68df-0310-9c12-caf00c1e9a41
  • Loading branch information
ripley committed Dec 27, 2024
1 parent 77699b5 commit f39c4f6
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 31 deletions.
29 changes: 11 additions & 18 deletions src/library/stats/man/fisher.test.Rd
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
% File src/library/stats/man/fisher.test.Rd
% Part of the R package, https://www.R-project.org
% Copyright 1995-2022 R Core Team
% Copyright 1995-2024 R Core Team
% Distributed under GPL 2 or later

\name{fisher.test}
Expand All @@ -24,17 +24,17 @@ fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
\item{workspace}{an integer specifying the size of the workspace
used in the network algorithm. In units of 4 bytes. Only used for
non-simulated p-values larger than \eqn{2 \times 2}{2 by 2} tables.
Since \R version 3.5.0, this also increases the internal stack size
which allows larger problems to be solved, however sometimes needing
hours. In such cases, \code{simulate.p.values=TRUE} may be more
This also increases the internal stack size
which allows larger problems to be solved, sometimes needing
hours. In such cases, \code{simulate.p.values = TRUE} may be more
reasonable.}
\item{hybrid}{a logical. Only used for larger than \eqn{2 \times 2}{2 by 2}
tables, in which cases it indicates whether the exact probabilities
(default) or a hybrid approximation thereof should be computed.}
\item{hybridPars}{a numeric vector of length 3, by default describing
\dQuote{Cochran's conditions} for the validity of the chi-squared
approximation, see \sQuote{Details}.}
\item{control}{a list with named components for low level algorithm
\item{control}{a list with named components for low-level algorithm
control. At present the only one used is \code{"mult"}, a positive
integer \eqn{\ge 2} with default 30 used only for larger than
\eqn{2 \times 2}{2 by 2} tables. This says how many times as much
Expand All @@ -56,7 +56,7 @@ fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
p-values by Monte Carlo simulation, in larger than \eqn{2 \times
2}{2 by 2} tables.}
\item{B}{an integer specifying the number of replicates used in the
Monte Carlo test.}
Monte Carlo test when \code{simulate.p.value} is true.}
}
\value{
A list with class \code{"htest"} containing the following components:
Expand All @@ -83,7 +83,7 @@ fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
length. Incomplete cases are removed, vectors are coerced into
factor objects, and the contingency table is computed from these.
For \eqn{2 \times 2}{2 by 2} cases, p-values are obtained directly
For \eqn{2 \times 2}{2 by 2} tables, p-values are obtained directly
using the (central or non-central) hypergeometric
distribution. Otherwise, computations are based on a C version of the
FORTRAN subroutine \code{FEXACT} which implements the network developed by
Expand All @@ -105,7 +105,7 @@ fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
alternative for a one-sided test is based on the odds ratio, so
\code{alternative = "greater"} is a test of the odds ratio being bigger
than \code{or}.
%
Two-sided tests are based on the probabilities of the tables, and take
as \sQuote{more extreme} all tables with probabilities less than or
equal to that of the observed table, the p-value being the sum of such
Expand All @@ -120,16 +120,9 @@ fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
cells have expected counts at least 5 (\code{= expect}), otherwise
the exact calculation is used. A corresponding \code{if()} decision
is made for all sub-tables considered.
%
Accidentally, \R has used \code{180} instead of \code{80} as
\code{percent}, i.e., \code{hybridPars[2]} in \R versions between
3.0.0 and 3.4.1 (inclusive), i.e., the 2nd of the \code{hybridPars}
(all of which used to be hard-coded previous to \R 3.5.0).
Consequently, in these versions of \R, \code{hybrid=TRUE} never made a
difference.
In the \eqn{r \times c}{r x c} case with \eqn{r > 2} or \eqn{c > 2},
internal tables can get too large for the exact test in which case an
internal tables can be too large for the exact test in which case an
error is signalled. Apart from increasing \code{workspace}
sufficiently, which then may lead to very long running times, using
\code{simulate.p.value = TRUE} may then often be sufficient and hence
Expand Down Expand Up @@ -243,7 +236,7 @@ MP6 <- rbind(
c(1,1,2,0,0,0,1),
c(0,1,1,1,1,0,0))
fisher.test(MP6)
# Exactly the same p-value, as Cochran's conditions are never met:
fisher.test(MP6, hybrid=TRUE)
# Exactly the same p-value, as Cochran's conditions are not met:
fisher.test(MP6, hybrid = TRUE)
}
\keyword{htest}
26 changes: 13 additions & 13 deletions tests/Examples/stats-Ex.Rout.save
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

R Under development (unstable) (2024-11-12 r87322) -- "Unsuffered Consequences"
R Under development (unstable) (2024-12-25 r87467) -- "Unsuffered Consequences"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin24.1.0
Platform: aarch64-apple-darwin24.2.0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Expand Down Expand Up @@ -5124,7 +5124,7 @@ function (V)
r[seq.int(from = 1L, by = p + 1L, length.out = p)] <- 1
r
}
<bytecode: 0x10b83cec8>
<bytecode: 0x12090f330>
<environment: namespace:stats>
> stopifnot(all.equal(Cl, cov2cor(cov(longley))),
+ all.equal(cor(longley, method = "kendall"),
Expand Down Expand Up @@ -7662,8 +7662,8 @@ data: MP6
p-value = 0.03929
alternative hypothesis: two.sided

> # Exactly the same p-value, as Cochran's conditions are never met:
> fisher.test(MP6, hybrid=TRUE)
> # Exactly the same p-value, as Cochran's conditions are not met:
> fisher.test(MP6, hybrid = TRUE)

Fisher's Exact Test for Count Data hybrid using asym.chisq. iff (exp=5,
perc=80, Emin=1)
Expand Down Expand Up @@ -7772,7 +7772,7 @@ attr(,".Environment")
> environment(as.formula("y ~ x"))
<environment: R_GlobalEnv>
> environment(as.formula("y ~ x", env = new.env()))
<environment: 0x10d923d98>
<environment: 0x10699fb68>
>
>
> ## Create a formula for a model with a large number of variables:
Expand Down Expand Up @@ -12714,14 +12714,14 @@ attr(,"class")
$linkfun
function (mu)
mu^lambda
<bytecode: 0x10dce8030>
<environment: 0x10dcf1fd8>
<bytecode: 0x12056a680>
<environment: 0x120575bb0>

$linkinv
function (eta)
pmax(eta^(1/lambda), .Machine$double.eps)
<bytecode: 0x10dce7ee0>
<environment: 0x10dcf1fd8>
<bytecode: 0x12056a530>
<environment: 0x120575bb0>

>
>
Expand Down Expand Up @@ -17080,8 +17080,8 @@ Step function with continuity 'f'= 0.2 , 3 knots at
> unclass(sfun0)
function (v)
.approxfun(x, y, v, method, yleft, yright, f, na.rm)
<bytecode: 0x12c5bc600>
<environment: 0x10d8ab2d8>
<bytecode: 0x115d97208>
<environment: 0x106858c98>
attr(,"call")
stepfun(1:3, y0, f = 0)
> ls(envir = environment(sfun0))
Expand Down Expand Up @@ -19647,7 +19647,7 @@ Number of Fisher Scoring iterations: 6
> cleanEx()
> options(digits = 7L)
> base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = 'CheckExEnv'),"\n")
Time elapsed: 2.868 0.238 3.12 0 0
Time elapsed: 2.846 0.215 3.067 0 0
> grDevices::dev.off()
null device
1
Expand Down

0 comments on commit f39c4f6

Please sign in to comment.