info prev up next book cdrom email home

Fisher's Exact Test

A Statistical Test used to determine if there are nonrandom associations between two Categorical Variables. Let there exist two such variables $X$ and $Y$, with $m$ and $n$ observed states, respectively. Now form an $n\times m$ Matrix in which the entries $a_{ij}$ represent the number of observations in which $x=i$ and $y=j$. Calculate the row and column sums $R_i$ and $C_j$, respectively, and the total sum

\begin{displaymath}
N=\sum_i R_i=\sum_j C_j
\end{displaymath}

of the Matrix. Then calculate the conditional Likelihood (P-Value) of getting the actual matrix given the particular row and column sums, given by

\begin{displaymath}
P_{\rm crit}={(R_1!R_2!\cdots R_m!)(C_1!C_2!\cdots C_n!)\over N!\prod_{i,j} a_{ij}!}
\end{displaymath}

(which is a Hypergeometric Distribution). Now find all possible Matrices of Nonnegative Integers consistent with the row and column sums $R_i$ and $C_j$. For each one, calculate the associated P-Value using (0) (where the sum of these probabilities must be 1). Then the P-Value of the test is given by the sum of all P-Value which are $\leq
P_{\rm crit}$.


The test is most commonly applied to a $2\times 2$ Matrices, and is computationally unwieldy for large $m$ or $n$.


For an example application of the test, let $X$ be a journal, say either Mathematics Magazine or Science, and let $Y$ be the number of articles on the topics of mathematics and biology appearing in a given issue of one of these journals. If Mathematics Magazine has five articles on math and one on biology, and Science has none on math and four on biology, then the relevant matrix would be

\begin{displaymath}
\matrix{
& {\it Math.\ Mag.} & {\it Science} & \cr
{\rm mat...
...biology}\hfill & 1 & 4 & R_2=5\cr
& C_1=6 & C_2=4 & N=10.\cr}
\end{displaymath}

Computing $P_{\rm crit}$ gives

\begin{displaymath}
P_{\rm crit}={5!^2 6! 4!\over 10!(5!0!1!4!)}=0.0238,
\end{displaymath}

and the other possible matrices and their $P$s are
$\displaystyle \left[\begin{array}{cc}4 & 1\\  2 & 3\end{array}\right]\quad P$ $\textstyle =$ $\displaystyle 0.2381$  
$\displaystyle \left[\begin{array}{cc}3 & 2\\  3 & 2\end{array}\right]\quad P$ $\textstyle =$ $\displaystyle 0.4762$  
$\displaystyle \left[\begin{array}{cc}2 & 3\\  4 & 1\end{array}\right]\quad P$ $\textstyle =$ $\displaystyle 0.2381$  
$\displaystyle \left[\begin{array}{cc}1 & 4\\  5 & 0\end{array}\right]\quad P$ $\textstyle =$ $\displaystyle 0.0238,$  

which indeed sum to 1, as required. The sum of $P$-values less than or equal to $P_{\rm crit}=0.0238$ is then 0.0476 which, because it is less than 0.05, is Significant. Therefore, in this case, there would be a statistically significant association between the journal and type of article appearing.



info prev up next book cdrom email home

© 1996-9 Eric W. Weisstein
1999-05-26