Binomial Distribution

$\begin{figure}\begin{center}\BoxedEPSF{BinomialDistribution.epsf}\end{center}\end{figure}$

The probability of successes in Bernoulli Trials is

$\begin{displaymath} P(n\vert N) = {N\choose n}p^n(1-p)^{N-n} = {N!\over n!(N-n)!} p^nq^{N-n}. \end{displaymath}$

(1)

The probability of obtaining more successes than the

observed is

$\begin{displaymath} P=\sum_{k=n+1}^N {N\choose k}p^k(1-p)^{N-k}=I_p(n+1,N-n), \end{displaymath}$

(2)

where

$\begin{displaymath} I_x(a,b)\equiv {B(x; a,b)\over B(a,b)}, \end{displaymath}$

(3)

is the Beta Function, and

is the incomplete Beta Function. The Characteristic Function is

$\begin{displaymath} \phi(t)=(q+pe^{it})^n. \end{displaymath}$

(4)

The Moment-Generating Function

for the distribution is

$\displaystyle M(t)$	$\textstyle =$	$\displaystyle \langle e^{tn}\rangle = \sum_{n=0}^N e^{tn}{N\choose n}p^nq^{N-n}$
	$\textstyle =$	$\displaystyle \sum_{n=0}^N {N\choose n}(pe^t)^n(1-p)^{N-n}$
	$\textstyle =$	$\displaystyle [pe^t+(1-p)]^N$	(5)
$\displaystyle M'(t)$	$\textstyle =$	$\displaystyle N[pe^t+(1-p)]^{N-1}(pe^t)$	(6)
$\displaystyle M''(t)$	$\textstyle =$	$\displaystyle N(N-1)[pe^t+(1-p)]^{N-2}(pe^t)^2$
	$\textstyle \phantom{=}$	$\displaystyle +N[pe^t+(1-p)]^{N-1}(pe^t).$	(7)

The Mean is

$\begin{displaymath} \mu = M'(0) = N(p+1-p)p = Np. \end{displaymath}$

(8)

The Moments about 0 are

$\displaystyle \mu_1'$	$\textstyle =$	$\displaystyle \mu=Np$	(9)
$\displaystyle \mu_2'$	$\textstyle =$	$\displaystyle Np(1-p+Np)$	(10)
$\displaystyle \mu_3'$	$\textstyle =$	$\displaystyle Np(1-3p+3Np+2p^2-3NP^2+N^2p^2)$	(11)
$\displaystyle \mu_4'$	$\textstyle =$	$\displaystyle Np(1-7p+7Np+12p^2-18Np^2+6N^2p^2$
	$\textstyle \phantom{=}$	$\displaystyle -6p^3+11Np^3-6N^2p^3+N^3p^3),$	(12)

so the Moments about the Mean are

$\displaystyle \mu_2$	$\textstyle =$	$\displaystyle \sigma^2 = [N(N-1)p^2+Np]-(Np)^2$
	$\textstyle =$	$\displaystyle N^2p^2-Np^2+Np-N^2p^2$
	$\textstyle =$	$\displaystyle Np(1-p) = Npq$	(13)
$\displaystyle \mu_3$	$\textstyle =$	$\displaystyle \mu_3'-3\mu_2'\mu_1'+2(\mu_1)^3$
	$\textstyle =$	$\displaystyle Np(1-p)(1-2p)$	(14)
$\displaystyle \mu_4$	$\textstyle =$	$\displaystyle \mu_4'-4\mu_3'\mu_1'+6\mu_2'(\mu_1')^2-3(\mu_1)^4$
	$\textstyle =$	$\displaystyle Np(1-p)[3p^2(2-N)+3p(N-2)+1].$	(15)

The Skewness and Kurtosis are

$\displaystyle \gamma_1$	$\textstyle =$	$\displaystyle {\mu_3\over\sigma^3} = {Np(1-p)(1-2p)\over [Np(1-p)]^{3/2}}$
	$\textstyle =$	$\displaystyle {1-2p\over\sqrt{Np(1-p)}}={q-p\over \sqrt{Npq}}$	(16)
$\displaystyle \gamma_2$	$\textstyle =$	$\displaystyle {\mu_4\over \sigma^4}-3={6p^2-6p+1\over Np(1-p)} = {1-6pq\over Npq}.$	(17)

An approximation to the Bernoulli distribution for large

can be obtained by expanding about the value $\tilde n$ where

is a maximum, i.e., where

. Since the Logarithm function is Monotonic, we can instead choose to expand the Logarithm. Let $n\equiv \tilde n+\eta$ , then

$\begin{displaymath} \ln[P(n)] = \ln [P(\tilde n)]+B_1\eta +{\textstyle{1\over 2}}B_2\eta^2+{\textstyle{1\over 3!}} B_3\eta^3+\ldots, \end{displaymath}$

(18)

where

$\begin{displaymath} B_k\equiv \left[{d^k \ln[P(n)]\over dn^k}\right]_{n=\tilde n}. \end{displaymath}$

(19)

But we are expanding about the maximum, so, by definition,

$\begin{displaymath} B_1 =\left[{d \ln[P(n)]\over dn}\right]_{n=\tilde n} = 0. \end{displaymath}$

(20)

This also means that

is negative, so we can write $B_2=-\vert B_2\vert$ . Now, taking the Logarithm of (1) gives

$\begin{displaymath} \ln[P(n)] = \ln N!-\ln n!-\ln(N-n)!+n\ln p+(N-n)\ln q. \end{displaymath}$

(21)

For large

and

we can use Stirling's Approximation

$\begin{displaymath} \ln(n!)\approx n\ln n-n, \end{displaymath}$

(22)

$\displaystyle {d[\ln(n!)]\over dn}$	$\textstyle \approx$	$\displaystyle (\ln n+1)-1 = \ln n$	(23)
$\displaystyle {d[\ln(N-n)!]\over dn}$	$\textstyle \approx$	$\displaystyle {d\over dn} [(N-n)\ln(N-n)-(N-n)]$
	$\textstyle =$	$\displaystyle \left[{-\ln(N-n)+(N-n) {-1\over N-n} +1}\right]$
	$\textstyle =$	$\displaystyle -\ln(N-n),$	(24)

and

$\begin{displaymath} {d\ln[P(n)]\over dn} \approx -\ln n+\ln(N-n)+\ln p-\ln q. \end{displaymath}$

(25)

To find $\tilde n$ , set this expression to 0 and solve for

$\begin{displaymath} \ln\left({{N-\tilde n\over \tilde n} {p\over q}}\right)= 0 \end{displaymath}$

(26)

$\begin{displaymath} {N-\tilde n\over\tilde n} {p\over q} =1 \end{displaymath}$

(27)

$\begin{displaymath} (N-\tilde n)p=\tilde n q \end{displaymath}$

(28)

$\begin{displaymath} \tilde n(q+p)=\tilde n=Np, \end{displaymath}$

(29)

since

. We can now find the terms in the expansion

$\displaystyle B_2$	$\textstyle \equiv$	$\displaystyle \left[{d^2 \ln[P(n)]\over dn^2}\right]_{n=\tilde n} = -{1\over \tilde n}-{1\over N-\tilde n}$
	$\textstyle =$	$\displaystyle -{1\over Np}-{1\over N(1-p)} = -{1\over N} \left({{1\over p}+{1\over q}}\right)$
	$\textstyle =$	$\displaystyle -{1\over N}\left({p+q\over pq}\right)=-{1\over Npq} = -{1\over N(1-p)}$	(30)
$\displaystyle B_3$	$\textstyle \equiv$	$\displaystyle \left[{d^3 \ln[P(n)]\over dn^3}\right]_{n=\tilde n} = {1\over \tilde n^2}-{1\over(N-\tilde n)^2}$
	$\textstyle =$	$\displaystyle {1\over N^2p^2}-{1\over N^2q^2} = {q^2-p^2\over N^2p^2q^2}$
	$\textstyle =$	$\displaystyle {(1-2p+p^2)-p^2\over N^2p^2(1-p)^2} = {1-2p\over N^2p^2(1-p)^2}$	(31)
$\displaystyle B_4$	$\textstyle \equiv$	$\displaystyle \left[{d^4 \ln[P(n)]\over dn^4}\right]_{n=\tilde n} = -{2\over \tilde n^3}-{2\over(n-\tilde n)^3}$
	$\textstyle =$	$\displaystyle -2\left({{1\over N^3p^3}+{1\over N^3q^3}}\right)={2(p^3+q^3)\over N^3p^3q^3}$
	$\textstyle =$	$\displaystyle {2(p^2-pq+q^2)\over N^3p^3q^3}$
	$\textstyle =$	$\displaystyle {2[p^2-p(1-p)+(1-2p+p^2)]\over N^3p^3(1-p^3)}$
	$\textstyle =$	$\displaystyle {2(3p^2-3p+1)\over N^3p^3(1-p^3)}.$	(32)

Now, treating the distribution as continuous,

$\begin{displaymath} \lim_{N\to\infty} \sum_{n=0}^N P(n) \approx \int P(n)\,dn = \int_{-\infty}^\infty P(\tilde n+\eta)\,d\eta = 1. \end{displaymath}$

(33)

Since each term is of order $1/N \sim 1/\sigma^2$ smaller than the previous, we can ignore terms higher than

, so

$\begin{displaymath} P(n)=P(\tilde n)e^{-\vert B_2\vert\eta^2/2}. \end{displaymath}$

(34)

The probability must be normalized, so

$\begin{displaymath} \int_{-\infty}^\infty P(\tilde n)e^{-\vert B_2\vert\eta^2/2}\,d\eta = P(\tilde n) \sqrt{2\pi\over \vert B_2\vert} = 1, \end{displaymath}$

(35)

and

$\displaystyle P(n)$	$\textstyle =$	$\displaystyle \sqrt{\vert B_2\vert\over 2\pi} e^{-\vert B_2\vert(n-\tilde n)^2/2}$
	$\textstyle =$	$\displaystyle {1\over \sqrt{2\pi Npq}}\mathop{\rm exp}\nolimits \left[{-{(n-Np)^2\over 2Npq}}\right].$	(36)

Defining $\sigma^2\equiv 2Npq$ ,

$\begin{displaymath} P(n) = {1\over\sigma\sqrt{2\pi}}\mathop{\rm exp}\nolimits \left[{-{(n-\tilde n)^2\over 2\sigma^2}}\right], \end{displaymath}$

(37)

which is a Gaussian Distribution. For $p \ll 1$ , a different approximation procedure shows that the binomial distribution approaches the Poisson Distribution. The first Cumulant is

$\begin{displaymath} \kappa_1=np, \end{displaymath}$

(38)

and subsequent Cumulants are given by the Recurrence Relation

$\begin{displaymath} \kappa_{r+1}=pq{d\kappa_r\over dp}. \end{displaymath}$

(39)

Let and be independent binomial Random Variables characterized by parameters and . The Conditional Probability of given that is

$P(x=i \vert x+y=k) = {P(x=i, x+y=k)\over P(x+y=k)} = {P(x=i, y=k-i)\over P(x+y=k)} = {P(x=i)P(y = k-i)\over P(x+y=k)}$

$= {{n\choose i}p^i(1-p)^{n-i}{m\choose k-i}p^{k-i}(1-p)^{m-(k-i)} \over {n+m\choose k}p^k(1-p)^{n+m-k}} = {{n\choose i}{m\choose k-i}\over{n+m\choose k}}.\quad$ (40)

Note that this is a Hypergeometric Distribution!

References

Beyer, W. H. CRC Standard Mathematical Tables, 28th ed. Boca Raton, FL: CRC Press, p. 531, 1987.

Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. ``Incomplete Beta Function, Student's Distribution, F-Distribution, Cumulative Binomial Distribution.'' §6.2 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 219-223, 1992.

Spiegel, M. R. Theory and Problems of Probability and Statistics. New York: McGraw-Hill, p. 108-109, 1992.

$P(x=i \vert x+y=k) = {P(x=i, x+y=k)\over P(x+y=k)} = {P(x=i, y=k-i)\over P(x+y=k)} = {P(x=i)P(y = k-i)\over P(x+y=k)}$
$= {{n\choose i}p^i(1-p)^{n-i}{m\choose k-i}p^{k-i}(1-p)^{m-(k-i)} \over {n+m\choose k}p^k(1-p)^{n+m-k}} = {{n\choose i}{m\choose k-i}\over{n+m\choose k}}.\quad$	(40)