info prev up next book cdrom email home

Birthday Problem

Consider the probability $Q_1(n,d)$ that no two people out of a group of $n$ will have matching birthdays out of $d$ equally possible birthdays. Start with an arbitrary person's birthday, then note that the probability that the second person's birthday is different is $(d-1)/d$, that the third person's birthday is different from the first two is $[(d-1)/d][(d-2)/d]$, and so on, up through the $n$th person. Explicitly,

$\displaystyle Q_1(n,d)$ $\textstyle =$ $\displaystyle {d-1\over d}{d-2\over d}\cdots{d-(n-1)\over d}$  
  $\textstyle =$ $\displaystyle {(d-1)(d-2)\cdots[d-(n-1)]\over d^{n-1}}.$ (1)

But this can be written in terms of Factorials as
\end{displaymath} (2)

so the probability $P_2(n,365)$ that two people out of a group of $n$ do have the same birthday is therefore
P_2(n,d)=1-Q_1(n,d)=1-{d!\over (d-n)!d^n}.
\end{displaymath} (3)

If 365-day years have been assumed, i.e., the existence of leap days is ignored, then the number of people needed for there to be at least a 50% chance that two share birthdays is the smallest $n$ such that $P_2(n,365)\geq 1/2$. This is given by $n=23$, since

$\displaystyle P_2(23,365)$ $\textstyle =$ $\displaystyle {38093904702297390785243708291056390518886454060947061\over 75091883268515350125426207425223147563269805908203125}$  
  $\textstyle \approx$ $\displaystyle 0.507297.$ (4)

The number $n$ of people needed to obtain $P_2(n,d)\geq 1/2$ for $d=1$, 2, ..., are 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, ... (Sloane's A033810).

The probability $P_2(n,d)$ can be estimated as

$\displaystyle P_2(n,d)$ $\textstyle \approx$ $\displaystyle 1-e^{-n(n-1)/2d}$ (5)
  $\textstyle \approx$ $\displaystyle 1-\left({1-{n\over 2d}}\right)^{n-1},$ (6)

where the latter has error
\epsilon <{n^3\over 6(d-n+1)^2}
\end{displaymath} (7)

(Sayrafiezadeh 1994).

\begin{figure}\begin{center}\BoxedEPSF{BirthdayProblem.epsf scaled 800}\end{center}\end{figure}

In general, let $Q_i(n,d)$ denote the probability that a birthday is shared by exactly $i$ (and no more) people out of a group of $n$ people. Then the probability that a birthday is shared by $k$ or more people is given by

P_k(n,d)=1-\sum_{i=1}^{k-1} Q_i(n,d).
\end{displaymath} (8)

$Q_2$ can be computed explicitly as

$\displaystyle Q_2(n,d)$ $\textstyle =$ $\displaystyle {n!\over d^n} \sum_{i=1}^{\left\lfloor{n/2}\right\rfloor } {1\over 2^i}{d\choose i}{d-i\choose n-2i}$  
  $\textstyle =$ $\displaystyle {n!\over d^n} \sum_{i=1}^{\left\lfloor{n/2}\right\rfloor } {d!\over 2^i i!(n-2i)!(d-n+i)!}$  
  $\textstyle =$ $\displaystyle {(-1)^n\over d^n}\left[{2^{-n/2}\Gamma(1+n)P_n^{(-d)}({\textstyle{1\over 2}}\sqrt{2})-{\Gamma(1+d)\over\Gamma(1+d-n)}}\right],$ (9)

where ${n\choose m}$ is a Binomial Coefficient, $\Gamma(n)$ is a Gamma Function, and $P_n^{(\lambda)}(x)$ is an Ultraspherical Polynomial. This gives the explicit formula for $P_3(n,d)$ as

$\displaystyle P_3(n,d)$ $\textstyle =$ $\displaystyle 1-Q_1(n,d)-Q_2(n,d)$  
  $\textstyle =$ $\displaystyle 1+{(-1)^{n+1}\Gamma(n+1)P_n^{(-d)}(2^{-1/2})\over 2^{n/2} d^n}.$ (10)

$Q_3(n,d)$ cannot be computed in entirely closed form, but a partially reduced form is
$Q_3(n,d)={\Gamma(d+1)\over d^n}\left[{{(-1)^nF({\textstyle{9\over 8}})-F(-{\textstyle{9\over 8}})\over\Gamma(d-n+1)}}\right.$
$ \left.{+(-1)^n\Gamma(1+n)\sum_{i=1}^{\left\lfloor{n/3}\right\rfloor } {(-3)^{-...
...(i-d)}({\textstyle{1\over 2}}\sqrt{2}\,)\over\Gamma(d-i+1)\Gamma(i+1)}}\right],$

$F=F(n,d,a)\equiv 1-{}_3F_2\left[{\!\!\matrix{{\textstyle{1\over 3}}(1-n), {\tex...
...\cr {\textstyle{1\over 2}}(d-n+1), {\textstyle{1\over 2}}(d-n+2)\cr}; a}\right]$

and ${}_3F_2(a,b,c;d,e;z)$ is a Generalized Hypergeometric Function.

In general, $Q_k(n,d)$ can be computed using the Recurrence Relation

Q_k(n,d)=\sum_{i=1}^{\left\lfloor{n/k}\right\rfloor } \left[...
...m_{j=1}^{k-1} Q_j(n-k,d-i){(d-i)^{n-ik}\over d^{n-ik}}}\right]
\end{displaymath} (13)

(Finch). However, the time to compute this recursive function grows exponentially with $k$ and so rapidly becomes unwieldy. The minimal number of people to give a 50% probability of having at least $n$ coincident birthdays is 1, 23, 88, 187, 313, 460, 623, 798, 985, 1181, 1385, 1596, 1813, ... (Sloane's A014088; Diaconis and Mosteller 1989).

A good approximation to the number of people $n$ such that $p=P_k(n,d)$ is some given value can be given by solving the equation

ne^{-n/(dk)}=\left[{d^{k-1}k!\ln\left({1\over 1-p}\right)\left({1-{n\over d(k+1)}}\right)}\right]^{1/k}
\end{displaymath} (14)

for $n$ and taking $\left\lceil{n}\right\rceil $, where $\left\lceil{n}\right\rceil $ is the Ceiling Function (Diaconis and Mosteller 1989). For $p=0.5$ and $k=1$, 2, 3, ..., this formula gives $n=1$, 23, 88, 187, 313, 459, 722, 797, 983, 1179, 1382, 1592, 1809, ..., which differ from the true values by from 0 to 4. A much simpler but also poorer approximation for $n$ such that $p=0.5$ for $k<20$ is given by
\end{displaymath} (15)

(Diaconis and Mosteller 1989), which gives 86, 185, 307, 448, 606, 778, 965, 1164, 1376, 1599, 1832, ... for $k=3$, 4, ....

The ``almost'' birthday problem, which asks the number of people needed such that two have a birthday within a day of each other, was considered by Abramson and Moser (1970), who showed that 14 people suffice. An approximation for the minimum number of people needed to get a 50-50 chance that two have a match within $k$ days out of $d$ possible is given by

n(k,d)=1.2\sqrt{d\over 2k+1}
\end{displaymath} (16)

(Sevast'yanov 1972, Diaconis and Mosteller 1989).

See also Birthday Attack, Coincidence, Small World Problem


Abramson, M. and Moser, W. O. J. ``More Birthday Surprises.'' Amer. Math. Monthly 77, 856-858, 1970.

Ball, W. W. R. and Coxeter, H. S. M. Mathematical Recreations and Essays, 13th ed. New York: Dover, pp. 45-46, 1987.

Bloom, D. M. ``A Birthday Problem.'' Amer. Math. Monthly 80, 1141-1142, 1973.

Bogomolny, A. ``Coincidence.''

Clevenson, M. L. and Watkins, W. ``Majorization and the Birthday Inequality.'' Math. Mag. 64, 183-188, 1991.

Diaconis, P. and Mosteller, F. ``Methods of Studying Coincidences.'' J. Amer. Statist. Assoc. 84, 853-861, 1989.

Feller, W. An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed. New York: Wiley, pp. 31-32, 1968.

Finch, S. ``Puzzle #28 [June 1997]: Coincident Birthdays.''

Gehan, E. A. ``Note on the `Birthday Problem.''' Amer. Stat. 22, 28, Apr. 1968.

Heuer, G. A. ``Estimation in a Certain Probability Problem.'' Amer. Math. Monthly 66, 704-706, 1959.

Hocking, R. L. and Schwertman, N. C. ``An Extension of the Birthday Problem to Exactly $k$ Matches.'' College Math. J. 17, 315-321, 1986.

Hunter, J. A. H. and Madachy, J. S. Mathematical Diversions. New York: Dover, pp. 102-103, 1975.

Klamkin, M. S. and Newman, D. J. ``Extensions of the Birthday Surprise.'' J. Combin. Th. 3, 279-282, 1967.

Levin, B. ``A Representation for Multinomial Cumulative Distribution Functions.'' Ann. Statistics 9, 1123-1126, 1981.

McKinney, E. H. ``Generalized Birthday Problem.'' Amer. Math. Monthly 73, 385-387, 1966.

Mises, R. von. ``Über Aufteilungs--und Besetzungs-Wahrscheinlichkeiten.'' Revue de la Faculté des Sciences de l'Université d'Istanbul, N. S. 4, 145-163, 1939. Reprinted in Selected Papers of Richard von Mises, Vol. 2 (Ed. P. Frank, S. Goldstein, M. Kac, W. Prager, G. Szegö, and G. Birkhoff). Providence, RI: Amer. Math. Soc., pp. 313-334, 1964.

Riesel, H. Prime Numbers and Computer Methods for Factorization, 2nd ed. Boston, MA: Birkhäuser, pp. 179-180, 1994.

Sayrafiezadeh, M. ``The Birthday Problem Revisited.'' Math. Mag. 67, 220-223, 1994.

Sevast'yanov, B. A. ``Poisson Limit Law for a Scheme of Sums of Dependent Random Variables.'' Th. Prob. Appl. 17, 695-699, 1972.

Sloane, N. J. A. A014088 and A033810 in ``An On-Line Version of the Encyclopedia of Integer Sequences.''

Stewart, I. ``What a Coincidence!'' Sci. Amer. 278, 95-96, June 1998.

Tesler, L. ``Not a Coincidence!''

info prev up next book cdrom email home

© 1996-9 Eric W. Weisstein