info prev up next book cdrom email home

Tree Searching

N.B. A detailed on-line essay by S. Finch was the starting point for this entry.


In database structures, two quantities are generally of interest: the average number of comparisons required to

1. Find an existing random record, and

2. Insert a new random record into a data structure.
Some constants which arise in the theory of digital tree searching are
$\displaystyle \alpha$ $\textstyle \equiv$ $\displaystyle \sum_{k=1}^\infty {1\over 2^k-1}=1.6066951524\ldots$ (1)
$\displaystyle \beta$ $\textstyle \equiv$ $\displaystyle \sum_{n=1}^\infty {1\over(2^n-1)^2}=1.1373387363\ldots.$ (2)

Erdös (1948) proved that $\alpha$ is Irrational. The expected number of comparisons for a successful search is
$\displaystyle E$ $\textstyle =$ $\displaystyle {\ln n\over\ln 2}+{\gamma-1\over\ln 2}-\alpha+{\textstyle{3\over 2}}+\delta(n)+{\mathcal O}(n^{-1/2})$ (3)
  $\textstyle \sim$ $\displaystyle \lg n-0.716644\ldots+\delta(n),$ (4)

and for an unsuccessful search is
$\displaystyle E$ $\textstyle =$ $\displaystyle {\ln n\over\ln 2}+{\gamma\over\ln 2}-\alpha+{\textstyle{1\over 2}}+\delta(n)+{\mathcal O}(n^{-1/2})$ (5)
  $\textstyle \sim$ $\displaystyle \lg n-0.273948\ldots+\delta(n).$ (6)

Here $\delta(n)$, $\epsilon(s)$, and $\rho(n)$ are small-amplitude periodic functions, and Lg is the base 2 Logarithm. The Variance for searching is
\begin{displaymath}
V\sim {1\over 12}+{\pi^2+6\over 6(\ln 2)^2}-\alpha-\beta+\epsilon(s)\sim 2.844383\ldots+\epsilon(s)
\end{displaymath} (7)

and for inserting is
\begin{displaymath}
V\sim {1\over 12}+{\pi^2\over 6(\ln 2)^2}-\alpha-\beta+\epsilon(s)\sim 0.763014\ldots+\epsilon(s).
\end{displaymath} (8)

The expected number of pairs of twin vacancies in a digital search tree is


\begin{displaymath}
\left\langle{A_n}\right\rangle{}=\left[{\theta+1-{1\over Q}\...
...ha^2-\alpha}\right)+\rho(n)}\right]n+{\mathcal O}(\sqrt{n}\,),
\end{displaymath} (9)

where


$\displaystyle Q$ $\textstyle \equiv$ $\displaystyle \prod_{k=1}^\infty \left({1-{1\over 2^k}}\right)=0.2887880950\ldots$ (10)
  $\textstyle =$ $\displaystyle {1\over 3}-{1\over 3\cdot 7}+{1\over 3\cdot 5\cdot 15}-{1\over 3\cdot 5\cdot 15\cdot 21}+\ldots$ (11)
  $\textstyle =$ $\displaystyle \mathop{\rm exp}\nolimits \left[{-\sum_{n=1}^\infty {1\over n(2^n-1)}}\right]$ (12)
  $\textstyle =$ $\displaystyle \sqrt{2\pi\over\ln 2}\,\mathop{\rm exp}\nolimits \left({{\ln 2\ov...
...\left[{1-\mathop{\rm exp}\nolimits \left({-{4\pi^2 n\over\ln 2}}\right)}\right]$ (13)

and
$\displaystyle \theta$ $\textstyle =$ $\displaystyle \sum_{k=1}^\infty {k 2^{k+1}\over 1\cdot 3\cdot 7\cdot 16\cdots(2^k-1)} \sum_{j=1}^k {1\over 2^j-1}$  
  $\textstyle =$ $\displaystyle 7.7431319855\ldots.$ (14)

(Flajolet and Sedgewick 1986). The linear Coefficient of $\left\langle{A_n}\right\rangle{}$ fluctuates around
\begin{displaymath}
c=\theta+1-{1\over Q}\left({{1\over\ln 2}+\alpha^2-\alpha}\right)=0.3720486812\ldots,
\end{displaymath} (15)

which can also be written


\begin{displaymath}
c={1\over\ln 2}\int_0^\infty {x\over 1+x}{dx\over(1+x)(1+{\t...
...(1+{\textstyle{1\over 4}}x)(1+{\textstyle{1\over 8}}x)\cdots}.
\end{displaymath} (16)

(Flajolet and Richmond 1992).


References

Finch, S. ``Favorite Mathematical Constants.'' http://www.mathsoft.com/asolve/constant/bin/bin.html

Finch, S. ``Favorite Mathematical Constants.'' http://www.mathsoft.com/asolve/constant/dig/dig.html

Finch, S. ``Favorite Mathematical Constants.'' http://www.mathsoft.com/asolve/constant/qdt/qdt.html

Flajolet, P. and Richmond, B. ``Generalized Digital Trees and their Difference-Differential Equations.'' Random Structures and Algorithms 3, 305-320, 1992.

Flajolet, P. and Sedgewick, R. ``Digital Search Trees Revisited.'' SIAM Review 15, 748-767, 1986.

Knuth, D. E. The Art of Computer Programming, Vol. 3: Sorting and Searching, 2nd ed. Reading, MA: Addison-Wesley, pp. 21, 134, 156, 493-499, and 580, 1973.



info prev up next book cdrom email home

© 1996-9 Eric W. Weisstein
1999-05-26