## distance between two vertices in a random binary tree

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

### distance between two vertices in a random binary tree

Consider the following method of creating a random binary tree, given a set of leaf vertices.

1. Put all leaf vertices in a bucket called U
2. While there is more than one vertex in U
2. a. Remove two and attach them to a new internal vertex v
2. b. Put v back in U
3. return the only vertex in U as the root of our binary tree

What's the expected value of the distance between two vertices in the tree with n vertices?

So far, I've noticed that an vertex gets more distant from the root every time it is selected in step 2. a. of the algorithm. This happens with probability ~2/(n-k+1) on the kth iteration of the algorithm. Therefore the expected distance from leaf to root is 2 log n

Now consider two vertices u, v, with lowest common ancestor L.
Notice dist(u,v) = dist(u, root) + dist(v, root) - 2 dist(u, root) = 2 (log n - dist(L, root))
I'm having trouble computing the term dist(L, root)

Here's what i've tried so far:
Since dist(u,L) + dist(L, root) = dist(u, root)
Ex(dist(L,root)) = 2 log n - Ex(dist(u,L))
Ex(dist(u,L)) is the sum of P(u and v not connected at step k) from 1 to n-1
P(u and v not joined at step k) = product of (1 - P(u and v are not picked at step j)) from 1 to k
= product of (1 - 1/(n-j choose 2)) from 1 to k
which is messy

Therefore i'm not sure how to compute this term.

Nitrodon
Posts: 497
Joined: Wed Dec 19, 2007 5:11 pm UTC

### Re: distance between two vertices in a random binary tree

The factors (1 - 1/(n-j choose 2)) in that last line can be manipulated into a form that makes them much easier to multiply together.