Consider the following method of creating a random binary tree, given a set of leaf vertices.

1. Put all leaf vertices in a bucket called U

2. While there is more than one vertex in U

2. a. Remove two and attach them to a new internal vertex v

2. b. Put v back in U

3. return the only vertex in U as the root of our binary tree

What's the expected value of the distance between two vertices in the tree with n vertices?

So far, I've noticed that an vertex gets more distant from the root every time it is selected in step 2. a. of the algorithm. This happens with probability ~2/(n-k+1) on the kth iteration of the algorithm. Therefore the expected distance from leaf to root is 2 log n

Now consider two vertices u, v, with lowest common ancestor L.

Notice dist(u,v) = dist(u, root) + dist(v, root) - 2 dist(u, root) = 2 (log n - dist(L, root))

I'm having trouble computing the term dist(L, root)

Here's what i've tried so far:

Since dist(u,L) + dist(L, root) = dist(u, root)

Ex(dist(L,root)) = 2 log n - Ex(dist(u,L))

Ex(dist(u,L)) is the sum of P(u and v not connected at step k) from 1 to n-1

P(u and v not joined at step k) = product of (1 - P(u and v are not picked at step j)) from 1 to k

= product of (1 - 1/(n-j choose 2)) from 1 to k

which is messy

Therefore i'm not sure how to compute this term.

## distance between two vertices in a random binary tree

**Moderators:** gmalivuk, Moderators General, Prelates

### Re: distance between two vertices in a random binary tree

The factors (1 - 1/(n-j choose 2)) in that last line can be manipulated into a form that makes them much easier to multiply together.

### Who is online

Users browsing this forum: No registered users and 15 guests