So, I’m working through the SICP book (which is really good) to refresh my scheme knowledge. The math behind one of the exercises confused me though — while the math I have to do to solve the problem is pretty trivial (it took me <5 minutes), I’m really confused as to where this came from. It’s got to do with calculating Fibonacci numbers in logarithmic, as opposed to linear time. Here’s the text of the exercise:

Spoiler:

Recall the transformation of the state variables a and b in the fib-iter process of section 1.2.2: a <- a + b and b <- a. Call this transformation T, and observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n). In other words, the Fibonacci numbers are produced by applying Tn, the nth power of the transformation T, starting with the pair (1,0). Now consider T to be the special case of p = 0 and q = 1 in a family of transformations Tpq where Tpq transforms the pair (a,b) according to a <- bq + aq + ap and b <- bp + aq. Show that if we apply such a transformation Tpq twice, the effect is the same as using a single transformation Tp'q' of the same form, and compute p' and q' in terms of p and q. This gives us an explicit way to square these transformations, and thus we can compute Tn using successive squaring, as in the fast-expt procedure. Put this all together to complete the following procedure, which runs in a logarithmic number of steps.

I understand the math, but I have no idea where it comes from. It’s pretty obvious that the transformations a <- a + b and b <- a are a case of Tpq, but where does that come from? The initial transformations could be a case of a whole load of different transformation families. Where does the specific one in the text come from, and why does it work? Couldn’t there be some other family of transformations that works?

I’m really not much of a math guy — I’ve had some college calculus, but no linear algebra or anything ‘beyond’ that in a standard curriculum.

Each step is a transformation of a 2-vector (a, b) into a new 2-vector (a+b, a). This is a simple linear transformation, so you can find a 2x2-matrix M such that M * (a, b) = (a+b, a) = T(a, b) Repeated application of a linear transformation is simply repeated multiplication T^n(a, b) = M * M * ... * M * (a, b) = M^n * (a, b) In other words: F_n = first element of M^n * (1, 0)

At this point, you already have a logarithmic algorithm, because you can calculate M^n in logarithmic time by repeated squaring, but you'd need 8 multiplications and 4 additions per squaring. You can optimize further by taking a good look at M and its squares and you'll eventually arrive at your simplified formula involving p and q.