- Given a number n, repeatedly replace n with the sum of the squares of its digits.
- This will eventually reach a cycle of numbers of at most 3 digits: the cycle will either be 1 --> 1 --> 1 --> etc, or consist of number(s) greater than 1.
- If the cycle is 1, the number is happy.

2 and 4 are the only known happy bases. Wikipedia claims that searches for happy bases have gone as high as 500,000,000; the citation for this claim leads to OEIS A161872, where the claim is made with no further citation.

To determine whether a base b is a happy base, we first note that applying the happy transformation to any number of 4 or more digits yields a lesser number. It therefore suffices to check all numbers up to 3 digits. Furthermore, the maximum result of the happy transformation of a number of 3 or fewer digits is 3 * (b-1)

^{2}. It therefore suffices to check all numbers less than or equal to 3 * (b-1)

^{2}.

We can therefore write the following Python code to find the smallest unhappy number in base b, or return 0 if b is a happy base:

Code: Select all

`def happybase(b):`

if b < 5: return (0, 2, 0, 2, 0)[b]

for n in xrange(2, 3 * (b-1)**2 + 1):

f, g = n, n

while g != 1:

f, x = divmod(f, b)

f, y = divmod(f, b)

f = f*f + y*y + x*x

g, x = divmod(g, b)

g, y = divmod(g, b)

g = g*g + y*y + x*x

g, x = divmod(g, b)

g, y = divmod(g, b)

g = g*g + y*y + x*x

if f == g != 1: return n

return 0

We use the cycle-finding trick employed in Pollard's rho algorithm to determine when a happy sequence has entered a cycle. We can also do this in C for a significant speedup:

Code: Select all

`long unsigned happybase(long unsigned b) { `

long unsigned n, f, g, x, y, z;

if (b == 1) return 2;

z = 3 * (b-1) * (b-1);

for (n = 2; n <= z; n++) {

f = g = n;

while (g != 1) {

x = f % b; f /= b;

y = f % b; f /= b;

f = f*f + y*y + x*x;

x = g % b; g /= b;

y = g % b; g /= b;

g = g*g + y*y + x*x;

x = g % b; g /= b;

y = g % b; g /= b;

g = g*g + y*y + x*x;

if (f == g && g != 1) return n;

}

}

return 0;

}

The C code has the obvious limitation that the variables will overflow when b is sufficiently large, and the 500,000,000 in that OEIS comment is getting close to that point. We also find that as b gets large, evaluating this function gets increasingly slower (in an average sense). So I'd like to do three things:

- Find a more efficient algorithm to determine whether a number is happy in a given base.
- Rewrite the C code to use arbitrary-precision integers. I have no idea how to do this.
- Find some theorems that can tell us whether a number n is happy in base b without having to evaluate n's happy sequence in base b. For example, if log
_{2}( log_{n}b ) is an integer, then n is happy in base b.