Large Numbers

First page . . . Back to page 4 . . . Forward to page 6 . . . Last page (page 11)

The Mega and the Moser

These numbers were described by Hugo Steinhaus and Leo Moser in 1951 book Mathematical Snapshots[35]. Using an iteration remarkably similar to the Jain iterated vargita-samvargita[31], these numbers show how easy it is to create a notation for extremely large numbers.

In the earlier "Steinhaus polygon notation", there were 3 special symbols: a triangle, a square, and a circle.

X inside a triangle equals X^X
X inside a square equals X inside X concentric triangles
X inside a circle equals X inside X concentric squares

Steinhaus called 2 inside a circle "Mega" and 10 in a circle "Megiston".

(Mathematical Snapshots was reprinted in 1969, called the "third American edition"; I learned about Megiston at a university library in June 1979.)

Later, in what is now called "Steinhaus-Moser notation", the circle was replaced with a pentagon and higher-order polygons (hexagons, etc.) were added to the system.

As before, X inside a triangle equals X^X
As before, X inside a square equals X inside X concentric triangles
X inside a pentagon equals X inside X concentric squares
X inside a hexagon equals X inside X concentric pentagons
and so on.

The "Mega" is now represented by 2 inside a pentagon, and "Megiston" is 10 inside a pentagon. A new, much larger quantity called "Moser's number" is "2 inside a megagon", where a "megagon" is a polygon with a Mega sides.

Here is the notation in functional form:

sm(a,n,p) = { a^a for n = 1, p = 3 { { sm(a,a,p-1) for n = 1, p > 3 { { sm(sm(a,1,p),n-1,p) for n > 1

and here are a few values:

sm(2,1,3) = 2^2 = 4 sm(2,2,3) = sm(sm(2,1,3),1,3) = sm(4,1,3) = 4^4 = 256 sm(2,1,4) = sm(2,2,3) = 256 mega = sm(2,1,5) = sm(2,2,4) = sm(sm(2,1,4),1,4) = sm(256,1,4) = sm(256,256,3) = sm(256^256,255,3) = sm((256^256)^(256^256),254,3) = sm([(256^256)^(256^256)]^[(256^256)^(256^256)],253,3) = ... megiston = sm(10,1,5) moser = sm(2,1,Mega)

We can approximate the sm function in terms of the hy function for small values of p:

sm(n,1,3) = n^n = hy(n,3,n) = hy(n,4,2) sm(n,2,3) = (n^n)^(n^n) = n^(n*n^n) = n^(n^(n+1)) ~= hy(n,4,3+epsilon) sm(n,3,3) = sm(n,2,3)^sm(n,2,3) = (n^(n^(n+1)))^(n^(n^(n+1))) = n^(n^(n+1) * n^(n^(n+1))) = n^(n^(1+n+n^(n+1))) ~= n^(n^(n^(n+1+epsilon))) ~= hy(n,4,4+epsilon) by induction sm(n,x,3) ~= hy(n,4,x+1+epsilon) sm(n,1,4) = sm(n,n,3) ~= hy(n,4,n+1+epsilon) sm(n,2,4) = sm(sm(n,1,4),1,4) ~= sm(hy(n,4,n+1+epsilon),1,4) ~= hy(hy(n,4,n+1+epsilon),4,hy(n,4,n+1+epsilon)+1+epsilon) "intuitively", sm(n,1,4) ~= hy(n,4,n+1+epsilon) sm(n,1,5) ~= hy(n,5,n^2)

The value of Mega can be computed by hypercalc's internal BASIC interpreter with the following code:

10 let mega = 256; 20 for n=1 to 256; 40 let mega = mega ^ mega; 80 next n 160 print "Mega = ", mega 320 end

The first few steps of this generate the numbers:

256 = 2⁸ = 2^2³
256²⁵⁶ = 2²⁰⁴⁸ = 2^2¹¹ ≈ 3.231700607153×10⁶¹⁶
(256²⁵⁶)^{(256²⁵⁶)} = 2^2²⁰⁵⁹ ≈ 10^{(1.992373902866×10⁶¹⁹)}
[(256²⁵⁶)^{(256²⁵⁶)}]^{[(256²⁵⁶)^{(256²⁵⁶)}]} = 2^{2^{2059+2²⁰⁵⁹}} ≈ 10^{10^{(1.992373902866×10⁶¹⁹)}}

Each time through the loop there are twice as many 256's — so there are 2²⁵⁶ 256's in mega. However, the parentheses are grouped differently from the power towers discussed above. After two times through the loop, for example, it's (256²⁵⁶)^(256²⁵⁶). That is not as big as 256^{(256^{(256²⁵⁶)})} — the former is 10^{(1.992373902866×10⁶¹⁹)}, the latter is 10^{[10^{(7.78271055807×10⁶¹⁶)}]}. This discrepancy continues, with the result that the mega is about 10^(10^(10^...^(1.992373902866×10⁶¹⁹)...)), with 255 10's before the "1.992373902866×10⁶¹⁹" part. For reasons explained here, that is equivalent to what you get if you replace all but the last few 10's with nearly any other number between say 2 and 1000. hypercalc's final answer is:

255 PT ( 1.992373902865×10⁶¹⁹ )

which represents a power tower with 255 powers of 10 ("255 P.T.") and 1.992373...×10⁶¹⁹ at the top.

Fast Growing Hierarchies

We have a lot of larger finite and well-defined numbers to come. As we proceed it becomes steadily harder to put them in ascending order. This is where it becomes really useful to use a formally-defined function hierarchy, as discussed earlier.

The (unfortunately somewhat poorly-named) phrase "fast growing hierarchy" refers to a family of construction and proof techniques that together can be used to demonstrate a lot of useful results that are important for working out the relative sizes of the large numbers to be discussed below.

Many examples of function hierarchies are possible, depending on the purpose for which they are to be used. An often-used example is the Hardy hierarchy, which begins (using n and m to refer to finite integers):

H₀(n) = n
H₁(n) = n+1
H₂(n) = n+2
H_m(n) = n+m
H_ω(n) = H_n(n) = 2n
H_ω+1(n) = H_ω(n+1) = 2n+2
H_ω+2(n) = H_ω+1(n+1) = H_ω(n+2) = 2n+4
H_ω+m(n) = 2n+2m
. . .
H_ω×2(n) = 4n
H_ω×3(n) = 8n
H_ω×m(n) = 2^m×n
H_ω²(n) = 2ⁿ×n
. . .
H_ω³(n) ≈ n↑↑{n}
H_ω⁴(n) ≈ n↑↑↑{n}
H_ω⁵(n) ≈ n↑↑↑↑{n}
H_ω^ω+1(n) ≈ {n,n,n} = n{n}n
. . .

In the last line here, "n{n}n" is using the Bowers abbreviation for n up-arrows, and "{n,n,n}" is a 3-element Bowers array.

Another function hierarchy popular with "googologists" is commonly called "the" fast-growing hierarchy, and often that name is used without regard to the fact that there is no single definition of a structure to derive all the functions that the googologists are trying to use under that name. Nevertheless, there is agreement on how the hierarchy begins, and it relates closely to the Goodstein sequences we will discuss next. Here is how it begins:

f₀(n) = n+1
f₁(n) = f₀ⁿ(n) = n×2
f₂(n) = f₁ⁿ(n) = n×2ⁿ
f₃(n) = f₂ⁿ(n) = n×2ⁿ×2^n×2ⁿ... (with n terms in the series)
f₄(n) ≈ n↑↑↑n
f₅(n) ≈ n↑↑↑↑n
. . .

These functions fall into a pattern that should look familiar:

*f₀() performs addition K+n,
f₁() performs multiplication K n,
f₂() slightly exceeds exponentiation Kⁿ,
f₃() slightly exceeds tetration or "hyper4" k↑n
. . .

And the next functions after that are equivalent in growth rate to the higher "hyper" operators. We also see that the f() functions with integer subscripts f_i() are similar to the Hardy functions with subscript ωⁱ.

As in our earlier examples there are an infinite number of functions with integer indices like this followed by many more with cardinal infinite indices.

This function hierarchy is quite well described in the Numberphile video TREE vs Graham's number.

That's enough on this for now, but we'll revisit this rich topic later.

Goodstein Sequences

Reuben Goodstein in 1944 proved [32] a remarkable theorem about the finite limit of an iterative process that seems to most observers to be destined to go on forever.

Higher than any of the finite-indexed f_n() functions, but not quite as high as the single-argument Ackermann function, is gw(n), the length of the weak Goodstein sequence starting with n. After we discuss that we'll move on to other things, and eventually come back to Goodstein sequences again using the "strong" definition.

"Goodstein sequence of n" = the sequence of integers constructed as in Goodstein's Theorem from a starting value of n.

The length of a Goodstein sequence, and the highest value achieved, are both guaranteed to be finite, and are given by a fast-growing function in n. Two definitions are common:

gw(n) = highest value in the Goodstein sequence of n.

gw(n) = number of terms in the Goodstein sequence of n (counting from the initial term equal to n and the first term equal to 0); or equivalently: the value of the base b of the term with a value of 1.

There are two ways to do this, "weak" and "strong". We'll start with the "weak" version, where the numbers grow less quickly and the sequence gets to 0 sooner.

Start with a number, like 7; and start with base 2. Write out the number "in base 2" by breaking it up into a sum of "digits" multiplied by powers of 2:

7 = 1×2² + 1×2¹ + 1×2⁰

Now "increase the base to 3" by changing the 2's to 3's (but leave the exponents alone):

1×3² + 1×3¹ + 1×3⁰

Now interpret this as a number, which is the new (larger) value:

1×3² + 1×3¹ + 1×3⁰ = 13

Now subtract one:

12 = 1×3² + 1×3¹ + 0×3⁰

12 is the second term in the sequence, and 3 is the base.

Now repeat the process: Change the 3's to 4's, compute a new value, and subtract one:

1×4² + 1×4¹ + 0×4⁰
1×4² + 1×4¹ + 0×4⁰ = 20
19 = 1×4² + 0×4¹ + 3×4⁰ ←*

Repeating this a few more times, we get:

27 = 1×5² + 0×5¹ + 2×5⁰
37 = 1×6² + 0×6¹ + 1×6⁰
49 = 1×7² + 0×7¹ + 0×7⁰
63 = 7×8¹ + 7×8⁰ ←*

Clearly every time we make a new term in the sequence, the base is increasing by one. This means that if we want to know the length of the sequence we can just look at the value of the base. We'll do this because it makes it easier to work out how long these sequences will be. It also means:

Every time the b⁰ term of the sum decreases, b becomes b+1

We're just adding one, but we will say that there is a function f(b) = b+1. Expressing it that way is more important than it may seem at the moment. Let's do the next 8 steps:

69 =             7×9¹ + 6×9⁰
75 =             7×10¹ + 5×10⁰
81 =             7×11¹ + 4×11⁰
87 =             7×12¹ + 3×12⁰
93 =             7×13¹ + 2×13⁰
99 =             7×14¹ + 1×14⁰
105 =             7×15¹ + 0×15⁰
111 =             6×16¹ + 15×16⁰    ←*

Also, at this point, notice the lines marked with "←*". These are times where the rightmost "digit", the b⁰ term of the sum, has switched from 0×b⁰ to (b-1)b⁰. Notice that the base is 4 the first time, and it is 8 the next time, and then 16 the time after that. It doubles each time.

Here is another example, starting with 27 (in base 2, 11011₂) and writing things more concisely with digits and the base as a subscript:

                     11011₂ = 27₁₀
11011₃ = 112₁₀. 11010₃ = 111₁₀
11010₄ = 324₁₀. 11003₄ = 323₁₀    ←*
11003₅ = 753₁₀. 11002₅ = 752₁₀
11002₆ = 1514₁₀. 11001₆ = 1513₁₀
11001₇ = 2745₁₀.
11000₈ = 4608₁₀. 10777₈ = 4607₁₀    ←*
10777₉ = 7198₁₀.
10776₁₀ = 10776
10775₁₁ = 15570₁₀
10774₁₂ = 21832₁₀
10773₁₃ = 29838₁₀
10772₁₄ = 39888₁₀
10771₁₅ = 52306₁₀
10770₁₆ = 67440₁₀. 1076F₁₆ = 67439₁₀    ←*
1076F₁₇ = 85661₁₀
...

Again the lines marked with "←*" show where the lowest digit was a zero, and again the base doubles from 4 to 8 and then to 16.

This doubling is a universal phenomenon; it happens in any Goodstein sequence even if we were to start with a base other than 2. The reason is pretty easy to see: When it is 0×b⁰, we're going to need to subtract one from the next "digit" to the left, which takes one step to do, and then we have b-1 in the rightmost "digit" and it will therefore take another b-1 steps to get this down to zero. So we spend exactly b steps, and during this time the base goes up from b to 2b.

Every time we go from ... + k×b¹ + 0×b⁰ to ... + (k-1)×b¹ + 0×b⁰, b becomes 2b

This time there is a function g(b) = 2b, which is equivalent to repeating the function f(b) = b+1 a total of b times.

For similar reasons, the number of steps needed each time a b² digit decreases are expressed by a function h(b) that is a repetition of the g() function:

Every time we go from ... + k×b² + 0×b¹ + 0×b⁰ to (k-1)×b² + 0×b¹ + 0×b⁰, b becomes 2^bb.

The new function h(b) is 2^bb, which is exactly what you expect to get if you start with b and double it b times.

The f(), g(), and h() functions we have defined so far fall into a pattern that should look familiar:

f(n) = n+1 = f₀(n)
g(n) = 2n = f₁(n)
h(n) = 2ⁿn = f₂(n)

These are none other than the first three functions of the fast-growing hierarchy described earlier.

Of course, the next-higher digits will grow the sequence length at a rate like that of tetration and the higher "hyper" operators. In general, if our starting value is 2ⁿ then the sequence will start with 111111₃ (with n 1's) and the length of the sequence will involve the n^th hyper operator. Very approximately,

gw(2ⁿ) ≈ 2↑↑...↑(2ⁿ) (with n-2 arrows)
= hy(2, n, 2ⁿ)

The much more slowly growing "Goodstein numbers" g_n(n), as seen in Sloane's sequence A266202, considers the n^th term in the "Goodstein sequence of n" as defined above. So for example, when we started with 7 and iterated 7 times we had 69 = 7×9¹, so 69 is the 7^th member of sequence A266202 (the starting 0 is considered the 0^th member of the sequence). The corresponding sequence for the "strong" definition of Goodstein iteration is A266201, and there the 7^th term is 37665879.

For more, you can skip ahead a bit to Goodstein sequences (strong).

First page . . . Back to page 4 . . . Forward to page 6 . . . Last page (page 11)

Japanese readers should see: 巨大数論 (from @kyodaisuu on Twitter)

If you like this you might also enjoy my numbers page.

s.30