# Large Numbers

First page . . . Back to page 3 . . . Forward to page 5 . . . Last page (page 9)

## The Mega and the Moser

These numbers were constructed by Hugo Steinhaus and Leo Moser (in the late 1960's or earlier, exact dates unknown) just to show that it is easy to create a notation for extremely large numbers.

In the earlier "Steinhaus polygon notation", there were 3 special symbols: a triangle, a square, and a circle.

X inside a triangle equals XX
X inside a square equals X inside X concentric triangles
X inside a circle equals X inside X concentric squares

Steinhaus called 2 inside a circle "Mega" and 10 in a circle "Megiston".

Later, in what is now called "Steinhaus-Moser notation", the circle was replaced with a pentagon and higher-order polygons (hexagons, etc.) were added to the system.

As before, X inside a triangle equals XX
As before, X inside a square equals X inside X concentric triangles
X inside a pentagon equals X inside X concentric squares
X inside a hexagon equals X inside X concentric pentagons
and so on.

The "Mega" is now represented by 2 inside a pentagon, and "Megiston" is 10 inside a pentagon. A new, much larger quantity called "Moser's number" is "2 inside a megagon", where a "megagon" is a polygon with a Mega sides.

(Both versions of the notation predate 1983; I learned about the Moser's number at an academic library in June 1979.)

Here is the notation in functional form:

sm(a,n,p) = { a^a for n = 1, p = 3 { { sm(a,a,p-1) for n = 1, p > 3 { { sm(sm(a,1,p),n-1,p) for n > 1

and here are a few values:

sm(2,1,3) = 2^2 = 4 sm(2,2,3) = sm(sm(2,1,3),1,3) = sm(4,1,3) = 4^4 = 256 sm(2,1,4) = sm(2,2,3) = 256 mega = sm(2,1,5) = sm(2,2,4) = sm(sm(2,1,4),1,4) = sm(256,1,4) = sm(256,256,3) = sm(256^256,255,3) = sm((256^256)^(256^256),254,3) = sm([(256^256)^(256^256)]^[(256^256)^(256^256)],253,3) = ... megiston = sm(10,1,5) moser = sm(2,1,Mega)

We can approximate the sm function in terms of the hy function for small values of p:

sm(n,1,3) = n^n = hy(n,3,n) = hy(n,4,2) sm(n,2,3) = (n^n)^(n^n) = n^(n*n^n) = n^(n^(n+1)) ~= hy(n,4,3+epsilon) sm(n,3,3) = sm(n,2,3)^sm(n,2,3) = (n^(n^(n+1)))^(n^(n^(n+1))) = n^(n^(n+1) * n^(n^(n+1))) = n^(n^(1+n+n^(n+1))) ~= n^(n^(n^(n+1+epsilon))) ~= hy(n,4,4+epsilon) by induction sm(n,x,3) ~= hy(n,4,x+1+epsilon) sm(n,1,4) = sm(n,n,3) ~= hy(n,4,n+1+epsilon) sm(n,2,4) = sm(sm(n,1,4),1,4) ~= sm(hy(n,4,n+1+epsilon),1,4) ~= hy(hy(n,4,n+1+epsilon),4,hy(n,4,n+1+epsilon)+1+epsilon)   "intuitively", sm(n,1,4) ~= hy(n,4,n+1+epsilon) sm(n,1,5) ~= hy(n,5,n^2)

The value of Mega can be computed by hypercalc's internal BASIC interpreter with the following code:

10 let mega = 256; 20 for n=1 to 256; 40 let mega = mega ^ mega; 80 next n 160 print "Mega = ", mega 320 end

The first few steps of this generate the numbers:

256 = 28 = 223
256256 = 22048 = 2211 ≈ 3.231700607153×10616
(256256)(256256) = 222059 ≈ 10(1.992373902866×10619)
[(256256)(256256)][(256256)(256256)] = 222059+22059 ≈ 1010(1.992373902866×10619)

Each time through the loop there are twice as many 256's — so there are 2256 256's in mega. However, the parentheses are grouped differently from the power towers discussed above. After two times through the loop, for example, it's (256256)^(256256). That is not as big as 256(256(256256)) — the former is 10(1.992373902866×10619), the latter is 10[10(7.78271055807×10616)]. This discrepancy continues, with the result that the mega is about 10^(10^(10^...^(1.992373902866×10619)...)), with 255 10's before the "1.992373902866×10619" part. For reasons explained here, that is equivalent to what you get if you replace all but the last few 10's with nearly any other number between say 2 and 1000. hypercalc's final answer is:

255 PT ( 1.992373902865×10619 )

which represents a power tower with 255 powers of 10 ("255 P.T.") and 1.992373...×10619 at the top.

## Friedman Sequences

In a 1998 paper14, Harvey Friedman describes the problem of finding the longest sequence of letters (chosen from a set of N allowed letters) such that no subsequence of letters i through 2i occurs anywhere further on in the sequence. For 1 letter {A} the maximum length is 3: AAA. For 2 letters {A, B} the longest sequence is 11: ABBBAAAAAAA. For 3 letters {A, B, C} the longest sequence is very very long, but not infinite.

He then goes on to show how to construct proofs of lower bounds for N-character sequences using certain special (N+1)-character sequences. With help from R. Dougherty, he found a lower bound for the N=3 case, A7198(158386) = ack-rm(7198,158386) = ack(7198,2,158386) = hy(2,7199,158386) = 2(7199)158386, where x(7199)y represents the 7199th hyper operator.13

This value is less than the "Graham-Rothschild number" and the other "Graham's number"s (which we'll go into next). However, the N=4 case gives a result that is immensely bigger than all the versions of Graham's number. Friedman describes these relations at .

## The "Graham-Rothschild Number"

The original genuine "Graham's number", from a 1971 paper by Graham and Rothschild , is an upper bound for a problem in Ramsey theory, (graph colouring, combinatorics).

The problem is to determine the lowest dimension of a hypercube such that if the lines joining all pairs of corners are two-coloured, a planar graph K4 of one colour will be forced. (K4 is a totally-connected graph with 4 edges, topologically equivalent to the 4 vertices and 6 edges of a tetrahedron. The planar requirement means that its 4 vertices are in one plane, meaning that it can't be just any 4 vertices of the hypercube, but diagonal planes are okay. See the article by Sbiis Saibian  for an extremely thorough description with lots of illustrations.)

In 1971 Graham and Rothschild proved that an answer exists using an upper bound F(F(F(F(F(F(F(12,3),3),3),3),3),3),3), where F(n) is defined:

F(m,n) = { 2^n for m=1, n>=2 { { 4 for m>=1, n=2 { { F(m-1,F(m,n-1)) for m>=2, n>=3 Graham and Gardner's 1971 description of the number

Using the generalised hyper function, F(F(F(F(F(F(F(12,3),3),3),3),3),3),3) is equivalent to lgn(7), wherelgn(n) is defined:

lgn(n) = { hy(2,12,3) for n = 1 { { hy(1, lgn(n-1), 3) for n > 1

In other words, using up-arrows:

lgn(1) = 2↑↑↑↑↑↑↑↑↑↑↑↑3     (12 up-arrows)     = hy(2, 12, 3)
lgn(2) = 2↑↑↑...↑↑↑3     (with lgn(1) up-arrows)
lgn(3) = 2↑↑↑...↑↑↑3     (with lgn(2) up-arrows)
. . .
lgn(7) = 2↑↑↑...↑↑↑3     (with lgn(6) up-arrows)

This is the original number from Graham's proof.

As related by mathematician John Baez , Martin Gardner wanted to describe it the Mathematical Games column of Scientific American, but Graham and Gardner agreed that the definition was a little too complex, and so they invented an even larger number...

## The "Graham-Gardner Number"

This is the number commonly known as "Graham's number", but it is more attributable to Martin Gardner than to Graham. The number from the 1971 paper by Graham and Rothschild, described above, is an "upper bound". Upper bounds are "true" so long as the thing they're supposed to be "bounding" is provably smaller. That means you can use a larger number for an upper bound, and it's true too. If I tell you that 22×3=64 is an upper bound for my problem, then it's safe for you to tell someone else that 223=256 is an upper bound for the problem, because 256 > 64.

So Gardner and Graham came up with the definition for a larger upper bound, which was popularised by Martin Gardner in 1977. This came to be known as "Graham's number", but I call it the "Graham-Gardner number". Its value is is gn(64), where gn() is defined as follows:

gn(n) = { hy(3,6,3) for n = 1 { { hy(3, gn(n-1)+2, 3) for n > 1

This illustration from the Graham's number Wikipedia article is a popular way to illustrate the definition concisely: The "curly braces" indicate that the number of up-arrows in each "layer" is counted by the number immediately below, with 4 arrows in the bottom layer.

Todd Cesere and Tim Chow have both proven that the "Graham-Gardner number" is bigger than the Moser, and in fact even gn(2) is much bigger than the Moser. Tim's proofis outlined on a page by Susan Stepney, here.

Since gn(1) is 3↑↑↑↑3, which is smaller than 2↑↑↑↑↑↑↑↑↑↑↑↑3, it follows that the "Graham-Rothschild number" is also bigger than Moser:

Moser << Graham-Rothschild << Graham-Gardner << Graham-Conway

This last variant, "Graham-Conway" is probably the least-known, but also the largest...

## The "Graham-Conway Number"

Conway and Guy's book The Book of Numbers  includes a description of a "Graham's number" which is inexact, and which differs from the other "Graham's number" definitions above. The exact text from  is:

 Graham's number is 4↑↑ . . . ↑↑4, where the number of arrows is 4↑↑ . . . ↑↑4, where the number of arrows is . . . et cetera . . . (where the number of lines like that is 64) It lies between 3→3→64→2 and 3→3→65→2.

(The last bit with the right-arrows is using Conway's Chained Arrow Notation). The problem with this description is that it doesn't tell you how many up-arrows is in the last of the "lines like that", though we know there are 64 lines. Sbiis Saibian has a thorough page on Graham's number  and calls this the "Graham-Conway number". He makes an educated guess that the number of up-arrows on the last 4↑..↑4 line should be the same as in the Graham-Gardner definition, i.e. 4 up-arrows:

Graham-Conway Number = Gc(64)
where Gc(n) is defined for all positive integer n as follows:
Gc(1) = 4↑↑↑↑4
Gc(n+1) = 4↑↑...↑↑4   with Gc(n) up-arrows

With this clarification, the definition of the "Graham-Conway number" is just that of the "Graham-Gardner number" except that you change the 3's to 4's. This definition has a sort of elegance in that everything is 4's, even the number of lines can be expressed as 64=4×4×4 if you're really trying to use 4's everywhere.

Clearly the "Graham-Conway number" is larger than the others, and it's the largest "Graham's number" I've seen, notwithstanding xkcd: How to horrify mathematicians

This strip is using a 2-argument Ackermann function. There are several two-argument definitions in my listing (Versions of Ackermann's Function), and we could choose any of them, so we may as well choose one that is easily turned into up-arrows, such as the one by Buck in 1963. This gives A(gn(64),gn(64)) = 2↑↑...↑↑gn(64), where the number of up-arrows is gn(64)-2. Thus it's "a little bigger" than gn(65). If xkcd wanted to horrify mathematicians a little more, and with fewer symbols, they might be better off defining a "single-argument" Ackermann function a(x)=A(x,x), then subscripting one of the g's by this a() function:

ga(g64) =     AAUGHHHH!!!

## Superclasses

Now let's take a break from the specific examples and functions for a moment to describe another type of "number class" that is evident in the way people describe large numbers. We'll call these superclasses.

I first saw the concept in Eliezer Yudkowsky's description8 of what I call the "Graham-Gardner number", apparently adapted from the very similar description by Martin Gardner16. I'm adapting it further to illustrate the superclasses.

(1) Let's start with 3↑3. This is 33 = 27, a number that is small enough for anyone to visualise.

(2) Next, consider 3↑↑3 = 33. This is 333 = 327 = 7625597484987, about 7 trillion. Even 7 billion would be too big for anyone to visualise, but most people can understand it as being comparable to something familiar. (For example, it is about 1000 times the world population and about a tenth the number of cells in a human body.)

(3) 3↑↑↑3 or 33, is 3(33), which is 37625597484987. That would be a power tower of 3's, 7 trillion levels high. Now we are far beyond the ability to understand — nothing in the real world is this numerous, and even combinations of things (such as the number of ways to shuffle every particle in the known universe) only require, at most, 4 or 5 levels of a power-tower to express. However, even though the quantity is beyond understanding, the procedure for computing it can be visualised: Start with n=1. Now replace n with 3n. Replace n with 3n again, and so on — repeat seven trillion times. This is a procedure that one can visualise doing, and it can even be completed in a reasonable amount of time using a fast computer and a suitable representation format such as level-index (be advised: some rounding may occur).

(4) Now consider 3↑↑↑↑3, or 33, the first step in the "Graham-Gardner number" definition. This is 3(33), which is 3(3(3(...(33))...)), a chain of 7 trillion 3's and hyper4 operators. Every one of these requires performing the exponent operation an unimaginable number of times. So now, the procedure is beyond human ability to visualise, although it can be understood. Start with n=3. Now replace n with an exponential tower of threes of height n. Repeat 33 - 2 more times, where 33 is the huge number whose calculation we visualised in the previous paragraph! Martin Gardner, also writing about "Graham's number" (the one here I call the "Graham-Gardner number"), said:

3↑↑↑↑3 is unimaginably larger than 3↑↑↑3, but it is still small as finite numbers go, since most finite numbers are very much larger.

(5) We have just seen four numbers in a sequence: 3↑3, 3↑↑3, 3↑↑↑3, and 3↑↑↑↑3. Now consider the formula for the "Graham-Gardner number". Begin with x = 3↑↑↑↑3, the number whose calculation procedure cannot even be visualised — then increase the number of up-arrows from 4 to x. Then increase the number of up-arrows to this new, larger value of x again. Then — repeat 61 more times! Here's what Yudkowsky had to say:

Graham's number is far beyond my ability to grasp. I can describe it, but I cannot properly appreciate it. (Perhaps Graham can appreciate it, having written a mathematical proof that uses it.) This number is far larger than most people's conception of infinity. I know that it was larger than mine. ...

This example illustrates the numbers (including those on my numbers page and on this page up to this point) can be seen as going from one level of abstraction to another. Here are the definitions of the superclasses:

Superclass 1: The number can be visualised. (Example: 27)

Superclass 2: The number cannot be visualised but it can be understood. (Example: 3↑↑3 ≈ 7 trillion)

Superclass 3: The number cannot be understood, but the procedure for computing it can be visualised. (Example: 3↑↑↑3)

Superclass 4: The procedure cannot be visualised, but the procedure can be understood. (3↑↑↑↑3)

Superclass 5: The procedure for generating the number is so abstract that it cannot be understood (by whoever's talking, by the author, by yourself, by some given group or people, etc.).

To complete the analogy to the normal classes, I will also add:

Superclass 0: The number can be "experienced" even by certain animals. (Example: 2)

Like the classes, the superclasses have important qualities that distinguish them, and this insight is what makes "superclass" a useful distinction. Each requires an additional amount of mental effort to ensure that the number is understood well anough to answer questions such as which number is larger, or which algorithm grows faster.

Notice that the first three (superclass 0, 1 and 2) are roughly equivalent to class 0, class 1 and class 2 above. The division points between them will be similar for most readers, but the upper range of superclass 2 will probably vary from one reader to another. I think I "understand" numbers about as big as the size of the visible universe, although I find it harder on some days than on others. There are probably people who have an understanding of such things as the size of the universe after cosmic inflation, or the size that would be needed for a spontaneous origin of life.

The first three superclasses can all be calculated fairly easily, using familiar techniques or modest towers of exponents — in the latter case sizes can easily be compared by looking at the number of levels in the tower and the value of the number at the top. Practical tools like hypercalc exist to actually work with these numbers.

Superclass 3 begins wherever your ability to "understand" ends. However, the procedure for computing it can still be visualised. Somewhere within superclass 3, the practical tools like hypercalc stop working. At the top end of superclass 3, we are well beyond the ability to work with actual numbers and have to start using symbols and words, because the numbers outstrip the available tools. However, it is still possible to make fairly simple proofs to show how one function converts to another. An example is proving that the higer-valued hyper4 operator ab grows about as quickly as the lower-valued hyper5 operator ab.

With superclass 4, such proofs become more difficult. This is why it is a bit tougher to compare the "Graham-Rothschild number" to the Moser. It can be worked out by most who have time and patience, and the explanation can be followed and understood fairly easy, but it probably isn't too easy to remember.

With Graham's number (the original "Graham-Rothschild number" or its more well-known variant the "Graham-Gardner number"), and other numbers of similar size and larger which we are about to get into, most readers will have stepped firmly into superclass 5. The last paragraph in the Yudkowsky quote captures what superclass 5 is about: perception and understanding are so difficult that it is hard even to compare the number to infinity. I cannot say where your superclass 5 begins, but once you reach it, it is better to stop trying to grasp the quantities in any direct way and instead use rigorous deduction, or trust others who have done so.

I have found my own boundary between 4 and 5 varies widely from one day to another. Clearly there are those (such as the people who worked on the numbers we are about to discuss) for whom "superclass 4" extends far higher than mine. I suggest it might be useful to define:

Superclass 6: The number is so big that no-one can understand its definition well enough to know anything useful about it.

In other words, superclass 6 is finite, but so large that no person, nor humanity as a whole, stands a chance of discovering anything useful about it. This probably happens with values of the Busy Beaver function with an argument greater than a few hundred.

## Conway's Chained Arrow Notation

This notation is based on Knuth's up-arrow notation. In this notation, three or more numbers are joined by right-arrows. The arrow is not an ordinary dyadic operator. You can have three or more numbers joined by arrows, in which case the arrows don't act separately, the whole chain has to be considered as a unit. It might be thought of as a function with a variable number of arguments, or perhaps a function whose single argument is an ordered list or vector.

Here is Conway's notation, adapted from Susan Stepney's excellent web page on large numbers:

A. Three-number chains are equivalent to the generalised hyper operator:

abc = hy(a, c+2, b) = a(c+2)b
= a ↑↑..↑↑ b, with c up-arrows

B. A chain with a final 1 is equivalent to the same chain with the 1 removed:

ab→...→x→1 = ab→...→x

C. Conway is silent about the meaning of a two-element chain7, but a definition is necessary. The most logical definition combines the two previous rules and assumes that they work in reverse:

ab = ab = ab→1 (by rule A)
= ab (by rule B)
therefore the reverse is true: ab = ab = ab

D. The last two numbers in a chain can be dropped if the second-to-last is a 1:

ab→...→x→1→(z+1) = ab→...→x

E. The last number in the chain can be reduced in size by 1 by taking the second-to-last number and replacing it with a copy of the entire chain with its second-to-last number reduced by 1:

ab→...→x→(y+1)→(z+1)
= ab→...→x→ (ab→...→xy→(z+1)) →z

F. If the last number in the chain is 2, rules D and E combine into:

ab→...→x→ (y+1) → 2
= ab→...→x→ (ab→...→xy→2) →1
= ab→...→x→ (ab→...→xy→2)
= ab→...→x→ (ab→...→x→ (ab→...→x→(y-1)→2) →1 )
= ab→...→x→ (ab→...→x→ (ab→...→x→(y-1)→2))

and in general,

ab→...→x→ (y+1) → 2
= ab→...→x→ (     y times
ab→...→x
) y times

G. A more general expansion of rule E follows:

ab→...→x→2→ (z+1)
= ab→...→x→ (ab→...→x→1→(z+1)) →z
= ab→...→x→ (ab→...→x ) →z

ab→...→x→3→ (z+1)
= ab→...→x→ (ab→...→x→2→(z+1)) →z
= ab→...→x→ (ab→...→x→ (ab→...→x) →z) →z

ab→...→x→4→ (z+1)
= ab→...→x→ (ab→...→x→3→(z+1)) →z
= ab→...→x→ (ab→...→x→ (ab→...→x→ (ab→...→x) →z) →z) →z

and in general,

ab→...→x→ (y+1) → (z+1)
= ab→...→x→ (     y times
ab→...→x
) →z     y times

The parentheses can only be removed after the chain inside the parentheses has been evaluated into a single number. Remember, the arrows are all taken together at once. You can't group them to evaluate, you can only reduce terms off the end using one of the rules above. abcd is not equivalent to (abc)→d, a→(bcd), a→(bc)→d, ab→(cd) or anything else like that.

The wikipedia article, Conway chained arrow notation, gives this set of rules which is equivalent to the above but probably a bit easier to use:

1. A single-number "chain" is equal to itself.

2. The chain ab represents the value ab.

3. If X is any chain (of one or more numbers), the chain Xa→1 is equivalent to Xa.

4a. The chain X→1→b for any value of b, is equivalent to X.

4b. The chain Xab for any values of a and b where both are greater than 1, is equivalent to X→(X→(a-1)→b)→(b-1).

4c. By repeating rule 4b we can see that any chain Xab with a and b both larger than 1 eventually turns into a chain X→c where "c" is a the value of the recursive construction X→(X→(X→(...(X→1→1)...)))→(b-1))→(b-1) with the number of nested parentheses depending on a.

Regarding 2's at the beginning

Note that 2↑↑↑...↑↑↑2 = 4 regardless of the number of up-arrows. Since any chain starting with 2→2→... will eventually end up as 2→2→c for some large c, it follows that any chain starting with 2→2→... is equal to 4.

Regarding 1's

Any chain starting with 1 eventually ends up as 1→ab for some a and b, which is 1↑↑↑...↑↑↑a with b arrows. This is just a power of 1, so any chain starting with 1 is equal to 1.

Any chain with a 1 in it someplace else, like abc→1→def, reduces to abc→1→x for some x, and this then immediately reduces to abc. So if you have anything with a 1 in it, you can remove the 1 and everything after it.

Some Correlations

The rules for Conway chained arrows express them in terms of Knuth's up-arrow notation and recursion; we can also express some of the other functions and numbers discussed above in terms of chained arrows:

Ackermann's function is equivalent to a 3-element chain:

A(m, n) = ( 2→(n+3)→(m-2) ) - 3

The "Graham-Gardner number" is not directly equal to a simple chained-arrow representation, but its definition in terms of the function gn(n) can be re-stated in terms of an interated function fi(n):

f1(n) = f(n) = 3→3→n = 3↑↑..↑↑3 with n up-arrows
f2(n) = f(f(n)) = 3→3→(3→3→n)
f3(n) = f(f(f(n))) = 3→3→(3→3→(3→3→n))
f4(n) = f(f(f(f(n)))), etc.

which gives the "Graham-Gardner number" G equal to f64(4) because the construction of Graham-Gardner starts with 3↑↑↑↑3 = 3→3→4, and we increase the number of arrows 63 times.

Now start with f64(1), and note that we can reverse rule 4c above (with "3→3" as the value of X), and show that:

f64(1) = 3→3→(3→3→(...3→3→(3→3→1))...))   with 64 sets of 3→3
= 3→3→(3→3→(...3→3→(3→3)→1)...)→1)→1
= 3→3→64→2

f64(27) = 3→3→(3→3→(...3→3→(3→3→27))...))   with 64 sets of 3→3
= 3→3→(3→3→(...3→3→(3→3→(3→3)))...))
= 3→3→65→2

Since f64(1) < f64(4) < f64(27), the "Graham-Gardner number" must be between 3→3→64→2 and 3→3→65→2. See here for a somewhat different discussion of the same thing.

Partial Ordering

One may speculate on the general problem of determining which is the larger of two chains ab→...→c and xy→...→z. We can begin to make answer that question for some of the shorter chains (most of which is simply a restatement and re-ordering of the examples in my partial ordering for Knuth up-arrows):

First, as noted above we can remove any 1's and anything coming after a 1. Also, if the chain starts with 2→2 the whole thing can be replaced with 4.

For two-item chains, the following are clear:

2→3 is smaller than 3→2
2→4 is the same as 4→2
for any other a and b, if b is greater than a then ab is greater than ba.

Now let's look at three-item chains:

For any a, the chain a→2→2 = a↑↑2 = aa. So for these, the one with a larger a is the larger chain.

If the first two items are the same, as in comparing abc to abd, both are like a↑↑↑...↑↑↑b but one has more arrows. So if a and b are both greater than 2, then the one with the larger d is larger.

Similarly if the first and last items are the same, as in comparing abd to acd, we are comparing two things with the same number of arrows (a↑↑↑...↑↑↑c versus a↑↑↑...↑↑↑d) and clearly the one with the larger number is larger.

A similar argument shows that if we are comparing acd to bcd, where only the first number differs, the one with the larger first number is larger. This is a generalization of the a→2→2 case above.

2→3→2 = 2↑↑3 = 2↑2↑2 = 2↑4, but 2→2→3 = 2↑↑↑2 = 2↑↑2 = 2↑2 = 4. Since 2→2→anything is 4, it's clear that if we have two 2's, the arrangement with the larger-than-2 number in the middle is larger. Also, 3→2→2 = 3↑↑2 = 3↑3 = 27, so this is the largest of the three combinations of two 2's and a 3.

There are three ways to combine a 2 and two 3's:

3→3→2 = 3↑↑3 = 3↑27; and 3→2→3 = 3↑↑↑2 = 3↑↑3 = 3↑27, so these are the same.

2→3→3 = 2↑↑↑3 = 2↑↑2↑↑2 = 2↑↑4 = 2↑2↑2↑2 = 2↑2↑4 = 2↑16

Now let's look at the combinations of 2, 3 and 4. There are 6 of these, and I'll show them here with the largest first:

3→2→4 = 3↑↑↑↑2 = 3↑↑↑3 = 3↑↑(3↑↑3) = 3↑↑(3↑3↑3) = 3↑↑(3↑27), a power-tower of height 3↑{27}.

2→3→4 = 2↑↑↑↑3 = 2↑↑↑(2↑↑↑2) = 2↑↑↑4 = 2↑↑(2↑↑(2↑↑2)) = 2↑↑(2↑↑4) = 2↑↑16, a tower of height 16.

2→4→3 = 2↑↑↑4, the same as 2→3→4 (a tower of height 16).

3→4→2 = 3↑↑4 = 3↑3↑3↑3, a tower of height 4.

4→2→3 = 4↑↑↑2 = 4↑↑4 = 4↑4↑4 = 4↑256.

4→3→2 = 4↑↑3 = 4↑4↑4 = 4↑256.

The thing to notice is that the winners have the 4 at the end, and among them the one with the 3 first is a lot larger. Let's try it again with bigger numbers all around:

4→3→5 = 4↑↑↑↑↑3 = 4↑↑↑↑(4↑↑↑↑4) = 4↑↑↑↑(4↑↑↑4↑↑↑4↑↑↑4)
3→4→5 = 3↑↑↑↑↑4 = 3↑↑↑↑(3↑↑↑↑(3↑↑↑↑3)) = 3↑↑↑↑(3↑↑↑↑(3↑↑↑3↑↑↑3))

Here it appears that 3→4→5 is going to end up being larger than 4→3→5. This is a reflection of the general rule for partial ordering of the hyper operator described above.

## Jonathan Bowers' Extended Operators

Jonathan Bowers has defined a whole series of notations that surpass everything mentioned so far. We can start by defining his extended operators, which are kind of analogous to the hyper operators. In fact, he starts with the hyper operators, which can be thought of as the original, "unextended" set of operators. Instead of putting the operator number inside a raised circle, he uses a pair of "angle-brackets":

a <n> b = ab = hy(a,n,b)

So, <1> is addition, a <1> b = a+b; <2> is multiplication and so on.

Using this notation makes it easier to make the operator itself a variable or expression, and unlike using the hy() function it retains the look of applying an operator (because the operator part is in the middle where it "belongs". For example:

a <1+2> b = a <3> b = ab
gn(1) = 3 <6> 3
gn(2) = 3 <3 <6> 3> 3
Mega ≈ 256 <4> 2
Moser ≈ Mega <Mega+1> 2 ≈ (256 <4> 2) <256 <4> 2> 2

Here is his first extended operator:

a <<1>> 2 = a <a> a
a <<1>> 3 = a <a <a> a> a
a <<1>> 4 = a <a <a <a> a> a> a
and so on
a <<1>> b     is "a expanded to b".

If you wish, you might prefer to represent this operator in a way similar to the higher hyper operators but with a 1 inside two circles (which I'll enlarge here for clarity):

ab

Since the two circles are hard to see in normal font sizes, I'll instead use the hyper1 operator symbol inside a set of parentheses: a()b.

Using this notation, the "Graham-Gardner number" is shown to be between 3()65 and 3()66 (the gn(x) function is as defined here):

3()2 = 3 <3> 3
gn(1) = 3 <6> 3
3()3 = 3 <3 <3> 3> 3 = 3 <27> 3
gn(2) = 3 <3 <6> 3> 3
and so on
3()65
gn(64) = "Graham-Gardner number"
3()66

Going back to Bowers' notation, here is the definition of the next extended operator. It is a right-associative recursive iteration of the first:

a <<2>> 2 = a <<1>> a
a <<2>> 3 = a <<1>> (a <<1>> a)
a <<2>> 4 = a <<1>> (a <<1>> (a <<1>> a))
and so on
a <<2>> b = a <<1>> (a <<2>> (b-1))
a <<2>> b     is called "a multiexpanded to b".

Note the similarity of the definition a <<2>> b = a <<1>> (a <<2>> (b-1)) to the corresponding definition for a hyper operator, e.g. ab = a(a(b-1)).

The subsequent extended operators are defined in a similar way, each in terms of the previous.

a <<3>> b     is called "a powerexpanded to b".
a <<4>> b     is called "a expandotetrated to b".

Then Bowers defines a third series of extended operators, in a similar way:

a <<<1>>> 2 = a <<a>> a
a <<<1>>> 3 = a <<a <<a>> a>> a
a <<<1>>> 4 = a <<a <<a <<a>> a>> a>> a
and so on
a <<<1>>> b     is "a exploded to b".

then the next operator:

a <<<2>>> 2 = a <<<1>>> a
a <<<2>>> 3 = a <<<1>>> (a <<<1>>> a)
a <<<2>>> 4 = a <<<1>>> (a <<<1>>> (a <<<1>>> a))
and so on
a <<<2>>> b     is called "a multiexploded to b".

and so on:

a <<<3>>> b     is called "a powerexploded to b".
a <<<4>>> b     is called "a explodotetrated to b".

You can see that this generalises easily to a function of four numbers, a, b, the number inside the angle-brackets and a number telling how many angle-brackets there are. This can be written as a function, f(a,b,c,d) or something like that — but Bowers wanted to go further.

First page . . . Back to page 3 . . . Forward to page 5 . . . Last page (page 9) s.11