Large Numbers -- Notes (page 4) at MROB

First page . . . Back to page 3 . . . Forward to page 5 . . . Last page (page 8)

Notes on The Busy Beaver Function

The "Busy Beaver game" was originally defined[50] by Tibor Rado at Ohio State in 1962. That same paper shows the technique of "chaining" Turing machines to construct machines that generate f(f(...(x))) marks on the tape, where f(x) represents the calculation performed by one of the machines, and based on this they show that BB(100)≥(((7!)!)!)! ≈ 10^{10^{7.46115084×10¹⁶⁴⁷⁷}}

In 1964 Milton Green developed a lower bound for the Busy Beaver function that was published [51] in 1964. In this paper he shows that for each additional two states, machines can produce numbers of a size comparable to the next-higher function in a hierarchy that is very similar to the Ackermann or "Hyper" function series.

Heiner Marxen and Juergen Buntrock discovered many candidates for actual Busy Beaver machine definitions for the 5-state and 6-state problems. A few of them are listed here, including the machine that was eventually proven to be the record holder for "number of steps before halting".

Regarding the general case of BB(n) as n increases, Marxen and Buntrock described Green's lower bound as "non-trivial (not primitive recursive)". This lower bound can be calculated but is too complex to state as a single expression in terms of n. When n=8 the method gives

BB(8) ≥ 3 × (7 × 3⁹² - 1) / 2 ≈ 8.248×10⁴⁴

Another similar result, quoted by Dewdney in his Scientific American column (Aug 1984)¹, gives a lower bound for BB(12) that is slightly larger than 4096^④166, where ^④ is the hyper4 function AKA "tetration".

Ray Kurzweil, in his 1999 book The Age of Spiritual Machines, discusses the Busy Beaver function in a footnote. I believe he was quoting the results of Green or Marxen and Buntrock, because his description is suggestive of the Ackermann function and because of the near-agreement of the values for BB(8). Kurzweil states that a 6-state Turing machine "can add" and BB(6) is 35; that a 7-state machine can multiply and BB(7) is 22961, and that an 8-state machine can exponentiate and BB(8) is "about 10⁴³". Then he goes on to say that BB(10) requires a notation in which the height of a stack of exponents is given by another stack of exponents whose height is given by... and so on; this might be equivalent to the "hyper5" function a↑↑↑b. (Presumably BB(9) would allow something in between like hyper4 (tetration), making a few successive values of BB(N) calculate values comparable to the generalized hyper operator of order N-5.) It soon outgrows this though, as Kurzweil says that BB(12) requires an "even more exotic notation". He finally proposes that the limits of human intelligence will be reached well before we could comprehend BB(100).

The Kurzweil lower bounds were soon exceeded by later developments. BB(6) candidates have been found through a decades-long effort to thoroughly search all possibilities.

Direct and Thorough Searching

The direct approach to finding actual values of the busy beaver function is to simulate Turing machines with a computer program, trying each possible set of rules to see what it does. The first impediment to this is that the number of machines to test grows so quickly. Lin and Rado give the forumla (2N-1)4N^2N-2, which incorporates some of the rules listed below:

N (2N-1) 4N^2N-2
1: 1
2: 192
3: 103680
4: 117440512
5: 230400000000
6: 697437190619136
7: 3018837446159761408
8: 17708874310761169551360
etc.

To actually implement a complete test of all Turing machines of a given state, you have to deal with many different types of cases. Many are easy to deal with and many are very difficult.

(If your browser supports Java you can try the Turing machine simulators here or here or here. If you have Linux, you can use my own simulator written in perl)

Since the tape is symmetrical, left and right can be interchanged. These two programs differ only in the directions of the arrows, so one can be ignored:

1,_: 2,1,> 1,1: 2,1,<
2,_: 1,1,< 2,1: H,1,>

1,_: 2,1,< 1,1: 2,1,>
2,_: 1,1,> 2,1: H,1,<

the latter program is skipped by setting as a rule that the "1,_" state always moves to the right.

If the "1,_" state writes a "_" and moves to state N, the machine will find itself in state N with a completely empty tape. Such a machine is identical to a machine in which states 1 and N are switched, except that it takes one extra step. Thus, for the purpose of the Busy Beaver problem, the first machine can be ignored. This is accomplished by setting a rule that the "1,_" state should always write a 1.

If the "1,_" state moves to state N, and if N is greater than 2, states N and 2 can be swapped and the resulting machine is equivalent, and as a result one of these two machines can be ignored. This is accomplished by setting a rule that the "1,_" state always proceeds to state 1 or 2.

If the "1,_" state goes to state 1, then the machine will stay in state 1 forever and write 1's forever, and therefore will never halt. Thus, such machines can be skipped. Combined with the previous rules, this completely defines the actions for the "1,_" state: It should be "1,_: 2,1,>".

The aforegoing rules establish that the "2,_" action will be the second to execute. Therefore it should not halt: if it did the machine would never use the "1,1" action, which could be used by having the "2,_" action perform "1,1,<".

Furthermore, if the "2,_" action goes to state 1 or state 2, it should not move to the right because if it did it would create an infinite-loop machine (such a machine would continually move right, either staying in state 2 or alternating between states 1 and 2.).

No state should have both of its actions go to itself (i.e. not change state) because that would result in an infinite loop as soon as that state is entered. An example of this is if the "3,_" action and the "3,1" action both went to state 3.

Except for state 1 (which is by definition the starting state) and state 2 (which, if using the aforegoing rules, is the state entered from the "1,_" state), all the other states can be renumbered in arbitrary order, and the resulting machines all have the same behavior. This is important for machines with 4 or more states and allows lots of programs to be eliminated quickly.

If the program doesn't have any states that contain a HALT action, we know it will never halt and it can be skipped.

Programs containing two or more HALT actions can be skipped because only one of the HALTs will ever be used (whichever one it gets to first) and the other HALTs would be superfluous.

If the HALT writes a 0, the same machine with a HALT that writes a 1 will be at least as good or better; thus the 0-writing version can be ignored.

A machine that moves left when HALTing is the same as a machine that moves right when HALTing, thus only one of these two possibilities needs to be looked at. This is accomplished by setting a rule that the HALT always moves to the right.

Some programs contain unused states. For example:

1,_: 1,1,> 1,1: 1,1,<
2,_: 2,1,> 2,1: H,1,>

State 1 always goes to state 1, so state 2 is never reached. Programs like this can be skipped because there is always a way to modify the program so that it uses the extra state and produces at least as much output.

A similar example is:

1,_:    2,1,>        1,1:    2,1,<
2,_:    1,1,<        2,1:    1,1,<
3,_:    4,1,>        3,1:    4,1,<
4,_:    H,1,>        4,1:    3,1,<

There is no way to get past states 1 and 2. Such machines are eliminated by looking for loops in the state graph.

Some programs halt very quickly, for example:

1,_:    2,1,>        1,1:    2,1,<
2,_:    3,1,>        2,1:    3,1,<
3,_:    4,1,>        3,1:    4,1,<
4,_:    H,1,>        4,1:    H,1,>

This program halts quickly because each state goes to a higher-numbered state, and the highest state (state 4) always halts. Based on that alone you know it will run for only 4 steps. (However, it doesn't hurt much to run such machines because they don't take long to run)

Based on all these examples, it seems as though it ought to be easy to search for the busy beaver using brute force. The trouble is in the programs that have to be run to determine what happens. Some exhibit very complex behavior. Only the simplest behaviors can be detected automatically.

If a program is simulated, and the simulation shows that the program halts before it has used all of its rules, then the unused rules are irrelevant and we know right away that all other programs that differ only in these unused rules will have the same behavior. This allows those other programs to be skipped.

Any programs that don't halt real quickly have to be analyzed in some way, usually by human inspection.

The simplest behaviors are like this trivial example:

1,_: 2,1,> 1,1: 2,1,<
2,_: 1,1,> 2,1: H,1,>

which will keep writing 1's and moving to the right.

Many programs have similar behavior, writing some continuous pattern and moving steadily and repeatedly either to the left or to the right. Here's a slightly less trivial example — it moves forever to the left in a two steps forward, one step back manner, leaving a trail of alternating 1's and 0's:

1,_:    2,1,>        1,1:    3,1,<
2,_:    3,1,<        2,1:    H,1,>
3,_:    1,_,<        3,1:    3,_,<

Automatically identifying such programs can be difficult, because sometimes the pattern it generates is complex and it takes a large number of steps to repeat each cycle.

Some programs write a fixed number of 1's but never halt, for example:

1,_: 2,1,> 1,1: H,1,>
2,_: 2,_,> 2,1: 1,1,>

which writes a single 1 and then goes off forever to the right.

Some programs grow forever but don't move steadily in one direction or the other. For example:

1,_:    2,1,>        1,1:    1,1,<
2,_:    3,1,>        2,1:    2,1,>
3,_:    1,_,<        3,1:    H,1,>

It produces an ever-expanding row of 1's in both directions, by moving back and forth. Each time it gets to an end it writes a new 1 and then reverses direction. State 3 is never reached while on a 1, so it never halts. Many programs behave in a similar manner but produce more complicated patterns or move in more complicated ways.

Some programs grow forever without any repeating pattern. The simplest example is this machine:

1,_:    2,1,>        1,1:    1,1,<
2,_:    1,_,<        2,1:    2,_,>
3,_:    1,_,<        3,1:    H,1,>

which is a binary counter. It keeps moving back and forth producing each binary number in sequence. The string of 1's and 0's is always changing, has a pattern that never repeats and grows forever (although, of course, the growth is very slow).

The next example is a little more complicated. It writes a string of 1's to the right, and simultaneously updates a binary number on the left; the binary number tells how many 1's have been written to the right:

1,_:    2,1,>        1,1:    1,1,<
2,_:    3,_,>        2,1:    2,_,>
3,_:    4,1,>        3,1:    3,1,>
4,_:    5,_,<        4,1:    H,1,>
5,_:    1,_,<        5,1:    5,1,<

Here is another machine (given by Marxen and Buntrock) that works like a binary counter, but produces a more complex pattern. Whenever all the 1's are to the left of the head (which occurs often) the binary value is recognizable. The digits are coded in groups of 3 cells: what would be a "1" digit appears as "111" and what would be a "0" appears as "_11":

1,_:    2,1,>        1,1:    1,1,<
2,_:    1,_,<        2,1:    3,_,>
3,_:    3,_,<        3,1:    4,1,>
4,_:    5,1,>        4,1:    1,_,<
5,_:    2,_,>        5,1:    H,1,>

Many other programs move around erratically, writing seemingly random patterns of 1's and 0's and transitioning between states in a seemingly random order, not moving steadily in any direction. Here's an example, the 5-state machine found by Schult in 1983:

1,_:    2,1,>        1,1:    3,_,<
2,_:    3,1,>        2,1:    4,1,>
3,_:    1,1,<        3,1:    2,_,>
4,_:    5,_,>        4,1:    H,1,>
5,_:    3,1,<        5,1:    1,1,>

This machine was a record-holder for a while in the 5-state "contest". In order to halt it has to be in state 4 with a 1 on the tape. The only way to be in state 4 is coming from state 2 with a 1 on the tape. The only way to be in state 2 is coming from state 1 with a 0 on the tape, or state 3 with a 1 on the tape. All other states lead to states 1 and 3, which is where it spends most of its time. When you run it, it appears to fill the tape with ever longer patterns of "111_111_111_...", then backtrack over those patterns and convert them into "1_1_1_1_1_...", but every time it completes a cycle there is a little difference, like an extra 1 at the left or right end. It runs for 134,467 steps before finally producing just the right pattern of 1's and 0's to lead itself into its HALT state.

There are many programs like this that run for a long time and eventually halt, and many others that have been simulated for a long time without following any type of pattern that anyone has been able to figure out. Such machines resemble this one:

1,_:    2,1,>        1,1:    2,1,<
2,_:    3,1,<        2,1:    5,_,>
3,_:    4,_,<        3,1:    1,_,>
4,_:    1,1,>        4,1:    4,_,<
5,_:    H,1,>        5,1:    3,_,>

Its behavior was sufficiently complex as to elude Marxen and Buntrock, but not their program. Designed to analyze Turing machines and prove that they do or do not halt, the program did manage to create a proof that this machine does not halt. The trouble is, the proof is so complex that it is difficult to check by hand.

Some programs run so long before halting that direct "one-step-at-a-time" simulation isn't fast enough to show how long they run. Many machines can be run to completion with first-order acceleration techniques (skipping through any repeated sequence of states and/or tape cells). However, many others require a combination of different types of acceleration. One good example was a certain machine that takes 8,690,333,381,690,951 (over 8×10¹⁵) steps to halt. When automated acceleration is too difficult, "manual" (human) analysis can sometimes be used to prove that a machine halts and calculate the relevant numbers. An example of this follows.

Busy Beaver Candidate Record-Setters

The list of Busy Beavers and BB Candidate Record-Setters has been moved to its own article, History of Busy Beaver Turing Machine Records.

6-state Records

In 2000 Marxen and Buntrock found a machine that performs iterated exponentiation, the equivalent of hyper4, leaving 6×10⁴⁶² ones on the tape — far exceeding the "35" achieved by Kurzweil's construction described earlier. Pavel Kropitz raised the BB(6) lower bound to 3.515×10¹⁸²⁶⁷ in 2010. This record held until May 2022, when Kropitz alternated with Ligocki in quick succession, raising the BB(6) record to 6.0×10³⁹⁴⁵⁶, 2×10⁹⁸⁶⁴¹, 10^646456993, 10^{10^{10^{10²⁰⁸²³}}}, and a staggering 10↑↑15. 6-state machines perform tetration, while Kurzweil's machines would need 9 states to do the same.

The 1997 Marxen-Buntrock BB₆ Record-Setter

For a more thorough and rigorous analysis read this page.

In 1997, Marxen and Buntrock found a Turing machine that set the record BB₆ ≥ 95,524,079.

As described in the article, the machine arrives several times at a pattern that is a single continuous set of N_i ones, with the head in state A and pointing at the 2^nd 1 from the left. Each time this happens the number of 1's has increased by a multiple of about 5/2. Then, after the 21^st such occurrence, the machine changes half the 1's to 0's and halts. It has calculated a number that is about 2.5²⁰ and somehow "magically" knows to stop rather than just running forever. This is achieved via a "modulo test" on the number of 1's each of the 21 times, which has 4 possible answers and the "halting" answer happens for the first time after the 20^th repetition.

Collatz Iteration in Busy Beavers

The 1997 Marxen-Buntrock machine serves as an example of an algorithm that performs repeated multiplications and a test using modulo arithmetic. The modulo test is computing the value of the remainder when a number is divided by some other (usually small) number. For example, 143 modulo 4 is 3, because there is a remainder of 3 when dividing by 4: 143 = 35×4+3.

A "Collatz mapping"² is an iterated function on integers, where at each iteration a modulo test is performed. The simplest modulo test is a test for odd or even, equivalent to noting the remainder after dividing by 2. Based on the modulo test, the Collatz mapping selects from two or more possible functions (such as 5X/2 or 3X+1).

The simplest and best-known Collatz mapping is the "3N+1 problem", which states that for any X, the next value of X in the iteration is:

		X/2	if X is even
		3X + 1	if X is odd

Such iterations are notable for the fact that the behavior of the iteration is very difficult to characterize. Does any value for X cause the iteration to grow forever, or do they all always end up in the cycle 4, 2, 1, 4, 2, 1, ...? Some values of X grow pretty high.

The 1997 Marxen-Buntrock BB(6) Turing machine differs from the standard Collatz mapping in having a "halt" condition as one possible outcome. It also differs from the 3N+1 problem in that all of the linear functions grow the number about 2.5×, thus there is no question of it ever getting smaller — only how long will it grow before it stops?

Later BB(6) Records

Marxen and Buntrock beat their 1997 record in Oct 2000 with a machine discussed in the next section).