Proceed to Safety

First page . . . Back to page 3 . . . Forward to page 5 . . . Last page (page 8)

Notes on The Busy Beaver Function

The "Busy Beaver game" was originally defined[43] by Tibor Rado at Ohio State in 1962. That same paper shows the technique of "chaining" Turing machines to construct machines that generate f(f(...(x))) marks on the tape, where f(x) represents the calculation performed by one of the machines, and based on this they show that BB(100)>=(((7!)!)!)! ≈ 10107.46115084×1016477

In 1964 Milton Green developed a lower bound for the Busy Beaver function that was published [44] in 1964. In this paper he shows that for each additional two states, machines can produce numbers of a size comparable to the next-higher function in a hierarchy that is very similar to the Ackermann or "Hyper" function series.

Heiner Marxen and Juergen Buntrock described Green's lower bound as "non-trivial (not primitive recursive)". This lower bound can be calculated but is too complex to state as a single expression in terms of n. When n=8 the method gives

BB(8) >= 3 × (7 × 392 - 1) / 2 ≈ 8.248×1044

Another similar result, quoted by Dewdney in his Scientific American column (Aug 1984)1, gives a lower bound for BB(12) that is slightly larger than 4096166, where is the hyper4 function AKA "tetration".

Ray Kurzweil, in his 1999 book The Age of Spiritual Machines, discusses the Busy Beaver function in a footnote. I believe he was quoting the results of Green or Marxen and Buntrock, because his description is suggestive of the Ackermann function and because of the near-agreement of the values for BB(8). Kurzweil states that a 6-state Turing machine "can add" and BB(6) is 35; that a 7-state machine can multiply and BB(7) is 22961, and that an 8-state machine can exponentiate and BB(8) is "about 1043". Then he goes on to say that BB(10) requires a notation in which the height of a stack of exponents is given by another stack of exponents whose height is given by... and so on; this might be equivalent to the "hyper5" function a↑↑↑b. (Presumably BB(9) is something in between like hyper4, and in general the successive values of BB(N) calculate values comparable to the generalized hyper operator or order N-5.) He says that BB(12) requires an "even more exotic notation", and that the limits of human intelligence are reached well before BB(100).

It has since been shown that at least one possible BB(6) candidate performs iterated exponentiation, the equivalent of hyper4.

The direct approach to finding actual values of the busy beaver function is to simulate Turing machines with a computer program, trying each possible set of rules to see what it does. The first impediment to this is that the number of machines to test grows so quickly. Lin and Rado give the forumla (2N-1)4N2N-2, which incorporates some of the rules listed below:

N (2N-1) 4N2N-2
1: 1
2: 192
3: 103680
4: 117440512
5: 230400000000
6: 697437190619136
7: 3018837446159761408
8: 17708874310761169551360

To actually implement a complete test of all Turing machines of a given state, you have to deal with many different types of cases. Many are easy to deal with and many are very difficult.

(If your browser supports Java you can try the Turing machine simulators here or here or here. If you have Linux, you can use my own simulator written in perl)

Since the tape is symmetrical, left and right can be interchanged. These two programs differ only in the directions of the arrows, so one can be ignored:

1,_:    2,1,>        1,1:    2,1,<
2,_:    1,1,<        2,1:    H,1,>

1,_:    2,1,<        1,1:    2,1,>
2,_:    1,1,>        2,1:    H,1,<

the latter program is skipped by setting as a rule that the "1,_" state always moves to the right.

If the "1,_" state writes a "_" and moves to state N, the machine will find itself in state N with a completely empty tape. Such a machine is identical to a machine in which states 1 and N are switched, except that it takes one extra step. Thus, for the purpose of the Busy Beaver problem, the first machine can be ignored. This is accomplished by setting a rule that the "1,_" state should always write a 1.

If the "1,_" state moves to state N, and if N is greater than 2, states N and 2 can be swapped and the resulting machine is equivalent, and as a result one of these two machines can be ignored. This is accomplished by setting a rule that the "1,_" state always proceeds to state 1 or 2.

If the "1,_" state goes to state 1, then the machine will stay in state 1 forever and write 1's forever, and therefore will never halt. Thus, such machines can be skipped. Combined with the previous rules, this completely defines the actions for the "1,_" state: It should be "1,_: 2,1,>".

The aforegoing rules establish that the "2,_" action will be the second to execute. Therefore it should not halt: if it did the machine would never use the "1,1" action, which could be used by having the "2,_" action perform "1,1,<".

Furthermore, if the "2,_" action goes to state 1 or state 2, it should not move to the right because if it did it would create an infinite-loop machine (such a machine would continually move right, either staying in state 2 or alternating between states 1 and 2.).

No state should have both of its actions go to itself (i.e. not change state) because that would result in an infinite loop as soon as that state is entered. An example of this is if the "3,_" action and the "3,1" action both went to state 3.

Except for state 1 (which is by definition the starting state) and state 2 (which, if using the aforegoing rules, is the state entered from the "1,_" state), all the other states can be renumbered in arbitrary order, and the resulting machines all have the same behavior. This is important for machines with 4 or more states and allows lots of programs to be eliminated quickly.

If the program doesn't have any states that contain a HALT action, we know it will never halt and it can be skipped.

Programs containing two or more HALT actions can be skipped because only one of the HALTs will ever be used (whichever one it gets to first) and the other HALTs would be superfluous.

If the HALT writes a 0, the same machine with a HALT that writes a 1 will be at least as good or better; thus the 0-writing version can be ignored.

A machine that moves left when HALTing is the same as a machine that moves right when HALTing, thus only one of these two possibilities needs to be looked at. This is accomplished by setting a rule that the HALT always moves to the right.

Some programs contain unused states. For example:

1,_:    1,1,>        1,1:    1,1,<
2,_:    2,1,>        2,1:    H,1,>

State 1 always goes to state 1, so state 2 is never reached. Programs like this can be skipped because there is always a way to modify the program so that it uses the extra state and produces at least as much output.

A similar example is:

1,_:    2,1,>        1,1:    2,1,<
2,_:    1,1,<        2,1:    1,1,<
3,_:    4,1,>        3,1:    4,1,<
4,_:    H,1,>        4,1:    3,1,<

There is no way to get past states 1 and 2. Such machines are eliminated by looking for loops in the state graph.

Some programs halt very quickly, for example:

1,_:    2,1,>        1,1:    2,1,<
2,_:    3,1,>        2,1:    3,1,<
3,_:    4,1,>        3,1:    4,1,<
4,_:    H,1,>        4,1:    H,1,>

This program halts quickly because each state goes to a higher-numbered state, and the highest state (state 4) always halts. Based on that alone you know it will run for only 4 steps. (However, it doesn't hurt much to run such machines because they don't take long to run)

Based on all these examples, it seems as though it ought to be easy to search for the busy beaver using brute force. The trouble is in the programs that have to be run to determine what happens. Some exhibit very complex behavior. Only the simplest behaviors can be detected automatically.

If a program is simulated, and the simulation shows that the program halts before it has used all of its rules, then the unused rules are irrelevant and we know right away that all other programs that differ only in these unused rules will have the same behavior. This allows those other programs to be skipped.

Any programs that don't halt real quickly have to be analyzed in some way, usually by human inspection.

The simplest behaviors are like this trivial example:

1,_:    2,1,>        1,1:    2,1,<
2,_:    1,1,>        2,1:    H,1,>

which will keep writing 1's and moving to the right.

Many programs have similar behavior, writing some continuous pattern and moving steadily and repeatedly either to the left or to the right. Here's a slightly less trivial example — it moves forever to the left in a two steps forward, one step back manner, leaving a trail of alternating 1's and 0's:

1,_:    2,1,>        1,1:    3,1,<
2,_:    3,1,<        2,1:    H,1,>
3,_:    1,_,<        3,1:    3,_,<

Automatically identifying such programs can be difficult, because sometimes the pattern it generates is complex and it takes a large number of steps to repeat each cycle.

Some programs write a fixed number of 1's but never halt, for example:

1,_:    2,1,>        1,1:    H,1,>
2,_:    2,_,>        2,1:    1,1,>

which writes a single 1 and then goes off forever to the right.

Some programs grow forever but don't move steadily in one direction or the other. For example:

1,_:    2,1,>        1,1:    1,1,<
2,_:    3,1,>        2,1:    2,1,>
3,_:    1,_,<        3,1:    H,1,>

It produces an ever-expanding row of 1's in both directions, by moving back and forth. Each time it gets to an end it writes a new 1 and then reverses direction. State 3 is never reached while on a 1, so it never halts. Many programs behave in a similar manner but produce more complicated patterns or move in more complicated ways.

Some programs grow forever without any repeating pattern. The simplest example is this machine:

1,_:    2,1,>        1,1:    1,1,<
2,_:    1,_,<        2,1:    2,_,>
3,_:    1,_,<        3,1:    H,1,>

which is a binary counter. It keeps moving back and forth producing each binary number in sequence. The string of 1's and 0's is always changing, has a pattern that never repeats and grows forever (although, of course, the growth is very slow).

The next example is a little more complicated. It writes a string of 1's to the right, and simultaneously updates a binary number on the left; the binary number tells how many 1's have been written to the right:

1,_:    2,1,>        1,1:    1,1,<
2,_:    3,_,>        2,1:    2,_,>
3,_:    4,1,>        3,1:    3,1,>
4,_:    5,_,<        4,1:    H,1,>
5,_:    1,_,<        5,1:    5,1,<

Here is another machine (given by Marxen and Buntrock) that works like a binary counter, but produces a more complex pattern. Whenever all the 1's are to the left of the head (which occurs often) the binary value is recognizable. The digits are coded in groups of 3 cells: what would be a "1" digit appears as "111" and what would be a "0" appears as "_11":

1,_:    2,1,>        1,1:    1,1,<
2,_:    1,_,<        2,1:    3,_,>
3,_:    3,_,<        3,1:    4,1,>
4,_:    5,1,>        4,1:    1,_,<
5,_:    2,_,>        5,1:    H,1,>

Many other programs move around erratically, writing seemingly random patterns of 1's and 0's and transitioning between states in a seemingly random order, not moving steadily in any direction. Here's an example, the 5-state machine found by Schult in 1983:

1,_:    2,1,>        1,1:    3,_,<
2,_:    3,1,>        2,1:    4,1,>
3,_:    1,1,<        3,1:    2,_,>
4,_:    5,_,>        4,1:    H,1,>
5,_:    3,1,<        5,1:    1,1,>

This machine was a record-holder for a while in the 5-state "contest". In order to halt it has to be in state 4 with a 1 on the tape. The only way to be in state 4 is coming from state 2 with a 1 on the tape. The only way to be in state 2 is coming from state 1 with a 0 on the tape, or state 3 with a 1 on the tape. All other states lead to states 1 and 3, which is where it spends most of its time. When you run it, it appears to fill the tape with ever longer patterns of "111_111_111_...", then backtrack over those patterns and convert them into "1_1_1_1_1_...", but every time it completes a cycle there is a little difference, like an extra 1 at the left or right end. It runs for 134,467 steps before finally producing just the right pattern of 1's and 0's to lead itself into its HALT state.

There are many programs like this that run for a long time and eventually halt, and many others that have been simulated for a long time without following any type of pattern that anyone has been able to figure out. Such machines resemble this one:

1,_:    2,1,>        1,1:    2,1,<
2,_:    3,1,<        2,1:    5,_,>
3,_:    4,_,<        3,1:    1,_,>
4,_:    1,1,>        4,1:    4,_,<
5,_:    H,1,>        5,1:    3,_,>

Its behavior was sufficiently complex as to elude Marxen and Buntrock, but not their program. Designed to analyze Turing machines and prove that they do or do not halt, the program did manage to create a proof that this machine does not halt. The trouble is, the proof is so complex that it is difficult to check by hand.

Some programs run so long before halting that direct "one-step-at-a-time" simulation isn't fast enough to show how long they run. Many machines can be run to completion with first-order acceleration techniques (skipping through any repeated sequence of states and/or tape cells). However, many others require a combination of different types of acceleration. The current record-holder for 6-state machines, which takes 8,690,333,381,690,951 (over 8×1015) steps to halt, is a good example. And, when all types of automated acceleration fall short, "maunal" (human) analysis can sometimes be used to prove that a machine halts and calculate the relevant numbers. (An example of this follows the table).

The list of Busy Beavers and BB Candidate Record-Setters has been moved to its own article, History of Busy Beaver Turing Machine Records.

Analysis of the 1997 Marxen-Buntrock BB6 Record-Setter

This section describes the machine that set the record BB6 >= 95,524,079. It is a 6-state candidate discovered by Marxen and Buntrock in 1997. (The Oct 2000 record-holder is discussed in the next section)

The machine's operation lends itself easily to analysis, which reveals a beautiful combination of a few simple processes that interact in just the right way. The machine implements an iterated function that grows at an exponential rate and uses a test condition based on a low-probability, deterministic, and statistically random condition to determine when to stop growing. As a result, it grows to a very large size, so large that it takes a really sophisticated Turing machine simulator to show that it halts by brute-force iteration.

Simulation of the first few steps shows that it creates a row of 1's and repeatedly expands it to a row of 1's about 2.5 times longer. The exit condition depends on the number of 1's modulo 4 (the remainder of the number when divided by 4). But the number of 1's increases in a way that also depends in its value modulo 4, with the result that the value modulo 4 changes in a "chaotic" manner. This means that the row of 1's can multiply many times before the machine halts. Furthermore, the process of performing the mod-4 test is combined with the process of performing the division by 2 (which is then followed by a multiplication by 5). This combination of functions — accomplishing two things simultaneously with the same code — is part of what makes this much complexity possible in a Turing machine with only 6 states.

To analyze, we started by finding an example of a simple state that the machine enters over and over again. I call such a state "canonical" because it it has a simple pattern and recurs multiple times during the machine's history. As already implied, the canonical state in this case is a single continuous set of N 1's. In addition we specify that the machine is in state 1 and the head is positioned over any of the 1's. It doesn't matter which of the 1's the head is on, because in all such states the machine just moves to the right, staying in state 1 until it gets to the end of the row of 1's.

This condition occurs at steps 4, 37, 259, and several more times as listed below. After it reaches the right-hand end of the row of 1's, it adds two more 1's to the right and reverses direction, then plows through the rest of the 1's changing them into a "1_1_" pattern (this is easy to see in the machine's state table: it cycles through states 3,4,5,6, each time moving to the left and writing alternating 1 and _). When the head reaches the left end, what happens next depends on the value of N modulo 4. Here is what the tape will end up looking like for each of the 4 possibilities:

N = 0 mod 4 (4, 8, 12, etc.): 1(10)M11
N = 1 mod 4 (5, 9, 13, etc.): Halts with 1(10)M11 on the tape
N = 2 mod 4 (6, 10, 14, etc.): 111(10)M11
N = 3 mod 4 (7, 11, 15, etc.): 111(10)M11

(Where M is equal to N/2, rounded down.) As you can see, only values of N that are 1 more than a multiple of 4 will result in a halt.

In the remaining cases the machine begins a process that takes about 5 N2 / 4 steps, during which it repeatedly shuttles back and forth turning each copy of '1_' into '11' and then appending a '111' to the left, thereby adding a total of 3M 1's to the left. When this is done the tape is once again a single solid row of 1's, with the machine in state 1. The new value of N for this new, longer string of 1's is different for each of the three cases:

N = 0 mod 4: 3 M + 1 + 2 M + 2 = 5 N / 2 + 3 = (1 or 3) mod 4
N = 1 mod 4: (machine will have halted)
N = 2 mod 4: 3 M + 3 + 2 M + 2 = 5 N / 2 + 5 = (0 or 2) mod 4
N = 3 mod 4: 3 M + 3 + 2 M + 2 = 5 (N-1) / 2 + 5 = (0 or 2) mod 4

Notice that the formula for the new N involves a division by 2, and adds either 3 or 5. Because of the division by 2, the new N's mod-4 value depends on the old N's value modulo 8:

N = 0 mod 8: new N = 3 mod 4
N = 1 mod 8: (machine will have halted)
N = 2 mod 8: new N = 2 mod 4
N = 3 mod 8: new N = 2 mod 4
N = 4 mod 8: new N = 1 mod 4
N = 5 mod 8: (machine will have halted)
N = 6 mod 8: new N = 0 mod 4
N = 7 mod 8: new N = 0 mod 4

As you can see, out of the 6 possible cases, only one of them leads to an N=1 mod 4 value (the 'halting' value) for the new N. This is an example of a generalized Collatz mapping2. (The parameters are: T(x) = floor(mi.x / d) + Xi, where d=4, i such that x = i mod d, mi = {10,0,10,10}, Xi = {3,1,5,3}, and initial value x=3).

A "Collatz mapping" is an iterated function on integers, where at each iteration you perform a modulo test (like the remainder after dividing by 2) and then select from one of a number of linear functions (such as X/2 or 3X+1) based on what the remainder was. The simplest and best-known Collatz mapping is the "3N+1 problem", which states that for any X, the next value of X in the iteration is:

X/2 if X is even
3X + 1 if X is odd

Such iterations are notable for the fact that the behavior of the iteration is very difficult to characterize. Does any value for X cause the iteration to grow forever, or do they all always end up in the cycle 4, 2, 1, 4, 2, 1, ...? Some values of X grow pretty high.

Our Turing machine differs from the standard Collatz mapping in having a "halt" condition as one possible outcome. It also differs from the 3N+1 problem in that all of the linear functions grow, thus there is no question of it ever getting smaller — only how long will it grow before it stops?

Here are all the N values, and for the first 12 cases the step number (found by simulation) at which a row of N 1's is first seen. (The precise number of steps for each N can be calculated using a little more analysis, but it's not important for answering the halting question). Also given are the mod-4 value and the value of M. Notice in particular the fairly erratic pattern of mod-4 values.

at step N i such that
N = i mod 4
4 3 3 1
37 10 2 5
259 30 2 15
1661 80 0 40
10,224 203 3 101
63,057 510 2 255
392,779 1280 0 640
2,449,742 3203 3 1601
15,294,575 8010 2 4005
95,566,797 20,030 2 10,015
597,248,199 50,080 0 25,040
3,732,606,762 125,203 3 62,601
313,010 2 156,505
782,530 2 391,265
1,956,330 2 978,165
4,890,830 2 2,445,415
12,227,080 0 6,113,540
30,567,703 3 15,283,851
76,419,260 0 38,209,630
191,048,153 1 95,524,076

It might seem a little disappointing to some readers to note that this machine, a record-setter for the number of 1's produced, actually produces as many as twice that record, only to cut the number almost in half just before halting.

First page . . . Back to page 3 . . . Forward to page 5 . . . Last page (page 8)

Robert Munafo's home pages on AWS    © 1996-2024 Robert P. Munafo.    about    contact
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Details here.