Hyper4 Iterated Exponential Function
Contents
The "Lower" Hyper4 Function a④b
An Exact Solution of the Inverses of a④b
The Higher Hyper4 Function ("tetration") a↑↑b = a④b
My Extension to Real-Valued Arguments
The main topic of this page is a function, hyper4, that takes two arguments and performs a repeated exponentiation operation. hyper4 is to ordinary exponentiation as exponentiation is to multiplication. Becase it is a function of two arguments, I also sometimes call it an operator.
For completeness I will begin by describing each of the more familiar operators, starting with the most basic.
Addition a+b = a①b
Addition can be thought of as an iterated act of adding one:
a+b = a+1+1+1+1...
where there are b copies of 1 on the right-hand side of the = sign.
Here is a table of the addition function for arguments from 0 to 6. It is the first part of a larger table that most grammar-school students learn:
|
By antidiagonals, this table is sequence A3056.
The rows and columns are all A0027 (omitting one or more initial terms).
The main diagonal is A5843.
The superdiagonal and subdiagonal are both A5408.
Multiplication a×b = a②b
Multiplication is iterated addition. These two definitions are equivalent:
a×b = ((a+a)+a)+...+a
or:
a×b = a+...+(a+(a+a))
where in each case there are b copies of a on the right-hand side of the = sign.
Here is a table of the multiplication function for arguments from 0 to 6. It is the first part of a larger table that most grammar-school students learn:
|
By antidiagonals, this table is sequence A4247.
The rows are A0004, A0027, A5843, A8585, A8586, A8587, A8588, etc.
Each column is the same as the corresponding row.
The main diagonal is A0290.
The superdiagonal and subdiagonal are both A2378.
Exponentiation ab = a↑b = a②b
Exponentiation is iterated multiplication. It is normally written ab but on this page I will use the ↑ symbol, like this: a↑b. These two definitions are equivalent:
a↑b = ((a×a)×a)×...×a
or:
a↑b = a×...×(a×(a×a))
where in each case there are b copies of a on the right-hand side of the = sign.
Here is a table1 of the exponentiation function for arguments from 0 to 6:
|
By antidiagonals, this table is sequence A3992.
The rows are A0007, A0012, A0079, A0244, A0302, A0351, A0400, etc.
The columns are A0012, A0027, A0290, A0578, A0583, A0584, A1014, etc.
The main diagonal is A0312.
The superdiagonal is A7778 and the subdiagonal is A0169.
Lower Hyper4 Function a④b
There are two ways to extend the series to a fourth operator beyond exponentiation, depending on how you group the terms2. Because of this I need to use different symbols to avoid confusion, and for one of the definitions there are two alternative symbols. To explain my "number in a circle" symbols, I should point out that in discussing these operators in general, it is useful to have an unlimited set of operator symbols. For this reason I sometimes use these alternate symbols for the three operators that have already been discussed:
a+b = a①b
a×b = a②b
a↑b = a③b
The defintions of the two types of hyper4 are:
a④b = ((a↑a)↑a)↑...↑a
and:
a↑↑b = a④b = a↑...↑(a↑(a↑a))
where in each case there are b copies of a on the right-hand side.
Both versions produce the same answer for a↑↑1 and a↑↑2 (a and a↑a respectively) but they differ beyond that. The first one produces smaller values so I call it the "lower hyper4 operator"; the other is the "higher" hyper4 operator. When necessary to distinguish between the two I use a④b and a④b respectively. For the latter function, the notation a↑↑b is more common in other literature.
Here is a table of the "lower" hyper4 function for arguments from 0 to 6:
|
By antidiagonals, this table is sequence A171881 (more terms here).
The first row is not in OEIS; second is A0012. Ignoring the 0 column, the next four rows are A1146, A55777, A137840, and A137841. The following rows are not in OEIS.
The first column is not an integer sequence. The next three columns are A0027, A0312, and A2489; the following columns are not in OEIS.
The main diagonal (without the initial term) is A89210.
The superdiagonal is A2488, and the subdiagonal is not in OEIS.
a④b can be generalized easily to real and even complex arguments through the identity a④b=a↑(a↑(b-1)). So for example, π④e is 3581.875516... and e④π is 4979.003621...
An Exact Solution of the Inverses of a④b
One might wonder how to compute the a④b function "backwards", computing the equivalent of a hyper4 "logarithm" or "root". More precisely, given the equation:
z = x④y = xx(y-1)
derive a solution for y in terms of x and z (the "hyper4logatithm") and a solution for x in terms of y and z (the "hyper4root").
The first one (getting y in terms of x and z) is pretty straightforward. Starting with z = x(x(y-1)), we merely need to take a (normal) logarithm to base x twice, then add 1:
z = xx(y-1)
logx(z) = x(y-1)
logx(logx(z)) = y-1
y = 1 + logx(logx(z))
That was pretty easy. How about the hyper4root?
Solving for x is a little trickier, because we have an x to the power of x (a "self-exponential" or "coupled exponential" function) which has no closed-form root in the standard elementary functions. However there is always a root to y=xx, even if x is negative or complex (in which case the root is usually complex). You just have to use the Lambert W function, which is defined to be the inverse of a=b eb:
a=b eb if and only if b = W(a) (see footnote4)
To solve z = x④y for x in terms of y and z, begin by defining k=y-1 (which will be easy to undo later). This gives us
z = xxk
Next, define l to be ln(x) (the latural logarithm of x), or equivalently x=el. Substituting this in for the x's, we get
z = (el)(el)k = e(l ek l)
We can take the latural log of both sides:
ln(z) = l ek l
than multiply by k, giving us
k ln(z) = k l ek l
This is now recognizably similar to the definition of the Lambert W function above, with a=k ln(z) and b=k l. So we can go ahead and use the W function to express the relation as:
k l = W(k ln(z))
Now replacing k and l with their definitions in terms of y and x respectively, we have
(y-1)ln(x) = W((y-1)ln(z))
which readily leads to
x = e(W((y-1)ln(z)) / (y-1))
Since the Lambert W function is defined for (almost) any complex argument, as are the logarithm ln(a) and exponential ea, the lower hyper4 function can be used just as freely as each of those other functions.
The Higher Hyper4 Function ("tetration") a↑↑b = a④b
As described above, the higher hyper4 operator defines a↑↑b as:
a↑↑b = a④b = a↑...↑(a↑(a↑a))
where there are b copies of a on the right-hand side.
Here is a table of values:
|
In the right column are three values beginning with "3 PT". This is an abbreviation for "three powers of ten". For example, "3 PT 8.0723×10153" is 1010108.0723×10153.
Reading across each row, it is easy to see that, once b gets large enough, each time you increase b by one, the value of a↑↑b increases to 10a↑↑b. But of course, it should be aa↑↑b, not 10a↑↑b. In fact the values are different, but they are so close that the first few digits of the number at the top of the "power tower" are the same. This is due to the power tower paradox described below.
By antidiagonals, this table is sequence A171882 (more terms here).
First four rows are A0035, A0012, A14221, and A14222; following rows are not in OEIS.
First four columns are A0012, A0027, A0312, and A2488; following columns are not in OEIS.
The higher hyper4 operator cannot be extended in any obvious and self-consistent way to the reals (to compute, for example, π④e). This issue is discussed at length below, with possible solutions.
My Extension of the hyper4 Function to Real-Valued Arguments
The hyperexponential (also called "tetration" or "power tower") x④y is defined for integer y>0 by:
x④1 = x
x④y = x↑(x④(y-1)) for y>1
The symbol ④ is explained above; see also this description) of the hyperN family of operators.
Several times, people have asked me: how can one define x④y for real y? For example, what is π④e)?
There are three other functions that have been extended to the reals in ways that seem promising: the factorials by the Gamma function, hyperfactorials by the K-function, and the lower (Simon and Plouffe 1995) superfactorial by Barnes' G-function. All of these are defined in terms of an integral or an infinite product and they all have a "recurrence relation" like Gamma(x+1) = x Gamma(x).
Defining x④y in terms of an infinite product, or even an infinite product of infinite products of integrals, seems like it will not work since it needs to behave more like an infinite exponential "tower". Using this method for the hyper4 function would probably require using an integral that acts like the xy function in one asymptote and like a lower function in the other.
Daniel Geisler has collected a lot of information related to this issue at his website, tetration.org. The best solution I have found so far was developed by I. N. Galidakis3. It uses the concept of a power tower with gradually growing exponents for example, it definines x④0.8=x0.8, x④0.9=x0.9, x④1.0=x1.0=xx0.0, x④1.1=xx0.1, x④1.2=xx0.2, etc. While this function is continuous, its 1st, 2nd, etc. derivatives are not, except for the 1st derivative when x=e.
Here is my attempt at the problem:
Begin by imagining an infinite power tower of the form:
x④y = x↑(x↑(a↑(b↑(c↑...↑(1↑(1↑(1)))...))))
where there are a finite number of x's on the bottom, and an infinite number of 1's on the top. The 1's can be completely ignored, and all you need to look at are the x's and the special terms a, b, c, ... in the middle. These terms have values between 1 and x and are called "transition terms". (Galidakis' method fits this model but only has one transition term.)
Now imagine that the x's at the bottom are actually just a tiny bit smaller than x, and that the 1's at the top are actually a tiny bit larger than 1. In fact, every term of the power tower is between 1 and x. The terms shown as x are just very very close to x.
We want something that is well-behaved, continuous and smooth. That means, essentially, x>a>b>c>1 and as y increases, the values of a, b, c increase gradually until eventually a=x. To achieve this result I decided to use the error function erf(z) = (2/√π)integral(0...z)[e-(z2)dt]
The terms in the power tower need to vary from x to 1 in a smooth manner with two asymptotes and an exponential bias. This is accomplished by the function:
g(x,z) = x((1+erf(2z-1))/2)
This function increases monotonically from 1 for z<<0 to x for z>>1, and most of the transition from 1 to x occurs in the range 0<z<1. Here are examples for x=2 and x=3:
|
All terms end up being between 1 and x, and there are none that are precisely equal to 1 or to x. However, they get arbitrarily close.
Then I define a finite approximation power-tower:
T(x,y,n) = g(x,y) ↑ (g(x,y-1) ↑ (g(x,y-2) ↑ (... ↑ (g(x,y-n+1)))))
for example: T(x,y,3) = g(x,y)↑(g(x,y-1)↑g(x,y-2))
Then finally, the first approximation to a real-valued hyper4 or "tetration" operator is:
x④y ≈ H(x,y) = limit(n->infinity) T(x,y,n)
Here are some examples:
|
Notice that when y is an integer, there are two terms (1.894 and 1.056 in this example) that are almost equivalent to an x. (But not exact: 1.8941.056 is a little less than 2; more on this later.)
Also notice that when y is a half-integer, one of the terms is the square root of x and two others (1.997 and 1.002 in this example) just about "cancel out".
Finally, notice that the example for 2④3 is the same as the one for 2④2, except that all the values have moved up one step in the power tower.
Computer implemenation of this function yields the following values:
|
Immediately you should notice that H(3,1), H(2,3) etc. do not have the expected integer values. This results from the fact that the terms given by the g() function do not cancel out in the way that they need to, and for sufficiently low values of y, the first term is not x. To compensate for this, two techniques are used.
The first technique relies on the expected property that
x④(y-1) = logx(x④y)
The technique is to calculate x④y2 for y2=y, or y2=y+1, or y2=y+2, using whatever value of y2 is high enough to ensure that the power tower has at least one term equal to x, then take the log of the result an appropriate number of times. More formally:
H2(x,y,0) = H(x,y),
H2(x,y,n) = logx(H2(x,y+1,n-1)) for n>0
and
x④y ≈ H3(x,y) = limit(n->inf) H2(x,y,n)
The second adjustment that needs to be done is to add a constant bias to y:
H4(x,y) = H3(x,y+Kx)
The bias Kx depends on x but not on y; it corrects for the fact that the transition terms do not "cancel out" perfectly. To compute Kx any value of y can be used, for simplicity set y=1 and find the value of Kx such that H4(x,y)=x.
Results
Applying these techniques yields the following:
|
* In the second column, the entries that have * instead of a number represent values larger than 102210≈1010308. This was the limit of my computer program, which handles large numbers by computing their logarithm to base 10; 10308 is the limit of IEEE double-precision floating point.
Other special values:
H4(e,0.5) = 1.654052630799
H4(e,e) = 3713.155056406
H4(e,π) = 4341201053.37
H4(π,0.5) = 1.750536646695
Although this function definition seems to work well, more work is needed before it can be considered a true extension of x↑↑y:
- The limit needs to be well-defined for arbitrarily large y. The error function converges quickly, but since it's being used for terms in a power tower, it might not converge fast enough if the power tower is tall. Due to this, even when the "bias" is perfectly adjusted for a small value of y the same bias will not produce the expected result for larger y. However, the formulas seem to behave well for the values I have tested, which all fall within the range 1 < x④y < 1010308.
- The bias is different for different values of x. For x greater than about 2.834 the bias is positive, otherwise it is negative. Beyond establishing that this crossover point is not e=2.71828... I have not investigated further. Perhaps there is a "better" g(x,z) would involve a function similar to the error function but with a slightly different shape so that a bias is not necessary. If the bias is unavoidable, a convincing argument of that fact would be needed.
- Further extension would be needed to handle y<1, x<1 or for complex x and y.
The Power-Tower Paradox
This section is largely based on an earlier explanation I wrote specifically for HyperCalc entitled Non-Intuitive Results When Working With Huge Numbers.
If you spend a while exploring the values of a↑↑b using an unlimited-range program like HyperCalc, you will probably start noticing some paradoxical results and might even start to think the calculator is giving wrong answers.
For example, I will use HyperCalc to compute 27 to the power of googolplex:
Go ahead -- just TRY to make me overflow! _ _ |_| . . ._ _ ._ _ ._ | _ | | | | | ) (-` | ( ,-| | ( ~ ~ 7 |~ ~' ~ ~ `~` ~ ~ -' Enter expressions, or type 'help' for help. C1 = googol = 10^100 R1 = googol: 1 x 10 ^ 100 C2 = googolplex = 10^googol R2 = googolplex: 10 ^ ( 1 x 10 ^ 100 ) C3 = 27^googolplex R3 = 10 ^ [ 10 ^ ( 1 x 10 ^ 100 ) ]Apparently HyperCalc thinks that
271010100 = 101010100
This is clearly wrong and it doesn't even seem to be a good approximation. What's going on?
First of all, note that HyperCalc, like any normal calculator, is trying to display all its answers in a form that involves using 10 as the base. A standard form like 1010...X is useful because it makes different types of calculations easily comparable. A user might want to know which is bigger: 100! or 444. 100! is about 9.33×10157. Is that bigger than 444? We need to either convert 100! to some power of 4, or convert 444 to some power of 10. If you have a calculator that goes high enough, it will tell you that 444 is about 1.34×10154.
Let's assume that HyperCalc cannot be trusted and calculate the correct answer ourselves. In order to compare the two, we need to express them in terms of the same base. Let's use base 10, because one of the numbers, 101010100, is already expressed in terms of base 10. So we need to determine the values of A and B such that:
271010100 = 1010A×10B
The first step is express the power of 27 as a power of 10 with a product in the exponent, using the formula xy = 10(log10(x)×y) :
271010100 = 10(log1027×1010100)
log1027 is about 1.4313637641589, so we have
271010100 = 101.4313637641589×1010100)
Now we have a base of 10 but the exponent still needs work. The next step is to express the product as a sum in the next-higher exponent; this time the formula we use is x×y = 10(log(x)+log(y)) :
101.4313637641589×1010100 = 1010(log101.4313637641589 + 10100)
log101.4313637641589 is about 0.1557500185891, and if we add this to 10100 we get
1010(0.1557500185891 + 10100)
= 10101000...000.1557500185891
= 1010(1.000...0001557500185891×10100)
where there are 94 more 0's in place of each of the "...". Now we have:
271010100 = 1010(1.000...0001557500185891×10100)
again with 94 missing 0's.
Now we've expressed the value of 27googolplex precisely enough to see HyperCalc's error and look how small the error is! The error manifests as the digits "155" in "1.000...000155". To even be able to see this "155", we need to use 104 digits of precision because that "1.000...000155" is 104 digits long. In the original calculation giving 271010100 = 101010100, HyperCalc used the default precision which is the native precision of the system, which is typically 16 digits of accuracy. Those 16 digits are taken up by the 1 and the first 15 0's so when it gets to the step of adding 0.155 to 1.0×10100, it just rounds off the answer to 1.0×10100.
R3 = 10 ^ [ 10 ^ ( 1 x 10 ^ 100 ) ]
HyperCalc can also use the arbitrary-precision bc to do its math at higher precisions, accessed by setting the special variable scale. We can use that to get a more accurate answer for 27googolplex. Here is the result:
C1 = scale=120 C1 = 27^(10^(10^100)) R1 = 10 ^ ( 1.43136376415898731186 x 10 ^ 10000000000000000000000000000000\ 000000000000000000000000000000000000000000000000000000000000000000000 )Two things happened here: Because 120 digits of precision have been specified, HyperCalc can now express googol exactly, so it displays it as a 1 followed by 100 zeros. This means in turn that we no longer get our desired values of A and B, but instead get the intermediate result 1.4313637641589×1010100 that we encountered during the derivation above.
To get the actual value of our desired A, we can take the logarithm to base 10 twice. Continuing the above session:
R1 = 10 ^ ( 1.43136376415898731186 x 10 ^ 10000000000000000000000000000000\ 000000000000000000000000000000000000000000000000000000000000000000000 ) C2 = log(%) R2 = 1.43136376415898731186 x 10 ^ 100000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000 C3 = log(%) R3 = 100000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000.1557500185891198647Here we went to 120 digits of precision to get an answer that is pretty close to the original paradoxical answer. That requires a fair amount of computational work, and if the original problem statement had been greater, it would take even more work. Resolving the difference between 2710101000000 and 1010101000000 would take over 1000000 digits of calculation, and resolving the difference between 2710101000000000000 and 1010101000000000000 would require over 1000000000000 digits. And clearly if we just add one more 10 to those towers of exponents, all hope of avoiding roundoff-error is lost. This point is explored further on my large numbers page as part of the uncomparably larger discussion.
A Lifetime Interest in hyper4
A childhood obsession with the family of dyadic operators (increment, add, multiply, etc...) began in 1973 or 1974 when I was 9 or 10 years old. At age 12 or 13 I began trying to make slide rules to compute the "lower hyper4 function" defined by the left-associative repetition of exponents: x④y = (((x↑x)↑x)↑...)↑x and did not succeed for the first few attempts. I discovered the formula x④y=x↑(x↑(y-1)) at age 14 and had a working slide rule a couple months later. In that part of my life (which I call "my classical period") I never could crack the right-associative "higher hyper4 function", and the obsession eventually faded. The Mandelbrot set eventually took its place as my primary mathematical obsession.
By my late 30's I had seen common threads between some of the other generalized functions Gamma (for the factorials), Barnes' G-function (for the lower superfactorial), and the K-function (for the hyperfactorials).
Eventually I decided to just put the assumptions down on paper and manually construct a function that would meet all of them. What I have now is not wholly satisfactory because it requires an adjustment for the base.)
See Also
The hyper4 function is also discussed extensively on my large numbers page.
A171881: 0, 1, 1, 2, 1, 1, 3, 4, 1, 1, 4, 27, 16, 1, 1, 5, 256, 19683, 256, 1, 1, 6, 3125, 4294967296, 7625597484987, 65536, 1, 1, 7, 46656, 298023223876953125, 340282366920938463463374607431768211456, 443426488243037769948249630619149892803, 4294967296, 1, 1, 8, 823543, ...
A171882: 1, 1, 0, 1, 1, 1, 1, 2, 1, 0, 1, 3, 4, 1, 1, 1, 4, 27, 16, 1, 0, 1, 5, 256, 7625597484987, 65536, 1, 1, 1, 6, 3125, 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096, (next term is 37625597484987≈1.26*103638334640024), ...
Footnotes
2 : After growing tired of the exponentiation function I moved on to hyper4, for which I invented the name "powerlog". Because an 7th-year maths textbook I had been given defined exponentiation as: ab=(..(a×a)×a...)×a, I decided to use the same type of definition for hyper4, with the result that I ended up working with the lower hyper4 function. I was in the 8th grade (middle school) when I derived the identity a④b=aa(b-1); later in the same year I created a custom slide rule for calculating this function.
s.13
Google+
mrob27