// f107_o.cpp - float_107 type for C++ // // Copyright (C) 1998-2013 Robert P. Munafo. // // This program is free software; you can redistribute it and/or // modify it under the terms of the GNU General Public License as // published by the Free Software Foundation; either version 2 of the // License, or (at your option) any later version. // // This program is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU // General Public License (immediately following this text) for more // details. // // To receive a copy of the GNU General Public License, write to the // Free Software Foundation, Inc., 59 Temple Place, Suite 330, // Boston, MA 02111-1307 USA /* F107_FMA setting is in f107_o.h This file, along with its header file f107_o.h, define a floating-point type built on IEEE double (float53) by using two doubles to represent a single quantity. The effective (implicit) mantissa precision is 107 bits, which comes from 2 times the 53 bits of the underlying type, plus the sign bit of the less-significant part. This is less than the 113 (implicit) bits supplied by the standard (IEEE 754r) binary128 quadruple-precision float type. The algorithms in this class defintion have been described in academic papers since the early 1970's. Dekker, "A Floating-Point Technique for Extending the Available Precision" (1971) gives the three-FLOP algorithm implemented by normalize_22 and called "FAST-TWO-SUM" in most of the literature starting with Shewchuk. The formula used in add_112 is given by Knuth, "Seminumerical Algorithms" (1981) in section 4.2.2. Priest (1992, "On Properties of Floating Point Arithmetics: Numerical Stability and the Cost of Accurate Computations") presents parts of these algorithms. Shewchuk (1997, "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates", Shewchuk1997.pdf) credits Dekker and Knuth and adds algorithms for summing and scaling expansions. Hida et al. (2000, "Quad-Double Arithmetic: Algorithms, Implementation, and Application", Hida2000Quad.pdf ; and 2007, "Library for Double-Double and Quad-Double Arithmetic", Hida2007Libr.pdf) give the algorithms in a more refined form, from which I created those you see here. I keep this code in a path called "shared/proj/libs/f107_o" in my home directory. In the following notes you should change the pathnames to match your own setup. To use in UNIX (makefile) and MacOS X (Xcode): - Add '-I$hd/shared/proj/libs/f107_o' to the compile line (where $hd is $ENV{"HOME"}) - Select an "Include Option" below. To use in CodeWarrior running in Classic mode: - Add 'Macintosh HD:Users::shared:proj:libs:f107_o' to the "access paths". - Add a "Prefix file" in "C/C++ Language" for each target in the project, and make sure the prefix file defines "__MWCC__" (and "USE_CONS_CLASSIC" if you intend to call pr_raw_double_107, print_f107_a or f107_to_digits) - Select an "Include Option" below. Include Options: - single-source programs: If your entire program consists of a single ".cpp" source file, include f107 with: #include "f107_o.cpp" - multiple-source programs: If your project has multiple source files: 1. Add #include "f107_o.h" to each source file that needs it 2. Add f107_o.cpp to the list of source files for compilation (in your Makefile, config.h, configure.in, aclocal.m4, the .xcodeproj file, etc.) EXAMPLES To find code that uses f107, use: "gofer 'include.*f107_o.[ch]' ch" Present clients of f107 include (in chronological order): PMZ-Classic, the Power MANDELZOOM project built in CodeWarrior under MacOS 8 and later. It goes back to the early 1990s (including my original multiple precision floating-point types f120_o and f100_o in 1994 and f150_o in 2002) but only began using f107 in 2007. cons (in .../libs/cons): This is the console emulator for MacOS 9.x and earlier. I added support for f107 output while I was adding f107 to PMZ-Classic. fplab (in .../proj/fplab), begun 20070228. This is the testbed for new higher-precision types built upon f107 (currently including f161 and f215). tmz-core (in .../proj/tmz-core), my current Mandelbrot rendering utility. It began using f107 on 20080311. m3000 (in .../proj/math), which uses f107 in the Floretions routines (added 20091120) and in an A097486 routine (added 20091221 and later moved to its own project) mflor (in .../proj/math), the interactive Floretions calculator. This was a spinoff of m3000, and began using f107 on 20091122. A097486 (in .../proj/A097486), begun 20100108. MDZ (in .../proj/MDZ), the interactive Mandelbrot viewer for X Windows systems (including Linux and Mac OS X) by James Morris. I added f107_o code on 20100222. Rily-PI (in .../proj/rily-pi), begun 20101103. zeta (in .../proj/zeta). I added some f107_o code for the Gamma function on 20120110. REVISION HISTORY 20070227 Add pr_raw_double and f107_to_digits 20070304 Begin converting to all-inlined operation. Fix a bug in sub_112 (preserved in the image "20070304 f107_o bug") Inline a ton of stuff, making it about twice as fast for Mandelbrot iteration. 20070309 A few bug-fixes and minor optimizations, mostly in the relational operators, coming out of implementation of f161. 20070311 Add F107_FMA switch, producing a dramatic improvement in speed for the multiply operations. 20070315 Add F107_APPROX switch and the non-sloppy versions of all add, sub, mul and square operations. 20070320 Add a bunch of INLINE_xxx switches to select whether or not various different routines are declared inline. 20070331 Begin the process of moving the inlined operators and functions inside the class declaration, which makes for less repetition in the syntax. Functions like normalize_22 have to be declared "static" and called with the prefix f107_o:: for it to compile properly. 20070404 Finish the work begun on 0331. Components "hi" and "lo" become "c0" and "c1". Fix trunc (I think) 20080311 Add == operators. Add *(f107_o, int), which currently causes "ambiguous overload" errors if you try to use it. 20080312 Fix a bug in f107_intpow; add f107_spf and f107_sqrt 20090305 Add a more complete error message to f107_spf() 20091113 Use f107_u8, f107_s16, etc. instead of unsigned8, signed16, etc. In the header file, don't define the macro to use FMA (fused multiply-add) if being compiled by #gcc#. 20091114 Add f107_strncpy and minor changes to avoid compiler warnings. 20091221 Add *= and /= operators. 20100108 f107_spf now handles positive exponents! 20100223 A few changes to make f107 more suitable for publication. 20100224 Add promotion constructor f161_o(const f107_o &); 20101017 Eliminate a compiler warning in pr_raw_double_107 20120110 Add f107_spfg, a string formatter that works like snprintf() with a format string like "%10.3g". Add sine and cosine routines. 20120111 Add floor() and pow(). Add exp() and some infinity and NaN handling. 20120112 The sqrt routine "f107_sqrt" is now called simply "sqrt". This overloads it with the C math library "double sqrt(double x)", which presents problems if you call it with a literal argument like "sqrt(2.0)", which will invoke the double version. However, this is an issue with all the overloaded operators too (most notably when using inexact decimal constants like "myf107 = 0.1;" or when using a constant in the denominator of division, like "myf107 = 2.0/3.0;"). Since we have this problem with overloaded = and / anyway, I've decided we may as well overload sqrt() too. Clean up declarations and parameter types (e.g. removing lvalue specifier '&' wherever it is not needed). Add assignment and constructor from string, and +, - operators with string arguments, so you can do: f107_o x = "2.3"; x = x + "0.1"; 20120113 Add sinh, currently pretty slow for many arguments (although it does give proper results) Add tanh (incomplete: does not yet converge for |argument|>1.0) and use it for exp() (for which its convergence is adequate). This saves us a few mults and a lot of divides in exp(), which previously was generating the factorial denominators on the fly. 20120114 Reorder some of the transcendental functions and add restrict_tanh(), which is a version of tanh(x) that only works for |x|<1.0, and can be used by exp() if I wish. 20120116 Several changes to restore compatibility with CodeWarrior (and in general, avoid dependence on stdio and newer ANSI C features): add F107_ISINF and F107_ISNAN macros, f107_isinf and f107_isnan functions, and f107_snprinf_int(). 20120117 Add f107_isinf_raw and f107_isnan_raw, and improved testing code for both. Add multiply operators that take a string constant as an argument; f107_spfg now handles NaNs. 20120118 f107_intpow now handles negative exponent 20120312 Add a bunch of prototypes (to avoid errors in PMZ) 20120313 A few changes to f107_is_inf, and try to init K107_infinity in a way that works on 80-bit 68881 long double. 20130818 Fix string-reversal bug in f107_snprinf_int that made multi-digit exponents print backwards, e.g. 1.23e+27 would be printed as "1.23e+72" 20131031 Fix several bugs in floor and one in intpow (thanks Adam Milazzo!) 20230410 If compiled with -mavx2 -mfma on an Intel machine, f107_o.h now infers that it can use the intrinsics _mm_set_sd, _mm_fmsub_pd, _mm_cvtsd_f64 */ /* "cons" is a library I wrote that provides printf-like functions in a Macintosh window, for running under the old pre-UNIX MacOS that did not have its own stdio facilities. Note that it is only used in this routine for a couple debugging routines and error cases. */ #ifdef USE_CONS_CLASSIC # include #else # include #endif #include "f107_o.h" #include #ifdef __GNUC__ # define F107_ISINF(x) ((int)isinf(x)) # define F107_ISNAN(x) ((int)isnan(x)) #else # define F107_ISINF(x) (f107_isinf(x)) # define F107_ISNAN(x) (f107_isnan_raw(x)) #endif /* The constant K107_p27_1 is 2^27+1, which is used to split a double into two parts for the multiply routine. */ const f107_pf K107_p27_1 = 134217729.0; f107_o K107_ln_2; f107_pf K107_infinity = 1.0e300 * 1.0e300; f107_pf K107_ninfinity = -(1.0e300 * 1.0e300); f107_pf K107_NaN = K107_infinity + K107_ninfinity; long st_taylor_base; long st_taylor_sum; int f107_need_init = 1; int f107_big_endian = 0; // init and support routines /* Format of the 64-bit IEEE floating-point data type (C builtin type "double"). Here, bytes are numbered as if accessed on a little-endian system. ----- word 1: bytes 4 through 7 ----- ---- word 0: bytes 0 through 3 ---- ----7---- ----6---- ---5---- ---4---- ---3---- ---2---- ---1---- ---0---- S EEEEEEE EEEE MMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM S = sign bit (1 for negative) E = 11-bit excess-1024 exponent M = 52-bit mantissa (53 bits if you count the implied leading '1' bit) For our purposes, the lower component of an f107_o and the two lower components of an f161_o have 54 bits of precision. Those components can be either positive or negative, so they effectively use the sign bit as a 54th bit. For example -- imagine that the value being represented is somewhere in the range (2^52..2^53-1). For such a value, the upper half of the dd_real has a least significant bit whose magnitude is 1.0. The lower half represents the fractional part, and will be a value from -0.5 to 0.5. Within this range, the sign bit selects one or the other of two ranges of size 0.5, and the exponent selects a range of size 0.25 (e.g. 0.00..0.25 or 0.25..0.5). The 52 mantissa bits select a value within this size 0.25. Thus, there are a total of 54 bits of resolution. This is why we call it "f107": the effective mantissa precision is 53 + 54 = 107. See: http://en.wikipedia.org/wiki/IEEE_754-1985#Examples */ int f107_isinf(f107_pf x) { int pf; if (x > 0.0) { pf = 1; } else if (x < 0) { pf = -1; } else { /* Zero or NAN */ return 0; } if ((x != x) && pf) { /* On 68881, infinity is not equal to itself */ return pf; } if ((x * 0.5) == x) { /* Any finite value divided by 2 (even the smallest denormal) should give a different value for the answer */ return pf; } return 0; } int f107_isinf_raw(f107_pf x) { f107_u32 hi, lo; if (f107_big_endian) { hi = F107_WORD_0(x); lo = F107_WORD_1(x); } else { hi = F107_WORD_1(x); lo = F107_WORD_0(x); } if (lo) { return 0; } if (hi == 0x7ff00000) { return 1; } else if (hi == 0xfff00000) { return -1; } return 0; } int f107_isnan_raw(f107_pf x) { f107_u32 hi, lo; if (f107_big_endian) { hi = F107_WORD_0(x); lo = F107_WORD_1(x); } else { hi = F107_WORD_1(x); lo = F107_WORD_0(x); } // printf("f107_isnan_raw: %g %08x %08x\n", x, hi, lo); if ((hi & 0x7ff00000) != 0x7ff00000) { /* Normal number */ // printf("f107_isnan_raw: normal\n"); return 0; } if (((hi & 0x000fffff) == 0) && (lo == 0)){ /* Infinity */ // printf("f107_isnan_raw: infinity\n"); return 0; } // printf("f107_isnan_raw: NaN\n"); return 1; } int f107_isnan(f107_pf x) { if (x == x) { return 0; } return 1; } void f107_fatal_error(const char * s, f107_s16 err, f107_s16 param) { #ifdef USE_CONS_CLASSIC cons_put_str((char *) s); cons_put_str(" (err="); cons_put_s16(err); cons_put_str(", param="); cons_put_s16(param); cons_put_str(")\n"); #else fprintf(stderr, "%s (err=%d, param=%d)\n", s, err, param); exit(-1); #endif } void f107_o_init(void) { f107_pf t1; int i; printf("%c", 0); // prevents erroneous -Wuninitialized message in GCC 11 /* Test sizeof our integer types */ if (sizeof(f107_pf) != 8) { f107_fatal_error("f107_o_init: sizeof(f107_pf) != 8", 1, 0); } if (sizeof(f107_u8) != 1) { f107_fatal_error("f107_o_init: sizeof(f107_u8) != 1", 1, 1); } if (sizeof(f107_s16) != 2) { f107_fatal_error("f107_o_init: sizeof(f107_s16) != 2", 1, 2); } if (sizeof(f107_u32) != 4) { f107_fatal_error("f107_o_init: sizeof(f107_u32) != 4", 1, 3); } /* Determine endian-ness */ t1 = 1.0; if (F107_WORD_0(t1) == 0x3ff00000) { f107_big_endian = 1; } else if (F107_WORD_1(t1) == 0x3ff00000) { f107_big_endian = 0; } else { f107_fatal_error("f107_o_init: cannot determine endian-ness", 2, 0); } // printf("init: %g %g %g\n", K107_infinity, K107_ninfinity, K107_NaN); K107_ln_2 = f107_sscan((char *) "0.693147180559945309417232121458176568"); /* Create values that "should be" infinite and NaN, then test our internal isinf and isnan functions */ for(i=0; i<2; i++) { if (f107_isinf(K107_infinity) <= 0) /* Should be 1 */ { f107_fatal_error( "f107_o_init: isinf test failed for initial K107_infinity", 3, i); } if (f107_isinf_raw(K107_infinity) <= 0) /* Should be 1 */ { f107_fatal_error( "f107_o_init: isinf_raw test failed for initial K107_infinity", 4, i); } if (f107_isinf(K107_ninfinity) >= 0) /* Should be -1 */ { f107_fatal_error( "f107_o_init: isinf test failed for initial K107_ninfinity", 5, i); } if (f107_isinf_raw(K107_ninfinity) >= 0) /* Should be i1 */ { f107_fatal_error( "f107_o_init: isinf_raw test failed for initial K107_ninfinity", 6, i); } if (f107_isnan(K107_NaN) == 0) /* Should be 1 */ { f107_fatal_error( "f107_o_init: isnan test failed for initial K107_NaN", 7, i); } if (f107_isnan_raw(K107_NaN) == 0) /* Should be 1 */ { f107_fatal_error( "f107_o_init: isnan_raw test failed for initial K107_NaN", 8, i); } K107_infinity = 1.0e300; K107_infinity *= K107_infinity; K107_ninfinity = - K107_infinity; K107_NaN = K107_infinity + K107_ninfinity; } st_taylor_base = 0; st_taylor_sum = 0; f107_need_init = 0; } void f107_stats(void) { #ifdef USE_CONS_CLASSIC cons_put_str("Average taylors: "); cons_put_f4_p2(((float) st_taylor_sum)/((float) st_taylor_base), 3, 3); cons_put_str("\n"); #else printf("Average taylors: %g\n", ((double) st_taylor_sum)/((double) st_taylor_base)); #endif } /* print a double in "raw" format. If sexp is true it will print the sign and exponent. If mant is true, it will print the mantissa. If highm is true, the implied high bit of the mantissa will be shown. (See description of IEEE-64 double format above.) */ void pr_raw_double_107(double d, int sexp, int hm, int mant) { f107_u8 *c; f107_s16 e; f107_s16 i; char * hexd = ((char *) "0123456789ABCDEF"); e = 0; // avoids a warning if (d == 0.0) { #ifdef USE_CONS_CLASSIC cons_put_str("+ 0 0000000000000."); #else printf("%s", "+ 0 0000000000000."); #endif return; } c = (f107_u8 *) &d; i = 1; if ((*((char *) (&i))) == 1) { // little-endian: reverse it f107_u8 t; t = c[0]; c[0] = c[7]; c[7] = t; t = c[1]; c[1] = c[6]; c[6] = t; t = c[2]; c[2] = c[5]; c[5] = t; t = c[3]; c[3] = c[4]; c[4] = t; } else { // big-endian } if (sexp) { /* Print sign */ if (*c & 0x80) { #ifdef USE_CONS_CLASSIC cons_putc('-'); #else printf("%c", '-'); #endif } else { #ifdef USE_CONS_CLASSIC cons_putc('+'); #else printf("%c", '+'); #endif } e = ((*c) & 0x7f) << 4; } c++; if (sexp) { #ifdef USE_CONS_CLASSIC cons_put_str("2^"); #else printf("%s", "2^"); #endif e = e | ((*c) >> 4); e = e - 1023; #ifdef USE_CONS_CLASSIC cons_put_s16(e); #else printf("%d", e); #endif } if (sexp && (hm | mant)) { #ifdef USE_CONS_CLASSIC cons_putc(' '); #else printf("%c", ' '); #endif } if (mant) { if (hm) { f107_u8 c1, c2; c1 = (*c) & 0xf; c2 = (c1 & 1) << 7; #ifdef USE_CONS_CLASSIC cons_putc(hexd[0x08 + (c1 >> 1)]); #else printf("%c", hexd[0x08 + (c1 >> 1)]); #endif for(i=0; i<6; i++) { c++; c1 = *c; #ifdef USE_CONS_CLASSIC cons_put_x8(c2 | (c1 >> 1)); #else printf("%02X", c2 | (c1 >> 1)); #endif c2 = (c1 & 1) << 7; } // Display the last bit as a ' or a . if (c2) { #ifdef USE_CONS_CLASSIC cons_putc('\''); #else printf("%c", '\''); #endif } else { #ifdef USE_CONS_CLASSIC cons_putc('.'); #else printf("%c", '.'); #endif } } else { #ifdef USE_CONS_CLASSIC cons_put_x4(*c); #else printf("1%c", hexd[*c]); #endif for(i=0; i<6; i++) { c++; #ifdef USE_CONS_CLASSIC cons_put_x8(*c); #else printf("%02X", *c); #endif } } } } // Internal routines void print_f107_a(f107_o a, int cr) { /* print the fields of the f107 structure */ #ifdef USE_CONS_CLASSIC cons_put_str("["); cons_put_f8_p2(a.c0,20,3); cons_put_str("|"); cons_put_f8_p2(a.c1,20,3); cons_put_str("]"); if (cr) { cons_put_str("\n"); } #else // printf("[%20.17g|%20.17g]", a.c0, a.c1); printf("["); pr_raw_double_107(a.c0,1,1,1); printf("|"); pr_raw_double_107(a.c1,1,1,1); printf("]"); if (cr) { printf("\n"); } #endif } /* The division algorithm begins with an approximation using the pf type's division operation; it then computes the error created by this and subtracts it out. Paraphrasing from Hida (2007) and adapting the description to fit our two-component implementation: Division is done by the familiar long division algorithm. Let a=(a0,a1) and b=(b0,b1) be f107 numbers. We can first compute an approximate quotient q0=a0/b0. We then compute the remainder r=a-q0xb, and compute the correction term q1=r0/b0. Full f107 precision is used for parts of this calculation since most of the bits will be canceled when doing the subtraction. q0 ________ b0 b1 ) a0 a1 */ void div_f107(f107_o a, f107_o b, f107_o &res) { f107_pf s0, s1, q0, q1; f107_o r; q0 = a.c0 / b.c0; r = b * ((f107_o) q0); s0 = f107_o::sub_112(a.c0, r.c0, s1); s1 -= r.c1; s1 += a.c1; q1 = (s0+s1)/(b.c0); r.c0 = f107_o::normalize_22(q0, q1, r.c1); res.c0 = r.c0; res.c1 = r.c1; } // creation: all are in f107_o.h // assignment: all are in f107_o.h // Conversion operators: all are in f107_o.h // building blocks for add: all are in f107_o.h // add: the more common operators are in f107_o.h f107_o operator + ( const f107_o lhs, const char * rhs ) { f107_o rval, sum; rval = f107_sscan((char *) rhs); sum = lhs + rval; return sum; } f107_o operator + ( const char * lhs, const f107_o rhs ) { f107_o lval, sum; lval = f107_sscan((char *) lhs); sum = lval + rhs; return sum; } // sub: the more common operators are in f107_o.h f107_o operator - ( const f107_o lhs, const char * rhs ) { f107_o rval, diff; rval = f107_sscan((char *) rhs); diff = lhs - rval; return diff; } f107_o operator - ( const char * lhs, const f107_o rhs ) { f107_o lval, diff; lval = f107_sscan((char *) lhs); diff = lval - rhs; return diff; } // comparison: all are in f107_o.h // scale: all are in f107_o.h // building blocks for mul: all are in f107_o.h // multiply: the more common operators are in f107_o.h // operator * // // input: f107 lhs; char * rhs // output: f107 (a * b) // // convert rhs from string, then multiply at f107 precision f107_o operator * ( const f107_o lhs, const char * rhs ) { f107_o rval, sum; rval = f107_sscan((char *) rhs); sum = lhs * rval; return sum; } // operator * // // input: char * lhs; f107 rhs // output: f107 (a * b) // // convert lhs from string, then multiply at f107 precision f107_o operator * ( const char * lhs, const f107_o rhs ) { f107_o lval, sum; lval = f107_sscan((char *) lhs); sum = lval * rhs; return sum; } // multiply by constants: all are in f107_o.h // divide f107_o operator / ( const f107_o numer, const f107_o denom ) { f107_o res; div_f107(numer, denom, res); return res; /* Here, on iMac G4, I get the compiler error: /Users/myname/shared/proj/libs/f107_o/f107_o.cpp: In function `f107_o operator/(const f107_o&, const f107_o&)': /Users/myname/shared/proj/libs/f107_o/f107_o.cpp:334: Internal compiler error in emit_move_insn, at expr.c:2783 Please submit a full bug report, with preprocessed source if appropriate. */ } // Note we want to use full precision because this operator is often used // to calculate reciprocals of precisely representable values, e.g. // computing the exact value of 1/3 f107_o operator / ( const f107_o lhs, f107_pf rhs ) { f107_o res; f107_o b; b.c0 = rhs; b.c1 = 0; div_f107(lhs, b, res); return res; } f107_o operator / ( f107_pf lhs, const f107_o rhs ) { f107_o res; f107_o a; a.c0 = lhs; a.c1 = 0; div_f107(a, rhs, res); return res; } // unary operators are inline // Modify-assign operators f107_o & f107_o :: operator += ( f107_o rhs ) { *this = *this + rhs; return *this; } f107_o & f107_o :: operator += ( f107_pf rhs ) { *this = *this + rhs; return *this; } f107_o & f107_o :: operator -= ( f107_o rhs ) { *this = *this - rhs; return *this; } f107_o & f107_o :: operator -= ( f107_pf rhs ) { *this = *this - rhs; return *this; } f107_o & f107_o :: operator *= ( f107_o rhs ) { *this = *this * rhs; return *this; } f107_o & f107_o :: operator *= ( f107_pf rhs ) { *this = *this * rhs; return *this; } f107_o & f107_o :: operator /= ( f107_o rhs ) { *this = *this / rhs; return *this; } f107_o & f107_o :: operator /= ( f107_pf rhs ) { *this = *this / rhs; return *this; } // %%% it would be nice to add <<= and >>= operators to perform // the scale_2p function. // Comparison (relational) operators -- many are inlined int operator >= ( const f107_o lhs, const f107_o rhs ) { int res; res = f107_o::cmp_f107(lhs, rhs); res = (res >= 0); return res; } int operator >= ( const f107_o lhs, f107_pf rhs ) { f107_o r; int res; r = rhs; res = (lhs >= r); return res; } int operator >= ( f107_pf lhs, const f107_o rhs ) { f107_o l; int res; l = lhs; res = (l >= rhs); return res; } int operator <= ( const f107_o lhs, const f107_o rhs ) { int res; res = f107_o::cmp_f107(lhs, rhs); res = (res <= 0); return res; } int operator <= ( const f107_o lhs, f107_pf rhs ) { f107_o r; int res; r = rhs; res = (lhs <= r); return res; } int operator <= ( f107_pf lhs, const f107_o rhs ) { f107_o l; int res; l = lhs; res = (l <= rhs); return res; } int operator != ( const f107_o lhs, const f107_o rhs ) { int res; res = f107_o::cmp_f107(lhs, rhs); res = (res != 0); return res; } int operator != ( const f107_o lhs, f107_pf rhs ) { return ((lhs.c0 != rhs) || (lhs.c1 != 0)); } int operator != ( f107_pf lhs, const f107_o rhs ) { return ((lhs != rhs.c0) || (rhs.c1 != 0)); } int operator == ( const f107_o lhs, const f107_o rhs ) { int res; res = f107_o::cmp_f107(lhs, rhs); res = (res == 0); return res; } int operator == ( const f107_o lhs, f107_pf rhs ) { return ((lhs.c0 == rhs) && (lhs.c1 == 0)); } int operator == ( f107_pf lhs, const f107_o rhs ) { return ((lhs == rhs.c0) && (rhs.c1 == 0)); } // Functions f107_o square ( const f107_o z ) { f107_o rv; rv = f107_o::f107_native_square(z); return rv; } f107_o fabs ( const f107_o z ) { f107_o rv; if (z.c0 < 0.0) { rv.c0 = -(z.c0); rv.c1 = -(z.c1); } else if (z.c0 > 0.0) { rv = z; } else if (z.c1 < 0.0) { rv.c0 = -(z.c0); rv.c1 = -(z.c1); } else { rv = z; } return rv; } f107_o trunc ( const f107_o z ) { f107_o rv; // input: 123.0+0.456 123.0-0.456 123.0+0.978 123.0-0.978 rv.c0 = trunc(z.c0); // 123.0 123.0 123.0 123.0 rv.c1 = trunc(z.c1); // 0.000 -0.000 0.000 -0.000 // output: 123.0+0.000 123.0-0.000 123.0+0.000 123.0-0.000 return rv; } f107_o floor(const f107_o x) { f107_pf t0, t1; f107_o rv; int s; // sign flag t0 = x.c0; t1 = x.c1; /* If input is negative we'll negate it and compute the ceiling function */ if (t0 < 0) { t0 = -t0; t1 = -t1; s = 1; } else { s = 0; } // sample input: 123.0+0.456 123.0-0.456 123.0+0.978 123.0-0.978 0.123-9e-4 t0 = floor(t0); // 123.0 123.0 123.0 123.0 0.000 if (t0 != 0) { t1 = floor(t1); // 0.000 -1.000 0.000 -1.000 rv.c0 = f107_o::normalize_22(t0, t1, rv.c1); // output: 123.0+0.000 122.0+0.000 123.0+0.000 122.0+0.000 } else { rv.c0 = t0; rv.c1 = 0; // 0.000+0.0 } if (s) { /* Original input was less than 0, and we flipped its sign, so we really want to be computing the ceiling function of that, then flip the sign back. */ rv = rv + 1.0; rv = - rv; } return rv; } // Continued-fraction approximation void cfa(f107_o x, f107_s16 order, f107_o *n, f107_o *d) { *n = trunc(x); *d = 1.0; if (order > 0) { f107_o n1, d1, x1, x2, x3; *n = trunc(x); x1 = x - (*n); x3 = 1.0; x2 = x3 / x1; cfa(x2, order-1, &n1, &d1); *n = (n1 * (*n)) + d1; *d = n1; } } /* Raise any value to an integer power. */ f107_o f107_intpow(f107_o x, int p) { int i; f107_o rv; f107_o pow2; int recip = 0; pow2 = x; if (p < 0) { p = -p; recip = 1; if (p < 0) { /* p was MININT */ p = p / 2; p = -p; pow2 = x * x; } } rv = 1.0; i = p; while(i > 0) { if (i & 1) { rv = rv * pow2; } i >>= 1; if (i) { /* We could do better here by a special-case for x near 1.0 */ pow2 = pow2 * pow2; } } if (recip) { return(1.0 / rv); } return(rv); } /* End of f107_intpow */ /* Square root, using the trusty old Babylonian method. This is similar to Newton's method, and uses the fact that if a is approximately the square root of x, then the average of a and x/a will be closer to the true square root. If we start with a equal to the basetype's square root of the high component of x, then in fact only two iterations are needed to get full precision. */ f107_o sqrt(f107_o x) { f107_o a, b; if (x <= 0.0) { a = 0.0; } else { a = (f107_o) sqrt(x.c0); b = x / a; a = (a + b) / 2.0; #if (!(F107_APPROX)) b = x / a; a = (a + b) / 2.0; #endif } return(a); } /* Your typical Taylor series sine-cosine routine */ f107_o f107_sincos(f107_o xq, f107_o zq, f107_o uq, double i) { double k; int n; f107_o prev; xq = xq * xq; // x^2 prev = zq; k = 2.0; for(n=30; n>0; n--) { // char out[40]; f107_spfg(out, sizeof(out), 0, 32, zq); // printf("%2d %s\n", n, out); uq = -uq * xq/(k*(k+i)); // -x^2/2 or -x^3/6 zq = zq + uq; if (zq == prev) { n = 0; // exit loop early } else { prev = zq; k += 2.0; } } return zq; } f107_o sin(f107_o x) { int neg = 0; f107_o rv; if (x < 0.0) { x = -x; neg = 1; } rv = f107_sincos(x, x, x, 1.0); if (neg) { rv = - rv; } return(rv); } f107_o cos(f107_o x) { if (x < 0.0) { x = -x; } return(f107_sincos(x, 1.0, 1.0, -1.0)); } /* This routine returns the first N significant digits of an f107_o value. It goes to fairly thorough measures to ensure that rounding is done properly. You supply a string with a little bit more than the needed amount of space, and (if desired) an integer in which to store the exponent. */ void f107_to_digits(f107_o x, char *s, int *expn, int *sign, int precision) { int dig1 = precision + 1; /* number of digits to compute */ f107_o r; // "remainder", the portion not yet turned into digits. f107_o pw; int e; /* exponent */ int i, d; int sgn; if (F107_ISINF(x.c0) || (F107_ISINF(x.c1))) { if (x.c0 < 0) { f107_strncpy(s, (char *) "-inf", precision); } else { f107_strncpy(s, (char *) "+inf", precision); } return; } if (F107_ISNAN(x.c0) || (F107_ISNAN(x.c1))) { f107_strncpy(s, (char *) "NaN", precision); return; } sgn = 0; if (x < 0.0) { x = - x; sgn = 1; } if (sign) { *sign = sgn; } r = fabs(x); if (x == 0.0) { /* x == 0.0 */ if (expn) { *expn = 0; } for (i = 0; i < precision; i++) s[i] = '0'; return; } /* First determine the (approximate) exponent. */ e = (int) (floor(log10(fabs(x.c0)))); if (e < -300) { r = r * f107_intpow(f107_o(10.0), 300); pw = f107_intpow(f107_o(10.0), (e + 300)); r = r / pw; } else if (e > 0) { pw = f107_intpow(f107_o(10.0), e); r = r / pw; } else if (e < 0) { pw = f107_intpow(f107_o(10.0), -e); r = r * pw; } /* Fix exponent if we are off by one */ if (r >= 10.0) { r = r / 10.0; e++; } else if (r < 1.0) { r = r * 10.0; e--; } if (r >= 10.0 || r < 1.0) { #ifdef USE_CONS_CLASSIC cons_put_str("f107_to_digits: exponent adjust failed.\n"); #else fprintf(stderr, "f107_to_digits: exponent adjust failed.\n"); #endif return; } /* Extract the digits */ for (i = 0; i < dig1; i++) { d = ((int) (r.c0)); r = r - ((f107_o) d); r = r * 10.0; s[i] = (char) (d + '0'); } /* Fix negative digits. */ for (i = dig1-1; i > 0; i--) { if (s[i] < '0') { s[i-1]--; s[i] += 10; } } if (s[0] <= '0') { #ifdef USE_CONS_CLASSIC cons_put_str("f107_to_digits: non-positive leading digit.\n"); #else fprintf(stderr, "f107_to_digits: non-positive leading digit.\n"); #endif return; } /* Round, handle carry */ if (s[dig1-1] >= '5') { s[dig1-2]++; i = dig1-2; while (i > 0 && s[i] > '9') { s[i] -= 10; s[--i]++; } } /* If first digit is 10, shift everything. */ if (s[0] > '9') { e++; for (i = precision; i >= 2; i--) s[i] = s[i-1]; s[0] = '1'; s[1] = '0'; } s[precision] = 0; if (expn) *expn = e; } /* End of f107_to_digits */ /* Copy a string as if by strncpy */ void f107_strncpy(char * to, char * fr, int n) { int i; for(i=0; (i0) { len--; x2 = x2 / 10; } if (len <= 0) { *s++ = 0; return; } /* Write digits in reverse order */ len = 0; /* len now counts length of digits output */ x2 = x; while(x2 > 0) { s[len] = '0' + (x2 % 10); len++; x2 = x2 / 10; } /* Trailing null */ s[len] = 0; /* Reverse the string in place */ for(i=0; i 35) { precision = 35; } else if (precision < 1) { precision = 1; } s = s1; f107_to_digits(x, s2, &exponent, &sign, precision); f107_strncpy(s3, s2, 39); s3[39] = 0; if (sign) { *s++ = '-'; } else if (sign_flag == '+') { *s++ = '+'; } if ((exponent > 0) && (exponent < precision)) { i = 0; *s++ = s2[i++]; while (exponent) { *s++ = s2[i++]; exponent--; } *s++ = '.'; for(; i 35) { precision = 35; } else if (precision < 1) { precision = 1; } f107_to_digits(x, s2, &exponent, &sign, precision); if (sign) { *s++ = '-'; length--; } else if (sign_flag == '+') { *s++ = '+'; length--; } if ((exponent > 0) && (exponent < precision)) { /* It can be formatted as a normal number without an exponent field */ /* length needs to be at least enough for lead digits and trailing null */ if (length < exponent+2) { f107_strncpy(s, (char *) "fmt-err", length); return; } i = 0; /* Emit lead digit */ *s++ = s2[i++]; length--; /* Emit the rest of the digits before the decimal point */ while (exponent) { *s++ = s2[i++]; length--; exponent--; } if (length > 1) { *s++ = '.'; length--; } for(; s2[i] && (i 1); i++) { *s++ = s2[i]; length--; } *s++ = 0; return; } else if (exponent == 0) { /* Values from 1.00000 to 9.99999 */ /* length needs to be at least enough for lead digit, decimal point and trailing null */ if (length < 3) { f107_strncpy(s, (char *) "!!!", length); return; } *s++ = s2[0]; length--; /* Lead digit */ *s++ = '.'; length--; for(i=1; s2[i] && (i 1); i++) { *s++ = s2[i]; length--; } *s++ = 0; return; } else if ((exponent < 0) && (exponent > -5)) { /* Values like 0.12345, 0.012345, etc. */ /* Length needs to be enough for leading 0's, decimal point, one significant digit, and trailing null */ if (length+exponent < 3) { f107_strncpy(s, (char *) "fmt-err", length); return; } /* Add a suitable number of zeros */ *s++ = '0'; length--; *s++ = '.'; length--; exponent++; while ((exponent < 0) && (length > 1)) { *s++ = '0'; length--; exponent++; } for(i=0; s2[i] && (i 1); i++) { *s++ = s2[i]; length--; } *s++ = 0; return; } /* General case: use scientific notation */ /* Get the exponent as a string */ if (exponent >= 0) { sexp[0] = '+'; f107_snprinf_int(sexp+1, sizeof(sexp)-1, exponent); } else { f107_snprinf_int(sexp, sizeof(sexp), exponent); } /* Here we are a little forgiving about the length: We'll deduct the space we need for the signed exponent and 'e', then emit as many digits as we can. Note that s2[] already has the requested number of digits. So we just need to make sure the length can accomodate the lead digit, decimal point and exponent. */ if (length - (1 + 1 + 1 + f107_strlen(sexp)) < 1) { f107_strncpy(s, (char *) "fmt-err", length); return; } /* Reserve space for 'e' and exponent and null */ length = length - (1 + f107_strlen(sexp) + 1); *s++ = s2[0]; length--; /* Lead digit */ *s++ = '.'; length--; for(i=1; (i 0); i++) { *s++ = s2[i]; length--; } *s++ = 'e'; length--; /* We now have just enough room for the exponent */ for(i=0; sexp[i]; i++) { *s++ = sexp[i]; } *s++ = 0; } // End of f107_spfg /* f107_sscan extracts an f107 value from a string in much the same way as sscanf with the "%f" format. Commas are ignored; signs are optional for the mantissa and exponent, and the decimal point and exponent are optional. For the opposite conversion, use f107_spf or f107_spfg. %%% NOTE: We do not recognize "NaN", "Inf", "+inf", etc. but we should. */ f107_o f107_sscan(char * s) { float f_digit; f107_o n_digit; int sign, exponent, e_sign; int in_mant, in_fraction; char c; f107_o mantissa; f107_o frac_pow10, k10; f107_o rv; int digits; digits = 0; k10 = 10; frac_pow10 = 1.0; mantissa = 0.0; sign = 0; e_sign = 0; in_fraction = 0; in_mant = 1; exponent = 0; while(*s) { c = *s; if (c == ' ') { // Just ignore spaces (I have seen these used as digit separators // for long constants like Pi) } else if (c == ',') { // Commas are used as digit separators in Europe (see for example // the coordinates at http://www.colinlmiller.com/fractals/gallery.htm) } else if (c == '-') { if (in_mant) { sign = 1; } else { e_sign = 1; } } else if (c == '+') { // no action needed } else if ((c >= '0') && (c <= '9')) { if (in_mant) { f_digit = ((float) (c - '0')); n_digit = f_digit; mantissa = mantissa * k10 + n_digit; if (in_fraction) { frac_pow10 = frac_pow10 * k10; } } else { exponent = (exponent * 10) + ((int) ((c - '0'))); } digits++; } else if (c == '.') { in_fraction = 1; } else if ((c == 'e') || (c == 'E')) { in_mant = 0; } s++; } if (e_sign) { rv = (mantissa / frac_pow10) / f107_intpow(k10, exponent); } else { rv = (mantissa / frac_pow10) * f107_intpow(k10, exponent); } if (sign) { rv = - rv; } return(rv); } // End of f107_sscan f107_pf f107_pow2_int(int x) { f107_pf rv, k2; k2 = 2.0; if (x < 0) { x = -x; k2 = 0.5; } rv = 1.0; while(x) { if (x & 1) { rv *= k2; } x = x >> 1; k2 = k2 * k2; } return rv; } /* Coefficients for exponential power series (Taylor) approximation to tanh(x). See oeis.org/A002430, oeis.org/A036279. If you want the denominators to be factorials, use A000182 for numerators. To get these numbers in maxima: fpprec:40; bfloat(taylor(tanh(x),x,0,60)); */ #define F107_TANH_NCOEF 30 f107_o tanh_coef[F107_TANH_NCOEF] = { "1.0", // 1 "-0.3333333333333333333333333333333333333333", // -1/3 "0.1333333333333333333333333333333333333333", // 2/15 "-0.05396825396825396825396825396825396825397", // -17/315 "0.02186948853615520282186948853615520282187", // 62/2835 "-0.008863235529902196568863235529902196568863", // -1382/155925 "0.003592128036572481016925461369905814350258", // 21844/6081075 "-0.001455834387051318268249485180702111919043", // -929569/638512875 "0.0005900274409455859813780759937000210887543", // 6404582/10854718875 "-0.0002391291142435524814857314588851128093205", // -443861162/1856156927625 "9.691537956929450325595875000388809377859e-5", "-3.92783238833168340533708080931233350952e-5", "1.591890506932896474074427981657004305531e-5", "-6.451689215655430763190842315302558044632e-6", "2.61477115129075455426359425640974109737e-6", "-1.059726832010465435091355394124777619004e-6", "4.294911078273805854820351280397272242615e-7", "-1.740661896357164777986229231330272921937e-7", "7.054636946400968325214518544480556569837e-8", "-2.859136662305253908331094534141646943575e-8", "1.15876444327988522001979320290917903963e-8", "-4.696295398230901628819043472467672382366e-9", "1.903336833931275921319706595462384034737e-9", "-7.71393363535906229047893307216272854899e-10", "3.12633954589208704577812001344089995544e-10", "-1.267057693030540106009628808122343862488e-10", "5.135191408039367740076049513054295176989e-11", "-2.081214686770047419897227195060971086049e-11", "8.434845419094338293849148188558706453424e-12", "-3.418514086811155814038680292485683222778e-12" }; f107_o restrict_tanh(f107_o x) { f107_o taylor; int neg; if (f107_need_init) { f107_o_init(); } /* Handle infinity and NaN first */ if (F107_ISINF(x.c0) || (F107_ISINF(x.c1))) { taylor.c0 = taylor.c1 = FP_INFINITE; return taylor; } if (F107_ISNAN(x.c0) || (F107_ISNAN(x.c1))) { return x; } /* Handle negative argument (I do this just to enforce "symmetry": tanh(-x)+tanh(x) will always equal 0) */ neg = 0; if (x < 0.0) { x = -x; neg = 1; } // char fmt[40]; // In the limit of sufficiently large arguments, sinh(x)=cosh(x)=e^x/2 // and therefore tanh(x)=1 // This happens when e^x + e^-x = e^x * (1 + 2^-107) // 1 + e^-2x = 1 + 2^-107 // e^-2x = 2^-107 // -2x = ln(2^-107) // x = ln(2^107) / 2 = 37.0833741599570740538219 if (x.c0 > 37.08337415995707405) { taylor = 1.0; /* Here I should have another case for arguments bigger than about 1.0, where it is more prudent to compute tanh(y) with y=1/x. Based on: tanh(x) = sinh(x)/cosh(x) = (e^x-e^-x)/(e^x+e^-x) = ((e^2x-1)/e^-2x)/((e^2x+1)/e^-2x) = (e^2x-1)/(e^2x+1) tanh(x)-1 = (e^2x - 1 - e^2x - 1) / (e^2x+1) = -2/(e^2x+1) tanh(x) = 1 - 2/(e^2x+1) I give maxima: taylor(1-2/(%e^(2/x)+1),x,0,10); but it isn't too helpful. */ } else { /* (%i1) taylor(tanh(x),x,0,15); 3 5 7 9 11 13 15 x 2 x 17 x 62 x 1382 x 21844 x 929569 x (%o1)/T/ x - -- + ---- - ----- + ----- - -------- + --------- - ---------- 3 15 315 2835 155925 6081075 638512875 + . . . Coefficients are declared globally above. */ // char fmt[40]; f107_o xpow, prev; f107_o x2 = x * x; int i; taylor = x; // t = x xpow = x; prev = 0.0; i = 1; while ((i < F107_TANH_NCOEF) && (taylor != prev)) { prev = taylor; xpow = xpow * x2; taylor = taylor + tanh_coef[i] * xpow; i++; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("i=%d, taylor=%s\n", i, fmt); } } /* If the argument was negative, the answer is reciprocal. */ if (neg) { taylor = - taylor; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("neg: taylor=%s\n", fmt); } return(taylor); } // End of restrict_tanh() f107_o exp(f107_o x) { f107_o taylor; f107_o p2; f107_pf pow2; int i, recip, square; if (f107_need_init) { f107_o_init(); } /* Handle infinity and NaN first */ if (F107_ISINF(x.c0) || (F107_ISINF(x.c1))) { if (F107_ISINF(x.c0) < 0) { taylor.c0 = taylor.c1 = 0.0; return taylor; } taylor.c0 = taylor.c1 = x.c0; return taylor; } if (F107_ISNAN(x.c0) || (F107_ISNAN(x.c1))) { return x; } /* Handle negative argument */ recip = 0; if (x < 0.0) { x = -x; recip = 1; } /* Handle overflow and underflow */ if (x.c0 > 709.78271289338399684) { taylor = 0.0; if (recip) { taylor.c0 = taylor.c1 = FP_INFINITE; } return taylor; } // Remove exact powers of 2, which will be multiplied back in at the end. p2 = trunc(x / K107_ln_2); x = x - (K107_ln_2 * p2); pow2 = f107_pow2_int((int) (p2.c0)); //printf("p2.c0 == %g, pow2 == %g\n", p2.c0, pow2); /* Next we divide by 2 a few times, this will be squared back at the end. Doing this allows the Taylor series to converge more quickly. Note that dividing by 2 is lossless, but squaring is not. This makes x very small, so we cannot compute exp(x) without losing significant digits. Therefore the Taylor series and squaring it back up will be performed on values of e^x-1. (This is in contrast to Hida (2007) which implies that they work on e^x directly). */ square = 0; for(i=0; (i<5) && (x.c0 > 0.0027); i++) { // 0.0027 ~ ln(2)/256 x = f107_o::f107_native_half(x); square++; } #if 1 //char fmt[40]; // Used in the commented-out debug printf { /* Perform Taylor series sum for exp(x)-1 */ f107_o numer, denom, prev; f107_pf k; st_taylor_base++; taylor = 0.0; prev = 1.0; /* this just needs to be different from the initial value of 'taylor', to get the loop started */ numer = 1.0; denom = 1.0; k = 0.0; while (taylor != prev) { st_taylor_sum++; prev = taylor; // Step 1 Step 2 Step 3 Step 4 k = k + 1.0; // 1.0 2.0 3.0 4.0 numer = numer * x; // x x^2 x^3 x^4 denom = denom * k; // 1.0 2.0 6.0 24.0 taylor = taylor + numer/denom;// x x+x^2/2 ..+x^3/6 ..+x^4/24 //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("k=%g, taylor=%s\n", k, fmt); } } #else /* Use the identity: exp(x) = (1 + t) / (1 - t) where t = tanh(x/2) We actually want exp(x)-1, to we convert the formula as follows: exp(x) = (1 + t) / (1 - t) = (2 + t - 1) / (1 - t) = 2 / (1 - t) + (t - 1) / (1 - t) = 2 / (1 - t) - 1 so exp(x) - 1 = 2 / (1 - t) - 2 */ taylor = restrict_tanh(f107_o::f107_native_half(x)); taylor = 2.0/(1.0 - taylor) - 2.0; #endif /* Square it back up to fill the range [0..ln(2)] */ while (square > 0) { /* The value coming in is exp(x)-1, and we want to square the value of exp(x), so we have to compute exp(x)^2-1. We could do this by adding 1, squaring, and subtracting 1 back. But if we add 1 we'll lose precision. Since ((1+x)^2)-1 = 1 + 2x + x^2 - 1 = 2x + x^2, we don't need to add or subtract the 1; we can avoid loss-of-significance by just computing 2x+x^2. */ taylor = f107_o::f107_native_2x(taylor) + f107_o::f107_native_square(taylor); square -= 1; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("squared: taylor=%s\n", fmt); } /* Now we can safely add the 1.0 back in */ taylor = taylor + 1.0; /* Scale it back up by the power of 2 we removed earlier. This is a lossless operation. */ // taylor.c0 *= pow2; taylor.c1 *= pow2; taylor = f107_o::f107_native_scale(taylor, pow2); //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("scaled: taylor=%s\n", fmt); /* If the argument was negative, the answer is reciprocal. */ if (recip) { taylor = 1.0 / taylor; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("recip: taylor=%s\n", fmt); } return(taylor); } f107_o log(f107_o target) { f107_o root; //char fmt[40]; if (F107_ISINF(target.c0) || (F107_ISINF(target.c1))) { if (F107_ISINF(target.c0) < 0) { // log of -inf root.c0 = root.c1 = sqrt(-1.0); // NaN return root; } // log of +inf root.c0 = root.c1 = target.c0; return root; } if (F107_ISNAN(target.c0) || (F107_ISNAN(target.c1))) { return target; } if (target < 0.0) { root.c0 = root.c1 = sqrt(-1.0); // NaN return root; } if (target == 0.0) { root.c0 = root.c1 = log(0); // -infinity return root; } /* First guess */ root = log(target.c0); //f107_spfg(fmt, sizeof(fmt), 0, 32, root); //printf("root 1=%s\n", fmt); root = root + target * exp(-root) - 1.0; //f107_spfg(fmt, sizeof(fmt), 0, 32, root); //printf("root 2=%s\n", fmt); #if (!(F107_APPROX)) root = root + target * exp(-root) - 1.0; //f107_spfg(fmt, sizeof(fmt), 0, 32, root); //printf("root 3=%s\n", fmt); #endif return root; } f107_o sinh(f107_o x) { f107_o taylor; int neg; if (f107_need_init) { f107_o_init(); } /* Handle infinity and NaN first */ if (F107_ISINF(x.c0) || (F107_ISINF(x.c1))) { taylor.c0 = taylor.c1 = FP_INFINITE; return taylor; } if (F107_ISNAN(x.c0) || (F107_ISNAN(x.c1))) { return x; } /* Handle negative argument (I do this just to enforce "symmetry": sinh(-x)+sinh(x) will always equal 0) */ neg = 0; if (x < 0.0) { x = -x; neg = 1; } /* Handle overflow and underflow */ if (x.c0 > 710.4758600739439421) { if (neg) { taylor.c0 = taylor.c1 = -FP_INFINITE; } else { taylor.c0 = taylor.c1 = FP_INFINITE; } return taylor; } // char fmt[40]; // In the limit of sufficiently large arguments, sinh(x)=cosh(x)=e^x/2 // This happens when e^x + e^-x = e^x * (1 + 2^-107) // 1 + e^-2x = 1 + 2^-107 // e^-2x = 2^-107 // -2x = ln(2^-107) // x = ln(2^107) / 2 = 37.0833741599570740538219 if (x.c0 > 37.08337415995707405) { taylor = exp(x - K107_ln_2); } else { /* (%i1) taylor(sinh(x),x,0,12); 3 5 7 9 11 x x x x x (%o1)/T/ x + -- + --- + ---- + ------ + -------- + . . . 6 120 5040 362880 39916800 */ f107_o numer, denom, prev; f107_o k; f107_o x2 = x * x; taylor = x; // t = x numer = x; denom = 1.0; k = 1.0; prev = 0.0; while (taylor != prev) { prev = taylor; // Step 1 Step 2 Step 3 k = k + 1.0; // 2.0 4.0 6.0 denom = denom * k; // 2.0 24.0 720.0 k = k + 1.0; // 3.0 5.0 7.0 denom = denom * k; // 6.0 120.0 5040.0 numer = numer * x2; // x^3 x^5 x^7 taylor = taylor + numer/denom;// x+x^3/6 x+x^3/6+3^5/120 ..+x^7/5040 //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("k=%g, taylor=%s\n", k.c0, fmt); } } /* If the argument was negative, the answer is reciprocal. */ if (neg) { taylor = - taylor; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("neg: taylor=%s\n", fmt); } return(taylor); } // End of sinh() f107_o tanh(f107_o x) { f107_o taylor; int neg; if (f107_need_init) { f107_o_init(); } /* Handle infinity and NaN first */ if (F107_ISINF(x.c0) || (F107_ISINF(x.c1))) { taylor.c0 = taylor.c1 = FP_INFINITE; return taylor; } if (F107_ISNAN(x.c0) || (F107_ISNAN(x.c1))) { return x; } /* Handle negative argument (I do this just to enforce "symmetry": tanh(-x)+tanh(x) will always equal 0) */ neg = 0; if (x < 0.0) { x = -x; neg = 1; } // In the limit of sufficiently large arguments, sinh(x)=cosh(x)=e^x/2 // and therefore tanh(x)=1 // This happens when e^x + e^-x = e^x * (1 + 2^-107) // 1 + e^-2x = 1 + 2^-107 // e^-2x = 2^-107 // -2x = ln(2^-107) // x = ln(2^107) / 2 = 37.0833741599570740538219 if (x.c0 > 37.08337415995707405) { taylor = 1.0; } else if (x.c0 > 1.0) { /* tanh(x) = 1 - 2/(e^2x+1) */ taylor = exp(f107_o::f107_native_2x(x)); taylor = 1.0 - 2.0/(taylor + 1.0); } else { /* For x <1.0 we can get away with a normal Taylor series. (%i1) taylor(tanh(x),x,0,15); 3 5 7 9 11 13 15 x 2 x 17 x 62 x 1382 x 21844 x 929569 x (%o1)/T/ x - -- + ---- - ----- + ----- - -------- + --------- - ---------- 3 15 315 2835 155925 6081075 638512875 + . . . Coefficients are declared globally above. */ // char fmt[40]; f107_o xpow, prev; f107_o x2 = x * x; int i; taylor = x; // t = x xpow = x; prev = 0.0; i = 1; while ((i < F107_TANH_NCOEF) && (taylor != prev)) { prev = taylor; xpow = xpow * x2; taylor = taylor + tanh_coef[i] * xpow; i++; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("i=%d, taylor=%s\n", i, fmt); } } /* If the argument was negative, the answer is reciprocal. */ if (neg) { taylor = - taylor; //f107_spfg(fmt, sizeof(fmt), 0, 32, taylor); //printf("neg: taylor=%s\n", fmt); } return(taylor); } // End of tanh() // end of f107_o.cpp