The implementation of expf() explains how approximation in the range [0 - 0.34] is done. The comment describes the "Reme" algorithm for constructing the polynomial. This is a typo and should be the "Remez" algorithm. The remez algorithm (or minimax) is used to calculate the coefficients of polynomials in other implementations of exp(0 and log().
See more:
https://en.wikipedia.org/wiki/Remez_algorithm
C compilers may fold const values at compile time, so expressions
which try to elicit underflow/overflow by performing simple
arithemetic on suitable values will not generate the required
exceptions.
Work around this by replacing code which does these arithmetic
operations with calls to the existing __math_xflow functions that are
designed to do this correctly.
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
libm/math: Pass sign to __math_xflow instead of muliplying result
While testing the exp function we noticed some errors at the specified
magnitude. Within this range the exp function returns the input value +1
as an output. We chose to run a test of 1m exponentially spaced values
in the ranges [-2^-27,-2^-32] and [2^-32,2^-27] which showed 7603 and
3912 results with an error of >=0.5 ULP (compared with MPFR in 128 bit)
with the highest being 0.56 ULP and 0.53 ULP.
It's easy to fix by changing the magnitude at which the input value +1
is returned from <2^-28 to <2^-32 and using the polynomial instead. This
reduces the number of results with an error of >=0.5 ULP to 485 and 479
in above tests, all of which are exactly 0.5 ULP.
As we were already checking on exp we also took a look at expf. For expf
the magnitude where the input value +1 is returned can be increased from
<2^-28 to <2^-23 without accuracy loss for a slight performance
improvement. To ensure this was the correct value we tested all values
in the ranges [-2^-17,-2^-28] and [2^-28,2^-17] (~92.3m values each).
The new implementations are provided under !__OBSOLETE_MATH, they use
ISO C99 code. There are several settings, with the default one the
worst case error in nearest rounding mode is 0.509 ULP for exp and
0.507 ULP for exp2 when a multiply and add is contracted into an fma.
They use a shared 2 KB lookup table, on aarch64 .text+.rodata size
of libm.a is increased by 1868 bytes. The w_*.c wrappers are disabled
for the new code as it takes care of error handling inline.
The old exp2(x) code used to be just pow(2,x) so the speedup there
is more significant.
The file name has no special prefix to avoid any name collision with
existing files.
Improvements on Cortex-A72:
exp latency: 3.2x
exp thruput: 4.1x
exp2 latency: 7.8x
exp2 thruput: 18.8x
compiler warnings found this way.
* libc/stdio/freopen.c (_freopen_r): Fix bug setting _flags.
* libc/include/stdio.h (_rename): Define when building newlib.
* libc/include/sys/signal.h (_kill): Ditto.
* libc/include/sys/stat.h (_mkdir): Ditto.
* libc/include/sys/time.h (_gettimeofday): Ditto.
* libc/include/sys/times.h (_times): Ditto.
* libc/include/sys/wait.h (_wait): Ditto.
* libc/locale/lmessages.c (empty): Don't define for Cygwin.
* libc/locale/lmonetary.c (cnv): Ditto.
* libc/locale/nl_langinfo.c (nl_langinfo): Ditto for variable s.
* libc/posix/collate.c: Throughout cast to avoid compiler warning.
* libc/posix/engine.c (matcher): Initialize dp to avoid compiler
warning.
* libc/posix/glob.c: Disable on Cygwin. Explain why.
* libc/posix/regcomp.c: Fix "uninitialized" compiler warnings.
(dissect): Deliberately silence gcc compiler warning. Add comment to
explain why.
* libc/posix/wordexp.c (wordexp): Remove num_bytes variable since result
is never used.
* libc/posix/popen.c (popen): Ditto for variable last.
* libc/reent/mkdirr.c: Include sys/stat.h.
* libc/reent/renamer.c: Include stdio.h.
* libc/search/hash.c: Throughout use underscored variants of the stat
function family.
(init_hash): Add missing definition for the __USE_INTERNAL_STAT64 case.
* libc/search/hash_bigkey.c (__big_insert): Add parenthesis to avoid
compiler warning.
* libc/search/hash_page.c (overflow_page): Initalize freep to NULL to
avoid compiler warning.
* libc/stdio/asiprintf.c (_asiprintf_r): Cast unsigned char * to char *
to avoid compiler warning.
(asiprintf): Ditto.
* libc/stdio/asprintf.c (_asprintf_r): Ditto.
(asprintf): Ditto.
* libc/stdio/vasiprintf.c (_vasiprintf_r): Ditto.
* libc/stdio/vasprintf.c (_vasprintf_r): Ditto.
* libc/stdio/mktemp.c (_gettemp): Cast to unsigned char in call to
isdigit to avoid compiler warning.
* libc/stdio/vfprintf.c (_VFPRINTF_R): Initialize variables used for
grouping to avoid compiler warning. Only define and set nseps and
nrepeats if they are really used.
* libc/stdio/vfwprintf.c (_VFWPRINTF_R): Ditto. Only define state if
it is really used.
* libc/stdio/vfscanf.c (u_char): Revert to be defined as unsigned char.
(__SVFSCANF_R): Cast fmt in call to __mbtowc.
* libc/stdlib/mbtowc_r.c (JIS_state_table): Disable when building
Cygwin.
(JIS_action_table): Ditto.
* libc/stdlib/wctomb_r.c (__utf8_wctomb): Add parenthesis to avoid
compiler warning.
* libc/string/strcasestr.c: Deliberately silence gcc compiler warning.
Add comment to explain why.
* libc/time/strptime.c (strptime): Cast to unsigned char in calls to
isspace to avoid compiler warning.
* libm/math/e_atan2.c (__ieee754_atan2): Add parenthesis to avoid
compiler warning.
* libm/math/e_exp.c (__ieee754_exp): Initialize k to 0 to avoid
compiler warning. Drop setting it to 0 later.
* libm/math/ef_exp.c (__ieee754_expf): Ditto.
* libm/math/e_pow.c (__ieee754_pow): Add braces to avoid compiler
warning.
* libm/math/ef_pow.c (__ieee754_powf): Ditto.
* libm/math/er_lgamma.c (__ieee754_lgamma_r): Initialize nadj to 0 to
avoid compiler warning.
* libm/math/erf_lgamma.c (__ieee754_lgammaf_r): Ditto.
* libm/math/e_rem_pio2.c (__ieee754_rem_pio2): Ditto for variable z.
* libm/common/sf_round.c (roundf): Remove signbit variable since result
is never used.