newlib-cygwin

Commit Graph

Author	SHA1	Message	Date
Szabolcs Nagy	cbe50607fb	Fix code style and comments of new math code Synchronize code style and comments with Arm Optimized Routines, there are no code changes in this patch. This ensures different projects using the same code have consistent code style so bug fix patches can be applied more easily.	2018-07-06 10:29:01 +02:00
Szabolcs Nagy	fb929067db	New exp and exp2 implementations The new implementations are provided under !__OBSOLETE_MATH, they use ISO C99 code. There are several settings, with the default one the worst case error in nearest rounding mode is 0.509 ULP for exp and 0.507 ULP for exp2 when a multiply and add is contracted into an fma. They use a shared 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased by 1868 bytes. The w_*.c wrappers are disabled for the new code as it takes care of error handling inline. The old exp2(x) code used to be just pow(2,x) so the speedup there is more significant. The file name has no special prefix to avoid any name collision with existing files. Improvements on Cortex-A72: exp latency: 3.2x exp thruput: 4.1x exp2 latency: 7.8x exp2 thruput: 18.8x	2018-06-27 15:40:49 +02:00

Author

SHA1

Message

Date

Szabolcs Nagy

cbe50607fb

Fix code style and comments of new math code

Synchronize code style and comments with Arm Optimized Routines, there
are no code changes in this patch.  This ensures different projects using
the same code have consistent code style so bug fix patches can be applied
more easily.

2018-07-06 10:29:01 +02:00

Szabolcs Nagy

fb929067db

New exp and exp2 implementations

The new implementations are provided under !__OBSOLETE_MATH, they use
ISO C99 code.  There are several settings, with the default one the
worst case error in nearest rounding mode is 0.509 ULP for exp and
0.507 ULP for exp2 when a multiply and add is contracted into an fma.
They use a shared 2 KB lookup table, on aarch64 .text+.rodata size
of libm.a is increased by 1868 bytes.  The w_*.c wrappers are disabled
for the new code as it takes care of error handling inline.

The old exp2(x) code used to be just pow(2,x) so the speedup there
is more significant.

The file name has no special prefix to avoid any name collision with
existing files.

Improvements on Cortex-A72:
exp latency: 3.2x
exp thruput: 4.1x
exp2 latency: 7.8x
exp2 thruput: 18.8x

2018-06-27 15:40:49 +02:00

2 Commits