newlib-cygwin

Commit Graph

Author	SHA1	Message	Date
Szabolcs Nagy	393a1cb4ea	Move __HAVE_FAST_FMA to math_config.h Define it consistently with other HAVE_* macros that only affect code using math_config.h. This is also closer to the Arm Optimized Routines code.	2018-07-06 10:29:01 +02:00
Szabolcs Nagy	07e2c32828	New log2 implementation The new implementation is provided under !__OBSOLETE_MATH, it uses ISO C99 code. With default settings the worst case error in nearest rounding mode is 0.547 ULP with inlined fma and fma contraction. It uses a 1 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased by 1584 bytes. Note that the math.h header defines log2(x) to be log(x)/Ln2, this is not changed, so the new code is only used if that macro is suppressed. Improvements on Cortex-A72: latency: 2.0x thruput: 2.2x	2018-06-27 15:40:49 +02:00

Author

SHA1

Message

Date

Szabolcs Nagy

393a1cb4ea

Move __HAVE_FAST_FMA to math_config.h

Define it consistently with other HAVE_* macros that only affect code
using math_config.h.  This is also closer to the Arm Optimized Routines
code.

2018-07-06 10:29:01 +02:00

Szabolcs Nagy

07e2c32828

New log2 implementation

The new implementation is provided under !__OBSOLETE_MATH, it uses
ISO C99 code.  With default settings the worst case error in nearest
rounding mode is 0.547 ULP with inlined fma and fma contraction.  It uses
a 1 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased
by 1584 bytes.

Note that the math.h header defines log2(x) to be log(x)/Ln2, this is
not changed, so the new code is only used if that macro is suppressed.

Improvements on Cortex-A72:
latency: 2.0x
thruput: 2.2x

2018-06-27 15:40:49 +02:00

2 Commits