Commit Graph

393 Commits

Author SHA1 Message Date
Keith Packard via Newlib a44bc679a4 libm/machine/arm: Add optimized fmaf and fma when available
When HAVE_FAST_FMAF is set, use the vfma.f32 instruction, when
HAVE_FAST_FMA is set, use the vfma.f64 instruction.

Usually the compiler built-ins will already have inlined these
instructions, but provide these symbols for cases where that doesn't
work instead of falling back to the (inaccurate) common code versions.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-10 21:04:12 +02:00
Keith Packard via Newlib 0c1989070e libm: Detect fast fmaf support
Anything with fast FMA is assumed to have fast FMAF, along with
32-bit arms that advertise 32-bit FP support and __ARM_FEATURE_FMA

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-10 21:01:46 +02:00
Keith Packard via Newlib 432b331c79 libm: ARM without HW double does not have fast FMA
32-bit ARM processors with HW float (but not HW double) may define
__ARM_FEATURE_FMA, but that only means they have fast FMA for 32-bit
floats.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-10 21:01:46 +02:00
Keith Packard via Newlib 73b02710ec libm/math: ensure that expf(-huge) sets FE_UNDERFLOW exception
It was calling __math_uflow(0) instead of __math_uflowf(0), which
resulted in no exception being set on machines with exception support
for float but not double.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-10 10:31:36 +02:00
Keith Packard via Newlib c3ce8405c1 libm: Control errno support with _IEEE_LIBM configuration parameter
This removes the run-time configuration of errno support present in
portions of the math library and unifies all of the compile-time errno
configuration under a single parameter so that the whole library
is consistent.

The run-time support provided by _LIB_VERSION is no longer present in
the public API, although it is still used internally to disable errno
setting in some functions. Now that it is a constant, the compiler should
remove that code when errno is not supported.

This removes s_lib_ver.c as _LIB_VERSION is no longer variable.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-05 22:23:02 +02:00
Keith Packard via Newlib e108d04432 libm/math: Don't modify __ieee754_pow return values in pow
The __ieee754 functions already return the right value in exception
cases, so don't modify those. Setting the library to _POSIX_/_IEEE_
mode now only affects whether errno is modified.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-05 22:16:31 +02:00
Keith Packard via Newlib 98a4f8de47 libm/math: Set errno to ERANGE for pow(0, -y)
POSIX says that the errno for pow(0, -y) should be ERANGE instead of
EDOM.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/pow.html

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-05 22:16:31 +02:00
Keith Packard via Newlib 2eafcc78df libm/math: Make yx functions set errno=ERANGE for x=0
The y0, y1 and yn functions need separate conditions when x is zero as
that returns ERANGE instead of EDOM.

Also stop adjusting the return value from the __ieee754_y* functions
as that is already correct and we were just breaking it.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-05 22:16:31 +02:00
Keith Packard via Newlib 905aa4c013 libm/math: set errno to ERANGE at gamma poles
For POSIX, gamma(i) (i non-positive integer) should set errno to
ERANGE instead of EDOM.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-05 22:16:31 +02:00
Keith Packard via Newlib bb166cfc3e libm/common: Set WANT_ERRNO based on _IEEE_LIBM value
_IEEE_LIBM is the configuration value which controls whether the
original libm functions modify errno. Use that in the new math code as
well so that the resulting library is internally consistent.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-08-04 19:30:45 +02:00
Keith Packard 12ad9a46df libm/math: Use __math_xflow in obsolete math code [v2]
C compilers may fold const values at compile time, so expressions
which try to elicit underflow/overflow by performing simple
arithemetic on suitable values will not generate the required
exceptions.

Work around this by replacing code which does these arithmetic
operations with calls to the existing __math_xflow functions that are
designed to do this correctly.

Signed-off-by: Keith Packard <keithp@keithp.com>

----

v2:
	libm/math: Pass sign to __math_xflow instead of muliplying result
2020-08-03 13:29:27 +02:00
Sebastian Huber ba283d8777 arm: Fix include to avoid undefined reference
ld: libm.a(lib_a-fesetenv.o): in function `fesetenv':
newlib/libm/machine/arm/fesetenv.c:38: undefined reference to `vmsr_fpscr'

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2020-07-29 16:24:13 +02:00
Eshan dhawan 3ca4325968 arm: Split fenv.c into multiple files
Use the already existing stub files if possible.  These files are
necessary to override the stub implementation with the machine-specific
implementation through the build system.

Reviewed-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-29 06:58:17 +02:00
Eshan dhawan b7a6e02dc6 arm: Fix fenv support
The previous fenv support for ARM used the soft-float implementation of
FreeBSD.  Newlib uses the one from libgcc by default.  They are not
compatible.  Having an GCC incompatible soft-float fenv support in
Newlib makes no sense.  A long-term solution could be to provide a
libgcc compatible soft-float support.  This likely requires changes in
the GCC configuration.  For now, provide a stub implementation for
soft-float multilibs similar to RISC-V.

Move implementation to one file and delete now unused files.  Hide
implementation details.  Remove function parameter names from header
file to avoid name conflicts.

Provide VFP support if __SOFTFP__ is not defined like glibc.

Reviewed-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-29 06:58:17 +02:00
Corinna Vinschen f095752167 libm: machine: Add missing sparc and mips configuration
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2020-07-03 10:45:44 +02:00
Eshan dhawan via Newlib 65918715a0 mips fenv support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-03 10:41:45 +02:00
Eshan dhawan via Newlib 03bf9f431c SPARC fenv support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-03 10:41:45 +02:00
Eshan dhawan via Newlib fd5e27d362 fenv aarch64 support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-02 12:12:39 +02:00
Eshan dhawan via Newlib a97bdf100f fenv support arm
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-06-09 21:13:17 -04:00
Jeff Johnston bc5087298d Regenerate libm/machine configuration files for powerpc 2020-06-09 20:59:04 -04:00
Eshan dhawan via Newlib e6ce6f1430 hard float support for PowerPC taken from FreeBSD
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-06-03 11:17:47 +02:00
Keith Packard via Newlib 6295d75913 newlib/libm/math: Make pow/powf return qnan for snan arg
The IEEE spec for pow only has special case for x**0 and 1**y when x/y
are quiet NaN. For signaling NaN, the general case applies and these functions
should signal the invalid exception and return a quiet NaN.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-03-26 12:21:33 +01:00
Keith Packard via Newlib 3439f3b0e9 newlib/libm/common: Don't re-convert float to bits in modf/modff
These functions shared a pattern of re-converting the argument to bits
when returning +/-0. Skip that as the initial conversion still has the
sign bit.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-03-26 12:21:33 +01:00
Keith Packard via Newlib 61cd34c1bf newlib/libm/common: Fix modf/modff returning snan
Recent GCC appears to elide multiplication by 1, which causes snan
parameters to be returned unchanged through *iptr. Use the existing
conversion of snan to qnan to also set the correct result in *iptr
instead.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-03-26 12:21:33 +01:00
Joseph S. Myers 5e24839658 Fix spurious underflow exceptions for Bessel functions for double(from glibc bug 14155)
This fix comes from glibc, from files which originated from
	the same place as the newlib files. Those files in glibc carry
	the same license as the newlib files.

Bug 14155 is spurious underflow exceptions from Bessel functions for
large arguments.  (The correct results for large x are roughly
constant * sin or cos (x + constant) / sqrt (x), so no underflow
exceptions should occur based on the final result.)

There are various places underflows may occur in the intermediate
calculations that cause the failures listed in that bug.  This patch
fixes problems for the double version where underflows occur in
calculating the intermediate functions P and Q (in particular, x**-12
gets computed while calculating Q).  Appropriate approximations are
used for P and Q for arguments at least 0x1p28 and above to avoid the
underflows.

For sufficiently large x - 0x1p129 and above - the code already has a
cut-off to avoid calculating P and Q at all, which means the
approximations -0.125 / x and 0.375 / x can't themselves cause
underflows calculating Q.  This cut-off is heuristically reasonable
for the point beyond which Q can be neglected (based on expecting
around 0x1p-64 to be the least absolute value of sin or cos for large
arguments representable in double).

The float versions use a cut-off 0x1p17, which is less heuristically
justifiable but should still only affect values near zeroes of the
Bessel functions where these implementations are intrinsically
inaccurate anyway (bugs 14469-14472), and should serve to avoid
underflows (the float underflow for jn in bug 14155 probably comes
from the recurrence to compute jn).  ldbl-96 uses 0x1p129, which may
not really be enough heuristically (0x1p143 or so might be safer - 143
= 64 + 79, number of mantissa bits plus total number of significant
bits in representation) but again should avoid underflows and only
affect values where the code is substantially inaccurate anyway.
ldbl-128 and ldbl-128ibm share a completely different implementation
with no such cut-off, which I propose to fix separately.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-03-26 12:21:33 +01:00
Fabian Schriever 6b0c1e7cc8 Fix hypotf missing mask in hi+lo decomposition
Add the missing mask for the decomposition of hi+lo which caused some
errors of 1-2 ULP.

This change is taken over from FreeBSD:
95436ce20d

Additionally I've removed some variable assignments which were never
read before being overwritten again in the next 2 lines.
2020-03-19 16:46:17 +01:00
Fabian Schriever 4ad9ba42fc Fix modf/f for NaN input
For NaN input the modf/f procedures should return NaN instead of zero
with the sign of the input.
2020-03-19 16:34:26 +01:00
Fabian Schriever 9e8da7bd21 Fix for k_tan.c specific inputs
This fix for k_tan.c is a copy from fdlibm version 5.3 (see also
http://www.netlib.org/fdlibm/readme), adjusted to use the macros
available in newlib (SET_LOW_WORD).

This fix reduces the ULP error of the value shown in the fdlibm readme
(tan(1.7765241907548024E+269)) to 0.45 (thereby reducing the error by
1).

This issue only happens for large numbers that get reduced by the range
reduction to a value smaller in magnitude than 2^-28, that is also
reduced an uneven number of times. This seems rather unlikely given that
one ULP is (much) larger than 2^-28 for the values that may cause an
issue.  Although given the sheer number of values a double can
represent, it is still possible that there are more affected values,
finding them however will be quite hard, if not impossible.

We also took a look at how another library (libm in FreeBSD) handles the
issue: In FreeBSD the complete if branch which checks for values smaller
than 2^-28 (or rather 2^-27, another change done by FreeBSD) is moved
out of the kernel function and into the external function. This means
that the value that gets checked for this condition is the unreduced
value. Therefore the input value which caused a problem in the
fdlibm/newlib kernel tan will run through the full polynomial, including
the careful calculation of -1/(x+r). So the difference is really whether
r or y is used. r = y + p with p being the result of the polynomial with
1/3*x^3 being the largest (and magnitude defining) value. With x being
<2^-27 we therefore know that p is smaller than y (y has to be at least
the size of the value of x last mantissa bit divided by 2, which is at
least x*2^-51 for doubles) by enough to warrant saying that r ~ y.  So
we can conclude that the general implementation of this special case is
the same, FreeBSD simply has a different philosophy on when to handle
especially small numbers.
2020-03-18 10:05:11 +01:00
Fabian Schriever c56f53a2a0 Fix truncf for sNaN input
Make line 47 in sf_trunc.c reachable. While converting the double
precision function trunc to the single precision version truncf an error
was introduced into the special case. This special case is meant to
catch both NaNs and infinities, however qNaNs and infinities work just
fine with the simple return of x (line 51). The only error occurs for
sNaNs where the same sNaN is returned and no invalid exception is
raised.
2020-03-11 12:10:58 +01:00
Joel Sherrill 91a8d0c907 i386/fenv.c: Include fenv.c implementation shared with x86_64, not stub 2020-03-10 16:05:59 +01:00
Fabian Schriever 18b4e0e518 Fix error in fdim/f for infinities
The comparison c == FP_INFINITE causes the function to return +inf as it
expects x = +inf to always be larger than y. This shortcut causes
several issues as it also returns +inf for the following cases:
 - fdim(+inf, +inf), expected (as per C99): +0.0
 - fdim(-inf, any non NaN), expected: +0.0

I don't see a reason to keep the comparison as all the infinity cases
return the correct result using just the ternary operation.
2020-03-10 15:11:23 +01:00
Fabian Schriever a8a40ee575 Fix error in exp in magnitude [2e-32,2e-28]
While testing the exp function we noticed some errors at the specified
magnitude. Within this range the exp function returns the input value +1
as an output. We chose to run a test of 1m exponentially spaced values
in the ranges [-2^-27,-2^-32] and [2^-32,2^-27] which showed 7603 and
3912 results with an error of >=0.5 ULP (compared with MPFR in 128 bit)
with the highest being 0.56 ULP and 0.53 ULP.

It's easy to fix by changing the magnitude at which the input value +1
is returned from <2^-28 to <2^-32 and using the polynomial instead. This
reduces the number of results with an error of >=0.5 ULP to 485 and 479
in above tests, all of which are exactly 0.5 ULP.

As we were already checking on exp we also took a look at expf. For expf
the magnitude where the input value +1 is returned can be increased from
<2^-28 to <2^-23 without accuracy loss for a slight performance
improvement. To ensure this was the correct value we tested all values
in the ranges [-2^-17,-2^-28] and [2^-28,2^-17] (~92.3m values each).
2020-03-09 10:12:25 +01:00
Fabian Schriever d4bcecb3e9 Fix error in float trig. function range reduction
The single-precision trigonometric functions show rather high errors in
specific ranges starting at about 30000 radians. For example the sinf
procedure produces an error of 7626.55 ULP with the input
5.195880078125e+04 (0x474AF6CD) (compared with MPFR in 128bit
precision). For the test we used 100k values evenly spaced in the range
of [30k, 70k]. The issues are periodic at higher ranges.

This error was introduced when the double precision range reduction was
first converted to float. The shift by 8 bits always returns 0 as iq is
never higher than 255.

The fix reduces the error of the example above to 0.45 ULP, highest
error within the test set fell to 1.31 ULP, which is not perfect, but
still a significant improvement. Testing other previously erroneous
ranges no longer show particularly large accuracy errors.
2020-03-03 16:45:22 +01:00
Fabian Schriever cef36220f2 Fix error in powf for (-1.0, NaN) input
Prevent confusion between -1.0 and 1.0 in powf. The corresponding
similar error was previously fixed for pow (see commit bb25dd1b)
2020-03-02 16:46:03 +01:00
Joel Sherrill fbaa096772 x86_64/i386 fenv: Replace symlink with include fenv_stub.c
Having symlinks for these files led to an issue reported to the RTEMS
Project that showed up using some tar for native Windows to unpack the
newlib sources.  It creates symlinks in the tar file as copies of the
files the symlinks point to.  If the links appear in the tar file before
the source exists, it cannot copy the file.

The solution in this patch is to convert the files that are symbolic
links into simple files which include the file they were linked to.
This should be more portable and avoids the symbolinc link problem.
2020-02-25 16:42:19 +01:00
Nicolas Brunie bb25dd1b0f pow: fix pow(-1.0, NaN)
I think I may have encountered a bug in the implementation of pow:
pow(-1.0, NaN) returns 1.0 when it should return NaN.
Because ix is used to check input vs 1.0 rather than hx, -1.0 is
mistaken for 1.0
2020-02-14 10:12:25 +01:00
Keith Packard 10058b98e7 Typo in license terms for newlib/libm/common/log2.c
The closing quotes were in the wrong place

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-02-06 11:58:50 +01:00
Jeff Johnston 4e78f8ea16 Bump up newlib release to 3.3.0 2020-01-21 15:17:43 -05:00
Jeff Johnston 1afb22a120 Bump up release to 3.2.0 for yearly snapshot 2020-01-02 14:56:24 -05:00
Keith Packard 7a526cdc28 libm: switch sf_log1p from double error routines to float
sf_log1p was using __math_divzero and __math_invalid, which
drag in a pile of double-precision code. Switch to using the
single-precision variants. This also required making those
available in __OBSOLETE_MATH mode.

Signed-off-by: Keith Packard <keithp@keithp.com>
2019-12-02 10:00:32 +01:00
Dimitar Dimitrov a1f617466d PRU: Align libmath to PRU ABI
The TI proprietary toolchain uses nonstandard names for some math
library functions. In order to achieve ABI compatibility between
GNU and TI toolchains, add support for the TI function names.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2019-10-31 15:02:33 -04:00
Jeff Johnston 0764a2eab8 Fix some generated files 2019-10-31 14:52:04 -04:00
Jeff Johnston cfc4955234 Add patch from Joel Sherrill for i386 and x86_64 fenv support 2019-10-08 16:59:04 -04:00
Joel Sherrill 1082cd8ea2 fe_dfl_env.c: Fix typo in comment 2019-09-03 09:53:38 -05:00
Joel Sherrill 91172ce591 fenv: Include documentation in generated .info file 2019-08-15 12:04:50 +02:00
Jon Turney b2990cae9e
fenv: Fix typo-ed variable name in documentation 2019-08-13 12:29:30 +01:00
Jon Turney 5624c18785
fenv: Fix mangled makedoc markup
See makedoc.c:657:
Variables are marked up as '<[foo]>'.
Code is marked up as '<<foo>>'.
2019-08-13 12:29:30 +01:00
Jon Turney be095dde8a
fenv: fe_dfl_env.c doesn't contain any documentation
fe_dfl_env.c doesn't contain any documentation, so drop it from makedoc
processing.
2019-08-13 12:29:29 +01:00
Joel Sherrill 744b90c996 Regenerated files from fenv.h addition 2019-08-09 17:49:16 +02:00
Joel Sherrill eae68bfc87 Add default implementation of fenv.h and all methods
The default implementation of the fenv.h methods return
	-EOPNOTSUP. Some of these have implementations appropriate
        for soft-float.

	The intention of the new fenv.h is that it be portable
	and that architectures provide their own implementation
	of sys/fenv.h.
2019-08-09 17:49:16 +02:00
Joel Sherrill 03f802846f Miscellaneous Makefile.in regenerated 2019-08-09 17:49:16 +02:00
Joel Sherrill 3e5302714f common/math_errf.c: Enable compilation of __math_oflowf
This resolved linking errors when using methods such as
	expm1().
2019-07-26 14:54:29 -05:00
Jeff Johnston 0d24a86822 Set errno in expm1{,f} / log1p{,f}
2019-07-09  Joern Rennecke  <joern.rennecke@riscy-ip.com>

	* libm/common/s_expm1.c ("math_config.h"): Include.
	(expm1): Use __math_oflow to set errno.
	* libm/common/s_log1p.c ("math_config.h"): Include.
	(log1p): Use __math_divzero and __math_invalid to set errno.
	* libm/common/sf_expm1.c ("math_config.h"): Include.
	(expm1f): Use __math_oflow to set errno.
	* libm/common/sf_log1p.c ("math_config.h"): Include.
	(log1pf): Use __math_divzero and __math_invalid to set errno.
2019-07-09 13:06:59 -04:00
Jozef Lawrynowicz b644774b8f Use nanf() instead of nan() in single-precision float libm math functions
This patch reduces code size for a few single-precision float math
functions, by using nanf() instead of nan() where required.
2019-01-23 10:46:30 +01:00
Jozef Lawrynowicz d451d9ec78 Use HUGE_VALF instead of HUGE_VAL in single-precision float libm math functions
This patch replaces instances of "(float).*HUGE_VAL" with a direct usage of
HUGE_VALF, which is also defined in math.h.
2019-01-23 10:46:30 +01:00
Jozef Lawrynowicz 7db203304e Remove HUGE_VAL definition from libm math functions
This patch removes the definitions of HUGE_VAL from some of the float math
functions. HUGE_VAL is defined in newlib/libc/include/math.h, so it is not
necessary to have a further definition in the math functions.
2019-01-23 10:46:30 +01:00
Jozef Lawrynowicz b14a879d85 Remove matherr, and SVID and X/Open math library configurations
Default math library configuration is now IEEE
2019-01-23 10:46:24 +01:00
Jeff Johnston 5726873100 Bump release to 3.1.0 for yearly snapshot 2018-12-31 23:40:11 -05:00
Szabolcs Nagy df6915f029 Fix powf overflow handling in non-nearest rounding mode
The threshold value at which powf overflows depends on the rounding mode
and the current check did not take this into account. So when the result
was rounded away from zero it could become infinity without setting
errno to ERANGE.

Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp

If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set,
which is fine in nearest rounding mode, but

  powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode
  powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode

and the previous implementation did not set errno in these cases.

The fix tries to avoid affecting the common code path or calling a
function that may introduce a stack frame, so float arithmetics is used
to check the rounding mode and the threshold is selected accordingly.
2018-12-10 16:51:05 +01:00
Corinna Vinschen 682c4a9f1e Implement nanl in newlib only
Drop Cygwin-specific nanl in favor of a generic implementation
in newlib.  Requires GCC 3.3 or later.

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2018-10-10 17:49:53 +02:00
Wilco Dijkstra 71e187bc07 Update Arm copyright notices in new math files
While working on the strstr patch I noticed several copyright headers
of the new math functions are missing closing quotes after ``AS IS.
I've added these.  Also update spellings of Arm Ltd in a few places
(but still use ARM LTD in upper case portion).  Finally add SPDX
identifiers to make everything consistent.
2018-09-28 11:03:55 +01:00
Szabolcs Nagy 877a386d76 Fix the documentation comment of checkint
checkint in pow is not supposed to be used with 0, inf or nan inputs.
2018-09-18 14:12:18 -04:00
Szabolcs Nagy f92a4c5d2d Document the log table generation method
Add comments with enough detail so the log lookup tables can be recreated.
2018-09-06 13:34:13 +02:00
Jon Beniston 86c31ae47b math_config.h: Fix signed overflow warning for 16-bit targets 2018-09-03 09:41:26 +02:00
Jon Beniston fcc1e7039f e_scalb.c: Call scalbln instead of scalbn on 16-bit targets to ensure constant fits in an int. 2018-09-03 09:41:23 +02:00
Jon Beniston a9cfb33b6c Add --disable-newlib-fno-builtin to allow compilation without -fno-builtin for smaller and faster code. 2018-08-31 15:40:42 -04:00
Keith Packard 088a45cdf6 Remove unused variable 'one' from sf_cos.c
Defined, never mentioned.

Signed-off-by: Keith Packard <keithp@keithp.com>
2018-08-29 15:57:27 +02:00
Wilco Dijkstra 8f1259a6ef Improve sincosf comments
Improve comments in sincosf implementation to make the code easier
to understand.  Rename the constant pi64 to pi63 since it's actually
PI * 2^-63.  Add comments for fields of sincos_t structure.  Add comments
describing implementation details to reduce_fast.
2018-08-16 13:17:44 +02:00
Corinna Vinschen 054ff18f5f newlib: don't use __visibility__ attrribute on Cygwin
gcc doesn't support visibility attribute on PE/COFF platforms

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2018-08-08 10:50:19 +02:00
Corinna Vinschen 2d87d95f12 newlib: fix various gcc warnings
* unused variables
* potentially used uninitialized
* suggested bracketing
* misleading indentation

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2018-08-08 10:50:19 +02:00
Szabolcs Nagy 81dc535bb9 Remove float compare option from sincosf
PREFER_FLOAT_COMPARISON setting was not correct as it could raise
spurious exceptions.  Fixing it is easy: just use ISLESS(x, y) instead
of abstop12(x) < abstop12(y) with appropriate non-signaling definition
for ISLESS.  However it seems this setting is not very useful (there is
only minor performance difference on various architectures), so remove
this option for now.
2018-07-11 17:16:04 +02:00
Szabolcs Nagy 358f3c61d6 Fix the documentation comments for log_inline in pow
There was a typo and the arguments were not explained clearly.
2018-07-11 17:16:04 +02:00
Szabolcs Nagy 138575c9b9 Fix namespace issues in sinf, cosf and sincosf
Use const sincos_t for clarity instead of making the typedef const.
Use __inv_pi4 and __sincosf_table to avoid namespace issues with
static linking.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy 2805b07fa1 Fix large ulp error in pow without fma very near 1.0
The !HAVE_FAST_FMA code path split r = z/c - 1 into r = rhi + rlo such
that when z = 1-tiny and c = 1 then rlo and rhi could have much larger
magnitude than r which later caused large rounding errors.

So do a nearest rounding instead of truncation at the split.

In newlib with default settings this was observable on some arm targets
that enable the new math code but has no fma.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy 6a85e1a4e5 Change the return type of converttoint and document the semantics
The roundtoint and converttoint internal functions are only called with small
values, so 32 bit result is enough for converttoint and it is a signed int
conversion so the natural return type is int32_t.

The original idea was to help the compiler keeping the result in uint64_t,
then it's clear that no sign extension is needed and there is no accidental
undefined or implementation defined signed int arithmetics.

But it turns out gcc does a good job with inlining so changing the type has
no overhead and the semantics of the conversion is less surprising this way.
Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top
bits were never usable and the existing code ensures that only the bottom
32 bits of the conversion result are used.

In newlib with default settings only aarch64 is affected and there is no
significant code generation change with gcc after the patch.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy 73a3e95ff2 Remove unused TOINT_RINT and TOINT_SHIFT macros
Only have separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy 393a1cb4ea Move __HAVE_FAST_FMA to math_config.h
Define it consistently with other HAVE_* macros that only affect code
using math_config.h.  This is also closer to the Arm Optimized Routines
code.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy cbe50607fb Fix code style and comments of new math code
Synchronize code style and comments with Arm Optimized Routines, there
are no code changes in this patch.  This ensures different projects using
the same code have consistent code style so bug fix patches can be applied
more easily.
2018-07-06 10:29:01 +02:00
Szabolcs Nagy b99d49e506 New pow implementation
The new implementation is provided under !__OBSOLETE_MATH, it uses
ISO C99 code.  With default settings the worst case error in nearest
rounding mode is 0.54 ULP with inlined fma and fma contraction.  It uses
a 4 KB lookup table in addition to the table in exp_data.c, on aarch64
.text+.rodata size of libm.a is increased by 2295 bytes.

Improvements on Cortex-A72:
latency: 3.3x
thruput: 4.9x
2018-06-27 15:40:49 +02:00
Szabolcs Nagy 07e2c32828 New log2 implementation
The new implementation is provided under !__OBSOLETE_MATH, it uses
ISO C99 code.  With default settings the worst case error in nearest
rounding mode is 0.547 ULP with inlined fma and fma contraction.  It uses
a 1 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased
by 1584 bytes.

Note that the math.h header defines log2(x) to be log(x)/Ln2, this is
not changed, so the new code is only used if that macro is suppressed.

Improvements on Cortex-A72:
latency: 2.0x
thruput: 2.2x
2018-06-27 15:40:49 +02:00
Szabolcs Nagy e5791079c6 New log implementation
The new implementations are provided under !__OBSOLETE_MATH, it uses
ISO C99 code.  With default settings the worst case error in nearest
rounding mode is 0.519 ULP with inlined fma and fma contraction.  It uses
a 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased
by 1703 bytes.  The w_log.c wrapper is disabled since error handling is
inline in the new code.

New __HAVE_FAST_FMA and __HAVE_FAST_FMA_DEFAULT feature macros were
added to enable selecting between the code path that uses fma and the
one that does not.  Targets supposed to set __HAVE_FAST_FMA_DEFAULT
if they have single instruction fma and the compiler can actually
inline it (gcc has __FP_FAST_FMA macro but that does not guarantee
inlining with -fno-builtin-fma).

Improvements on Cortex-A72:
latency: 1.9x
thruput: 2.3x
2018-06-27 15:40:49 +02:00
Szabolcs Nagy fb929067db New exp and exp2 implementations
The new implementations are provided under !__OBSOLETE_MATH, they use
ISO C99 code.  There are several settings, with the default one the
worst case error in nearest rounding mode is 0.509 ULP for exp and
0.507 ULP for exp2 when a multiply and add is contracted into an fma.
They use a shared 2 KB lookup table, on aarch64 .text+.rodata size
of libm.a is increased by 1868 bytes.  The w_*.c wrappers are disabled
for the new code as it takes care of error handling inline.

The old exp2(x) code used to be just pow(2,x) so the speedup there
is more significant.

The file name has no special prefix to avoid any name collision with
existing files.

Improvements on Cortex-A72:
exp latency: 3.2x
exp thruput: 4.1x
exp2 latency: 7.8x
exp2 thruput: 18.8x
2018-06-27 15:40:49 +02:00
Szabolcs Nagy cfbcbd1c95 Use uint32_t sign argument to math error functions
This change is equivalent to the commit
c65db17340
and only affects code that is from the Arm optimized-routines project.

It does not affect the observable behaviour, but the code generation
can be different on 64bit targets.  The intention is to make the
portable semantics of the code obvious by using a fixed size type.
2018-06-27 15:40:49 +02:00
Corinna Vinschen b14daac482 Revert "Remove -fno-builtin to allow gcc to inline functions such as fabs, floor, creal, imag."
This reverts commit c077b9de99.

Yet another accidental commit...
2018-06-26 10:17:04 +02:00
Jon Beniston c077b9de99 Remove -fno-builtin to allow gcc to inline functions such as fabs, floor, creal, imag. 2018-06-25 13:31:51 +02:00
Wilco Dijkstra 3baadb9912 Improve performance of sinf/cosf/sincosf
Here is the correct patch with both filenames and int cast fixed:

This patch is a complete rewrite of sinf, cosf and sincosf.  The new version
is significantly faster, as well as simple and accurate.
The worst-case ULP is 0.56072, maximum relative error is 0.5303p-23 over all
4 billion inputs.  In non-nearest rounding modes the error is 1ULP.

The algorithm uses 3 main cases: small inputs which don't need argument
reduction, small inputs which need a simple range reduction and large inputs
requiring complex range reduction.  The code uses approximate integer
comparisons to quickly decide between these cases - on some targets this may
be slow, so this can be configured to use floating point comparisons.

The small range reducer uses a single reduction step to handle values up to
120.0.  It is fastest on targets which support inlined round instructions.

The large range reducer uses integer arithmetic for simplicity.  It does a
32x96 bit multiply to compute a 64-bit modulo result.  This is more than
accurate enough to handle the worst-case cancellation for values close to
an integer multiple of PI/4.  It could be further optimized, however it is
already much faster than necessary.

Simple benchmark showing speedup factor on AArch64 for various ranges:

range	0.7853982	sinf	1.7	cosf	2.2	sincosf	2.8
range	1.570796	sinf	1.9	cosf	1.9	sincosf	2.7
range	3.141593	sinf	2.0	cosf	2.0	sincosf	3.5
range	6.283185	sinf	2.3	cosf	2.3	sincosf	4.2
range	125.6637	sinf	2.9	cosf	3.0	sincosf	5.1
range	1.1259e15	sinf	26.8	cosf	26.8	sincosf	45.2

ChangeLog:
2018-05-18  Wilco Dijkstra  <wdijkstr@arm.com>

        * newlib/libm/common/Makefile.in: Regenerated.
        * newlib/libm/common/Makefile.am: Add sinf.c, cosf.c, sincosf.c
        sincosf.h, sincosf_data.c. Add -fbuiltin -fno-math-errno to CFLAGS.
        * newlib/libm/common/math_config.h: Add HAVE_FAST_ROUND, HAVE_FAST_LROUND,
        roundtoint, converttoint, force_eval_float, force_eval_double, eval_as_float,
        eval_as_double, likely, unlikely.
        * newlib/libm/common/cosf.c: New file.
        * newlib/libm/common/sinf.c: Likewise.
        * newlib/libm/common/sincosf.h: Likewise.
        * newlib/libm/common/sincosf.c: Likewise.
        * newlib/libm/common/sincosf_data.c: Likewise.
        * newlib/libm/math/sf_cos.c: Add #if to build conditionally.
        * newlib/libm/math/sf_sin.c: Likewise.
        * newlib/libm/math/wf_sincos.c: Likewise.

--
2018-06-21 09:37:04 +02:00
Corinna Vinschen cfe8c6c504 Revert "Improve performance of sinf/cosf/sincosf"
This reverts commit fca80a9d1b.

Accidentally pushed a preliminary version
2018-06-21 09:36:39 +02:00
Jon Beniston b7d9d27b0e libm/common/s_round.c (round): Add cast for 16-bit CPUs 2018-06-21 09:31:13 +02:00
Wilco Dijkstra fca80a9d1b Improve performance of sinf/cosf/sincosf
This patch is a complete rewrite of sinf, cosf and sincosf.  The new version
is significantly faster, as well as simple and accurate.
The worst-case ULP is 0.56072, maximum relative error is 0.5303p-23 over all
4 billion inputs.  In non-nearest rounding modes the error is 1ULP.

The algorithm uses 3 main cases: small inputs which don't need argument
reduction, small inputs which need a simple range reduction and large inputs
requiring complex range reduction.  The code uses approximate integer
comparisons to quickly decide between these cases - on some targets this may
be slow, so this can be configured to use floating point comparisons.

The small range reducer uses a single reduction step to handle values up to
120.0.  It is fastest on targets which support inlined round instructions.

The large range reducer uses integer arithmetic for simplicity.  It does a
32x96 bit multiply to compute a 64-bit modulo result.  This is more than
accurate enough to handle the worst-case cancellation for values close to
an integer multiple of PI/4.  It could be further optimized, however it is
already much faster than necessary.

Simple benchmark showing speedup factor on AArch64 for various ranges:

range	0.7853982	sinf	1.7	cosf	2.2	sincosf	2.8
range	1.570796	sinf	1.9	cosf	1.9	sincosf	2.7
range	3.141593	sinf	2.0	cosf	2.0	sincosf	3.5
range	6.283185	sinf	2.3	cosf	2.3	sincosf	4.2
range	125.6637	sinf	2.9	cosf	3.0	sincosf	5.1
range	1.1259e15	sinf	26.8	cosf	26.8	sincosf	45.2

ChangeLog:
2018-06-18  Wilco Dijkstra  <wdijkstr@arm.com>

        * newlib/libm/common/Makefile.in: Regenerated.
        * newlib/libm/common/Makefile.am: Add sinf.c, cosf.c, sincosf.c
        sincosf.h, sincosf_data.c. Add -fbuiltin -fno-math-errno to CFLAGS.
        * newlib/libm/common/math_config.h: Add HAVE_FAST_ROUND, HAVE_FAST_LROUND,
        roundtoint, converttoint, force_eval_float, force_eval_double, eval_as_float,
        eval_as_double, likely, unlikely.
        * newlib/libm/common/cosf.c: New file.
        * newlib/libm/common/sinf.c: Likewise.
        * newlib/libm/common/sincosf.h: Likewise.
        * newlib/libm/common/sincosf.c: Likewise.
        * newlib/libm/common/sincosf_data.c: Likewise.
        * newlib/libm/math/sf_cos.c: Add #if to build conditionally.
        * newlib/libm/math/sf_sin.c: Likewise.
        * newlib/libm/math/wf_sincos.c: Likewise.

--
2018-06-19 09:44:28 +02:00
Matthias Kannwischer fcfea0ae2d fix llrint and lrint for 52 <= exponent <= 62 2018-05-29 15:59:48 +02:00
Jeff Johnston e928275566 Use _LDBL_EQ_DBL in nexttowardf.c
2018-05-07  Tom de Vries  <tom@codesourcery.com>

	* libm/common/nexttowardf.c: Use _LDBL_EQ_DBL instead of
	_LDBL_EQ_DOUBLE.
2018-05-07 12:22:12 -04:00
Jeff Johnston cd31fbb2ae Add nvptx port.
- From: Cesar Philippidis <cesar@codesourcery.com>
  Date: Tue, 10 Apr 2018 14:43:42 -0700
  Subject: [PATCH] nvptx port

  This port adds support for Nvidia GPU's, which are primarily used as
  offload accelerators in OpenACC and OpenMP.
2018-04-13 15:42:37 -04:00
Jeff Johnston fffd2770db Bump release to 3.0.0 for yearly snapshot
- major release required due to removal of K&R support
2018-01-18 13:07:45 -05:00
Yaakov Selkowitz 7192f84096 ansification: remove _HAVE_STDC
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:30 -06:00
Yaakov Selkowitz 70ee6b17df ansification: remove _EXFUN, _EXFUN_NOTHROW
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:29 -06:00
Yaakov Selkowitz 9087163804 ansification: remove _DEFUN
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:26 -06:00
Yaakov Selkowitz fff27f8429 ansification: remove _DEFUN_VOID
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:19 -06:00
Yaakov Selkowitz 0bda30e1ff ansification: remove _CONST
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:08 -06:00
Yaakov Selkowitz 6783860a2e ansification: remove _AND
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:05 -06:00
Jim Wilson c874f1145f newlib: Don't do double divide in powf.
* Use 0.0f instead of 0.0 in divide.
2017-12-13 11:33:19 +01:00