This isn't strictly necessary, but it makes for much clearer logs as
to what the target is doing, and provides cache vars for anyone who
wants to force the test a different way, and it lets the build cache
its own results when rerunning config.status.
Restore the call to AC_NO_EXECUTABLES -- I naively assumed in commit
2e9aa5f56c ("newlib: update preprocessor
configure checks") that checking for a preprocessor would not involve
linking code. Unfortunately, autoconf will implicitly check that the
compiler "works" before allowing it to be used, and that involves a
link test, and that fails because newlib provides the C library which
is needed to pass a link test.
There is some code in NEWLIB_CONFIGURE specifically to help mitigate
these, but it's not kicking in here for some reason, so let's just add
the AC_NO_EXECUTABLES call back until we can unwind that custom logic.
Additionally, we have to call AC_PROG_CPP explicitly. This was being
invoked later on, but only in the use_libtool=yes codepath, and that
is almost never enabled.
When we had configure scripts in subdirs, the newlib_basedir value
was computed relative to that, and it'd be the same when used in the
Makefile in the same dir. With many subdir configure scripts removed,
the top-level configure & Makefile can't use the same relative path.
So switch the subdir Makefiles over to abs_newlib_basedir when they
use -I to find source headers.
Do this for all subdirs, even ones with configure scripts and where
newlib_basedir works. This makes the code consistent, and avoids
surprises if the configure script is ever removed in the future as
part of merging to the higher level.
Some of the subdirs were using -I$(newlib_basedir)/../newlib/ for
some reason. Collapse those too since newlib_basedir points to the
newlib source tree already.
When using the top-level configure script but subdir Makefiles, the
newlib_basedir value gets a bit out of sync: it's relative to where
configure lives, not where the Makefile lives. Move the abs setting
from the top-level configure script into acinclude.m4 so we can rely
on it being available everywhere. Although this commit doesn't use
it anywhere, just lays the groundwork.
The machine configure scripts are all effectively stub scripts that
pass the higher level options to its own makefile. The only one doing
any custom tests was nds32. The rest were all effectively the same as
the libm/ configure script.
So instead of recursively running configure in all of these subdirs,
generate their makefiles from the top-level configure. For nds32,
deploy a pattern of including subdir logic via m4:
m4_include([machine/nds32/acinclude.m4])
Even its set of checks are very small -- it does 2 preprocessor tests
and sets up 2 makefile conditionals.
Some of the generated machine makefiles have a bunch of extra stuff
added to them, but that's because they were inconsistent in their
configure libtool calls. The top-level has it, so it exports some
new vars to the ones that weren't already.
The machine/{configure,Makefile} files exist only to fan out to the
specific machine/$arch/ subdir. We already have all that same info
in the libm/ dir itself, so by moving the recursive configure and
make calls into it, we can cut off this logic entirely and save the
overhead.
For arches that don't have a machine subdir, it means they can skip
the logic entirely.
The nds32 & spu dirs are using compile tests to look for some
preprocessor defines, but we don't need to compile the code,
just preprocess it. So switch to AC_PREPROC_IFELSE.
The sh dir is using a preprocessor test via grep, but let's
switch it to AC_PREPROC_IFELSE too to be consistent.
This should allow us to drop the uncommon AC_NO_EXECUTABLES call.
It's unclear why this was added originally, but assuming it was needed
20 years ago, it shouldn't be explicitly required nowadays. Current
versions of autotools already take care of exporting LDFLAGS to the
Makefile as needed (things are actually getting linked). That's why
the configure diffs show LDFLAGS still here, but shifted to a diff
place in the output list. A few dirs stop exporting LDFLAGS, but
that's because they don't do any linking, only compiling, so it's
correct.
As for the use of $ldflags instead of the standard $LDFLAGS, I can't
really explain that at all. Just use the right name so users don't
have to dig into why their setting isn't respected, and then use a
non-standard name instead. Adjust the testsuite to match.
Now that we require a recent version of autoconf, we can rely on this
macro working. We shift the call in configure.ac down a little to
help keep the generated diff minimal -- there should be no functional
difference otherwise. This is because the autoconf macros will call
a bunch of standard toolchain macros first, and arguably the current
code is incorrect in how it does its testing.
Since AM_INIT_AUTOMAKE calls AC_PROG_AWK, and some configure.ac
scripts call it too, we end up testing for awk multiple times. If
we change NEWLIB_CONFIGURE to require the macro instead, then it
makes sure it's always expanded, but only once.
While we're here, do the same thing with AC_PROG_INSTALL since it
is also called by AM_INIT_AUTOMAKE, although it doesn't currently
result in duplicate configure checks.
The AC_LIBTOOL_WIN32_DLL macro has been deprecated for a while and code
should call LT_INIT with win32-dll instead. Update the calls to match.
The generated code is noisy not because of substantial differences, but
because the order of some macros change (i.e. instead of calling AS and
then CC, CC is called first and then AS).
Since automake already sets per-library CCASFLAGS to $(AM_CCASFLAGS)
by default, there's no need to explicitly set it here.
Many of these dirs don't have .S files in the first place, so the rule
doesn't even do anything. That can easily be seen when Makefile.in has
no changes as a result.
For the dirs with .S files, the custom rules are the same as the pattern
.S.o rules, so this is a nice cleanup.
The only dir that was adding extra flags (newlib/libc/machine/mn10300/)
to the per-library setting can have it moved to the global AM_CCASFLAGS
since the subdir only has one target. Although the setting just adds
extra debugging flags, so maybe it should be deleted in general.
There are a few dirs that we leave the redundant setting in place. This
is to workaround an automake limitation in subdirs that support building
with & w/out libtool:
https://www.gnu.org/software/automake/manual/html_node/Objects-created-both-with-libtool-and-without.html
This matches what the other GNU toolchain projects have done already.
The generated diff in practice isn't terribly large. This will allow
more use of subdir local.mk includes due to fixes & improvements that
came after the 1.11 release series.
The newlib & libgloss dirs are already generated using autoconf-2.69.
To avoid merging new code and/or accidental regeneration using diff
versions, leverage config/override.m4 to pin to 2.69 exactly. This
matches what gcc/binutils/gdb are already doing.
The README file already says to use autoconf-2.69.
To accomplish this, it's just as simple as adding -I flags to the
top-level config/ dir when running aclocal. This is because the
override.m4 file overrides AC_INIT to first require the specific
autoconf version before calling the real AC_INIT.
Libtool needs to get BSD-format (or MS-format) output out of the system
nm, so that it can scan generated object files for symbol names for
-export-symbols-regex support. Some nms need specific flags to turn on
BSD-formatted output, so libtool checks for this in its AC_PATH_NM.
Unfortunately the code to do this has a pair of interlocking flaws:
- it runs the test by doing an nm of /dev/null. Some platforms
reasonably refuse to do an nm on a device file, but before now this
has only been worked around by assuming that the error message has a
specific textual form emitted by Tru64 nm, and that getting this
error means this is Tru64 nm and that nm -B would work to produce
BSD-format output, even though the test never actually got anything
but an error message out of nm -B. This is fixable by nm'ing *nm
itself* (since we necessarily have a path to it).
- the test is entirely skipped if NM is set in the environment, on the
grounds that the user has overridden the test: but the user cannot
reasonably be expected to know that libtool wants not only nm but
also flags forcing BSD-format output. Worse yet, one such "user" is
the top-level Cygnus configure script, which neither tests for
nor specifies any BSD-format flags. So platforms needing BSD-format
flags always fail to set them when run in a Cygnus tree, breaking
-export-symbols-regex on such platforms. Libtool also needs to
augment $LD on some platforms, but this is done unconditionally,
augmenting whatever the user specified: the nm check should do the
same.
One wrinkle: if the user has overridden $NM, a path might have been
provided: so we use the user-specified path if there was one, and
otherwise do the path search as usual. (If the nm specified doesn't
work, this might lead to a few extra pointless path searches -- but
the test is going to fail anyway, so that's not a problem.)
(Tested with NM unset, and set to nm, /usr/bin/nm, my-nm where my-nm is a
symlink to /usr/bin/nm on the PATH, and /not-on-the-path/my-nm where
*that* is a symlink to /usr/bin/nm.)
ChangeLog
2021-09-27 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27967
* libtool.m4 (LT_PATH_NM): Try BSDization flags with a user-provided
NM, if there is one. Run nm on itself, not on /dev/null, to avoid
errors from nms that refuse to work on non-regular files. Remove
other workarounds for this problem. Strip out blank lines from the
nm output.
This reports common symbols like GNU nm, via a type code of 'C'.
ChangeLog
2021-09-27 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27967
* libtool.m4 (lt_cv_sys_global_symbol_pipe): Augment symcode for
Solaris 11.
AR from older binutils doesn't work with --plugin and rc:
[hjl@gnu-cfl-2 bin]$ touch foo.c
[hjl@gnu-cfl-2 bin]$ ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/10/liblto_plugin.so rc libfoo.a foo.c
[hjl@gnu-cfl-2 bin]$ ./ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/10/liblto_plugin.so rc libfoo.a foo.c
./ar: no operation specified
[hjl@gnu-cfl-2 bin]$ ./ar --version
GNU ar (Linux/GNU Binutils) 2.29.51.0.1.20180112
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
[hjl@gnu-cfl-2 bin]$
Check if AR works with --plugin and rc before passing --plugin to AR and
RANLIB.
PR ld/27173
* configure: Regenerated.
* libtool.m4 (_LT_CMD_OLD_ARCHIVE): Check if AR works with
--plugin and rc before enabling --plugin.
config/
PR ld/27173
* gcc-plugin.m4 (GCC_PLUGIN_OPTION): Check if AR works with
--plugin and rc before enabling --plugin.
libiberty/
PR ld/27173
* configure: Regenerated.
zlib/
PR ld/27173
* configure: Regenerated.
The configure scripts were regenerated with 2.69 for the newlib-4.2.0
release in 484d2ebf8d, but the aclocal
files were not. Do that now to avoid confusion between the two as to
which version of autoconf was used.
Since automake deprecated the INCLUDES name in favor of AM_CPPFLAGS,
change all existing users over. The generated code is the same since
the two variables have been used in the same exact places by design.
There are other cleanups to be done, but lets focus on just renaming
here so we can upgrade to a newer automake version w/out triggering
new warnings.
The 'cygnus' option was removed from automake 1.13 in 2012, so the
presence of this option prevents that or a later version of automake
being used.
A check-list of the effects of '--cygnus' from the automake 1.12
documentation, and steps taken (where possible) to preserve those
effects (See also this thread [1] for discussion on that):
[1] https://lists.gnu.org/archive/html/bug-automake/2012-03/msg00048.html
1. The foreign strictness is implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
2. The options no-installinfo, no-dependencies and no-dist are implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
Future work: Remove no-dependencies and any explicit header dependencies,
and use automatic dependency tracking instead. Are there explicit rules
which are now redundant to removing no-installinfo and no-dist?
3. The macro AM_MAINTAINER_MODE is required.
Already present in newlib/acinclude.m4
Note that maintainer-mode is still disabled by default.
4. Info files are always created in the build directory, and not in the
source directory.
This appears to be an error in the automake documentation describing
'--cygnus' [2]. newlib's info files are generated in the source
directory, and no special steps are needed to keep doing that.
[2] https://lists.gnu.org/archive/html/bug-automake/2012-04/msg00028.html
5. texinfo.tex is not required if a Texinfo source file is specified.
(The assumption is that the file will be supplied, but in a place that
automake cannot find.)
This effect is overriden by an explicit setting of the TEXINFO_TEX
variable (the directory part of which is fed into texi2X via the
TEXINPUTS environment variable).
6. Certain tools will be searched for in the build tree as well as in the
user's PATH. These tools are runtest, expect, makeinfo and texi2dvi.
For obscure automake reasons, this effect of '--cygnus' is not active
for makeinfo in newlib's configury.
However, there appears to be top-level configury which selects in-tree
runtest, expect and makeinfo, if present. So, if that works as it
appears, this effect is preserved. If not, this may cause problem if
anyone is building those tools in-tree.
This effect is not preserved for texi2dvi. This may cause problems if
anyone is building texinfo in-tree.
If needed, explicit checks for those tools looking in places relative to
$(top_srcdir)/../ as well as in PATH could be added.
7. The check target doesn't depend on all.
This effect is not preseved. The check target now depends on the all
target.
This concern seems somewhat academic given the current state of the
testsuite.
Also note that this doesn't touch libgloss.
Add all the effects of 'cygnus' for which there exists an explicit way
to request that behaviour:
* Implied foreign strictness and options no-installinfo, no-dependencies
and no-dist are added to AM_INIT_AUTOMAKE in newlib/acinclude.m4.
* macro AM_MAINTAINER_MODE is added to newlib/acinclude.m4.
* For the implied TEXINFO_TEX of '$(top_srcdir)/../texinfo/texinfo.tex',
an explicit TEXINFO_TEX is always relative to $(srcdir), so write the
same pathname in that form.
This is to prepare for the removal of the automake option '--cygnus'.
- Currently, frexpl() supports only the following cases.
1) LDBL_MANT_DIG == 64 or 113
2) 'long double' is equivalent to 'double'
This patch add support for LDBL_MANT_DIG == 53.
The code accessing the floating point control/status register, namely
#define __cfc1(__fcsr) __asm __volatile("cfc1 %0, $31" : "=r" (__fcsr)
does not compile with mips16. This changed the makefile to pass -mno-mips16 to avoid the following
compiler error:
mips-mti-elf fails with "Error: unrecognized opcode `cfc1 $3,$31'"
- Currently, printf("%La\n", 1e1000L) crashes with segv due to lack
of frexpl() function. With this patch, frexpl() function has been
implemented in libm to solve this issue.
Addresses: https://sourceware.org/pipermail/newlib/2021/018718.html
libm/machine/i386/f_ldexp.S:30: Warning: no instruction mnemonic suffix given and no register operands; using default for `fild'
libm/machine/i386/f_ldexpf.S:30: Warning: no instruction mnemonic suffix given and no register operands; using default for `fild'
fix this by adding the l mnemonic suffix
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
cc Aldy Hernandez <aldyh@redhat.com> and Andrew MacLeod <amacleod@redhat.com>,
they are author of new VRP analysis for GCC, just to make sure I didn't
mis-understanding or mis-interpreting anything on GCC site.
GCC 11 have better value range analysis, that give GCC more confidence
to perform more aggressive optimization, but it cause scalbn/scalbnf get
wrong result.
Using scalbn to demostrate what happened on GCC 11, see comments with VRP
prefix:
```c
double scalbn (double x, int n)
{
/* VRP RESULT: n = [-INF, +INF] */
__int32_t k,hx,lx;
...
k = (hx&0x7ff00000)>>20;
/* VRP RESULT: k = [0, 2047] */
if (k==0) {
/* VRP RESULT: k = 0 */
...
k = ((hx&0x7ff00000)>>20) - 54;
if (n< -50000) return tiny*x; /*underflow*/
/* VRP RESULT: k = -54 */
}
/* VRP RESULT: k = [-54, 2047] */
if (k==0x7ff) return x+x; /* NaN or Inf */
/* VRP RESULT: k = [-54, 2046] */
k = k+n;
if (k > 0x7fe) return huge*copysign(huge,x); /* overflow */
/* VRP RESULT: k = [-INF, 2046] */
/* VRP RESULT: n = [-INF, 2100],
because k + n <= 0x7fe is false, so:
1. -INF < [-54, 2046] + n <= 0x7fe(2046) < INF
2. -INF < [-54, 2046] + n <= 2046 < INF
3. -INF < n <= 2046 - [-54, 2046] < INF
4. -INF < n <= [0, 2100] < INF
5. n = [-INF, 2100] */
if (k > 0) /* normal result */
{SET_HIGH_WORD(x,(hx&0x800fffff)|(k<<20)); return x;}
if (k <= -54) {
/* VRP OPT: Evaluate n > 50000 as true...*/
if (n > 50000) /* in case integer overflow in n+k */
return huge*copysign(huge,x); /*overflow*/
else return tiny*copysign(tiny,x); /*underflow*/
}
k += 54; /* subnormal result */
SET_HIGH_WORD(x,(hx&0x800fffff)|(k<<20));
return x*twom54;
}
```
However give the input n = INT32_MAX, k = k+n will overflow, and then we
expect got `huge*copysign(huge,x)`, but new VRP optimization think
`n > 50000` is never be true, so optimize that into `tiny*copysign(tiny,x)`.
so the solution here is to moving the overflow handle logic before `k = k + n`.
- compiler is sometimes optimizing out the rounding check in
e_sqrt.c and ef_sqrt.c which uses two constants to create
an inexact operation
- there is a similar constant operation in s_tanh.c/sf_tanh.c
- make the one and tiny constants volatile to stop this
Use the more official fesetenv(FE_DFL_ENV) from _dll_crt0, thus
allowing to drop the _feinitialise declaration from fenv.h.
Provide a no-op _feinitialise in Cygwin as exportable symbol for really
old applications when _feinitialise was called from mainCRTStartup in
crt0.o.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Drop the Cygwin-specific fenv.cc and fenv.h file and use the equivalent
newlib functionality now, so we have at least one example of a user for
this new mechanism.
fenv.c: allow _feinitialise to be called from Cygwin startup code
fenv.h: add declarations for fegetprec and fesetprec for Cygwin only.
Fix a comment.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
So far the build mechanism in newlib only allowed to either define
machine-specific headers, or headers shared between all machines.
In some cases, architectures are sufficiently alike to share header
files between them, but not with other architectures. A good example
is ix86 vs. x86_64, which share certain traits with each other, but
not with other architectures.
Introduce a new configure variable called "shared_machine_dir". This
dir can then be used for headers shared between architectures.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
This Patch removes Soft Float code from MIPS.
Instead It adds the soft float code from RISCV
The code came from FreeBSD and assumes the FreeBSD softfp
implementation not the one with GCC. That was an overlooked and
fixed in the other fenv code already.
Signed-off-by: Eshan Dhawan <eshandhawan51@gmail.com>
This patch fixes the error found by Paul Zimmermann (see
https://homepages.loria.fr/PZimmermann/papers/#accuracy) regarding x
close to 1 and rather large y (specifically he found the case
powf(0x1.ffffeep-1,-0x1.000002p+27) which returns +Inf instead of the
correct value). We found 2 more values for x which show the same faulty
behaviour, and all 3 are fixed with this patch. We have tested all
combinations for x in [+1.fffdfp-1, +1.00020p+0] and y in
[-1.000007p+27, -1.000002p+27] and [1.000002p+27,1.000007p+27].
The current gamma, gamma_r, gammaf and gammaf_r functions return
|gamma(x)| instead of ln(|gamma(x)|) due to a change made back in 2002
to the __ieee754_gamma_r implementation. This patch fixes that, making
all of these functions map too their lgamma equivalents.
To fix the underlying bug, the __ieee754_gamma functions have been
changed to return gamma(x), removing the _r variants as those are no
longer necessary. Their names have been changed to __ieee754_tgamma to
avoid potential confusion from users.
Now that the __ieee754_tgamma functions return the correctly signed
value, the tgamma functions have been modified to use them.
libm.a now exposes the following gamma functions:
ln(|gamma(x)|):
__ieee754_lgamma_r
__ieee754_lgammaf_r
lgamma
lgamma_r
gamma
gamma_r
lgammaf
lgammaf_r
gammaf
gammaf_r
lgammal (on machines where long double is double)
gamma(x):
__ieee754_tgamma
__ieee754_tgammaf
tgamma
tgammaf
tgammal (on machines where long double is double)
Additional aliases for any of the above functions can be added if
necessary; in particular, I'm not sure if we need to include
__ieee754_gamma*_r functions (which would return ln(|(gamma(x)|).
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
Switch commit message to ASCII
For RISC-V targets without hardware FMA support, include the
common fma implementation to provide that API.
Signed-off-by: Keith Packard <keithp@keithp.com>
Like ARM, some RISC-V implementations have hardware sqrt. Support for
that can be detected at compile time, which the code did. However, the
filenames were incorrect so that both the risc-v specific and general
code were getting included in the resulting library.
Fix this by following the ARM model and #include'ing the general code
when the architecture-specific support is not available.
Signed-off-by: Keith Packard <keithp@keithp.com>
This is required to avoid colliding with files built from libm/common
that would end up with the same object name.
When libm.a was constructed from the individual sub-libraries, the
contents of the libm/common files would be replaced by that from
libm/machine/arm with the same name.
Signed-off-by: Keith Packard <keithp@keithp.com>
When HAVE_FAST_FMAF is set, use the vfma.f32 instruction, when
HAVE_FAST_FMA is set, use the vfma.f64 instruction.
Usually the compiler built-ins will already have inlined these
instructions, but provide these symbols for cases where that doesn't
work instead of falling back to the (inaccurate) common code versions.
Signed-off-by: Keith Packard <keithp@keithp.com>
Anything with fast FMA is assumed to have fast FMAF, along with
32-bit arms that advertise 32-bit FP support and __ARM_FEATURE_FMA
Signed-off-by: Keith Packard <keithp@keithp.com>
32-bit ARM processors with HW float (but not HW double) may define
__ARM_FEATURE_FMA, but that only means they have fast FMA for 32-bit
floats.
Signed-off-by: Keith Packard <keithp@keithp.com>
It was calling __math_uflow(0) instead of __math_uflowf(0), which
resulted in no exception being set on machines with exception support
for float but not double.
Signed-off-by: Keith Packard <keithp@keithp.com>
This removes the run-time configuration of errno support present in
portions of the math library and unifies all of the compile-time errno
configuration under a single parameter so that the whole library
is consistent.
The run-time support provided by _LIB_VERSION is no longer present in
the public API, although it is still used internally to disable errno
setting in some functions. Now that it is a constant, the compiler should
remove that code when errno is not supported.
This removes s_lib_ver.c as _LIB_VERSION is no longer variable.
Signed-off-by: Keith Packard <keithp@keithp.com>
The __ieee754 functions already return the right value in exception
cases, so don't modify those. Setting the library to _POSIX_/_IEEE_
mode now only affects whether errno is modified.
Signed-off-by: Keith Packard <keithp@keithp.com>
The y0, y1 and yn functions need separate conditions when x is zero as
that returns ERANGE instead of EDOM.
Also stop adjusting the return value from the __ieee754_y* functions
as that is already correct and we were just breaking it.
Signed-off-by: Keith Packard <keithp@keithp.com>
_IEEE_LIBM is the configuration value which controls whether the
original libm functions modify errno. Use that in the new math code as
well so that the resulting library is internally consistent.
Signed-off-by: Keith Packard <keithp@keithp.com>
C compilers may fold const values at compile time, so expressions
which try to elicit underflow/overflow by performing simple
arithemetic on suitable values will not generate the required
exceptions.
Work around this by replacing code which does these arithmetic
operations with calls to the existing __math_xflow functions that are
designed to do this correctly.
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
libm/math: Pass sign to __math_xflow instead of muliplying result
ld: libm.a(lib_a-fesetenv.o): in function `fesetenv':
newlib/libm/machine/arm/fesetenv.c:38: undefined reference to `vmsr_fpscr'
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Use the already existing stub files if possible. These files are
necessary to override the stub implementation with the machine-specific
implementation through the build system.
Reviewed-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
The previous fenv support for ARM used the soft-float implementation of
FreeBSD. Newlib uses the one from libgcc by default. They are not
compatible. Having an GCC incompatible soft-float fenv support in
Newlib makes no sense. A long-term solution could be to provide a
libgcc compatible soft-float support. This likely requires changes in
the GCC configuration. For now, provide a stub implementation for
soft-float multilibs similar to RISC-V.
Move implementation to one file and delete now unused files. Hide
implementation details. Remove function parameter names from header
file to avoid name conflicts.
Provide VFP support if __SOFTFP__ is not defined like glibc.
Reviewed-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
The IEEE spec for pow only has special case for x**0 and 1**y when x/y
are quiet NaN. For signaling NaN, the general case applies and these functions
should signal the invalid exception and return a quiet NaN.
Signed-off-by: Keith Packard <keithp@keithp.com>
These functions shared a pattern of re-converting the argument to bits
when returning +/-0. Skip that as the initial conversion still has the
sign bit.
Signed-off-by: Keith Packard <keithp@keithp.com>
Recent GCC appears to elide multiplication by 1, which causes snan
parameters to be returned unchanged through *iptr. Use the existing
conversion of snan to qnan to also set the correct result in *iptr
instead.
Signed-off-by: Keith Packard <keithp@keithp.com>
This fix comes from glibc, from files which originated from
the same place as the newlib files. Those files in glibc carry
the same license as the newlib files.
Bug 14155 is spurious underflow exceptions from Bessel functions for
large arguments. (The correct results for large x are roughly
constant * sin or cos (x + constant) / sqrt (x), so no underflow
exceptions should occur based on the final result.)
There are various places underflows may occur in the intermediate
calculations that cause the failures listed in that bug. This patch
fixes problems for the double version where underflows occur in
calculating the intermediate functions P and Q (in particular, x**-12
gets computed while calculating Q). Appropriate approximations are
used for P and Q for arguments at least 0x1p28 and above to avoid the
underflows.
For sufficiently large x - 0x1p129 and above - the code already has a
cut-off to avoid calculating P and Q at all, which means the
approximations -0.125 / x and 0.375 / x can't themselves cause
underflows calculating Q. This cut-off is heuristically reasonable
for the point beyond which Q can be neglected (based on expecting
around 0x1p-64 to be the least absolute value of sin or cos for large
arguments representable in double).
The float versions use a cut-off 0x1p17, which is less heuristically
justifiable but should still only affect values near zeroes of the
Bessel functions where these implementations are intrinsically
inaccurate anyway (bugs 14469-14472), and should serve to avoid
underflows (the float underflow for jn in bug 14155 probably comes
from the recurrence to compute jn). ldbl-96 uses 0x1p129, which may
not really be enough heuristically (0x1p143 or so might be safer - 143
= 64 + 79, number of mantissa bits plus total number of significant
bits in representation) but again should avoid underflows and only
affect values where the code is substantially inaccurate anyway.
ldbl-128 and ldbl-128ibm share a completely different implementation
with no such cut-off, which I propose to fix separately.
Signed-off-by: Keith Packard <keithp@keithp.com>
Add the missing mask for the decomposition of hi+lo which caused some
errors of 1-2 ULP.
This change is taken over from FreeBSD:
95436ce20d
Additionally I've removed some variable assignments which were never
read before being overwritten again in the next 2 lines.
This fix for k_tan.c is a copy from fdlibm version 5.3 (see also
http://www.netlib.org/fdlibm/readme), adjusted to use the macros
available in newlib (SET_LOW_WORD).
This fix reduces the ULP error of the value shown in the fdlibm readme
(tan(1.7765241907548024E+269)) to 0.45 (thereby reducing the error by
1).
This issue only happens for large numbers that get reduced by the range
reduction to a value smaller in magnitude than 2^-28, that is also
reduced an uneven number of times. This seems rather unlikely given that
one ULP is (much) larger than 2^-28 for the values that may cause an
issue. Although given the sheer number of values a double can
represent, it is still possible that there are more affected values,
finding them however will be quite hard, if not impossible.
We also took a look at how another library (libm in FreeBSD) handles the
issue: In FreeBSD the complete if branch which checks for values smaller
than 2^-28 (or rather 2^-27, another change done by FreeBSD) is moved
out of the kernel function and into the external function. This means
that the value that gets checked for this condition is the unreduced
value. Therefore the input value which caused a problem in the
fdlibm/newlib kernel tan will run through the full polynomial, including
the careful calculation of -1/(x+r). So the difference is really whether
r or y is used. r = y + p with p being the result of the polynomial with
1/3*x^3 being the largest (and magnitude defining) value. With x being
<2^-27 we therefore know that p is smaller than y (y has to be at least
the size of the value of x last mantissa bit divided by 2, which is at
least x*2^-51 for doubles) by enough to warrant saying that r ~ y. So
we can conclude that the general implementation of this special case is
the same, FreeBSD simply has a different philosophy on when to handle
especially small numbers.
Make line 47 in sf_trunc.c reachable. While converting the double
precision function trunc to the single precision version truncf an error
was introduced into the special case. This special case is meant to
catch both NaNs and infinities, however qNaNs and infinities work just
fine with the simple return of x (line 51). The only error occurs for
sNaNs where the same sNaN is returned and no invalid exception is
raised.
The comparison c == FP_INFINITE causes the function to return +inf as it
expects x = +inf to always be larger than y. This shortcut causes
several issues as it also returns +inf for the following cases:
- fdim(+inf, +inf), expected (as per C99): +0.0
- fdim(-inf, any non NaN), expected: +0.0
I don't see a reason to keep the comparison as all the infinity cases
return the correct result using just the ternary operation.
While testing the exp function we noticed some errors at the specified
magnitude. Within this range the exp function returns the input value +1
as an output. We chose to run a test of 1m exponentially spaced values
in the ranges [-2^-27,-2^-32] and [2^-32,2^-27] which showed 7603 and
3912 results with an error of >=0.5 ULP (compared with MPFR in 128 bit)
with the highest being 0.56 ULP and 0.53 ULP.
It's easy to fix by changing the magnitude at which the input value +1
is returned from <2^-28 to <2^-32 and using the polynomial instead. This
reduces the number of results with an error of >=0.5 ULP to 485 and 479
in above tests, all of which are exactly 0.5 ULP.
As we were already checking on exp we also took a look at expf. For expf
the magnitude where the input value +1 is returned can be increased from
<2^-28 to <2^-23 without accuracy loss for a slight performance
improvement. To ensure this was the correct value we tested all values
in the ranges [-2^-17,-2^-28] and [2^-28,2^-17] (~92.3m values each).
The single-precision trigonometric functions show rather high errors in
specific ranges starting at about 30000 radians. For example the sinf
procedure produces an error of 7626.55 ULP with the input
5.195880078125e+04 (0x474AF6CD) (compared with MPFR in 128bit
precision). For the test we used 100k values evenly spaced in the range
of [30k, 70k]. The issues are periodic at higher ranges.
This error was introduced when the double precision range reduction was
first converted to float. The shift by 8 bits always returns 0 as iq is
never higher than 255.
The fix reduces the error of the example above to 0.45 ULP, highest
error within the test set fell to 1.31 ULP, which is not perfect, but
still a significant improvement. Testing other previously erroneous
ranges no longer show particularly large accuracy errors.