E.g. known errors:
x = 0x1.000002c5e2e99p+0, y = 0x1.c9eee35374af6p+31 had an error of 639
ULP and is now correctly rounded.
x = 0x1.fffffd2e3e669p-1, y = 0x1.344c9823eb66cp+32 had an error of 428
ULP and is now correctly rounded.
__loadlocale never sets errno, but newlocale is supposed to
return ENOENT if the locale isn't valid.
Fixes: aefd8b5b51 ("Implement newlocale, freelocale, duplocale, uselocale")
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
If a CPU implements EL2 as its highest exception level then programs
using newlib may start in hypervisor mode. In that state it is not
trivial to switch into the various EL1 modes to configure the
individual exception stacks, so do not try.
The existing code checked if there was a chunk in free_list and
that the tail was not the next chunk.
The check if there is a chunk is not needed since it's already
known but the case of a single chunk in free_list needs to be
handled differently.
Commit 89eb4bce15 was pretty half-hearted, missing
the codepage character type tables and wctomb/mbtowc
mappings.
Fixes: 89eb4bce15 ("Cygwin: support KOI8-T codeset")
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Since Windows Vista, locale handling is converted from using numeric
locale identifiers (LCID) to using ISO5646 locale strings. In the
meantime Windows introduced new locales which don't even have a LCID
attached. Those were unusable in Cygwin because locale information
for these locales required to call the new locale functions taking
a locale string.
Convert Cygwin to drop LCIDs and use Windows ISO5646 locales instead.
The last place using LCIDs is the __set_charset_from_locale function.
Checking numerically is easier and uslay faster than checking strings.
However, this function is clearly a TODO
Used on Linux as default codeset for Tajik. There's no matching
Windows codepage, so fake it as CP103.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Set __OBSOLETE_MATH_DEFAULT to 0 if 'd' extension is supported (i.e.
__riscv_flen == 64).
Base on the comment for __OBSOLETE_MATH_DEFAULT:
> ... it assumes that the toolchain has ISO C99 support (hexfloat
> literals, standard fenv semantics), the target has IEEE-754 conforming
> binary32 float and binary64 double (not mixed endian) representation,
> standard SNaN representation, double and single precision arithmetics
> has similar latency and it has no legacy SVID matherr support, only
> POSIX errno and fenv exception based error handling.
Signed-off-by: Hau Hsu <hau.hsu@sifive.com>
This patch is for the sake of gnulib.
gnulib implements some form of a thread-safe setlocale variant
called setlocale_null_r, which is supposed to return the locale
strings in a thread-safe manner. This only succeeds if the system's
setlocale already handles this thread-safe, otherwise gnulib adds
some locking on its own.
Newlib's setlocale always writes the global string array holding the
LC_ALL value anew on each invocation of setlocale(LC_ALL, NULL).
Since that doesn't allow to call setlocale(LC_ALL, NULL) in a
thread-safe manner, so locking in gnulib is required.
And here's the problem...
The lock is decorated as dllexport when building for Cygwin. This
collides with the default behaviour of ld to export all symbols.
If it finds one decorated symbol, it will only export this symbol
to the DLL import lib.
Change setlocale so that it writes the global string array
holding the LC_ALL value at the time the locale gets changed.
On setlocale(LC_ALL, NULL), just return the pointer to the
global LC_ALL string array, just as in GLibc. The burden of
doing so is negligibly for all targets, but adds thread-safety
for gnulib's setlocal_null_r() function, and gnulib can drop
the lock entirely when building for Cygwin.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
When compiling Newlib for arm targets with GCC 12.1 onward, the
passing of architecture extension information to the assembler is
automatic, making the use of .fpu and .arch_extension directives
in assembly files redundant.
With older versions of GCC, however, these directives must be
hard-coded into the `arm/setjmp.S' file to allow the assembly of
instructions concerning the storage and subsequent reloading of the
floating point registers to/from the jump buffer, respectively.
This patch conditionally adds the `.fpu vfpxd' and `.arch_extension
mve' directives based on compile-time preprocessor macros concerning
GCC version and target architectural features, such that both the
assembly and linking of setjmp.S succeeds for older versions of
Newlib.
We don't have floating-point exception or non-default rounding mode
support for the RISC-V soft-float environment, `feraiseexcept' and
`fesetround' do nothing unless the `__riscv_flen' macro has been set.
Therefore following ISO C language requirements[1] only define macros
for soft float that correspond to actually supported floating-point
environment features, removing failures from GCC testing such as:
FAIL: gcc.dg/torture/fp-int-convert-timode-3.c -O0 execution test
FAIL: gcc.dg/torture/fp-int-convert-timode-4.c -O0 execution test
References:
[1] "Programming languages -- C", ISO/IEC 9899:2023, working draft --
September 3, 2022, Section 7.6 "Floating-point environment <fenv.h>"
Fixes: 7040b2de08 ("Add RISC-V port for libm")
Signed-off-by: Maciej W. Rozycki <macro@embecosm.com>
This is still not properly resolving <https://gcc.gnu.org/PR85463>
'[nvptx] "exit" in offloaded region doesn't terminate process', but is
one step into that direction, and allows for simplifying some GCC code.
... as implemented for GCN in 'newlib/libc/sys/amdgcn/*' files, but (for now)
still adding to the catch-all 'newlib/libc/machine/nvptx/misc.c' file.
This is necessary for the GCC/Fortran I/O system, for example.
Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
Such a hard-coded ELIX level restriction is only being applied for nvptx
newlib -- but we'd actually like higher levels' functions available there,
too. (Users continue to be able to override this via newlib 'configure',
as for every other newlib target.)
This already enables GCC test cases that currently FAIL due to
'unresolved symbol strndup' ('gcc.dg/builtin-dynamic-object-size-0.c'), or
'unresolved symbol mempcpy' ('gcc.dg/torture/pr45636.c'), for example.
Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
Given that nvptx newlib currently restricts itself to ELIX level 1, this
is not already a problem. However, in the following we'd like to lift
that restriction, and then run into:
[...]/newlib/libc/ssp/stack_protector.c: In function ‘__stack_chk_init’:
[...]/newlib/libc/ssp/stack_protector.c:31:1: sorry, unimplemented: global constructors not supported on this target
31 | }
| ^
GCC patch "nvptx: Support global constructors/destructors via 'collect2'"
has been posted, but not yet accepted. Until that is resolved, use the
same manual SSP setup as for GCN.
This implements a set of vectorized math routines to be used by the
compiler auto-vectorizer. Versions for vectors with 2 lanes up to
64 lanes (in powers of 2) are provided.
These routines are based on the scalar versions of the math routines in
libm/common, libm/math and libm/mathfp. They make extensive use of the GCC
C vector extensions and GCN-specific builtins in GCC.
The libgloss port has been reaching back into newlib internals for a
single header whose contents have been frozen for almost a decade.
To break this backwards libgloss->newlib dependency, move the acle
header to the srcroot include/ so everyone can use the same copy.
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags. Save the PAC value in the jump buffer so that
longjmp can only return to the authenticated location.
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strlen:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strcmp:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
Augment the arm_asm.h header file to simplify function prologues and
epilogues whilst adding support for PACBTI enablement via macros for
hand-written assembly functions. For PACBTI, both prologues/epilogues
as well as cfi-related directives are automatically amended
accordingly, depending on the compile-time mbranch-protection argument
values.
It defines the following preprocessor macros:
* HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
leaf functions.
* PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
to the stack irrespective of whether the ip register is clobbered in
the function or not.
* STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
the push list as necessary in the prologue to ensure stack
alignment preservation at the start of assembly function. The
epilogue behavior is likewise affected by this flag, ensuring any
pushed dummy registers also get popped on function return.
It also defines the following assembler macros:
* prologue: In addition to pushing any callee-saved registers onto
the stack, it generates any requested pacbti instructions.
Pushed registers are specified via the optional `first', `last',
`push_ip' and `push_lr' macro argument parameters.
when a single register number is provided, it pushes that
register. When two register numbers are provided, they specify a
rage to save. If push_ip and/or push_lr are non-zero, the
respective registers are also saved. Stack alignment is requested
via the `align` argument, which defaults to the value of
STACK_ALIGN_ENFORCE, unless manually overridden.
For example:
prologue push_ip=1 -> push {ip}
prologue push_ip=1, align8=1 -> push {r2, ip}
prologue push_ip=1, push_lr=1 -> push {ip, lr}
prologue 1 -> push {r1}
prologue 1, align8=1 -> push {r0, r1}
prologue 1 push_ip=1 -> push {r1, ip}
prologue 1 4 -> push {r1-r4}
prologue 1 4 push_ip=1 -> push {r1-r4, ip}
* epilogue: pops registers off the stack and emits pac key signing
instruction, if requested. The `first', `last', `push_ip',
`push_lr' and `align' function as per the prologue macro,
generating pop instead of push instructions.
Stack alignment is enforced via the following helper macro
call-chain:
{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
_preprocess_reglist1 -> {_prologue|_epilogue}
Finally, the necessary cfi directives for adding debug information
to prologue and epilogue are generated via the following macros:
* cfisavelist - prologue macro helper function, generating
necessary .cfi_offset directives associated with push instruction.
Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
to generate the following:
push {r1-r2, ip}
.cfi_adjust_cfa_offset 12
.cfi_offset 143, -4
.cfi_offset 2, -8
.cfi_offset 1, -12
* cfirestorelist - epilogue macro helper function, emitting
.cfi_restore instructions prior to resetting the cfa offset. As
such, calling `epilogue 1 2 push_ip=1' will produce:
pop {r1-r2, ip}
.cfi_register 143, 12
.cfi_restore 2
.cfi_restore 1
.cfi_def_cfa_offset 0
... so that all of 'exit', '_exit', '_Exit' work. 'exit' thus becomes the
standard 'newlib/libc/stdlib/exit.c' -- and functions registered via 'atexit'
are now called at return from 'main' or manual 'exit' invocation.
It seems there is a swapped logic in one of the subcases of
setjmp.S for MIPS: when the FPU registers are 64-bit within
a 32-bit aligned jmp_buf, the code realigns the pointers
before doing 64-bit writes, but the branch logic is swapped:
we must avoid the address adjustement when bit 2 is zero
(that is, the address is already 8-byte aligned).
This always triggers an address error when run, as tested
on a MIPS VR4300 with O64 ABI.
The implementation of expf() explains how approximation in the range [0 - 0.34] is done. The comment describes the "Reme" algorithm for constructing the polynomial. This is a typo and should be the "Remez" algorithm. The remez algorithm (or minimax) is used to calculate the coefficients of polynomials in other implementations of exp(0 and log().
See more:
https://en.wikipedia.org/wiki/Remez_algorithm
This implements a set of vectorized math routines to be used by the
compiler auto-vectorizer. Versions for vectors with 2 lanes up to
64 lanes (in powers of 2) are provided.
These routines are based on the scalar versions of the math routines in
libm/common, libm/math and libm/mathfp. They make extensive use of the GCC
C vector extensions and GCN-specific builtins in GCC.
As per the arm Procedure Call Standard for the Arm Architecture
section 6.1.2 [1], VFP registers s16-s31 (d8-d15, q4-q7) must be
preserved across subroutine calls.
The current setjmp/longjmp implementations preserve only the core
registers, with the jump buffer size too small to store the required
co-processor registers.
In accordance with the C Library ABI for the Arm Architecture
section 6.11 [2], this patch sets _JBTYPE to long long adjusting
_JBLEN to 20.
It also emits vfp load/store instructions depending on architectural
support, predicated at compile time on ACLE feature-test macros.
[1] https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst
[2] https://github.com/ARM-software/abi-aa/blob/main/clibabi32/clibabi32.rst