Zfinx/Zdinx are new extensions ratified in 2022, it similar to F/D extensions,
support hard float operation for single/double precision, but the difference
between Zfinx/Zdinx and F/D is Zfinx/Zdinx is operating under general purpose
registers rather than dedicated floating-point registers.
This patch improve the hard float support detection for RISC-V port, so
that Zfinx/Zdinx can have better/right performance.
Co-authored-by: Jesse Huang <jesse.huang@sifive.com>
Disable at least m68010 and m68020. These processors certainly do not
like unaligned accesses.
Signed-off-by: Remy Bohmer <linux@bohmer.net>
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Signed-off-by: Austin Morton <austinpmorton@gmail.com>
Signed-off-by: Chris Packham <judge.packham@gmail.com>
Rename s_nearbyint.c, s_fdim.c and s_scalbln.c to remove conflicts
Remove functions that are not needed from above files
Modify include paths
Add includes missing in cygwin build
Add missing types
Create Makefiles
Create header files to resolve dependencies between directories
Modify some instances of unsigned long to uint64_t for 32 bit platforms
Add HAVE_FPMATH_H
When compiling Newlib for arm targets with GCC 12.1 onward, the
passing of architecture extension information to the assembler is
automatic, making the use of .fpu and .arch_extension directives
in assembly files redundant.
With older versions of GCC, however, these directives must be
hard-coded into the `arm/setjmp.S' file to allow the assembly of
instructions concerning the storage and subsequent reloading of the
floating point registers to/from the jump buffer, respectively.
This patch conditionally adds the `.fpu vfpxd' and `.arch_extension
mve' directives based on compile-time preprocessor macros concerning
GCC version and target architectural features, such that both the
assembly and linking of setjmp.S succeeds for older versions of
Newlib.
We don't have floating-point exception or non-default rounding mode
support for the RISC-V soft-float environment, `feraiseexcept' and
`fesetround' do nothing unless the `__riscv_flen' macro has been set.
Therefore following ISO C language requirements[1] only define macros
for soft float that correspond to actually supported floating-point
environment features, removing failures from GCC testing such as:
FAIL: gcc.dg/torture/fp-int-convert-timode-3.c -O0 execution test
FAIL: gcc.dg/torture/fp-int-convert-timode-4.c -O0 execution test
References:
[1] "Programming languages -- C", ISO/IEC 9899:2023, working draft --
September 3, 2022, Section 7.6 "Floating-point environment <fenv.h>"
Fixes: 7040b2de08 ("Add RISC-V port for libm")
Signed-off-by: Maciej W. Rozycki <macro@embecosm.com>
This is still not properly resolving <https://gcc.gnu.org/PR85463>
'[nvptx] "exit" in offloaded region doesn't terminate process', but is
one step into that direction, and allows for simplifying some GCC code.
... as implemented for GCN in 'newlib/libc/sys/amdgcn/*' files, but (for now)
still adding to the catch-all 'newlib/libc/machine/nvptx/misc.c' file.
This is necessary for the GCC/Fortran I/O system, for example.
Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
The libgloss port has been reaching back into newlib internals for a
single header whose contents have been frozen for almost a decade.
To break this backwards libgloss->newlib dependency, move the acle
header to the srcroot include/ so everyone can use the same copy.
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags. Save the PAC value in the jump buffer so that
longjmp can only return to the authenticated location.
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strlen:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strcmp:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
Augment the arm_asm.h header file to simplify function prologues and
epilogues whilst adding support for PACBTI enablement via macros for
hand-written assembly functions. For PACBTI, both prologues/epilogues
as well as cfi-related directives are automatically amended
accordingly, depending on the compile-time mbranch-protection argument
values.
It defines the following preprocessor macros:
* HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
leaf functions.
* PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
to the stack irrespective of whether the ip register is clobbered in
the function or not.
* STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
the push list as necessary in the prologue to ensure stack
alignment preservation at the start of assembly function. The
epilogue behavior is likewise affected by this flag, ensuring any
pushed dummy registers also get popped on function return.
It also defines the following assembler macros:
* prologue: In addition to pushing any callee-saved registers onto
the stack, it generates any requested pacbti instructions.
Pushed registers are specified via the optional `first', `last',
`push_ip' and `push_lr' macro argument parameters.
when a single register number is provided, it pushes that
register. When two register numbers are provided, they specify a
rage to save. If push_ip and/or push_lr are non-zero, the
respective registers are also saved. Stack alignment is requested
via the `align` argument, which defaults to the value of
STACK_ALIGN_ENFORCE, unless manually overridden.
For example:
prologue push_ip=1 -> push {ip}
prologue push_ip=1, align8=1 -> push {r2, ip}
prologue push_ip=1, push_lr=1 -> push {ip, lr}
prologue 1 -> push {r1}
prologue 1, align8=1 -> push {r0, r1}
prologue 1 push_ip=1 -> push {r1, ip}
prologue 1 4 -> push {r1-r4}
prologue 1 4 push_ip=1 -> push {r1-r4, ip}
* epilogue: pops registers off the stack and emits pac key signing
instruction, if requested. The `first', `last', `push_ip',
`push_lr' and `align' function as per the prologue macro,
generating pop instead of push instructions.
Stack alignment is enforced via the following helper macro
call-chain:
{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
_preprocess_reglist1 -> {_prologue|_epilogue}
Finally, the necessary cfi directives for adding debug information
to prologue and epilogue are generated via the following macros:
* cfisavelist - prologue macro helper function, generating
necessary .cfi_offset directives associated with push instruction.
Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
to generate the following:
push {r1-r2, ip}
.cfi_adjust_cfa_offset 12
.cfi_offset 143, -4
.cfi_offset 2, -8
.cfi_offset 1, -12
* cfirestorelist - epilogue macro helper function, emitting
.cfi_restore instructions prior to resetting the cfa offset. As
such, calling `epilogue 1 2 push_ip=1' will produce:
pop {r1-r2, ip}
.cfi_register 143, 12
.cfi_restore 2
.cfi_restore 1
.cfi_def_cfa_offset 0
... so that all of 'exit', '_exit', '_Exit' work. 'exit' thus becomes the
standard 'newlib/libc/stdlib/exit.c' -- and functions registered via 'atexit'
are now called at return from 'main' or manual 'exit' invocation.
It seems there is a swapped logic in one of the subcases of
setjmp.S for MIPS: when the FPU registers are 64-bit within
a 32-bit aligned jmp_buf, the code realigns the pointers
before doing 64-bit writes, but the branch logic is swapped:
we must avoid the address adjustement when bit 2 is zero
(that is, the address is already 8-byte aligned).
This always triggers an address error when run, as tested
on a MIPS VR4300 with O64 ABI.
As per the arm Procedure Call Standard for the Arm Architecture
section 6.1.2 [1], VFP registers s16-s31 (d8-d15, q4-q7) must be
preserved across subroutine calls.
The current setjmp/longjmp implementations preserve only the core
registers, with the jump buffer size too small to store the required
co-processor registers.
In accordance with the C Library ABI for the Arm Architecture
section 6.11 [2], this patch sets _JBTYPE to long long adjusting
_JBLEN to 20.
It also emits vfp load/store instructions depending on architectural
support, predicated at compile time on ACLE feature-test macros.
[1] https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst
[2] https://github.com/ARM-software/abi-aa/blob/main/clibabi32/clibabi32.rst
Call __builtin_gcn_get_stack_limit and __builtin_gcn_first_call_this_thread_p
to reduce dependency on some register/layout assumptions by using the new
GCC mainline (GCC 13) builtins, if they are available. If not, the existing
code is used.
The first attempt to support the 64-bit mode had two bugs:
1. The saved general-purpose register 31 value was overwritten with the saved
link register value.
2. The link register was saved and restored using 32-bit instructions.
Use 64-bit store/load instructions to save/restore the link register. Make
sure that the general-purpose register 31 and the link register storage areas
do not overlap.
In a follow up patch, struct _reent is optionally replaced by dedicated
thread-local objects. In this case,_REENT is optionally defined to NULL. Add
the _REENT_IS_NULL() macro to disable this check on demand.
Add a _REENT_CLEANUP() macro to encapsulate access to the
__cleanup member of struct reent. This will help to replace the
struct member with a thread-local storage object in a follow up
patch.
Add a _REENT_STDERR() macro to encapsulate access to the _stderr
member of struct reent. This will help to replace the struct
member with a thread-local storage object in a follow up patch.
Add a _REENT_STDOUT() macro to encapsulate access to the _stdout
member of struct reent. This will help to replace the struct
member with a thread-local storage object in a follow up patch.
Add a _REENT_STDIN() macro to encapsulate access to the _stdin
member of struct reent. This will help to replace the struct
member with a thread-local storage object in a follow up patch.
Add a _REENT_ERRNO() macro to encapsulate the access to the
_errno member of struct reent. This will help to replace the
structure member with a thread-local storage object in a follow
up patch.
Replace uses of __errno_r() with _REENT_ERRNO(). Keep __errno_r() macro for
potential users outside of Newlib.
Remove the pointer indirection through the read-only _global_impure_ptr and
directly use a externally visible _impure_data object of type struct _reent.
This enables the static initialization of global data structures in a follow up
patch. In addition, we get rid of a machine-specific file.
The recent makefile reorganization broke the amdgcn port by creating
duplicate __malloc_lock symbols. This patch fixes the problem by renaming
the malloc_support.c file to mlock.c, thus overriding the default symbol
properly. Actually, I'm not sure how this ever worked?
Convert all the libc/ subdir makes into the top-level Makefile. This
allows us to build all of libc from the top Makefile without using any
recursive make calls. This is faster and avoids the funky lib.a logic
where we unpack subdir archives to repack into a single libc.a. The
machine override logic is maintained though by way of Makefile include
ordering, and source file accumulation in libc_a_SOURCES.
There's a few dummy.c files that are no longer necessary since we aren't
doing the lib.a accumulating, so punt them.
The winsup code has been pulling the internal newlib ssp library out,
but that doesn't exist anymore, so change that to pull the objects.
Rather than define per-object rules in the Makefile, have small files
that define & include the right content. This simplifies the build
rules, and makes understanding the source a little easier (imo) as it
makes all the subdirs behave the same: you have 1 source file and it
produces 1 object. It's also about the same amount of boiler plate,
without having to define custom build rules that can fall out of sync.
We also realign the free & pvalloc definitions: common code puts these
in malloc.o & valloc.o respectively, not in free.o & pvalloc.o objects.
This will also be important as we merge the libc.a build into the top
dir since it relies on a single flat list of objects for overrides.
The mallopt symbol is defined in tiny-malloc.c, not mallocr.c, but
the Makefile in here tries to compile it out of the latter. This
leads to mallopt never being defined.
The build also creates mallinfo.o & mallopt.o & mallstats.o objects
to override common ones, but the common dir doesn't use these names.
Instead, it places these all in mstats.o.
So move the build define logic to a dedicated file and compile it
directly to make things a bit simpler while fixing the missing func
and aligning objects with the cmomon code.
This kills off the last configure script under libc/ and folds it
into the top newlib configure script. The a lot of the logic was
already in the top configure script, so move what's left into a
libc/acinclude.m4 file.
Remove dependency on __sdidinit member of struct _reent to check
object initialization. Like __sdidinit, the __cleanup member of
struct _reent is initialized in the __sinit() function. Checking
initialization against __cleanup serves the same purpose and will
reduce overhead in the __sfp() function in a follow up patch.
The crt0.o was handled in a subdir-by-subdir basis: it would be compiled
in one (e.g. libc/sys/$arch/), then copied up one level (libc/sys/), then
copied up another (libc/) before finally being copied & installed in the
top newlib dir. The libc/sys/ copy was cleaned up, and then the top dir
was changed to copy it directly out of the libc/sys/$arch/ dir. But the
libc/sys/ copy to libc/ was left behind. Clean that up now too.
When migrating the manual to the top-level, the include order was
sorted by name of the subdir. But this changed the chapter order
of the manual in the process. Change the sorting back to match
existing chapters and update the comments to explain.
Using xxx_LIBADD, xxx_DEPENDENCIES, and EXTRA_xxx_SOURCES is one way of
conditionally including files into a target. But it's meant more for the
cases where the variables added to LIBADD & DEPENDENCIES are constructed
via substitution (e.g. AC_SUBST) or other dynamic methods. With Automake
conditionals, then the much simpler form is to conditionally append to
the xxx_SOURCES variable and let Automake sort everything out.
Replace the custom build rules (which require copying & pasting from the
current Makefile) with small stub files. This allows us to drop the rules
entirely and let Automake provide everything.
These subdirs don't actually use anything from libm. The common dir
in particular only has 4 header files, and none are included here.
The xstormy16 code has a comment mentioning why this hack is here, but
it refers to code that was removed when its configure script was merged
up a level.