Commit Graph

533 Commits

Author SHA1 Message Date
Mike Frysinger 8fc6b4b30e newlib: regen aclocal.m4 after autoconf update
The configure scripts were regenerated with 2.69 for the newlib-4.2.0
release in 484d2ebf8d, but the aclocal
files were not.  Do that now to avoid confusion between the two as to
which version of autoconf was used.
2022-01-12 07:01:18 -05:00
Sebastian Huber ebe756e466 powerpc/setjmp: Improve RTEMS support
For some RTEMS multilibs, the FPU and Altivec units are disabled during
interrupt handling.  Do not save and restore the corresponding registers in
this case.
2022-01-11 09:15:03 +01:00
Mike Frysinger ed20821a40 newlib: migrate from INCLUDES to AM_CPPFLAGS
Since automake deprecated the INCLUDES name in favor of AM_CPPFLAGS,
change all existing users over.  The generated code is the same since
the two variables have been used in the same exact places by design.

There are other cleanups to be done, but lets focus on just renaming
here so we can upgrade to a newer automake version w/out triggering
new warnings.
2022-01-05 20:29:53 -05:00
Jeff Johnston 484d2ebf8d Update newlib to 4.2.0 2021-12-31 12:46:13 -05:00
Jon Turney bfcabeb876
newlib: Regenerate autotools files 2021-12-29 22:45:06 +00:00
Jon Turney a4e734fcdb
newlib: Remove automake option 'cygnus'
The 'cygnus' option was removed from automake 1.13 in 2012, so the
presence of this option prevents that or a later version of automake
being used.

A check-list of the effects of '--cygnus' from the automake 1.12
documentation, and steps taken (where possible) to preserve those
effects (See also this thread [1] for discussion on that):

[1] https://lists.gnu.org/archive/html/bug-automake/2012-03/msg00048.html

1. The foreign strictness is implied.

Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4

2. The options no-installinfo, no-dependencies and no-dist are implied.

Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4

Future work: Remove no-dependencies and any explicit header dependencies,
and use automatic dependency tracking instead.  Are there explicit rules
which are now redundant to removing no-installinfo and no-dist?

3. The macro AM_MAINTAINER_MODE is required.

Already present in newlib/acinclude.m4

Note that maintainer-mode is still disabled by default.

4. Info files are always created in the build directory, and not in the
source directory.

This appears to be an error in the automake documentation describing
'--cygnus' [2]. newlib's info files are generated in the source
directory, and no special steps are needed to keep doing that.

[2] https://lists.gnu.org/archive/html/bug-automake/2012-04/msg00028.html

5. texinfo.tex is not required if a Texinfo source file is specified.
(The assumption is that the file will be supplied, but in a place that
automake cannot find.)

This effect is overriden by an explicit setting of the TEXINFO_TEX
variable (the directory part of which is fed into texi2X via the
TEXINPUTS environment variable).

6. Certain tools will be searched for in the build tree as well as in the
user's PATH. These tools are runtest, expect, makeinfo and texi2dvi.

For obscure automake reasons, this effect of '--cygnus' is not active
for makeinfo in newlib's configury.

However, there appears to be top-level configury which selects in-tree
runtest, expect and makeinfo, if present. So, if that works as it
appears, this effect is preserved. If not, this may cause problem if
anyone is building those tools in-tree.

This effect is not preserved for texi2dvi. This may cause problems if
anyone is building texinfo in-tree.

If needed, explicit checks for those tools looking in places relative to
$(top_srcdir)/../ as well as in PATH could be added.

7. The check target doesn't depend on all.

This effect is not preseved. The check target now depends on the all
target.

This concern seems somewhat academic given the current state of the
testsuite.

Also note that this doesn't touch libgloss.
2021-12-29 22:45:04 +00:00
Jon Turney 8e166351b3
newlib: Regenerate autotools files 2021-12-29 22:45:03 +00:00
Jon Turney 639cb7ec1a
newlib: Regenerate all autotools files
Regenerate all aclocal.m4, configure and Makefile.in files.
2021-12-09 21:41:35 +00:00
Mike Frysinger 6226bad0ea change _COMPILING_NEWLIB to _LIBC
Use the same name as glibc & gnulib to indicate "newlib itself is
being compiled".  This also harmonizes the codebase a bit in that
_LIBC was already used in places instead of _COMPILING_NEWLIB.

Building for bfin-elf, mips-elf, and x86_64-pc-cygwin produces
the same object code.
2021-11-15 19:32:23 -05:00
Mike Frysinger 372093689c define _COMPILING_NEWLIB for all targets when compiling
The _COMPILING_NEWLIB symbol is for declaring "the code is being
compiled for newlib itself" so headers can change behavior vs the
header being used by users (who should get the normal clean API).
Unfortunately, this symbol is defined inconsistently leading to it
only being useful for a few subsections of the tree.

Pull it out so that it's defined all the time for all targets.
2021-11-11 17:26:45 -05:00
Mike Frysinger 328e1b1a3d newlib: mips: delete glibc-specific logic
This code looks like it's written to be copied & pasted between diff
C libraries and relies on _LIBC only being used with glibc.  This will
break when newlib changes from _COMPILING_NEWLIB to _LIBC, so delete
the glibc-specific logic ahead of time.
2021-11-09 19:21:13 -05:00
Mike Frysinger 59e83de0b1 libgloss/newlib: update configure.ac in Makefile.in files
The maintainer rules refer to configure.in directly, so update that
after renaming all the configure.ac files.
2021-11-06 14:14:49 -04:00
Mike Frysinger 920617998e libgloss/newlib: rename configure.in to configure.ac
The .in name has been deprecated for a long time in favor of .ac.
2021-09-13 10:14:37 -04:00
Roger Sayle 6bb96d13a2 nvptx: Emulate clock and other machine stubs.
This patch to the libc/machine/nvptx port of newlib implements an
approximation of "clock" and provides some additional stub routines.
These changes not only reduce the number of (link) failures in the GCC
testsuite when targeting nvptx-none, but also allow the NIST scimark4
benchmark to compile and run without modification.

newlib already contains support for backends to provide their own
clock implementations via -DCLOCK_PROVIDED.  That functionality is
used here to return an approximate elapsed time based on the NVidia
GPU's clock64 cycle counter.  Although not great, this is better than
the current behaviour of link error from the unresolved symbol
_times_r.

The other part of the patch is to add a small number of stub functions
to nvptx's misc.c.  Adding isatty, for example, resolves linking
problems in libc from the dependency in __smakebuf_r, and the sync
stub, for example, fixes the failure with GCC's
testsuite/gfortran.dg/ISO_Fortran_binding_14.f90 [which simply tests
that gfortran can call a/any C function].

newlib/
        configure.host: Add -DCLOCK_PROVIDED to newlib_cflags on nvptx*.

newlib/libc/machine/nvptx
        Makefile.am: Add clock.c to lib_a_SOURCES.
        clock.c: New source file to implement/approximate clock().
        misc.c: Add stubs for fstat, isatty, open, sync and unlink.
2021-08-25 10:20:27 +02:00
Richard Earnshaw 2a3a03972b aarch64: support binary mode for opening files
Newlib for aarch64 uses libgloss for the backend.  One common libgloss
implementation is the 'rdimon' implementation, which uses the Arm
Semihosting protocol.  In order to support a remote host that runs on
Windows we need to know whether a file is to be opened in binary or
text mode.  That means that we need to preserve this information via
O_BINARY until we know what the libgloss binding will be.

This patch simply copies the arm implementation from sys/arm/sys and
puts it in machine/aarch64/sys, because we don't have a 'sys' subtree
on aarch64.
2021-05-26 15:17:11 +01:00
Corinna Vinschen cc19109af9 Cygwin: don't export _feinitialise from newlib
Use the more official fesetenv(FE_DFL_ENV) from _dll_crt0, thus
allowing to drop the _feinitialise declaration from fenv.h.

Provide a no-op _feinitialise in Cygwin as exportable symbol for really
old applications when _feinitialise was called from mainCRTStartup in
crt0.o.

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2021-04-13 12:55:34 +02:00
Corinna Vinschen 3b22d72255 fenv: drop Cygwin-specific implementation in favor of newlib code
Drop the Cygwin-specific fenv.cc and fenv.h file and use the equivalent
newlib functionality now, so we have at least one example of a user for
this new mechanism.

fenv.c: allow _feinitialise to be called from Cygwin startup code

fenv.h: add declarations for fegetprec and fesetprec for Cygwin only.
        Fix a comment.

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2021-04-13 12:55:34 +02:00
Corinna Vinschen 05753071c0 fenv: Move shared x86 sys/fenv.h from x86_64 to shared_x86
drop matching symlink in i386

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2021-04-13 12:55:33 +02:00
Corinna Vinschen 79ac4237dc fenv: add missing declarations to x86 fenv.h
feenableexcept, fedisableexcept and fegetexcept were
accidentally missing in the x86 fenv.h

Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
2021-04-13 12:55:33 +02:00
Marcus Comstedt 26478769a6 RISC-V: Fix optimized strcmp on big endian 2021-02-25 12:14:18 +01:00
Eshan dhawan 55a6e49a08 Removed Soft float from MIPS
This Patch removes Soft Float code from MIPS.
Instead It adds the soft float code from RISCV

The code came from FreeBSD and assumes the FreeBSD softfp
implementation not the one with GCC. That was an overlooked and
fixed in the other fenv code already.

Signed-off-by: Eshan Dhawan <eshandhawan51@gmail.com>
2021-02-05 10:32:16 +01:00
Jeff Johnston 415fdd4279 Bump up newlib version to 4.1.0 2020-12-18 18:50:49 -05:00
Sebastian Huber 6cc47c4c33 arm: Fix memchr() for Armv8-R
The Cortex-R52 processor is an Armv8-R processor with a NEON unit.  This
fix prevents conflicting architecture profiles A/R errors issued by the
linker.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2020-12-14 16:10:30 -05:00
Jeff Johnston 14123c991b Bump newlib release to 4.0.0 2020-12-11 14:37:12 -05:00
Jojo R 8315a90822 Port of C-SKY for newlib
Contributor list:  

  - Lifang Xia <lifang_xia@c-sky.com>  
  - Jojo R <jiejie_rong@c-sky.com>
  - Xianmiao Qu <xianmiao_qu@c-sky.com>
  - Yunhai Shang <yunhai_shang@c-sky.com>
2020-09-23 15:08:59 -04:00
Eshan dhawan b7a6e02dc6 arm: Fix fenv support
The previous fenv support for ARM used the soft-float implementation of
FreeBSD.  Newlib uses the one from libgcc by default.  They are not
compatible.  Having an GCC incompatible soft-float fenv support in
Newlib makes no sense.  A long-term solution could be to provide a
libgcc compatible soft-float support.  This likely requires changes in
the GCC configuration.  For now, provide a stub implementation for
soft-float multilibs similar to RISC-V.

Move implementation to one file and delete now unused files.  Hide
implementation details.  Remove function parameter names from header
file to avoid name conflicts.

Provide VFP support if __SOFTFP__ is not defined like glibc.

Reviewed-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-29 06:58:17 +02:00
PkmX via Newlib 123b806523 riscv: fix integer wraparound in memcpy
This patch fixes a bug in RISC-V's memcpy implementation where an
integer wraparound occurs when src + size < 8 * sizeof(long), causing
the word-sized copy loop to be incorrectly entered.

Signed-off-by: Chih-Mao Chen <cmchen@andestech.com>
2020-07-27 10:14:34 +02:00
Eshan dhawan via Newlib 104caeb7b1 Removed #ifndef _ARM_PCS_VFP_ from sys/fenv.h for arm
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-06 13:18:28 +02:00
Eshan dhawan via Newlib 65918715a0 mips fenv support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-03 10:41:45 +02:00
Eshan dhawan via Newlib 03bf9f431c SPARC fenv support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-03 10:41:45 +02:00
Eshan dhawan via Newlib fd5e27d362 fenv aarch64 support
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-07-02 12:12:39 +02:00
Eshan dhawan via Newlib a97bdf100f fenv support arm
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-06-09 21:13:17 -04:00
Eshan dhawan via Newlib e6ce6f1430 hard float support for PowerPC taken from FreeBSD
Signed-off-by: Eshan dhawan <eshandhawan51@gmail.com>
2020-06-03 11:17:47 +02:00
Richard Earnshaw f973a7d8be arm: Finish moving newlib to unified syntax for Thumb1
Most code in newlib already uses unified syntax, but just a couple of
laggards remain.  This patch removes these and means the the entire
code base has now been converted.
2020-03-02 13:33:11 +00:00
Keith Packard 9042d0ce65 Use remove-advertising-clause script to edit BSD licenses
This edits licenses held by Berkeley and NetBSD, both of which
have removed the advertising requirement from their licenses.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-01-29 19:03:31 +01:00
Jeff Johnston 4e78f8ea16 Bump up newlib release to 3.3.0 2020-01-21 15:17:43 -05:00
Keith Packard 5377a84776 riscv: Map between ieeefp.h exception bits and RISC-V FCSR bits
If we had architecture-specific exception bits, we could just set them
to match the processor, but instead ieeefp.h is shared by all targets
so we need to map between the public values and the register contents.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-01-21 10:28:35 +01:00
Keith Packard 8e74c7119f riscv: Add 'break' statements to fpsetround switch
This makes the fpsetround function actually do something rather than
just return -1 due to the default 'fall-through' behavior of the switch
statement.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-01-21 10:28:35 +01:00
Keith Packard 954504ea14 riscv: Use current pseudo-instructions to access the FCSR register
Use fscsr and frcsr to store and read the FCSR register instead of
fssr and frsr.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-01-21 10:28:35 +01:00
Jeff Johnston 1afb22a120 Bump up release to 3.2.0 for yearly snapshot 2020-01-02 14:56:24 -05:00
Anthony Green b481c11e5a Optimize setjmp/longjmp for moxie.
We don't need to save/restore every register -- just those
we don't expect to be trashed by function calls.
2019-12-20 09:00:26 -05:00
Anthony Green 31227ba53d Fix setjmp/longjmp for the moxie port.
These functions needs to save and restore the stack frame, because
that's where the return address is stored.
2019-12-13 13:08:06 -05:00
Kwok Cheung Yeung d14714c690 Stash reent marker in upper bits of s1 on AMD GCN
s[0:3] contain a descriptor used to set up the initial value of the
stack, but only the lower 48 bits of s[0:1] are currently used.
The reent marker is currently set in s3, but by stashing it in the
upper 16 bits of s[0:1] instead, s3 can be freed up for other purposes.
2019-11-08 10:34:28 +01:00
Dimitar Dimitrov 0c7734673a Initial PRU port for libgloss and newlib
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2019-10-31 14:47:19 -04:00
Joel Sherrill 9e06ba1ac3 riscv/sys/fenv.h: Add missing extern for fe_dfl_env_p 2019-10-09 11:00:45 -05:00
Jeff Johnston cfc4955234 Add patch from Joel Sherrill for i386 and x86_64 fenv support 2019-10-08 16:59:04 -04:00
Christos Gentsos 175b215e05 Optimize epilogue sequence for architectures with POP interworking.
ARMv5 and above supports arm/thumb interworking using POP, so we can
improve the exit sequence in this case.
2019-10-07 14:38:14 +01:00
Joel Sherrill c711371384 riscv/include/fenv.h: Use shared fenv.h.
libc/include/fenv.h was a direct copy of this file.
2019-09-03 09:52:34 -05:00
Joel Sherrill 03f802846f Miscellaneous Makefile.in regenerated 2019-08-09 17:49:16 +02:00
Kito Cheng 654398db84 RISC-V: Fix header guard for sys/fenv.h 2019-08-02 09:34:39 +02:00
Martin Erik Werner 739e89cbe6 or1k: Avoid write outside setjmp buf & shrink buf
Update the offsets used to save registers into the stejmp jmp_buf
structure in order to:

* Avoid writing the supervision register outside the buffer and thus
  clobbering something on the stack. Previously the supervision register
  was written at offset 124 while the buffer was of length 124.

* Shrink the jmp_buf down to the size actually needed, by avoiding holes
  at the locations of omitted registers.
2019-06-27 12:51:54 +02:00
Martin Erik Werner 8b080534ca or1k: Correct longjmp return value
Invert equality check instruction to correct the return value handling
in longjmp.

The return value should be the value of the second argument to longjmp,
unless the argument value was 0 in which case it should be 1.

Previously, longjmp would set return value 1 if the second argument was
non-zero, and 0 if it was 0, which was incorrect.
2019-06-27 09:09:37 +02:00
Jeff Johnston eb429ad509 Fix __getreent stack calculations for AMD GCN
From: Andrew Stubbs <ams@codesourcery.com>

Fix a bug in which the high-part of 64-bit values are being corrupted, leading
to erroneous stack overflow errors. The problem was only that the mixed-size
calculations are being treated as signed when they should be unsigned.
2019-06-07 13:57:45 -04:00
Jim Wilson 5c86f0da5f RISC-V: Add size optimized memcpy, memmove, memset and strcmp.
This patch adds implementations of memcpy, memmove, memset and strcmp
optimized for size. The changes have been tested in
riscv/riscv-gnu-toolchain by riscv-dejagnu with
riscv-sim.exp/riscv-sim-nano.exp.
2019-05-22 17:36:57 -07:00
Jozef Lawrynowicz 1e6c561d48 Implement reduced code size "tiny" printf and puts
"tiny" printf is derived from _vfprintf_r in libc/stdio/nano-vfprintf.c.
"tiny" puts has been implemented so that it just calls write, without
any other processing.
Support for buffering, reentrancy and streams has been removed from
these functions to achieve reduced code size.

This reduced code size implementation of printf and puts can be enabled
in an application by passing "--wrap printf" and "--wrap puts" to the
GNU linker. This will replace references to "printf" and "puts" in user
code with "__wrap_printf" and "__wrap_puts" respectively.
If there is no implementation of these __wrap* functions in user code,
these "tiny" printf and puts implementations will be linked into the
final executable.

The wrapping mechanism is supposed to be invisible to the user:
- A GCC wrapper option such as "-mtiny-printf" will be added to alias
  these wrap commands.
- If the user is unaware of the "tiny" implementation, and chooses to
  implement their own __wrap_printf and __wrap_puts, their own
  implementation will be automatically chosen over the "tiny" printf and
  puts from the library.

Newlib must be configured with --enable-newlib-nano-formatted-io for
the "tiny" printf and puts functions to be built into the library.

Code size reduction examples:
printf("Hello World\n")
  baseline - msp430-elf-gcc gcc-8_3_0-release
     text    data     bss
   5638     214      26
  "tiny" puts enabled
    text    data     bss
     714      90      20

printf("Hello %d\n", a)
  baseline - msp430-elf-gcc gcc-8_3_0-release
    text    data     bss
   10916     614      28

  "tiny" printf enabled
    text    data     bss
    4632     280      20
2019-04-15 14:22:33 +02:00
Jozef Lawrynowicz 2af6ad9f05 Copy prerequisite file for "tiny" printf implementation
Use newlib/libc/stdio/nano-vfprintf.c as baseline for tiny-printf.c
2019-04-15 14:22:30 +02:00
Andrew Stubbs e8b23909e4 Add missing includes.
These missing includes were causing build warnings, but also a real bug in
which the "size" parameter to "write" was being passed in 32-bit, whereas it
ought to be 64-bit.  This led to intermittent bad behaviour.
2019-03-25 16:44:10 +01:00
Jozef Lawrynowicz b14a879d85 Remove matherr, and SVID and X/Open math library configurations
Default math library configuration is now IEEE
2019-01-23 10:46:24 +01:00
Jeff Johnston 1787e9d033 AMD GCN Port contributed by Andrew Stubbs <ams@codesourcery.com>
Add support for the AMD GCN GPU architecture.  This is primarily intended for
use with OpenMP and OpenACC offloading.  It can also be used for stand-alone
programs, but this is intended mostly for testing the compiler and is not
expected to be useful in general.

The GPU architecture is highly parallel, and therefore Newlib must be
configured to use dynamic re-entrancy, and thread-safe malloc.

The only I/O available is a via a shared-memory interface provided by libgomp
and the gcn-run tool included with GCC.  At this time this is limited to
stdout, argc/argv, and the return code.
2019-01-15 10:48:08 -05:00
Jeff Johnston 5726873100 Bump release to 3.1.0 for yearly snapshot 2018-12-31 23:40:11 -05:00
Wilco Dijkstra df7824d1a4 Fix issue with dst bias in memset
This patch fixes an issue in the previous memset loop change. If the
zva size is >= 256 and there are more than 64 bytes left in the
tail, we could enter the loop and thus need to rebias dst by 32 as
well.

Since no known CPUs use this size this can't be tested natively, so I've
tested it on a simulator initialized with a large zva size.

--
2018-11-08 16:45:19 +00:00
Wilco Dijkstra d80db60066 Adjust writeback in non-zero memset
This fixes an ineffiency in the non-zero memset.  Delaying the writeback
until the end of the loop is slightly faster on some cores - this shows
~5% performance gain on Cortex-A53 when doing large non-zero memsets.

Tested against the GLIBC testsuite.
2018-11-06 14:59:51 +00:00
Sebastian Huber da418955f5 Move common <sys/dirent.h> content to <dirent.h>
Move common content of the various <sys/dirent.h> and the latest FreeBSD
<dirent.h> to <dirent.h>.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-10-11 08:29:16 +02:00
Jon Beniston a9cfb33b6c Add --disable-newlib-fno-builtin to allow compilation without -fno-builtin for smaller and faster code. 2018-08-31 15:40:42 -04:00
Keith Packard 82dfae9ab0 Use __inhibit_loop_to_libcall in all memset/memcpy implementations
This macro selects a compiler option that disables recognition of
common memset/memcpy patterns and converting those to direct
memset/memcpy calls.

Signed-off-by: Keith Packard <keithp@keithp.com>
2018-08-29 16:05:37 +02:00
Siddhesh Poyarekar d02cc7a09d strcmp.S: Improve performance for misaligned strings
Replace the simple byte-wise compare in the misaligned case with a
dword compare with page boundary checks in place.  For simplicity I've
chosen a 4K page boundary so that we don't have to query the actual
page size on the system.

This results in up to 3x improvement in performance in the unaligned
case on falkor and about 2.5x improvement on mustang as measured using
bench-strcmp in glibc.
2018-07-13 13:27:54 +02:00
Siddhesh Poyarekar 2d9f35c2cc memcmp.S: optimize for medium to large sizes
This improved memcmp provides a fast path for compares up to 16 bytes
and then compares 16 bytes at a time, thus optimizing loads from both
sources.  The glibc memcmp microbenchmark retains performance (with an
error of ~1ns) for smaller compare sizes and reduces up to 31% of
execution time for compares up to 4K on the APM Mustang.  On Qualcomm
Falkor this improves to almost 48%, i.e. it is almost 2x improvement
for sizes of 2K and above.
2018-07-13 13:27:54 +02:00
Siddhesh Poyarekar f44eee8f1b Improve strncmp for mutually misaligned inputs
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient.  Enhance the comparison
similar to strcmp by loading a double-word at a time.  The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark in glibc is as follows:

falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)

All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.
2018-07-13 13:27:54 +02:00
Jeff Johnston cd31fbb2ae Add nvptx port.
- From: Cesar Philippidis <cesar@codesourcery.com>
  Date: Tue, 10 Apr 2018 14:43:42 -0700
  Subject: [PATCH] nvptx port

  This port adds support for Nvidia GPU's, which are primarily used as
  offload accelerators in OpenACC and OpenMP.
2018-04-13 15:42:37 -04:00
Sebastian Huber 1658a57715 epiphany: Additional setjmp() and longjmp() syms
At least with Binutils 2.30 and GCC 7.3 we need symbol definitions
without the leading underscore.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-01-31 08:17:19 +01:00
Jeff Johnston fffd2770db Bump release to 3.0.0 for yearly snapshot
- major release required due to removal of K&R support
2018-01-18 13:07:45 -05:00
Yaakov Selkowitz 7192f84096 ansification: remove _HAVE_STDC
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:30 -06:00
Yaakov Selkowitz 70ee6b17df ansification: remove _EXFUN, _EXFUN_NOTHROW
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:29 -06:00
Yaakov Selkowitz 9087163804 ansification: remove _DEFUN
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:26 -06:00
Yaakov Selkowitz 67ee0cac4c ansification: remove _VOID
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:20 -06:00
Yaakov Selkowitz fff27f8429 ansification: remove _DEFUN_VOID
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:19 -06:00
Yaakov Selkowitz 670b01da7f ansification: remove _CAST_VOID
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:17 -06:00
Yaakov Selkowitz e6321aa6a6 ansification: remove _PTR
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:16 -06:00
Yaakov Selkowitz eea249da3b ansification: remove _PARAMS
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:13 -06:00
Yaakov Selkowitz 0bda30e1ff ansification: remove _CONST
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:08 -06:00
Yaakov Selkowitz 6783860a2e ansification: remove _AND
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2018-01-17 11:47:05 -06:00
Jon Turney c006fd459f makedoc: make errors visible
Discard QUICKREF sections, rather than writing them to stderr
Discard MATHREF sections, rather than discarding as an error
Pass NOTES sections through to texinfo, rather than discarding as an error
Don't redirect makedoc stderr to .ref file
Remove makedoc output on error
Remove .ref files from CLEANFILES
Regenerate Makefile.ins

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
2017-12-07 11:54:11 +00:00
Yaakov Selkowitz 1f1e477554 powerpc: remove TRAD_SYNOPSIS
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2017-12-01 03:41:50 -06:00
Yaakov Selkowitz ddd22ee069 nds32: remove TRAD_SYNOPSIS
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2017-12-01 03:41:50 -06:00
Yaakov Selkowitz 4e8c64b928 microblaze: remove TRAD_SYNOPSIS
Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>
2017-12-01 03:41:50 -06:00
Kito Cheng 6864c08b94 Change license to FreeBSD License for RISC-V
- For prevent confuse about what BSD license variant we used, 2- or
   3-clause license, we change the license to FreeBSD license to make
   it unambiguously refers to the 2-clause license.
2017-08-21 11:08:54 +02:00
Kito Cheng 363dbb9e44 Add RISC-V port for newlib
Contributor list:
    - Andrew Waterman  <andrew@sifive.com>
    - Palmer Dabbelt  <palmer@dabbelt.com>
    - Kito Cheng  <kito.cheng@gmail.com>
    - Scott Beamer  <sbeamer@eecs.berkeley.edu>
2017-08-16 18:00:58 -04:00
Richard Earnshaw d6cac3e1da [arm] Fix strcpy for unified syntax on ARMv4t thumb.
ARMv4t does not support mov between two low registers.  Now we use
unified syntax mov instructions need converting to movs.
2017-07-21 11:23:27 +01:00
Ian Tessier via newlib 4bce7ecbe1 arm: Update strcpy.c to use UAL syntax.
With this change the arm platform can now be fully compiled with Clang.

Tested by comparing the output with GCC 4.8.2, and Clang 4.0, using a
variety of arches, big/little endianness, and arm/thumb mode to verify
the generated assembly output matches between GCC vs Clang with UAL, and
also GCC with UAL vs GCC with non-UAL, for all preprocessor code blocks.

The only difference found is an extra nop at the end of the function
when compiled with GCC using armv7-a/thumb/little-endian/-O2 compared to
Clang. The nop is not emitted when compiled in big-endian mode.
2017-07-20 16:18:29 +02:00
Wilco Dijkstra c86063bdc0 Optimized memcmp
This is an optimized memcmp for AArch64.  This is a complete rewrite
using a different algorithm.  The previous version split into cases
where both inputs were aligned, the inputs were mutually aligned and
unaligned using a byte loop.  The new version combines all these cases,
while small inputs of less than 8 bytes are handled separately.

This allows the main code to be sped up using unaligned loads since
there are now at least 8 bytes to be compared.  After the first 8 bytes,
align the first input.  This ensures each iteration does at most one
unaligned access and mutually aligned inputs behave as aligned.
After the main loop, process the last 8 bytes using unaligned accesses.

This improves performance of (mutually) aligned cases by 25% and
unaligned by >500% (yes >6 times faster) on large inputs.

ChangeLog:
2017-06-28  Wilco Dijkstra  <wdijkstr@arm.com>

        * newlib/libc/machine/aarch64/memcmp.S (memcmp):
        Rewrite of optimized memcmp.

GLIBC benchtests/bench-memcmp.c performance comparison for Cortex-A53:

Length    1, alignment  1/ 1:		153%
Length    1, alignment  1/ 1:		119%
Length    1, alignment  1/ 1:		154%
Length    2, alignment  2/ 2:		121%
Length    2, alignment  2/ 2:		140%
Length    2, alignment  2/ 2:		121%
Length    3, alignment  3/ 3:		105%
Length    3, alignment  3/ 3:		105%
Length    3, alignment  3/ 3:		105%
Length    4, alignment  4/ 4:		155%
Length    4, alignment  4/ 4:		154%
Length    4, alignment  4/ 4:		161%
Length    5, alignment  5/ 5:		173%
Length    5, alignment  5/ 5:		173%
Length    5, alignment  5/ 5:		173%
Length    6, alignment  6/ 6:		145%
Length    6, alignment  6/ 6:		145%
Length    6, alignment  6/ 6:		145%
Length    7, alignment  7/ 7:		125%
Length    7, alignment  7/ 7:		125%
Length    7, alignment  7/ 7:		125%
Length    8, alignment  8/ 8:		111%
Length    8, alignment  8/ 8:		130%
Length    8, alignment  8/ 8:		124%
Length    9, alignment  9/ 9:		160%
Length    9, alignment  9/ 9:		160%
Length    9, alignment  9/ 9:		150%
Length   10, alignment 10/10:		170%
Length   10, alignment 10/10:		137%
Length   10, alignment 10/10:		150%
Length   11, alignment 11/11:		160%
Length   11, alignment 11/11:		160%
Length   11, alignment 11/11:		160%
Length   12, alignment 12/12:		146%
Length   12, alignment 12/12:		168%
Length   12, alignment 12/12:		156%
Length   13, alignment 13/13:		167%
Length   13, alignment 13/13:		167%
Length   13, alignment 13/13:		173%
Length   14, alignment 14/14:		167%
Length   14, alignment 14/14:		168%
Length   14, alignment 14/14:		168%
Length   15, alignment 15/15:		168%
Length   15, alignment 15/15:		173%
Length   15, alignment 15/15:		173%
Length    1, alignment  0/ 0:		134%
Length    1, alignment  0/ 0:		127%
Length    1, alignment  0/ 0:		119%
Length    2, alignment  0/ 0:		94%
Length    2, alignment  0/ 0:		94%
Length    2, alignment  0/ 0:		106%
Length    3, alignment  0/ 0:		82%
Length    3, alignment  0/ 0:		87%
Length    3, alignment  0/ 0:		82%
Length    4, alignment  0/ 0:		115%
Length    4, alignment  0/ 0:		115%
Length    4, alignment  0/ 0:		122%
Length    5, alignment  0/ 0:		127%
Length    5, alignment  0/ 0:		119%
Length    5, alignment  0/ 0:		127%
Length    6, alignment  0/ 0:		103%
Length    6, alignment  0/ 0:		100%
Length    6, alignment  0/ 0:		100%
Length    7, alignment  0/ 0:		82%
Length    7, alignment  0/ 0:		91%
Length    7, alignment  0/ 0:		87%
Length    8, alignment  0/ 0:		111%
Length    8, alignment  0/ 0:		124%
Length    8, alignment  0/ 0:		124%
Length    9, alignment  0/ 0:		136%
Length    9, alignment  0/ 0:		136%
Length    9, alignment  0/ 0:		136%
Length   10, alignment  0/ 0:		136%
Length   10, alignment  0/ 0:		135%
Length   10, alignment  0/ 0:		136%
Length   11, alignment  0/ 0:		136%
Length   11, alignment  0/ 0:		136%
Length   11, alignment  0/ 0:		135%
Length   12, alignment  0/ 0:		136%
Length   12, alignment  0/ 0:		136%
Length   12, alignment  0/ 0:		136%
Length   13, alignment  0/ 0:		135%
Length   13, alignment  0/ 0:		136%
Length   13, alignment  0/ 0:		136%
Length   14, alignment  0/ 0:		136%
Length   14, alignment  0/ 0:		136%
Length   14, alignment  0/ 0:		136%
Length   15, alignment  0/ 0:		136%
Length   15, alignment  0/ 0:		136%
Length   15, alignment  0/ 0:		136%
Length    4, alignment  0/ 0:		115%
Length    4, alignment  0/ 0:		115%
Length    4, alignment  0/ 0:		115%
Length   32, alignment  0/ 0:		127%
Length   32, alignment  7/ 2:		395%
Length   32, alignment  0/ 0:		127%
Length   32, alignment  0/ 0:		127%
Length    8, alignment  0/ 0:		111%
Length    8, alignment  0/ 0:		124%
Length    8, alignment  0/ 0:		124%
Length   64, alignment  0/ 0:		128%
Length   64, alignment  6/ 4:		475%
Length   64, alignment  0/ 0:		131%
Length   64, alignment  0/ 0:		134%
Length   16, alignment  0/ 0:		128%
Length   16, alignment  0/ 0:		119%
Length   16, alignment  0/ 0:		128%
Length  128, alignment  0/ 0:		129%
Length  128, alignment  5/ 6:		475%
Length  128, alignment  0/ 0:		130%
Length  128, alignment  0/ 0:		129%
Length   32, alignment  0/ 0:		126%
Length   32, alignment  0/ 0:		126%
Length   32, alignment  0/ 0:		126%
Length  256, alignment  0/ 0:		127%
Length  256, alignment  4/ 8:		545%
Length  256, alignment  0/ 0:		126%
Length  256, alignment  0/ 0:		128%
Length   64, alignment  0/ 0:		171%
Length   64, alignment  0/ 0:		171%
Length   64, alignment  0/ 0:		174%
Length  512, alignment  0/ 0:		126%
Length  512, alignment  3/10:		585%
Length  512, alignment  0/ 0:		126%
Length  512, alignment  0/ 0:		127%
Length  128, alignment  0/ 0:		129%
Length  128, alignment  0/ 0:		128%
Length  128, alignment  0/ 0:		129%
Length 1024, alignment  0/ 0:		125%
Length 1024, alignment  2/12:		611%
Length 1024, alignment  0/ 0:		126%
Length 1024, alignment  0/ 0:		126%
Length  256, alignment  0/ 0:		128%
Length  256, alignment  0/ 0:		127%
Length  256, alignment  0/ 0:		128%
Length 2048, alignment  0/ 0:		125%
Length 2048, alignment  1/14:		625%
Length 2048, alignment  0/ 0:		125%
Length 2048, alignment  0/ 0:		125%
Length  512, alignment  0/ 0:		126%
Length  512, alignment  0/ 0:		127%
Length  512, alignment  0/ 0:		127%
Length 4096, alignment  0/ 0:		125%
Length 4096, alignment  0/16:		125%
Length 4096, alignment  0/ 0:		125%
Length 4096, alignment  0/ 0:		125%
Length 1024, alignment  0/ 0:		126%
Length 1024, alignment  0/ 0:		126%
Length 1024, alignment  0/ 0:		126%
Length 8192, alignment  0/ 0:		125%
Length 8192, alignment 63/18:		636%
Length 8192, alignment  0/ 0:		125%
Length 8192, alignment  0/ 0:		125%
Length   16, alignment  1/ 2:		317%
Length   16, alignment  1/ 2:		317%
Length   16, alignment  1/ 2:		317%
Length   32, alignment  2/ 4:		395%
Length   32, alignment  2/ 4:		395%
Length   32, alignment  2/ 4:		398%
Length   64, alignment  3/ 6:		475%
Length   64, alignment  3/ 6:		475%
Length   64, alignment  3/ 6:		477%
Length  128, alignment  4/ 8:		479%
Length  128, alignment  4/ 8:		479%
Length  128, alignment  4/ 8:		479%
Length  256, alignment  5/10:		543%
Length  256, alignment  5/10:		539%
Length  256, alignment  5/10:		543%
Length  512, alignment  6/12:		585%
Length  512, alignment  6/12:		585%
Length  512, alignment  6/12:		585%
Length 1024, alignment  7/14:		611%
Length 1024, alignment  7/14:		611%
Length 1024, alignment  7/14:		611%
2017-06-29 20:36:35 +02:00
Sebastian Pop 9938a64ca9 aarch64: optimize the unaligned case of memcmp
This brings to newlib a performance improvement that we developed in Bionic
libc.  That change has been submitted for review to Bionic libc:
https://android-review.googlesource.com/418279

A similar patch has been submitted for review in glibc:
https://sourceware.org/ml/libc-alpha/2017-06/msg01143.html

Patch written by Vikas Sinha and Sebastian Pop.

The performance was measured on the bionic-benchmarks on a hikey (aarch64 8xA53)
board. There was no performance change to the existing benchmark
and a performance improvement on the new benchmark for memcmp
on the unaligned side. The new benchmark has been submitted for
review at https://android-review.googlesource.com/414860

The overall performance improves by 18% for the small data set 8
and the performance improves by 450% for the large data set 64k.

The base is with the libc from /system/lib64. The bionic libc
with this patch is in /data.

hikey:/data # export LD_LIBRARY_PATH=/system/lib64
hikey:/data # ./bionic-benchmarks --benchmark_filter='BM_string_memcmp*'
Run on (8 X 2.4 MHz CPU s)
Benchmark                                Time           CPU Iterations
----------------------------------------------------------------------
BM_string_memcmp/8                      30 ns         30 ns   22955680    251.07MB/s
BM_string_memcmp/64                     57 ns         57 ns   12349184   1076.99MB/s
BM_string_memcmp/512                   305 ns        305 ns    2297163   1.56496GB/s
BM_string_memcmp/1024                  571 ns        571 ns    1225211   1.66912GB/s
BM_string_memcmp/8k                   4307 ns       4306 ns     162562   1.77177GB/s
BM_string_memcmp/16k                  8676 ns       8675 ns      80676   1.75887GB/s
BM_string_memcmp/32k                 19233 ns      19230 ns      36394   1.58695GB/s
BM_string_memcmp/64k                 36986 ns      36984 ns      18952   1.65029GB/s
BM_string_memcmp_aligned/8             199 ns        199 ns    3519166   38.3336MB/s
BM_string_memcmp_aligned/64            386 ns        386 ns    1810734   158.073MB/s
BM_string_memcmp_aligned/512          1735 ns       1734 ns     403981   281.525MB/s
BM_string_memcmp_aligned/1024         3200 ns       3200 ns     218838   305.151MB/s
BM_string_memcmp_aligned/8k          25084 ns      25080 ns      28180   311.507MB/s
BM_string_memcmp_aligned/16k         51730 ns      51729 ns      13521   302.057MB/s
BM_string_memcmp_aligned/32k        103228 ns     103228 ns       6782   302.727MB/s
BM_string_memcmp_aligned/64k        207117 ns     207087 ns       3450   301.806MB/s
BM_string_memcmp_unaligned/8           339 ns        339 ns    2070998   22.5302MB/s
BM_string_memcmp_unaligned/64         1392 ns       1392 ns     502796   43.8454MB/s
BM_string_memcmp_unaligned/512        9194 ns       9194 ns      76133   53.1104MB/s
BM_string_memcmp_unaligned/1024      18325 ns      18323 ns      38206   53.2963MB/s
BM_string_memcmp_unaligned/8k       148579 ns     148574 ns       4713   52.5831MB/s
BM_string_memcmp_unaligned/16k      298169 ns     298120 ns       2344   52.4118MB/s
BM_string_memcmp_unaligned/32k      598813 ns     598797 ns       1085    52.188MB/s
BM_string_memcmp_unaligned/64k     1196079 ns    1196083 ns        540   52.2539MB/s

hikey:/data # export LD_LIBRARY_PATH=/data
hikey:/data # ./bionic-benchmarks --benchmark_filter='BM_string_memcmp*'
Run on (8 X 2.4 MHz CPU s)
Benchmark                                Time           CPU Iterations
----------------------------------------------------------------------
BM_string_memcmp/8                      30 ns         30 ns   23209918   252.802MB/s
BM_string_memcmp/64                     57 ns         57 ns   12348447   1076.95MB/s
BM_string_memcmp/512                   305 ns        305 ns    2296878   1.56471GB/s
BM_string_memcmp/1024                  572 ns        571 ns    1224426    1.6689GB/s
BM_string_memcmp/8k                   4309 ns       4308 ns     162491   1.77109GB/s
BM_string_memcmp/16k                  9348 ns       9345 ns      74894   1.63285GB/s
BM_string_memcmp/32k                 18329 ns      18322 ns      38249    1.6656GB/s
BM_string_memcmp/64k                 36992 ns      36981 ns      18952   1.65045GB/s
BM_string_memcmp_aligned/8             199 ns        199 ns    3513925   38.3162MB/s
BM_string_memcmp_aligned/64            386 ns        386 ns    1814038   158.192MB/s
BM_string_memcmp_aligned/512          1735 ns       1735 ns     402279   281.502MB/s
BM_string_memcmp_aligned/1024         3204 ns       3202 ns     218761   304.941MB/s
BM_string_memcmp_aligned/8k          25577 ns      25569 ns      27406   305.548MB/s
BM_string_memcmp_aligned/16k         52143 ns      52123 ns      13522   299.769MB/s
BM_string_memcmp_aligned/32k        105169 ns     105127 ns       6637    297.26MB/s
BM_string_memcmp_aligned/64k        206508 ns     206383 ns       3417   302.835MB/s
BM_string_memcmp_unaligned/8           282 ns        282 ns    2482953    27.062MB/s
BM_string_memcmp_unaligned/64          542 ns        541 ns    1298317    112.77MB/s
BM_string_memcmp_unaligned/512        2152 ns       2152 ns     325267   226.915MB/s
BM_string_memcmp_unaligned/1024       4025 ns       4025 ns     173904   242.622MB/s
BM_string_memcmp_unaligned/8k        32276 ns      32271 ns      21818    242.09MB/s
BM_string_memcmp_unaligned/16k       65970 ns      65970 ns      10554   236.851MB/s
BM_string_memcmp_unaligned/32k      131241 ns     131242 ns       5129    238.11MB/s
BM_string_memcmp_unaligned/64k      266159 ns     266160 ns       2661   234.821MB/s
2017-06-26 10:22:40 +02:00
Prakhar Bahuguna 21ff2cf930 Fix minor issues in memchr NEON implementation 2017-06-07 12:16:15 +02:00
Sebastian Huber 2693c1db69 Move ARM access.c from machine to sys
The implementation of the POSIX access() function is nothing machine
specific like memcpy(), etc.  Move it back to the system domain.  This
avoids problems due to the include search order of the Newlib/GCC build
which picks up machine includes before system includes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2017-05-25 12:34:53 -04:00
Prakhar Bahuguna c47c9bdc1b Optimise memchr for NEON-enabled processors 2017-04-06 18:19:20 +02:00
Catherine Moore 571c69656a Use .syntax unified instead of .syntax divided. 2017-03-30 17:18:12 +02:00
Kyrill Tkachov 52a6da816f arm: Fix addressing in optpld macro
In patch b219285f87 you have a syntax
error in the PLD instruction.  The syntax for the pld argument should be
in square brackets as it's a memory address like so: pld [r1].  With
your patch the newlib build fails for armv7-a targets.  This patch fixes
the build failures.

Tested by making sure the newlib build completes successfully.

2016-01-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

    * libc/machine/arm/strcpy.c (strcpy): Fix PLD assembly syntax.
    * libc/machine/arm/strlen-stub.c (strlen): Likewise.
2017-01-26 16:29:36 +01:00
Pat Pannuto 3ebc26958e arm: Remove RETURN macro
LTO can re-order top-level assembly blocks, which can cause this
macro definition to appear after its use (or not at all), causing
compilation failures. On modern toolchains (armv4t+), assembly
should write `bx lr` in all cases, and linkers will transparently
convert them to `mov pc, lr`, allowing us to simply remove the
macro.
  (source: https://groups.google.com/forum/#!topic/comp.sys.arm/3l7fVGX-Wug
   and verified empirically)

For the armv4.S file, preserve this macro to maximize backwards
compatibility.
2017-01-25 13:32:09 +01:00
Pat Pannuto b219285f87 arm: Remove optpld macro
LTO can re-order top-level assembly blocks, which can cause this
macro definition to appear after its use (or not at all), causing
compilation failures. As the macro has very few uses, simply removing
it by inlining is a simple fix.

n.b. one of the macro invocations in strlen-stub.c was already
guarded by the relevant #define, so it is simply converted directly
to a pld
2017-01-25 13:32:09 +01:00
Pat Pannuto e7332409cc Remove unneeded references to arm_asm.h
This should result in no functional changes, it simply removes references
to arm_asm.h that did not use anything from that file.
2017-01-25 13:32:09 +01:00
Jeff Johnston 61f181d6b8 Bump release to 2.5.0 for yearly snapshot. 2016-12-22 21:33:54 -05:00