In order to maintain consistency both within machine/arm and between
machine/arm and machine/aarch64, rename the 'c' stub to -stub.c.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
This patch flattens the condition code selection used in strlen in an
attempt to make the guarding condition for each alternative
implementation clearer and to structure the logic in a manner that
makes it easier to maintain complementary logic between the
alternative 'C' and assembler implementations.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
ARM newlib has various strcmp implementations that use .cfi_*
directives to generate unwind information.
The effect of this is that the generated objects contain .eh_frame
sections. However, ARM uses its own unwind info format, not
.eh_frame, which is generated by ARM-specific directives, not .cfi_*.
The .eh_frame sections are useless, but also not removed by strip and
may be loaded into memory at runtime.
This patch fixes this by using .cfi_sections .debug_frame (as in
glibc) so that the directives generate .debug_frame instead.
.debug_frame is useful for the debugger, can be removed by strip, and
is not loaded into memory at runtime.
* libc/machine/arm/strcmp-arm-tiny.S: Use .cfi_sections
.debug_frame.
* libc/machine/arm/strcmp-armv4.S: Likewise.
* libc/machine/arm/strcmp-armv4t.S: Likewise.
* libc/machine/arm/strcmp-armv6.S: Likewise.
* libc/machine/arm/strcmp-armv6m.S: Likewise.
* libc/machine/arm/strcmp-armv7.S: Likewise.
* libc/machine/arm/strcmp-armv7m.S: Likewise.
The patch cleans up the auto configury mechanism used to select
different implementations of memchr for various architecture versions.
The approach here is to remove the selection of memchr within automake
and instead use complimentary logic in memchr-stub.c and memchr.S to
choose between the gerneric memchr.c implementation or one of the
architecture specific implementations.
This patch also changes the selection criteria inline with the
previous proposal here:
https://sourceware.org/ml/newlib/2015/msg00752.html
but using the ACLE predefines.
Regressed for armv7-a armv5 armv8-a, correct selection of memcpy
implementation by manual inspection of a test program built for these
three architectures.
This patch cleans up the auto configury mechanism used to select
different implementations of memcpy for various architecture versions.
The approach here is to remove the selection of memcpy within automake
and instead use complimentary logic in memcpy-stub.c and memcpy.S to
choose between the generic memcpy.c implemenation or one of the
architecture specific memcpy*.S implemenations.
Regressed for armv7-a armv5 armv8-a, correct selection of memcpy
implementation by manual inspection of a test program built for these
three architectures.
This revised patch flips the remaining preprocessor logic in
memcpy-stub.c to use ACLE defines as requested in the previous review
and removes the now disused HAVE_ARMV7A and HAVE_ARMV8A configure.in
support.
The newlib configury logic that detects architecture version and
chooses an appropriate memcpy implementation does not consider
ARMv8-a.
This patch adds configury logic to detect ARMv8-a along with the
associated changes in Makefile.am and memcpy.
Hi!
I've got the situation, that the function strlen() occurs twice in libc.a
(building newlib for ARM-V7a and Size-Optimized).
In newlib/libc/machine/arm/strlen.c there are the pre-processor stetements ...
#if defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED) || \
(defined (__thumb__) && !defined (__thumb2__))
/*...*/
#else
#if !(defined(_ISA_ARM_7) || defined(__ARM_ARCH_6T2__))
/*...*/
#endif
and in newlib/libc/machine/arm/strlen-armv7.S the "exclude" begins with
/* NOTE: This ifdef MUST match the ones in arm/strlen.c
We fallback to the one in arm/strlen.c for size optimised or
for older architectures. */
#if defined(_ISA_ARM_7) || defined(__ARM_ARCH_6T2__) && \
!(defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED) || \
(defined (__thumb__) && !defined (__thumb2__)))
But this is not completely contrary to arm/strlen.c (see above)!
To fix the logical statement in arm/strlen-armv7.S there are parentheses needed
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
This is an optimized memset for AArch64. Memset is split into 4 main
cases: small sets of up to 16 bytes, medium of 16..96 bytes which are
fully unrolled. Large memsets of more than 96 bytes align the
destination and use an unrolled loop processing 64 bytes per
iteration. Memsets of zero of more than 256 use the dc zva
instruction, and there are faster versions for the common ZVA sizes 64
or 128. STP of Q registers is used to reduce codesize without loss of
performance.
This is an optimized memset for AArch64. Memset is split into 4 main
cases: small sets of up to 16 bytes, medium of 16..96 bytes which are
fully unrolled. Large memsets of more than 96 bytes align the
destination and use an unrolled loop processing 64 bytes per
iteration. Memsets of zero of more than 256 use the dc zva
instruction, and there are faster versions for the common ZVA sizes 64
or 128. STP of Q registers is used to reduce codesize without loss of
performance.
This is an optimized memcpy for AArch64. Copies are split into 3 main
cases: small copies of up to 16 bytes, medium copies of 17..96 bytes
which are fully unrolled. Large copies of more than 96 bytes align
the destination and use an unrolled loop processing 64 bytes per
iteration. In order to share code with memmove, small and medium
copies read all data before writing, allowing any kind of overlap. On
a random copy test memcpy is 40.8% faster on A57 and 28.4% on A53.
This is an optimized memmove for AArch64. All copies of up to 96
bytes and all backward copies are done by the new memcpy. The only
remaining case is large forward copies which are done in the same way
as the memcpy loop, but copying from the end rather than the start.
improvements. Adjust to allow building as stpcpy.
* libc/machine/aarch64/stpcpy.S: New file.
* libc/machine/aarch64/stpcpy-stub.c: New file.
* libc/machine/aarch64/Makefile.am (lib_a_SOURCES): Build stpcpy.
* libc/machine/aarch64/Makefile.in: Regenerated.
* libc/machine/aarch64/strrchr-stub.c: New file.
* libc/machine/aarch64/Makefile.am: Add them to build list.
* libc/machine/aarch64/Makefile.in: Regenerated.
from the 64-bit _JBTYPE definition.
* libc/machine/mips/setjmp.S: Re-work the o32 FP64 support to match
the now one-and-only supported o32 FP64 ABI extension. Also
support o32 FPXX.
Found by:
find -name '*.h' |xargs grep -i 'attribute.*(([a-z]'
For an example of the type of bugs this causes, try compiling this valid
C11 program (it's valid because 'noreturn' is reserved for use in the
user namespace unless you include <stdnoreturn.h>):
$ cat foo.c
#define noreturn __attribute__((noreturn))
#include <stdlib.h>
$ gcc -c -o foo.o -Wall foo.c
In file included from /usr/include/stdlib.h:11:0,
from foo.c:2:
foo.c:1:18: error: expected ')' before '__attribute__'
#define noreturn __attribute__((noreturn))
^
/usr/include/stdlib.h:66:28: error: expected ',' or ';' before ')' token
_VOID _EXFUN(abort,(_VOID) _ATTRIBUTE ((noreturn)));
^
* libc/machine/spu/spu_timer_internal.h: Decorate attribute names
with __, for namespace safety.
* libc/machine/xscale/machine/profile.h: Likewise.
* libc/include/stdlib.h: Likewise.
* libc/include/_ansi.h: Likewise.
* libc/include/sys/unistd.h: Likewise.
* libc/sys/linux/linuxthreads/libc-symbols.h: Likewise.
* libc/sys/linux/linuxthreads/internals.h: Likewise.
* libc/sys/linux/machine/i386/weakalias.h: Likewise.
* libc/sys/linux/machine/i386/dl-procinfo.h: Likewise.
* libc/sys/linux/machine/i386/dl-machine.h: Likewise.
* libc/sys/linux/libc-symbols.h: Likewise.
* libc/sys/linux/iconv/gconv_charset.h: Likewise.
* libc/sys/linux/include/resolv.h: Likewise.
* libc/sys/linux/sys/unistd.h: Likewise.
* libc/sys/linux/dl/atomicity.h: Likewise.
* libc/sys/linux/dl/dynamic-link.h: Likewise.
* libc/sys/linux/dl/ldsodefs.h: Likewise.