2016-04-18 Thomas Preud'homme <thomas.preudhomme@arm.com>
* libc/machine/arm/strlen-stub.c: Check capabilities of architecture
to decide which Thumb implementation to use and fall back to C
implementation for architecture not supporting Thumb mode.
* libc/machine/arm/strlen.S: Likewise.
Introduce <machine/_endian.h> to let target based customization of
<machine/endian.h> via
* _LITTLE_ENDIAN,
* _BIG_ENDIAN,
* _PDP_ENDIAN, and
* _BYTE_ORDER.
defines. Add definitions expected by FreeBSD to
<machine/endian.h> like
* _QUAD_HIGHWORD,
* _QUAD_LOWWORD,
* __bswap16(),
* __bswap32(),
* __bswap64(),
* __htonl(),
* __htons(),
* __ntohl(), and
* __ntohs().
Also, if __BSD_VISIBLE
* LITTLE_ENDIAN,
* BIG_ENDIAN,
* PDP_ENDIAN, and
* BYTE_ORDER.
Targets that define __machine_host_to_from_network_defined in
<machine/_endian.h> must provide their own implementation of
* __htonl(),
* __htons(),
* __ntohl(), and
* __ntohs(),
otherwise a default implementation is provided by <machine/endian.h>.
In case of GCC defines to builtins are used.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
libgloss:
* arm/Makefile.in: Add newlib/libc/machine/arm to the include path if
newlib is present.
* arm/arm.h: Include acle-compat.h.
(THUMB_V7_V6M): Rename to ...
(PREFER_THUMB): This. Use ACLE macros __ARM_ARCH_ISA_ARM instead of
__ARM_ARCH_6M__ to decide whether to define it.
(THUMB1_ONLY): Define for Thumb-1 only targets.
(THUMB_V7M_V6M): Rename to ...
(THUMB_VXM): This. Defined based on __ARM_ARCH_ISA_ARM, excluding
ARMv7.
* arm/crt0.S: Use THUMB1_ONLY rather than __ARM_ARCH_6M__,
!__ARM_ARCH_ISA_ARM rather than THUMB_V7M_V6M for fp enabling, and
PREFER_THUMB rather than THUMB_V7_V6M. Rename other occurences of
THUMB_V7M_V6M to THUMB_VXM.
* arm/linux-crt0.c: Likewise.
* arm/redboot-crt0.S: Likewise.
* arm/swi.h: Likewise.
* arm/trap.S: Likewise.
newlib:
* libc/machine/arm/memcpy-stub.c: Use ACLE macros __ARM_ARCH_ISA_THUMB
and __ARM_ARCH_ISA_ARM to check for Thumb-2 only targets rather than
__ARM_ARCH and __ARM_ARCH_PROFILE.
* libc/machine/arm/memcpy.S: Likewise.
* libc/machine/arm/setjmp.S: Likewise for Thumb-1 only target and
include acle-compat.h.
* libc/machine/arm/strcmp.S: Likewise for Thumb-1 and Thumb-2 only
target and include acle-compat.h.
* libc/sys/arm/arm.h: Include acle-compat.h.
(THUMB_V7_V6M): Rename to ...
(PREFER_THUMB): This. Use ACLE macro __ARM_ARCH_ISA_ARM instead of
__ARM_ARCH_6M__ to decide whether to define it.
(THUMB1_ONLY): Define for Thumb-1 only targets.
(THUMB_V7M_V6M): Rename to ...
(THUMB_VXM): This. Defined based on __ARM_ARCH_ISA_ARM, excluding
ARMv7.
* libc/sys/arm/crt0.S: Use PREFER_THUMB rather than THUMB_V7_V6M and
rename THUMB_V7M_V6M into THUMB_VXM.
* libc/sys/arm/swi.h: Likewise.
Reformulate the strcmp-armv7.S selection logic around the architecture
features required by the implementation code rather (some) version of
the architecture that expose those features.
The patch moves the inline ASM thumb2 -Os implementation out into its
own .S file.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
The patch moves the inline ASM thumb1 -O2 implementation out into its
own .S file.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
The patch adds strlen.S to contain the complementary preprocessor
logic to strlen-stub.c intended to provide #inclusion of alternative
.S implementations.
Initially we just include the existing strlen-armv7.S implementation.
We rewrite _ISA_ARMV7 in both strlen.S and strlen-stub.c to use the
underlying existing underlying defintion from arm_asm.h in order to
avoide including that file, this is in effect the first step towards a
move to ACLE predefines only.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
In order to maintain consistency both within machine/arm and between
machine/arm and machine/aarch64, rename the 'c' stub to -stub.c.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
This patch flattens the condition code selection used in strlen in an
attempt to make the guarding condition for each alternative
implementation clearer and to structure the logic in a manner that
makes it easier to maintain complementary logic between the
alternative 'C' and assembler implementations.
Tested by building newlib and comparing libc.a binaries before and
after for all permutations of:
Architectures:
armv4 armv4t armv5 armv5t armv5te armv6 armv6j armv6k
armv6z armv6kz armv6t2 armv6-m armv6s-m armv7 armv7-a
armv7ve armv7-r armv7-m armv7e-m armv8-a iwmmxt iwmmxt2
ISAs:
thumb arm
Optimization Levels:
Os O2
Excluding:
armv6s-m -mthumb
armv6-m -mthumb
armv6zk -mthumb
armv6z -mthumb
armv6k -mthumb
armv6j -mthumb
ARM newlib has various strcmp implementations that use .cfi_*
directives to generate unwind information.
The effect of this is that the generated objects contain .eh_frame
sections. However, ARM uses its own unwind info format, not
.eh_frame, which is generated by ARM-specific directives, not .cfi_*.
The .eh_frame sections are useless, but also not removed by strip and
may be loaded into memory at runtime.
This patch fixes this by using .cfi_sections .debug_frame (as in
glibc) so that the directives generate .debug_frame instead.
.debug_frame is useful for the debugger, can be removed by strip, and
is not loaded into memory at runtime.
* libc/machine/arm/strcmp-arm-tiny.S: Use .cfi_sections
.debug_frame.
* libc/machine/arm/strcmp-armv4.S: Likewise.
* libc/machine/arm/strcmp-armv4t.S: Likewise.
* libc/machine/arm/strcmp-armv6.S: Likewise.
* libc/machine/arm/strcmp-armv6m.S: Likewise.
* libc/machine/arm/strcmp-armv7.S: Likewise.
* libc/machine/arm/strcmp-armv7m.S: Likewise.
The patch cleans up the auto configury mechanism used to select
different implementations of memchr for various architecture versions.
The approach here is to remove the selection of memchr within automake
and instead use complimentary logic in memchr-stub.c and memchr.S to
choose between the gerneric memchr.c implementation or one of the
architecture specific implementations.
This patch also changes the selection criteria inline with the
previous proposal here:
https://sourceware.org/ml/newlib/2015/msg00752.html
but using the ACLE predefines.
Regressed for armv7-a armv5 armv8-a, correct selection of memcpy
implementation by manual inspection of a test program built for these
three architectures.
This patch cleans up the auto configury mechanism used to select
different implementations of memcpy for various architecture versions.
The approach here is to remove the selection of memcpy within automake
and instead use complimentary logic in memcpy-stub.c and memcpy.S to
choose between the generic memcpy.c implemenation or one of the
architecture specific memcpy*.S implemenations.
Regressed for armv7-a armv5 armv8-a, correct selection of memcpy
implementation by manual inspection of a test program built for these
three architectures.
This revised patch flips the remaining preprocessor logic in
memcpy-stub.c to use ACLE defines as requested in the previous review
and removes the now disused HAVE_ARMV7A and HAVE_ARMV8A configure.in
support.
The newlib configury logic that detects architecture version and
chooses an appropriate memcpy implementation does not consider
ARMv8-a.
This patch adds configury logic to detect ARMv8-a along with the
associated changes in Makefile.am and memcpy.
Hi!
I've got the situation, that the function strlen() occurs twice in libc.a
(building newlib for ARM-V7a and Size-Optimized).
In newlib/libc/machine/arm/strlen.c there are the pre-processor stetements ...
#if defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED) || \
(defined (__thumb__) && !defined (__thumb2__))
/*...*/
#else
#if !(defined(_ISA_ARM_7) || defined(__ARM_ARCH_6T2__))
/*...*/
#endif
and in newlib/libc/machine/arm/strlen-armv7.S the "exclude" begins with
/* NOTE: This ifdef MUST match the ones in arm/strlen.c
We fallback to the one in arm/strlen.c for size optimised or
for older architectures. */
#if defined(_ISA_ARM_7) || defined(__ARM_ARCH_6T2__) && \
!(defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED) || \
(defined (__thumb__) && !defined (__thumb2__)))
But this is not completely contrary to arm/strlen.c (see above)!
To fix the logical statement in arm/strlen-armv7.S there are parentheses needed
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
* libc/machine/arm/strcmp-armv4.S: New file.
* libc/machine/arm/strcmp-armv4t.S: New file.
* libc/machine/arm/strcmp-armv6.S: New file.
* libc/machine/arm/strcmp-armv7.S: New file.
* libc/machine/arm/strcmp-armv7m.S: New file.
* libc/machine/arm/strcmp.S: Replace with wrapper for various
implementations.
* libc/machine/arm/Makefile.am (strcmp.o, strcmp.obj): Add
dependencies.
* libc/machine/arm/Makefile.in: Regenerated.
Adjust the conditions for entering the aligned copy loop to
improve performance on mutually misaligned buffer copies.
2013-07-01 Will Newton <will.newton@linaro.org>
* libc/machine/arm/memcpy-armv7a.S: Adjust entry to
aligned loop to improve misaligned copy performance.
Import the latest version of strlen from the Linaro cortex-strings
package. This version is faster across a variety of block size and
alignments on ARMv7.
newlib/ChangeLog:
2013-06-21 Will Newton <will.newton@linaro.org>
* libc/machine/arm/strlen-armv7.S: Import latest strlen
code from Linaro cortex-strings.
* libc/machine/arm/memcpy-stub.c: Use generic memcpy if unaligned
access is not enabled.
* libc/machine/arm/memcpy.S: Faster memcpy implementation for
Cortex A15 cores using NEON and VFP if available.
memchr.S.
* libc/machine/arm/arm_asm.h: Add ifdef to allow it to be included
in .S files.
* libc/machine/arm/memchr-stub.c: New file - just selects what to
compile.
* libc/machine/arm/memchr.S: New file - ARMv6t2/v7 version.
* libc/machine/arm/Makefile.am (lib_a_SOURCES): Add strlen-armv7.S.
* libc/machine/arm/strlen-armv7.S: New file.
* libc/machine/arm/strlen.c: Add ifdef optimised code so it isn't
for v7 or 6t2.
* libc/machine/arm/Makefile.in: Regenerate.
memcpy function optimized for the cortex-a15.
* libc/machine/arm/memcpy-stub.c: New file.
* libc/machine/arm/Makefile.am (lib_a_SOURCES): Add memcpy-stub.c,
memcpy.S.
* libc/machine/arm/Makefile.in: Regenerate.