Update AArch64 assembly string routines from:
https://github.com/ARM-software/optimized-routines
commit 0cf84f26b6b8dcad8287fe30a4dcc1fdabd06560
Author: Sebastian Huber <sebastian.huber@embedded-brains.de>
Date: Thu Jul 27 17:14:57 2023 +0200
string: Fix corrupt GNU_PROPERTY_TYPE (5) size
For ELF32 the notes alignment is 4 and not 8.
Add license and copyright information to COPYING.NEWLIB as entry (56).
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient. Enhance the comparison
similar to strcmp by loading a double-word at a time. The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark in glibc is as follows:
falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)
All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.