newlib-cygwin

mirror of git://sourceware.org/git/newlib-cygwin.git synced 2025-02-22 08:46:17 +08:00

History

Wilco Dijkstra 473f1a3a5d Improve performance of strstr

v3: Add support for read ahead using strnlen, giving an additional 25% speedup
on large inputs (both short and long needles).

This patch significantly improves performance of strstr by using Sunday's
Quick-Search algorithm.  Due to its simplicity it has the best average
performance of string matching algorithms on almost all inputs.  It uses a
bad-character shift table to skip past mismatches.

The needle length is limited to 254 - this reduces the shift table memory
4 to 8 times, lowering preprocessing overhead and minimizing cache effects.
The limit also implies its worst-case performance is linear.

Larger needles are processed by the Two-Way algorithm.  The macro AVAILABLE
has been improved to use strnlen to read the input in chunks.  This results
in a 2.5 times speedup for large needles, reducing the performance drop when
the Quick-Search algorithm can't be used.

The code for 1-4 byte needles has been simplified and now uses unsigned
char.  Since the optimized code relies on 8-bit chars, we defer to the
size-optimized implementation if CHAR_BIT > 8.

The performance gain of finding a set of randomly chosen words of size 8 in
256 bytes of English text is 14 times on AArch64. For longer haystacks the
gain is well over 20 times.

The size-optimized strstr has also been rewritten from scratch to improve
performance.  On the same test the performance gain is 69%.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

--

2018-10-18 19:51:39 +02:00

ambiguous.t

generated width data, Unicode 10.0

2018-03-12 10:17:20 +01:00

bcmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

bcopy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

bzero.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

combining.t

generated width data, Unicode 10.0

2018-03-12 10:17:20 +01:00

explicit_bzero.c

Add explicit_bzero()

2016-03-18 12:33:40 +01:00

ffsl.c

Add ffsl(), ffsll(), fls(), flsl(), flsll()

2017-07-05 13:49:48 +02:00

ffsll.c

Add ffsl(), ffsll(), fls(), flsl(), flsll()

2017-07-05 13:49:48 +02:00

fls.c

Add ffsl(), ffsll(), fls(), flsl(), flsll()

2017-07-05 13:49:48 +02:00

flsl.c

Add ffsl(), ffsll(), fls(), flsl(), flsll()

2017-07-05 13:49:48 +02:00

flsll.c

Add ffsl(), ffsll(), fls(), flsl(), flsll()

2017-07-05 13:49:48 +02:00

gnu_basename.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

index.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

local.h

ansification: remove _EXFUN, _EXFUN_NOTHROW

2018-01-17 11:47:29 -06:00

Makefile.am

string: add wmempcpy

2017-11-30 04:06:49 -06:00

Makefile.in

makedoc: make errors visible

2017-12-07 11:54:11 +00:00

memccpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memcmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memcpy.c

Use __inhibit_loop_to_libcall in all memset/memcpy implementations

2018-08-29 16:05:37 +02:00

memmem.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memmove.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

mempcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memrchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

memset.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

mkunidata

fix/enhance Unicode table generation scripts

2018-03-14 10:44:32 +01:00

mkwide

width data generation

2018-03-12 10:17:20 +01:00

mkwidthA

width data generation

2018-03-12 10:17:20 +01:00

rawmemchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

rindex.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

stpcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

stpncpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

str-two-way.h

* lib/str-two-way.h (two_way_long_needle): Avoid bug with long

2010-10-06 09:29:35 +00:00

strcasecmp_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strcasecmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcasestr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strchrnul.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcoll_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strcoll.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strcspn.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strdup_r.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strdup.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strerror_r.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strerror.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strings.tex

Add man page entry for strnstr.c.

2017-08-30 15:10:07 +02:00

strlcat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strlcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strlen.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strlwr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strncasecmp_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strncasecmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strncat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strncmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strncpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strndup_r.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strndup.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strnlen.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strnstr.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strpbrk.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strrchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strsep.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strsignal.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strspn.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strstr.c

Improve performance of strstr

2018-10-18 19:51:39 +02:00

strtok_r.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strtok.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strupr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

strverscmp.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strxfrm_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

strxfrm.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

swab.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

timingsafe_bcmp.c

Add timingsafe_bcmp()

2016-03-18 12:33:40 +01:00

timingsafe_memcmp.c

Add timingsafe_memcmp()

2016-03-18 12:33:40 +01:00

u_strerr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

uniset

width data generation

2018-03-12 10:17:20 +01:00

wcpcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcpncpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscasecmp_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

wcscasecmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcschr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscoll_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

wcscoll.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcscspn.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsdup.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

wcslcat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcslcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcslen.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsncasecmp_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

wcsncasecmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsncat.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsncmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsncpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsnlen.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcspbrk.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsrchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsspn.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsstr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcstok.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcstrings.tex

string: add wmempcpy

2017-11-30 04:06:49 -06:00

wcswidth.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcsxfrm_l.c

string: remove TRAD_SYNOPSIS

2017-12-01 03:41:52 -06:00

wcsxfrm.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wcwidth.c

use generated width data

2018-03-12 10:17:20 +01:00

wide.t

generated width data, Unicode 10.0

2018-03-12 10:17:20 +01:00

WIDTH-A

width data generation

2018-03-12 10:17:20 +01:00

wmemchr.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wmemcmp.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wmemcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wmemmove.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wmempcpy.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

wmemset.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00

xpg_strerror_r.c

ansification: remove _DEFUN

2018-01-17 11:47:26 -06:00