mirror of
git://sourceware.org/git/newlib-cygwin.git
synced 2025-01-19 04:49:25 +08:00
473f1a3a5d
v3: Add support for read ahead using strnlen, giving an additional 25% speedup on large inputs (both short and long needles). This patch significantly improves performance of strstr by using Sunday's Quick-Search algorithm. Due to its simplicity it has the best average performance of string matching algorithms on almost all inputs. It uses a bad-character shift table to skip past mismatches. The needle length is limited to 254 - this reduces the shift table memory 4 to 8 times, lowering preprocessing overhead and minimizing cache effects. The limit also implies its worst-case performance is linear. Larger needles are processed by the Two-Way algorithm. The macro AVAILABLE has been improved to use strnlen to read the input in chunks. This results in a 2.5 times speedup for large needles, reducing the performance drop when the Quick-Search algorithm can't be used. The code for 1-4 byte needles has been simplified and now uses unsigned char. Since the optimized code relies on 8-bit chars, we defer to the size-optimized implementation if CHAR_BIT > 8. The performance gain of finding a set of randomly chosen words of size 8 in 256 bytes of English text is 14 times on AArch64. For longer haystacks the gain is well over 20 times. The size-optimized strstr has also been rewritten from scratch to improve performance. On the same test the performance gain is 69%. Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test (https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c). --