This patch significantly improves performance of memmem using a novel
modified Horspool algorithm. Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.
By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.
Small needles up to size 2 use a dedicated linear search. Very long needles
use the Two-Way algorithm (to avoid increasing stack size inlining is now disabled).
The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8 (this is even faster than the recently improved strstr
algorithm, so I'll update that in the near future).
The size-optimized memmem has also been rewritten from scratch to get a
2.7x performance gain.
Tested against GLIBC testsuite and randomized tests.
Message-Id: <DB5PR08MB1030649D051FA8532A4512C883B20@DB5PR08MB1030.eurprd08.prod.outlook.com>
- Turns out, the definition of POSIX unlink semantics is half-hearted
so far: It's not possible to link an open file HANDLE if it has
been deleted with POSIX semantics, nor is it possible to remove
the delete disposition. This breaks linkat on an O_TMPFILE.
Tested with W10 1809.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
On Windows 10 1803 and later, create dirs under the Cygwin
installation dir as case sensitive, if WSL is installed.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
- short-circuit most code in unlink_nt since it's not necessary
anymore if FILE_DISPOSITION_POSIX_SEMANTICS is supported.
- Immediately remove O_TMPFILE from filesystem after creation.
Disable code for now because we have to implement /proc/self/fd
opening by handle first, lest linkat fails.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Newer FAT32 and exFAT add FILE_SUPPORTS_ENCRYPTION to their
flags which wasn't handled by Cygwin yet.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Various new file info class members adding important POSIX semantics
have been added with W10 1709. We may want to utilize them, so add
a matching wincaps.
Rearrange checking the W10 build number to prefer the latest builds
over the older builds. Rename wincap_10 to wincap_10_1507 for
enhanced clarity.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
- Add missing members added in later OS versions
- Rearrange accompanying FILE_foo_INFORMATION structs
ordered by info class
- Add promising FILE_foo_INFORMATION structs of later
Windows 10 releases plus accompanying enums
- Drop "Checked on 64 bit" comments since that's self-evident
these days
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
FreeBSD uses a 64-bit ino_t since 2017-05-23. We need this for the
pipe() support in libbsd.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Various structures exported by sysctl_rtsock() contain padding fields
which were not being zeroed.
Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: ae
MFC after: 3 days
Security: kernel memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18333
The check for the TEB being allocated beyond the first 2GB area is not
valid anymore. At least on W10 WOW64, the TEB is allocated in the
lower 2GB even in large-address aware executables. Use VirtualQuery
instead. It fails for invalid addresses so that's a simple enough test.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The threshold value at which powf overflows depends on the rounding mode
and the current check did not take this into account. So when the result
was rounded away from zero it could become infinity without setting
errno to ERANGE.
Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp
If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set,
which is fine in nearest rounding mode, but
powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode
powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode
and the previous implementation did not set errno in these cases.
The fix tries to avoid affecting the common code path or calling a
function that may introduce a stack frame, so float arithmetics is used
to check the rounding mode and the threshold is selected accordingly.
for {n,u,m}stosbt
Integer overflows and wrong constants limited the accuracy of these
functions and created situatiosn where sbttoXs(Xstosbt(Y)) != Y. This
was especailly true in the ns case where we had millions of values
that were wrong.
Instead, used fixed constants because there's no way to say ceil(X)
for integer math. Document what these crazy constants are.
Also, use a shift one fewer left to avoid integer overflow causing
incorrect results, and adjust the equasion accordingly. Document this.
Allow times >= 1s to be well defined for these conversion functions
(at least the Xstosbt). There's too many users in the tree that they
work for >= 1s.
This fixes a failure on boot to program firmware on the mlx4
NIC. There was a msleep(1000) in the code. Prior to my recent rounding
changes, msleep(1000) worked, but msleep(1001) did not because the old
code rounded to just below 2^64 and the new code rounds to just above
it (overflowing, causing the msleep(1000) to really sleep 1ms).
A test program to test all cases will be committed shortly. The test
exaustively tries every value (thanks to bde for the test).
Sponsored by: Netflix, Inc
Differential Revision: https://reviews.freebsd.org/D18051
of the result rather than the floor(). Returning the floor means that
sbttoX(Xtosbt(y)) != y for almost all values of y. In practice, this
results in a difference of at most 1 in the lsb of the sbintime_t. This
difference is meaningless for all current users of these functions, but
is important for the newly introduced sysctl conversion routines which
implicitly rely on the transformation being idempotent.
Sponsored by: Netflix, Inc
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
and decimal time units. Use them in some existing code that is
vulnerable to roundoff errors.
The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call. The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS. (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
While reformatting the script, backticks `` were replaced with
brackets $(). This in turn invalidated the \\( ... \\) expressions in the
sed script because backslash resolution in $() works differently from
backslash resolution in ``. Only a single backslash is valid now.
While at it, fix up the uname(2) date representation when building a
snapshot.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
when calling clocks too early in DLL init, the vtables are not correctly
set up for some reason. Calls to init() from now() fail because the init
pointer in the vtable is NULL.
Real life example is mintty which runs into a minor problem at startup,
triggering a system_printf call. Strace is another problem, it's called
the first time prior to any class initialization.
Workaround is to make sure that no virtual methods are called in an
early stage. Make init() non-virtual and convert resolution() to a
virtual method instead. Add a special non-virtual
clk_monotonic_t::strace_usecs.
While at it:
- Inline internal-only methods.
- Drop the `inited' member. Convert period/ticks_per_sec toa union.
Initialize period/ticks_per_sec via InterlockeExchange64.
- Fix GetTickCount64 usage. No, it's not returning ticks but
milliseconds since boot (unbiased).
- Fix comment indentation.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Use whatever native unit the system provides for the resolution of
a timer to avoid rounding problems
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
- Drop hires_[nm]s clocks, rename hires.h to clock.h.
- Implement clk_t class as an extensible clock class in new file clock.cc.
- Introduce get_clock(clock_id) returning a pointer to the clk_t instance
for clock_id. Provide the following methods along the lines of the former
hires classes:
void clk_t::nsecs (struct timespec *);
ULONGLONG clk_t::nsecs ();
LONGLONG clk_t::usecs ();
LONGLONG clk_t::msecs ();
void clk_t::resolution (struct timespec *);
- Add CLOCK_REALTIME_COARSE, CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC_COARSE
and CLOCK_BOOTTIME clocks.
- Allow clock_nanosleep, pthread_condattr_setclock and timer_create to use
all new clocks (both clocks should be usable with a small tweak, though).
- Bump DLL major version to 2.12.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
- Add CLOCK_REALTIME_COARSE, CLOCK_MONOTONIC_RAW,
CLOCK_MONOTONIC_COARSE and CLOCK_BOOTTIME
- Guard new values with __GNU_VISIBLE
- Add CLOCK_REALTIME_COARSE as (clockid_t) 0 for simplicity
(It allows to have all values < 8 and so be used as array
index into an array of clocks)
- Fix macro bracketing
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
clock_setres is a questionable function only existing on QNX.
Disable the function, just return success for CLOCK_REALTIME
to maintain backward compatibility.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The current method to make hires_ns priming thread-safe isn't
thread-safe. Rather than hoping that running the thread in
TIME_CRITICAL priority is doing the right thing, use a spinlock.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
LARGE_INTEGER has QuadPart anyway, no reason to compute the
64 bit value from HighPart and LowPart.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
A previous commit introduced the ability to use the semi-hosting
SYS_EXIT_EXTENDED operation to libgloss, this commit adds the same
ability to the sys/arm/ backend so that building newlib only will
provide the same capabilities.
The toplevel makefile used by binutils/gcc/newlib/etc has install-pdf and
install-html targets, but they fail because libgloss doesn't support them.
Tested with an arm-eabi combined tree build and install, and verifying that
the install-pdf and install-html targets now work, and that the pdf and html
doc files are now in the install tree.
libgloss/
* Makefile.in (install-html, install-pdf): New.
* doc/Makefile.in (htmldir, pdfdir): New.
(porting.ps): Delete white space on blank line.
(install-pdf, install-html): New.
The _exit function currently passes -1 as a "sig" to the _kill function as an
invalid signal number so that _kill can distinguish between an abort and a
standard exit.
For boards using the SYS_EXIT_EXTENDED semi-hosting operation to return a
status code, this means that the "status" paramter to _exit is ignored and the
return code is always -1.
https://developer.arm.com/docs/100863/latest/semihosting-operations/sys_exit_extended-0x20
This patch puts shared code between _kill and _exit into a new function
_kill_shared that takes the semi-hosting "reason" to use (if semi-hosting is
available) as an argument.
For semi-hosting _kill_shared provides that "reason".
Without the "sig" argument being used to distinguish between a normal and
abnormal exit, the _exit function can provide the return code to be used if the
SYS_EXIT_EXTENDED operation is available.
Hence the exit code can be returned.
This patch fixes an issue in the previous memset loop change. If the
zva size is >= 256 and there are more than 64 bytes left in the
tail, we could enter the loop and thus need to rebias dst by 32 as
well.
Since no known CPUs use this size this can't be tested natively, so I've
tested it on a simulator initialized with a large zva size.
--
Do not define __ATTRIBUTE_IMPURE_PTR__ for RTMES on the v850 target.
The previous definition lead to the following linker error in
combination with -fdata-sections:
relocation truncated to fit: R_V810_GPWLO_1 against symbol
`_global_impure_ptr' defined in .rodata._global_impure_ptr section in
libc.a(lib_a-impure.o)
relocation truncated to fit: R_V810_GPWLO_1 against symbol `_impure_ptr'
defined in .data._impure_ptr section in libc.a(lib_a-impure.o)
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
The <machine/param.h> header file exposes some unrelated stuff not
covered by C or POSIX. Avoid its use in <sys/_cpuset.h> since it is
included in <rtems.h>.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
This fixes an ineffiency in the non-zero memset. Delaying the writeback
until the end of the loop is slightly faster on some cores - this shows
~5% performance gain on Cortex-A53 when doing large non-zero memsets.
Tested against the GLIBC testsuite.
fhandler_socket_wsock::set_socket_handle calls set_flags after
setting the O_NONBLOCK/O_CLOEXEC flags, thus overwriting them.
It also turns out that fhandler_socket_wsock::init_events is called
too late. The inheritence flags are changed before creating the
socket event handling objects. Thus, inheritence flags for
those objects are wrong with SOCK_CLOEXEC.
Fix this by reordering the calls and setting the file flags through
fhandler_base::set_flags.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>