The patch fixing the alignment of _cygtls::context accidentally
pushed the desperate attempt to automate the alignment by using
another, non-working variation of attribute((aligned)). Drop it.
Fixes: dcab768cb9 ("Cygwin: cygtls: fix context alignment")
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
In the nano version of malloc, when the last chunk is to be extended,
there is no need to acount for the header again as it's already taken
into account in the overall "alloc_size" at the beginning of the
function.
Contributed by STMicroelectronics
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
When using nano malloc and the remaning heap space is not big enough to
fullfill the allocation, malloc will attempt to merge the last chunk in
the free list with a new allocation in order to create a bigger chunk.
This is successful, but the chunk still remains in the free_list, so
any later call to malloc can give out the same region without it first
being freed.
Possible sequence to verify:
void *p1 = malloc(3000);
void *p2 = malloc(4000);
void *p3 = malloc(5000);
void *p4 = malloc(6000);
void *p5 = malloc(7000);
free(p2);
free(p4);
void *p6 = malloc(35000);
free(p6);
void *p7 = malloc(42000);
void *p8 = malloc(32000);
Without the change, p7 and p8 points to the same address.
Requirement, after malloc(35000), there is less than 42000 bytes
available on the heap.
Contributed by STMicroelectronics
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
When __SINGLE_THREAD__ is not defined, stdin, stdout and stderr needs
to have their _lock instance initialized. The __sfp() method is not
invoked for the 3 mentioned fds thus, the std() method needs to handle
the initialization of the lock.
This is more or less a revert of 382550072b
Contributed by STMicroelectronics
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
A hang was encountered, apparently triggered by commit 63b503916d,
changing tls_pathbufs from malloc'ed to HeapAlloc'ed memory. After
lengthy debugging it transpired that adding the heap handle to the
tls_pathbuf struct added 8 bytes to the cygtls area, thus moving
the "context" member by 8 bytes, too, so it was suddently unaligned.
Fix this for now by changing the alignment.
Fix this once and for all, by adding code to the gentls_offsets script
to check if the alignment of the "context" member is 16 bytes. If not,
print a matching error message, remove the just generated file, and exit
with error.
FIXME: It would be really nice to find a way to auomate the correct
alignment of the "context" member, but I don't see any way to use
alignment attributes to get what we need here.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
while debugging a problem introduced in commit
63b503916d ("Cygwin: tls_pathbuf: Use Windows heap")
a hang in fork was encountered using the original implementation
of tls_pathbuf:
Using tmp_pathbuf inside the code block guarded by __malloc_trylock
may call malloc from tmp_pathbuf::w_get and thus trying to lock an
exclusive SRW lock recursively, which results in a deadlock.
Allocate a small SECURITY_ATTRIBUTES block on the stack rather than
allocating a 64K tmp_pathbuf. This avoids the potential malloc call.
Drop the __malloc_trylock call entirely. There must not be a malloc
call inside the frok::parent block guarded by __malloc_lock, and
just trying to lock is too dangerous inside fork while other threads
might actually chage the content of the heap. Additionally, add a
comment frowning on malloc usage inside tis code block.
Fixes: 44a79a6eca ("Cygwin: convert malloc lock to SRWLOCK")
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
PR 29515 points out our documentation builds are broken, let's just move
over to the new non-recursive builds.
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Rather than using malloc/free for the buffers, we're now using
HeapAlloc/HeapFree on a HEAP_NO_SERIALIZE heap created for this
thread.
Advantages:
- Less contention. Our malloc/free doesn't scale well in
multithreaded scenarios
- Even faster heap allocation by using a non serialized heap.
- Internal, local, temporary data not cluttering the user heap.
- Internal, local, temporary data not copied over to child process
at fork().
Disadvantage:
- A forked process has to start allocating temporary buffers from
scratch. However, this should be alleviated by the fact that
buffer allocation usually reaches its peak very early in process
runtime, so the longer the proceess runs, the less buffers have
to allocated, and, only few processes don't exec after fork
anyway.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
shmat may call shmget. shmget locks by itself as necessary,
so there's no reason to keep the lock active and recurse into
the lock. Use SRWLOCK and only lock as required.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
So far we use a single muto to guard three different datastructures
inside class authz_ctx: the authz HANDLE, the user context HANDLE
and the context cache list. Split the single muto into three
independent SRWLOCKs and guard all datastrcutures as necessary to
avoid thread contention.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The old technique was from a time when we had to reduce stack pressure
by moving 64K buffers elsewhere. It was implemented using a static
global buffer, guarded by a muto. However, that adds a lock which may
unnecessarily serialize threads.
Use Windows heap buffers per invocation instead. HeapAlloc/HeapFree are
pretty fast, scale nicely in multithreaded scenarios and don't serialize
threads unnecessarily.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
CloseHandle gets redefined to a macro calling an internal function
in debug.h when building with -DDEBUGGING, but profiler has no access
to that function.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
This patch makes syscalls for SH architecture respecting the global option
"--disable-newlib-supplied-syscalls". This is useful when a bare-metal
toolchain is needed.
Signed-off-by: Yilin Sun <imi415@imi.moe>
This simple testcase:
locale_t st = newlocale(LC_ALL_MASK, "C", (locale_t)0);
locale_t st2 = newlocale(LC_CTYPE_MASK, "en_US.UTF-8", st);
is sufficient to reproduce a crash in _newlocale_r. After the first call
to newlocale, `st' points to __C_locale, which is const. When using `st'
as locale base in the second call, _newlocale_r tries to set pointers
inside base to NULL. This is bad if base is __C_locale, obviously.
Add a test to avoid trying to overwrite pointer values inside base if
base is __C_locale.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
This way, the sem API is all in the same place, even if the
underlying semaphore class is still in thread.cc.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Rename CygwinCreateThread to create_posix_thread and move
from miscfuncs.cc to create_posix_thread.cc, inbcluding all
related functions. Analogue for the prototypes.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Currently it is possible for symlink_info::check to return -1 in case
we're searching for foo and find foo.lnk that is not a Cygwin symlink.
This contradicts the new meaning attached to a negative return value
in commit 19d59ce75d. Fix this by setting "res" to 0 at the beginning
of the main loop and not seting it to -1 later.
Also fix the commentary preceding the function definition to reflect
the current behavior.
Addresses: https://cygwin.com/pipermail/cygwin/2022-August/252030.html
provide entire internal and external pthread API from inside the
same file.
While I dislike to have another even larger file, this is basically
cleaning up the source and grouping the external API into useful
chunks. Splitting the file cleanly is tricky due to usage of inline
methods is_good_object and verifyable_object_isvalid.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
So far, wmemset used the C implemantation from newlib. Let's use
the optimized assembler code instead.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>