Previously, fprintf() on a wide-oriented stream crashes or outputs
garbage. This is because a narrow char string which can be odd bytes
in length is cast into a wide char string which should be even
bytes in length in __sprint_r/__sfputs_r based on the __SWID flag.
As a result, if the length is odd bytes, the reading buffer runs over
the buffer length, which causes a crash. If the length is even bytes,
garbage is printed.
With this patch, any output to the stream which is set to different
orientation fails with error just like glibc. Note that it behaves
differently from other libc implementations such as BSD, musl and
Solaris.
Reviewed-by: Corinna Vinschen <corinna@vinschen.de>
Signed-off-by: Takashi Yano <takashi.yano@nifty.ne.jp>
Add a _REENT_ERRNO() macro to encapsulate the access to the
_errno member of struct reent. This will help to replace the
structure member with a thread-local storage object in a follow
up patch.
Replace uses of __errno_r() with _REENT_ERRNO(). Keep __errno_r() macro for
potential users outside of Newlib.
svfwscanf replaces getwc and ungetwc_r. The comments in the code talk
about avoiding file operations, but they also need to bypass the
mbtowc calls as svfwscanf operates on wchar_t, not multibyte data,
which is a more important reason here; they would not work correctly
otherwise.
The ungetwc replacement has code which uses the 3 byte FILE _ubuf
field, but if wchar_t is 32-bits, this field is not large enough to
hold even one wchar_t value. Building in this mode generates warnings
about array overflow:
In file included from ../../newlib/libc/stdio/svfiwscanf.c:35:
../../newlib/libc/stdio/vfwscanf.c: In function '_sungetwc_r.isra':
../../newlib/libc/stdio/vfwscanf.c:316:12: warning: array subscript 4294967295 is above array bounds of 'unsigned char[3]' [-Warray-bounds]
316 | fp->_p = &fp->_ubuf[sizeof (fp->_ubuf) - sizeof (wchar_t)];
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../../newlib/libc/stdio/stdio.h:46,
from ../../newlib/libc/stdio/vfwscanf.c:82,
from ../../newlib/libc/stdio/svfiwscanf.c:35:
../../newlib/libc/include/sys/reent.h:216:17: note: while referencing '_ubuf'
216 | unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */
| ^~~~~
However, the vfwscanf code *never* ungets data before the start of the
scanning operation, and *always* ungets data which matches the input
at that point, so the code always hits the block which backs up over
the input data and never hits the block which uses the _ubuf field.
In addition, the svfwscanf code will always start with the unget
buffer empty, so the ungetwc replacement never needs to support an
unget buffer at all.
Simplify the code by removing support for everything other than
backing up over the input data, leaving the check to make sure it
doesn't get underflowed in case the vfscanf code has a bug in it.
Signed-off-by: Keith Packard <keithp@keithp.com>
This edits licenses held by Berkeley and NetBSD, both of which
have removed the advertising requirement from their licenses.
Signed-off-by: Keith Packard <keithp@keithp.com>
newlib's vfwscanf(3) (or specifically, __SVFWSCANF_R()) fails to correctly set
the assignment-suppressing character (`*') flag[1] which, when present in the
formatting string, results in undefined behaviour comprising retrieving and
dereferencing a pointer that was not supplied by the caller as such or at all.
When compared to the vfscanf(3) implementation, this would appear to be over
the missing goto match_failure statement preceded by the flags test seen below.
Hence, this patch (re)introduces it.
[1] <http://pubs.opengroup.org/onlinepubs/009695399/functions/fwscanf.html>
--
Old BSD bug: While ^ is recognized and the set of matching characters
is negated, the code neglects to increment the pointer pointing to the
matching characters. Thus, on a negation expression like %[^xyz], the
matching doesn't only stop at x, y, or z, but incorrectly also on ^.
Fix this by setting the start pointer after recognizing the ^.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
* vfscanf: per POSIX, if the target type is wchar_t, the width is
counted in (multibyte) characters, not in bytes.
* vfscanf: Handle UTF-8 multibyte sequences converted to surrogate
pairs on UTF-16 systems.
* vfwscanf: Don't count high surrogates in input against field width
counting. Per POSIX, input is
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The width value keeps the maximum field width. This is the maximum
field width of the *input*. It's *never* to be used in conjunction
with the number of bytes or characters written to the output argument.
However, especially in vfwscanf, the code is partially taken from
NetBSD which erroneously subtracts the number of multibyte chars
written to the argument from the width variable, thus potentially
subtracting up to MB_CUR_MAX from width for a single character in
the input stream.
To make matters worse, the previous patch adding %m added basically
the same mistake for 'c' type input.
Fix it.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
* The new code is guarded with _WANT_IO_POSIX_EXTENSIONS, but
this is automatically enabled with _WANT_IO_C99_FORMATS for now.
* vfscanf neglects to implement %l[, so %ml[ is not implemented yet
either.
* Sidenote: vfwscanf doesn't allow ranges in %[ yet. Strictly this
is allowed per POSIX, but it differes from vfscanf as well as from
glibc.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The special handling of %\0 in [w]scanf is flawed. It's just a
matching failure and should be handled as such. scanf also
fakes an int input value on %X with X being an invalid conversion
char. This is also just a matching failure and should be handled
the same way as %\0.
There's no indication of the reason for this "disgusting
backwards compatibility hacks" in the logs, given this
code made it into newlib before setting up the CVS repo.
Just handle these cases identically as matching failures.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Move all locale category structure definitions into setlocale.h and remove
other headers in locale subdir. Create inline accessor functions for
current category struct pointers and use throughout. Use pointers to
"C" locale category structs by default in __global_locale.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
secure stream related critical section against thread cancellation.
(_newlib_flockfile_exit): Ditto.
(_newlib_sfp_lock_end): Ditto.
(_newlib_sfp_lock_start): Ditto for the list of streams.
(_newlib_sfp_lock_exit): Ditto.
(_newlib_sfp_lock_end): Ditto.
Use aforementioned macros in place of _flockfile/_funlockfile
and __sfp_lock_acquire/__sfp_lock_release throughout the code.
* libc/stdio/fclose.c: Explicitely disable and re-enable thread
cancellation. Explain why.
* libc/stdio/freopen.c: Ditto.
* libc/stdio64/freopen64.c: Ditto.
changes of flags and fp lock.
* libc/stdio/freopen.c: Ditto.
* libc/stdio/freopen64.c: Ditto.
* libc/stdio/fgetc.c: Revert change from 2009-04-24, remove sfp locks
which guard entire function to avoid potential deadlocks when using
stdio functions in multiple thraeds.
* libc/stdio/fgets.c: Ditto.
* libc/stdio/fgetwc.c: Ditto.
* libc/stdio/fgetws.c: Ditto.
* libc/stdio/fread.c: Ditto.
* libc/stdio/fseek.c: Ditto.
* libc/stdio/getc.c: Ditto.
* libc/stdio/getdelim.c: Ditto.
* libc/stdio/gets.c: Ditto.
* libc/stdio/vfscanf.c: Ditto.
* libc/stdio/vfwscanf.c: Ditto.
* libc/stdio/fflush.c (_fflush_r): Split out core functionality into
new function __sflush_r. Just lock file and call __sflush_r from here.
* libc/stdio/fwalk.c (_fwalk): Remove static helper function and move
functionality back into main function. Don't walk a file with flags
value of 1. Add comment.
(_fwalk_reent): Ditto.
* libc/stdio/local.h (__sflush_r): Declare.
* libc/stdio/refill.c (__srefill): Before calling fwalk, set flags
value to 1 so this file pointer isn't walked. Revert flags afterwards
and call __sflush_r for this fp if necessary. Add comments.