Commit 89eb4bce15 was pretty half-hearted, missing
the codepage character type tables and wctomb/mbtowc
mappings.
Fixes: 89eb4bce15 ("Cygwin: support KOI8-T codeset")
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Add a _REENT_ERRNO() macro to encapsulate the access to the
_errno member of struct reent. This will help to replace the
structure member with a thread-local storage object in a follow
up patch.
Replace uses of __errno_r() with _REENT_ERRNO(). Keep __errno_r() macro for
potential users outside of Newlib.
- Remove charset parameter from low level __foo_wctomb/__foo_mbtowc calls.
- Instead, create array of function for ISO and Windows codepages to point
to function which does not require to evaluate the charset string on
each call. Create matching helper functions. I.e., __iso_wctomb,
__iso_mbtowc, __cp_wctomb and __cp_mbtowc are functions returning the
right function pointer now.
- Create __WCTOMB/__MBTOWC macros utilizing per-reent locale and replace
calls to __wctomb/__mbtowc with calls to __WCTOMB/__MBTOWC.
- Drop global __wctomb/__mbtowc vars.
- Utilize aforementioned changes in Cygwin to get rid of charset in other,
calling functions and simplify the code.
- In Cygwin restrict global cygheap locale info to the job performed
by internal_setlocale. Use UTF-8 instead of ASCII on the fly in
internal conversion functions.
- In Cygwin dll_entry, make sure to initialize a TLS area with a NULL
_REENT->_locale pointer. Add comment to explain why.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
compiler warnings found this way.
* libc/stdio/freopen.c (_freopen_r): Fix bug setting _flags.
* libc/include/stdio.h (_rename): Define when building newlib.
* libc/include/sys/signal.h (_kill): Ditto.
* libc/include/sys/stat.h (_mkdir): Ditto.
* libc/include/sys/time.h (_gettimeofday): Ditto.
* libc/include/sys/times.h (_times): Ditto.
* libc/include/sys/wait.h (_wait): Ditto.
* libc/locale/lmessages.c (empty): Don't define for Cygwin.
* libc/locale/lmonetary.c (cnv): Ditto.
* libc/locale/nl_langinfo.c (nl_langinfo): Ditto for variable s.
* libc/posix/collate.c: Throughout cast to avoid compiler warning.
* libc/posix/engine.c (matcher): Initialize dp to avoid compiler
warning.
* libc/posix/glob.c: Disable on Cygwin. Explain why.
* libc/posix/regcomp.c: Fix "uninitialized" compiler warnings.
(dissect): Deliberately silence gcc compiler warning. Add comment to
explain why.
* libc/posix/wordexp.c (wordexp): Remove num_bytes variable since result
is never used.
* libc/posix/popen.c (popen): Ditto for variable last.
* libc/reent/mkdirr.c: Include sys/stat.h.
* libc/reent/renamer.c: Include stdio.h.
* libc/search/hash.c: Throughout use underscored variants of the stat
function family.
(init_hash): Add missing definition for the __USE_INTERNAL_STAT64 case.
* libc/search/hash_bigkey.c (__big_insert): Add parenthesis to avoid
compiler warning.
* libc/search/hash_page.c (overflow_page): Initalize freep to NULL to
avoid compiler warning.
* libc/stdio/asiprintf.c (_asiprintf_r): Cast unsigned char * to char *
to avoid compiler warning.
(asiprintf): Ditto.
* libc/stdio/asprintf.c (_asprintf_r): Ditto.
(asprintf): Ditto.
* libc/stdio/vasiprintf.c (_vasiprintf_r): Ditto.
* libc/stdio/vasprintf.c (_vasprintf_r): Ditto.
* libc/stdio/mktemp.c (_gettemp): Cast to unsigned char in call to
isdigit to avoid compiler warning.
* libc/stdio/vfprintf.c (_VFPRINTF_R): Initialize variables used for
grouping to avoid compiler warning. Only define and set nseps and
nrepeats if they are really used.
* libc/stdio/vfwprintf.c (_VFWPRINTF_R): Ditto. Only define state if
it is really used.
* libc/stdio/vfscanf.c (u_char): Revert to be defined as unsigned char.
(__SVFSCANF_R): Cast fmt in call to __mbtowc.
* libc/stdlib/mbtowc_r.c (JIS_state_table): Disable when building
Cygwin.
(JIS_action_table): Ditto.
* libc/stdlib/wctomb_r.c (__utf8_wctomb): Add parenthesis to avoid
compiler warning.
* libc/string/strcasestr.c: Deliberately silence gcc compiler warning.
Add comment to explain why.
* libc/time/strptime.c (strptime): Cast to unsigned char in calls to
isspace to avoid compiler warning.
* libm/math/e_atan2.c (__ieee754_atan2): Add parenthesis to avoid
compiler warning.
* libm/math/e_exp.c (__ieee754_exp): Initialize k to 0 to avoid
compiler warning. Drop setting it to 0 later.
* libm/math/ef_exp.c (__ieee754_expf): Ditto.
* libm/math/e_pow.c (__ieee754_pow): Add braces to avoid compiler
warning.
* libm/math/ef_pow.c (__ieee754_powf): Ditto.
* libm/math/er_lgamma.c (__ieee754_lgamma_r): Initialize nadj to 0 to
avoid compiler warning.
* libm/math/erf_lgamma.c (__ieee754_lgammaf_r): Ditto.
* libm/math/e_rem_pio2.c (__ieee754_rem_pio2): Ditto for variable z.
* libm/common/sf_round.c (roundf): Remove signbit variable since result
is never used.
(lc_message_charset): Ditto.
(loadlocale): Set charset of the "C" locale to "UTF-8" on Cygwin.
* libc/stdlib/mbtowc_r.c (__mbtowc): Default to __utf8_mbtowc on
Cygwin.
* libc/stdlib/wctomb_r.c (__wctomb): Default to __utf8_wctomb on
Cygwin.
recognizes 0x8e and 0x8f lead bytes.
(_iseucjp2): Rename from _iseucjp.
* libc/stdlib/mbtowc_r.c (__eucjp_mbtowc): Convert JIS-X-0212
triplebyte sequences as well.
* libc/stdlib/wctomb_r.c (__eucjp_wctomb): Convert to JIS-X-0212
triplebyte sequences as well.
_MB_CAPABLE systems.
* libc/ctype/iswblank.c: Ditto.
* libc/ctype/iswcntrl.c: Ditto.
* libc/ctype/iswprint.c: Ditto.
* libc/ctype/iswpunct.c: Ditto.
* libc/ctype/iswspace.c: Ditto.
* libc/ctype/jp2uc.c (__jp2uc): On Cygwin, just return c.
Explain why.
* libc/ctype/towlower.c: Ditto.
* libc/ctype/towupper.c: Ditto.
* libc/include/sys/config.h: Define _MB_EXTENDED_CHARSETS_ISO
and _MB_EXTENDED_CHARSETS_WINDOWS if _MB_EXTENDED_CHARSETS_ALL is
defined. Define _MB_EXTENDED_CHARSETS_ALL on Cygwin only for now.
* libc/include/sys/reent.h (struct _reent): Mark _current_category
and _current_locale as unused.
* libc/locale/locale.c: Add new charset support to documentation.
Include ../stdio/local.h from here.
(lc_ctype_charset): Set to "ASCII" by default.
(lc_message_charset): Ditto.
(_setlocale_r): Don't set _current_category and _current_locale.
(loadlocale): Add Cygwin codepage support. On _MB_CAPABLE
systems, set __mbtowc and __wctomb function pointers to function
corresponding with current charset. Don't allow non-existant
ISO-8859-12 charset. Add support for Windows singlebyte codepages.
On Cygwin, add support for GBK, CP949, and BIG5. On Cygwin,
call __set_ctype() in case the catorgy is LC_CTYPE. Don't set
_current_category and _current_locale.
* libc/stdlib/Makefile.am (GENERAL_SOURCES): Add sb_charsets.c.
* libc/stdlib/Makefile.in: Regenerate.
* libc/stdlib/local.h: Add prototype for __locale_charset.
Add prototypes for __mbtowc and __wctomb pointers.
Add prototypes for charset-specific _wctomb_r and _mbtowc_r
functions.
Declare tables and functions from sb_charsets.c.
* libc/stdlib/mbtowc_r.c (__mbtowc): Define. Set to __ascii_mbtowc
by default.
(_mbtowc_r): Just call __mbtowc from here.
(__ascii_mbtowc): New function.
(__iso_mbtowc): New function.
(__cp_mbtowc): New function.
(__utf8_mbtowc): New function.
(__sjis_mbtowc): New function. Disable on Cygwin.
(__eucjp_mbtowc): New function. Disable on Cygwin.
(__jis_mbtowc): New function. Disable on Cygwin.
* libc/stdlib/sb_charsets.c: New file, adding singlebyte to UTF
conversion tables for all ISO and CP charsets.
(__iso_8859_index): New function.
(__cp_index): New function.
* libc/stdlib/wctomb_r.c (__wctomb): Define. Set to __ascii_wctomb
by default.
(_wctomb_r): Just call __wctomb from here.
(__ascii_wctomb): New function.
(__utf8_wctomb): New function.
(__sjis_wctomb): New function. Disable on Cygwin.
(__eucjp_wctomb): New function. Disable on Cygwin.
(__jis_wctomb): New function. Disable on Cygwin.
(__iso_wctomb): New function.
(__cp_wctomb): New function.
invalid character sequence.
* libc/stdlib/mbtowc_r.c (_mbtowc_r): Fix compiler warning due to
missing declaration of __locale_charset.
* libc/stdlib/wctomb_r.c (_wctomb_r): Ditto.
* libc/stdlib/wctomb_r.c (_wctomb_r): When checking single-byte
charset, cast wchar to size_t in case wchar_t is signed.
* libc/stdlib/wctomb.c (wctomb): Add similar single-byte check.
* libc/stdlib/wctomb_r.c (_wctomb_r): Return EILSEQ in case of an
invalid wchar. Return -1 if wchar doesn't fit into singlebyte
value in case of using a singlebyte charset.
sequences since they are invalid in the Unicode standard.
Handle surrogate pairs in case of wchar_t == UTF-16.
* wctomb_r.c (_wctomb_r): Don't convert invalid Unicode wchar_t
values beyond 0x10ffff into UTF-8 chars. Handle surrogate pairs in
case of wchar_t == UTF-16.
* libc/include/sys/_types.h (_mbstate_t): Changed to use
unsigned char internally.
* libc/sys/linux/sys/_types.h: Ditto.
* libc/include/sys/reent.h
* libc/stdlib/mblen.c (mblen): Use function-specific state
value from default reentrancy structure.
* libc/stdlib/mblen_r.c (_mblen_r): If return code from
_mbtowc_r is less than 0, reset state __count value and
return -1.
* libc/stdlib/mbrlen.c (mbrlen): If the input state pointer
is NULL, use the function-specific pointer provided in the
default reentrancy structure.
* libc/stdlib/mbrtowc.c: Add reentrant form of function.
If input state pointer is NULL, use function-specific area
provided in reentrancy structure.
* libc/stdlib/mbsrtowcs.c: Ditto.
* libc/stdlib/wcrtomb.c: Ditto.
* libc/stdlib/wcsrtombs.c: Ditto.
* libc/stdlib/mbstowcs.c: Reformat.
* libc/stdlib/wcstombs.c: Ditto.
* libc/stdlib/mbstowcs_r.c (_mbstowcs_r): If an error occurs,
reset the state's __count value and return -1.
* libc/stdlib/mbtowc.c: Ditto.
* libc/stdlib/mbtowc_r.c (_mbtowc_r): Add restartable functionality.
If number of bytes is used up before completing a valid multibyte
character, return -2 and save the state.
* libc/stdlib/wctomb_r.c (_wctomb_r): Define __state as __count
and change some __count references to __state for clarity.
* libc/stdlib/abort.c: changed description: uses "raise" instead of
"getpid" and "kill"; added: uses "write" and "_exit".
Also included unistd.h for "_exit" prototype.
* libc/stdlib/system.c: included unistd.h for "execve" prototype,
reent.h for "_fork_r" and "_wait_r" prototypes.
(do_system): changed extern char *environ[] to POSIX-friendly
extern char **environ.
* libc/stdlib/wctomb_r.c: included string.h for "strlen" and "strcmp"
prototypes.
* libc/stdlib/remove.c: included reent.h for "_unlink_r" prototype.
* libc/reent/execr.c: included sys/wait.h for "wait" prototype.
* libc/reent/fstatr.c: included sys/stat.h for "fstat" prototype.
* libc/reent/openr.c: included fcntl.h for "open" prototype.
* libc/reent/signalr.c: included signal.h for "kill" prototype,
unistd.h for "getpid" prototype.
* libc/reent/statr.c: included sys/stat.h for "stat" prototype.
* libc/reent/timer.c: included sys/time.h for "gettimeofday" prototype.
* libc/unix/getut.c (utmpname): removed local, incorrect "strdup"
prototype. Also included stdlib.h for "abort", string.h for
"strdup" and "strncmp" prototypes.
* libc/unix/getlogin.c: included string.h for "strncmp", "memset", and
"strncpy", unistd.h for "read" and "close" prototypes.
* libc/posix/execvp.c: included string.h for "strchr", "strlen", and
"strcat" prototypes.