Problem:
After passing locales created by 'duplocale' to 'uselocale',
referencing 'MB_CUR_MAX', which is actually expanded to
'__locale_mb_cur_max()' by preprocessors, causes segmentation faults.
Direct use of locales from 'newlocale' does not cause the problem.
This is the problem of 'duplocale'.
$ echo $LANG
ja_JP.UTF-8
$ cat test.c
#include <stdlib.h>
#include <locale.h>
volatile int var;
int main(void) {
locale_t const loc = newlocale(LC_ALL_MASK, "", NULL);
locale_t const dup = duplocale(loc);
locale_t const old = uselocale(dup);
var = MB_CUR_MAX; /* <-- crashes here */
uselocale(old);
freelocale(dup);
freelocale(loc);
return 0;
}
$ gcc test.c
$ ./a
Segmentation fault (core dumped)
# Note: "core dumped" in the above message was actually written in
# Japanese, but I translated the part to post a mail in English.
Bug:
In the beginning of '__loadlocale' (newlib/libc/locale/locale.c:501),
there is a code which checks if the operations can be skipped:
> /* Avoid doing everything twice if nothing has changed. */
> if (!strcmp (new_locale, loc->categories[category]))
> return loc->categories[category];
While, in the function '_duplocale_r' (newlib/libc/locale/
duplocale.c), '__loadlocale' is called as in the quoted codes:
> /* If the object is not a "C" locale category, copy it. Just call
> __loadlocale. It knows what to do to replicate the category. */
> tmp_locale.lc_cat[i].ptr = NULL;
> tmp_locale.lc_cat[i].buf = NULL;
> if (!__loadlocale (&tmp_locale, i, tmp_locale.categories[i]))
> goto error;
This call of '__loadlocale' results in the skip check being
!strcmp(tmp_locale.categories[i], tmp_locale.categories[i]),
which is always true. This means that the actual operations of
'__loadLocale' will never be performed for 'duplocale'.
Fix:
The call of '__loadlocale' in '_duplocale_r' is modified.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Keep __ctype_ptr__ available on Cygwin only, for backward compatibility
with existing apps referencing it via the ctype macros.
Otherwise initialize __global_locale.ctype_ptr and __C_locale.ctype_ptr
and use them throughout.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Only access "C" locale using the new __get_C_locale inline function.
Enable __global_locale for !_MB_CAPABLE targets. Accommodate !_MB_CAPABLE
targets in new locale code.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Add global const __C_locale for reference purposes.
Bump Cygwin API minor number and DLL major version number to 2.6.0.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
This allows looping through the structs and buffers. Also
rearrange definitions to follow order of LC_xxx values.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
Don't use global variables. This allows to call loadlocale from
the yet to be created newlocale().
Rename _thr_locale_t to __locale_t (these locales are not restricted
to threads so the name is misleading).
Along these lines, fix _set_ctype to take a __locale_t as parameter.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
- Remove charset parameter from low level __foo_wctomb/__foo_mbtowc calls.
- Instead, create array of function for ISO and Windows codepages to point
to function which does not require to evaluate the charset string on
each call. Create matching helper functions. I.e., __iso_wctomb,
__iso_mbtowc, __cp_wctomb and __cp_mbtowc are functions returning the
right function pointer now.
- Create __WCTOMB/__MBTOWC macros utilizing per-reent locale and replace
calls to __wctomb/__mbtowc with calls to __WCTOMB/__MBTOWC.
- Drop global __wctomb/__mbtowc vars.
- Utilize aforementioned changes in Cygwin to get rid of charset in other,
calling functions and simplify the code.
- In Cygwin restrict global cygheap locale info to the job performed
by internal_setlocale. Use UTF-8 instead of ASCII on the fly in
internal conversion functions.
- In Cygwin dll_entry, make sure to initialize a TLS area with a NULL
_REENT->_locale pointer. Add comment to explain why.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
Move all locale category structure definitions into setlocale.h and remove
other headers in locale subdir. Create inline accessor functions for
current category struct pointers and use throughout. Use pointers to
"C" locale category structs by default in __global_locale.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
Introduce first cut of struct _thr_locale_t used for the locale_t definition.
Introduce global instance called __global_locale used by default.
Introduce internal inline functions __get_global_locale, __get_locale_r,
__get_current_locale.
Remove usage of global variables in favor of accessor functions pointing to
__global_locale for now. Include all local headers in locale subdir from
setlocale.h to get single include for internal locale access.
Introduce __CTYPE_PTR macro to replace direct access to __ctype_ptr__
and use throughout in isxxx functions.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
accidental declaration of __numeric_load_locale.
* libc/locale/locale.c: Include timelocal.h to get declaration of
__time_load_locale.
(__set_locale_from_locale_alias): Fix return type.
(__locale_msgcharset): Avoid compiler warnings.
(_localeconv_r): Ditto.
* libc/locale/locale.c (current_categories): On Cygwin, set LC_CTYPE
to C.UTF-8 to match initial __wctomb and __mbtowc settings.
(lc_ctype_charset): On Cygwin, initialize to "UTF-8".
(loadlocale): Remove unused Cygwin-specifc code.
if __HAVE_LOCALE_INFO_EXTENDED__ is defined.
* libc/include/langinfo.h (enum __nl_item): New type. Define all
native values accessible through nl_langinfo. Define previously
existing POSIX-compatible values as macros as well.
* libc/include/stdlib.h (__mb_cur_max): Drop declaration.
(__locale_mb_cur_max): Declare.
(MB_CUR_MAX): Re-define calling __locale_mb_cur_max.
* libc/locale/Makefile.am (ELIX_SOURCES): Add lctype.c.
* libc/locale/Makefile.in: Regenerate.
* libc/locale/lctype.c: New file to define and load LC_CTYPE category.
* libc/locale/lctype.h: New file, matching header.
* libc/locale/lmessages.c (_C_messages_locale): Add default values for
wide char members.
(__messages_load_locale): Add _C_messages_locale in call to
__set_lc_messages_from_win.
* libc/locale/lmessages.h (struct lc_messages_T): Add wide char members.
* libc/locale/lmonetary.c (_C_monetary_locale): Add default values for
wide char members.
(__monetary_load_locale): Add _C_monetary_locale in call to
__set_lc_monetary_from_win.
* libc/locale/lmonetary.h (struct lc_monetary_T): Add wide char members.
Add numerical values for international currency formatting per
POSIX-1.2008, if __HAVE_LOCALE_INFO_EXTENDED__ is defined.
* libc/locale/lnumeric.c (_C_numeric_locale): Add default values for
wide char members.
(__numeric_load_locale): Add _C_numeric_locale in call to
__set_lc_numeric_from_win.
* libc/locale/lnumeric.h (struct lc_numeric_T): Add wide char members.
* libc/locale/locale.c (loadlocale): Return doing nothing if category
locale didn't change. Convert category if chain to switch statement.
Call __ctype_load_locale in LC_CTYPE case.
(__locale_charset): Add (but disable for now) returning codeset from
__get_current_ctype_locale.
(__locale_mb_cur_max): Add (but disable for now) returning mb_cur_max
from __get_current_ctype_locale.
(__locale_msgcharset): Add returning codeset from
__get_current_messages_locale.
(_localeconv_r): Accommodate int_XXX values.
* libc/locale/nl_langinfo.c (nl_ext): New array to define what is to
be returned for non-POSIX values.
(nl_Langinfo): Return correct codeset for each locale category. Return
extended values if __HAVE_LOCALE_INFO_EXTENDED__ is defined.
* libc/locale/timelocal.c (_C_time_locale): Add default values for
wide char members.
(__time_load_locale): Add _C_time_locale in call to
__set_lc_time_from_win.
* libc/locale/timelocal.h (struct lc_time_T): Add wide char members.
* libc/stdio/vfwprintf.c (_VFWPRINTF_R): Use wide char decimal point
and thousands_sep if __HAVE_LOCALE_INFO_EXTENDED__ is defined.
* libc/time/strftime.c: Rework to accommodate availability of wide char
strings in LC_TIME category if __HAVE_LOCALE_INFO_EXTENDED__ is defined.
Cygwin only: Allow GB2312 and EUC-CN as alternative codeset names
for GBK. Add to documentation.
* libc/locale/nl_langinfo.c (nl_langinfo): On Cygwin, translate EUCCN
to GB2312.
reason for using __CYGWIN__.
(lconv): Remove _CONST entirely.
(loadlocale): Guard calls to function loading locale-specific
category data with __HAVE_LOCALE_INFO__ rather than __CYGWIN__.
* libc/sys/config.h (__HAVE_LOCALE_INFO__): Define for Cygwin.
parameters for wide char to multibyte conversion. Call
__set_lc_messages_from_win on Cygwin.
* libc/locale/lmessages.h: Make C++-safe.
(__messages_load_locale): Change declaration.
* libc/locale/lmonetary.c (__monetary_load_locale): Use
_monetary_locale_buf as buffer pointer.
* libc/locale/lnumeric.c (__numeric_load_locale): Use
_numeric_locale_buf as buffer pointer.
* libc/locale/timelocal.c (__time_load_locale): Use time_locale_buf
as buffer pointer.
* libc/locale/locale.c (loadlocale): Enable loading LC_MESSAGES data
on Cygwin.
support to documentation.
(__set_locale_from_locale_alias): Declare when build for Cygwin.
(loadlocale): On Cygwin, if locale can't be recognized, call
__set_locale_from_locale_alias to check for locale alias.
Define FAIL macro to replace `return NULL' statements. Replace
throughout.
(_CTYPE_GEORGIAN_PS_255): Define.
(_CTYPE_PT154_128_254): Define.
(_CTYPE_PT154_255): Define.
(__ctype_cp): Add array members for above ctype definitions.
* libc/locale/locale.c (loadlocale): Make TIS-620 charset name
available for all targets. Add guards for setting the conversion
function pointers. Add support for GEORGIAN-PS and PT154 charsets.
Change documentation to reflect current behaviour more closely.
* libc/locale/nl_langinfo.c (nl_langinfo): On Cygwin, translate
"CP101" to "GEORGIAN-PS" and "CP102" to "PT154".
* libc/stdlib/sb_charsets.c (__cp_conv): Add conversion arrays
for GEORGIAN-PS and PT154.
(__cp_index): Map invalid Windows codepage number 101 to
GEORGIAN-PS conversion array, 102 to PT154 conversion array.
parameters for wide char to multibyte conversion. Call
__set_lc_monetary_from_win on Cygwin.
* libc/locale/lmonetary.h: Make C++-safe.
(__monetary_load_locale): Change declaration.
* libc/locale/lnumeric.c (__numeric_load_locale): Take additional
parameters for wide char to multibyte conversion. Call
__set_lc_numeric_from_win on Cygwin.
* libc/locale/lnumeric.h: Make C++-safe.
(__numeric_load_locale): Change declaration.
* libc/locale/locale.c (lconv): De-constify for Cygwin.
(__set_charset_from_locale): Rename from
__set_charset_from_codepage. Take locale as parameter instead of
a codepage.
(loadlocale): Allow "EUC-JP" for "EUCJP" and "EUC-KR" for "EUCKR".
Change documnetation accordingly. Enable LC_COLLATE, LC_MONETARY,
LC_NUMERIC, and LC_TIME handling on Cygwin.
(_localeconv_r): On Cygwin, copy values from monetary and numeric
domain if change has been noted.
* libc/locale/nl_langinfo.c (nl_langinfo): Accommodate change of
am/pm layout in struct lc_time_T.
* libc/locale/timelocal.c (_C_time_locale): Accommodate
redefinition of am/pm members.
(__time_load_locale): Take additional parameters for wide char
to multibyte conversion. Call __set_lc_time_from_win on Cygwin.
* libc/locale/timelocal.h: Make C++-safe.
(struct lc_time_T): Convert am and pm to a am_pm array for easier
consumption by strftime and strptime.
(__time_load_locale): Change declaration.
* libc/time/strftime.c: Change documentation to reflect changes to
strftime. Remove locale constant strings in favor of access to
locale-specifc data.
(_ctloc): Define access method for locale-specifc data.
(TOLOWER): Define for tolower conversion.
(strftime): Throughout, convert locale-specific formats to use
locale-specific data. Add GNU-specific "%P" format.
* libc/time/strptime.c: Remove locale constant strings in favor of
access to locale-specifc data.
(_ctloc): Define access method for locale-specifc data.
(strptime): Throughout, convert locale-specific formats to use
locale-specific data.
(lc_message_charset): Ditto.
(loadlocale): Set charset of the "C" locale to "UTF-8" on Cygwin.
* libc/stdlib/mbtowc_r.c (__mbtowc): Default to __utf8_mbtowc on
Cygwin.
* libc/stdlib/wctomb_r.c (__wctomb): Default to __utf8_wctomb on
Cygwin.
* libc/stdlib/sb_charsets.c (__micro_atoi): Allow five-digit codepage
numbers.
* libc/locale/locale.c (loadlocale): Set MB_CUR_MAX to 1 for KOI8
charsets.
* libc/stdlib/local.h (__cp_conv): Remove incorrect number of codepages.
* libc/locale/locale.c: Update documentation.
(loadlocale): Map "KOI8-R" and "KOI8-U" to CP20866 and CP21866.
2009-08-24 Andy Koppe <andy.koppe@gmail.com>
* libc/stdlib/sb_charsets.c (__cp_conv): Add KOI8-R (Russian, CP20866)
and KOI8-U (Ukrainian, CP21866) to Windows codepage conversion tables.
* libc/ctype/ctype_cp.h (__ctype_cp): Likewise for ctype tables.