E.g. arm ABI requires -fshort-enums for bare-metal toolchains.
Given there are only 29 category enums, the compiler chooses an
8 bit enum type, so a size of 11 bits for the bitfield leads to
a compile time error:
error: width of 'cat' exceeds its type
enum category cat: 11;
^~~
Fix this by aligning the size of the category members to byte
borders.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Scripts do not try to acquire Unicode data by best-effort magic anymore.
Options supported:
-h for help
-i to copy Unicode data from /usr/share/unicode/ucd first
-u to download Unicode data from unicode.org first
If (despite of -i or -u if given) the necessary Unicode files are not
available locally, table generation is skipped, but no error code is
returned, so not to obstruct the build process if called from a Makefile.
E.g. arm ABI requires -fshort-enums for bare-metal toolchains.
Given there are only 29 category enums, the compiler chooses an
8 bit enum type, so a size of 11 bits for the bitfield leads to
a compile time error:
error: width of 'cat' exceeds its type
enum category cat: 11;
^~~
Fix this by aligning the size of the category members to byte
borders.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
touupper and toulower didn't return a value in all cases. Worse,
this only broke Cygwin when building without optimization for debug
purposes.
Why GCC neglects to notice this is a mystery.
While at it, fix formatting.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The tow* functions use an included case conversion table which can be
generated from Unicode data.
The isw* functions use a character categories table (provided by
categories.c) which can be generated from Unicode data.
Delegation between current-locale and specific-locale-dependent functions
was reverted towards the generic locale-dependent functions (*_l.c);
this is however only relevant on systems with non-Unicode wide character
locales, thus not on Cygwin.
Table categories.t and tag enumeration categories.cat provide
character class data for most of the isw* functions.
These data are generated from Unicode data.
Discard QUICKREF sections, rather than writing them to stderr
Discard MATHREF sections, rather than discarding as an error
Pass NOTES sections through to texinfo, rather than discarding as an error
Don't redirect makedoc stderr to .ref file
Remove makedoc output on error
Remove .ref files from CLEANFILES
Regenerate Makefile.ins
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
_ctype_b is defined in ctype_.c as a const char array for non cygwin
targets allowing negative ctype index but as a char array for the same
targets in ctype_.h, giving type conflict at compile time. This is
because the cygwin targets are not treated specially in the latter file.
This patch adds the necessary logic for cygwin targets in ctype_.h.
Keep __ctype_ptr__ available on Cygwin only, for backward compatibility
with existing apps referencing it via the ctype macros.
Otherwise initialize __global_locale.ctype_ptr and __C_locale.ctype_ptr
and use them throughout.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The former __locale_charset always fetched the current locale's charset.
We need the per-locale charset, too, in future. Rename __locale_charset
to __current_locale_charset and change __locale_charset to take a
locale_t as parameter. Accommodate througout.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
make pdf on arm-none-eabi targets fails to build after the reorganization in
baf0c9fcb56e5cf8f54357bf8d8646b51b236886 to fold is*_l documentation in their
is* counterpart. This is due two issues:
1) newlib/libc/ctype/ctype.tex still including the def file for the long versions
2) missing angle brackets in .c files for some of is*_l functions
This patch fixes the issues and allows make pdf to succeeds.
Don't use global variables. This allows to call loadlocale from
the yet to be created newlocale().
Rename _thr_locale_t to __locale_t (these locales are not restricted
to threads so the name is misleading).
Along these lines, fix _set_ctype to take a __locale_t as parameter.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
- Remove charset parameter from low level __foo_wctomb/__foo_mbtowc calls.
- Instead, create array of function for ISO and Windows codepages to point
to function which does not require to evaluate the charset string on
each call. Create matching helper functions. I.e., __iso_wctomb,
__iso_mbtowc, __cp_wctomb and __cp_mbtowc are functions returning the
right function pointer now.
- Create __WCTOMB/__MBTOWC macros utilizing per-reent locale and replace
calls to __wctomb/__mbtowc with calls to __WCTOMB/__MBTOWC.
- Drop global __wctomb/__mbtowc vars.
- Utilize aforementioned changes in Cygwin to get rid of charset in other,
calling functions and simplify the code.
- In Cygwin restrict global cygheap locale info to the job performed
by internal_setlocale. Use UTF-8 instead of ASCII on the fly in
internal conversion functions.
- In Cygwin dll_entry, make sure to initialize a TLS area with a NULL
_REENT->_locale pointer. Add comment to explain why.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
Introduce first cut of struct _thr_locale_t used for the locale_t definition.
Introduce global instance called __global_locale used by default.
Introduce internal inline functions __get_global_locale, __get_locale_r,
__get_current_locale.
Remove usage of global variables in favor of accessor functions pointing to
__global_locale for now. Include all local headers in locale subdir from
setlocale.h to get single include for internal locale access.
Introduce __CTYPE_PTR macro to replace direct access to __ctype_ptr__
and use throughout in isxxx functions.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
These source files have makedoc markup, but aren't listed to be chewed by
makedoc. I am assuming that is accidental.
Future work: Note that stdio/fseeko.c, stdio/ftello.c and common/s_isnand.c have
makedoc markup, but duplicate stdio/fseek.c, stdio/ftell.c and common/s_isnan.c
respectively.
2015-06-23 Jon Turney <jon.turney@dronecode.org.uk>
* libc/ctype/Makefile.am (CHEWOUT_FILES): Add isblank.def.
* libc/ctype/ctype.tex: Include isblank and add to menu.
* libc/posix/Makefile.am (CHEWOUT_FILES): Add posix_spawn.def.
* libc/posix/posix.tex: Include posix_spawn and add to menu.
* libc/stdio64/Makefile.am (CHEWOUT_FILES): Add fdopen.def.
* libc/stdio64/stdio64.tex: Include fdopen64 and add to menu.
* libc/stdio64/fdopen64.c: Improve one-line description.
* libc/string/Makefile.am (CHEWOUT_FILES): Add strchrnul.def.
* libc/string/strings.tex: Include strchrnul and add to menu.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Scan CID 60023).
* libc/ctype/iswalpha.c (iswalpha): Add bounds check to avoid
out-of-bounds read from utf8 tables (CID 59949).
* libc/locale/ldpart.c (__part_load_locale): Add 1 byte to size of lbuf.
Write NUL into the last byte to accommodate split_lines (CID 60047).
* libc/include/sys/features.h: Redefine compilation environment
definitions for Cygwin to cover 64 bit Cygwin.
* libc/ctype/ctype_.c (_ctype_): Fix definition for 64 bit Cygwin.
* libc/include/machine/setjmp.h: Change definition of _JBLEN to allow
different values for 32 bit and 64 bit Cygwin.
* libc/include/reent.h (stat64): Define as stat under Cygwin, instead
of as __stat64. Undef stat64 if not building Newlib.
* libc/include/sys/stat.h (stat64): Define as stat under Cygwin.
* libc/ctype/iswprint.c (iswprint): Ditto.
* libc/ctype/iswpunct.c (iswpunct): Drop standalone implementation.
Define in terms of other wctype functions instead.
* libc/ctype/towlower.c (towlower): Update to Unicode 5.2. Add comment
to explain how to fetch the data from the Unicode database.
* libc/ctype/towupper.c (towupper): Ditto.
* libc/ctype/utf8alpha.h: Ditto.
* libc/ctype/utf8print.h: Ditto.
* libc/ctype/utf8punct.h: Remove.
* libc/ctype/iswcntrl.c (iswcntrl): Add comment to explain how to
fetch the data from the Unicode database.
U+00A0 and U+200B. Add Unicode character U+180E. Add comment
to explain how to generate from Unicode data file.
* libc/ctype/iswspace.c (iswspace): Ditto.
(_CTYPE_GEORGIAN_PS_255): Define.
(_CTYPE_PT154_128_254): Define.
(_CTYPE_PT154_255): Define.
(__ctype_cp): Add array members for above ctype definitions.
* libc/locale/locale.c (loadlocale): Make TIS-620 charset name
available for all targets. Add guards for setting the conversion
function pointers. Add support for GEORGIAN-PS and PT154 charsets.
Change documentation to reflect current behaviour more closely.
* libc/locale/nl_langinfo.c (nl_langinfo): On Cygwin, translate
"CP101" to "GEORGIAN-PS" and "CP102" to "PT154".
* libc/stdlib/sb_charsets.c (__cp_conv): Add conversion arrays
for GEORGIAN-PS and PT154.
(__cp_index): Map invalid Windows codepage number 101 to
GEORGIAN-PS conversion array, 102 to PT154 conversion array.