Except for the "C" or "POSIX" locale, checking for start <= finish
is always wrong. Range start must be <= range finish in terms of the
locale's collating order. So make sure to call always wcscoll(), even
in the "C"/"POSIX" locale, which makes wcscoll equivalent to wcscmp
anyway.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
computejumps() moves g->charjump to a position relativ to the value of
CHAR_MIN. As such, g->charjump doesn't necessarily point to the address
actually allocated. While regfree() takes that into account, the low
memory handling in regcomp_internal() doesn't. Fix that by free'ing
the actually allocated address, as in regfree().
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
...and use __wcollate_range_cmp. This will have to be tweaked further
when supporting collation symbols...
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Since Windows Vista, locale handling is converted from using numeric
locale identifiers (LCID) to using ISO5646 locale strings. In the
meantime Windows introduced new locales which don't even have a LCID
attached. Those were unusable in Cygwin because locale information
for these locales required to call the new locale functions taking
a locale string.
Convert Cygwin to drop LCIDs and use Windows ISO5646 locales instead.
The last place using LCIDs is the __set_charset_from_locale function.
Checking numerically is easier and uslay faster than checking strings.
However, this function is clearly a TODO
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=179721
After FreeBSD eventually picked up the bugreport from within
only 5 years, rename __collate_range_cmp to __wcollate_range_cmp
as suggested all along, and make it type safe (wint_t instead of
wchar_t for hopefully obvious reasons...)
While at it, drop __collate_load_error and fix the checks for
it in glob and fnmatch.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
- call mbrtowi instead of mbrtowc
- drop Cygwin-only surrogate handling from wgetnext and xmbrtowc since
it's encapsulated in mbrtowi.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
This patch has been inspired by the Linux kernel patch
294f69e662d1 compiler_attributes.h: Add 'fallthrough' pseudo keyword for switch/case use
written by Joe Perches <joe AT perches DOT com> based on an idea from
Dan Carpenter <dan DOT carpenter AT oracle DOT com>. The following text
is from the original log message:
Reserve the pseudo keyword 'fallthrough' for the ability to convert the
various case block /* fallthrough */ style comments to appear to be an
actual reserved word with the same gcc case block missing fallthrough
warning capability.
All switch/case blocks now should end in one of:
break;
fallthrough;
goto <label>;
return [expression];
continue;
In C mode, GCC supports the __fallthrough__ attribute since 7.1,
the same time the warning and the comment parsing were introduced.
Cygwin-only: add an explicit -Wimplicit-fallthrough=5 to the build
flags.
The former __locale_charset always fetched the current locale's charset.
We need the per-locale charset, too, in future. Rename __locale_charset
to __current_locale_charset and change __locale_charset to take a
locale_t as parameter. Accommodate througout.
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Move all locale category structure definitions into setlocale.h and remove
other headers in locale subdir. Create inline accessor functions for
current category struct pointers and use throughout. Use pointers to
"C" locale category structs by default in __global_locale.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
Introduce first cut of struct _thr_locale_t used for the locale_t definition.
Introduce global instance called __global_locale used by default.
Introduce internal inline functions __get_global_locale, __get_locale_r,
__get_current_locale.
Remove usage of global variables in favor of accessor functions pointing to
__global_locale for now. Include all local headers in locale subdir from
setlocale.h to get single include for internal locale access.
Introduce __CTYPE_PTR macro to replace direct access to __ctype_ptr__
and use throughout in isxxx functions.
Signed-off by: Corinna Vinschen <corinna@vinschen.de>
* collate.h: New header.
(__collate_range_cmp): Declare.
(__collate_load_error): Define.
* glob.cc: Pull in latest version from FreeBSD. Simplify and reduce
Cygwin-specific changes.
* regex/regcomp.c: Include collate.h on Cygwin as well.
(__collate_range_cmp): Move from here...
* nlsfuncs.cc (__collate_range_cmp): ...to here.
* miscfuncs.cc (thread_wrapper): Fix typo in comment.
(CygwinCreateThread): Take dead zone of Windows stack into account.
Change the way how the stack is commited and how to handle guardpages.
Explain how and why.
* thread.h (PTHREAD_DEFAULT_STACKSIZE): Change definition. Explain why.
outside of the base plane to UTF-8. Call throughout instead of
wcrtomb.
(wgetnext): Handle surrogate pairs on UTF-16 systems.
* regex/regexec.c (xmbrtowc): Ditto.
(NONCHAR): Better cast here to make the test work. Move comment
from step here.
(matcher): Disable skipping initial string in multibyte case.
* regex/regcomp.c (p_bracket): Don't simplify singleton in the invert
case.
(p_b_term): Handle early end of pattern after dash in bracket
expression.
(singleton): Don't ignore the wides just because there's already a
singleton in the single byte chars. Fix condition for a singleton
wide accordingly.
(findmust): Check for LC_CTYPE charset, rather than LC_COLLATE charset.
* regex2.h (CHIN): Fix condition in the icase & invert case.
(ISWORD): Fix wrong cast to unsigned char.
* Makefile.in (install-headers): Remove extra command to install
regex.h.
(uninstall-headers): Remove extra command to uninstall regex.h.
* nlsfuncs.cc (collate_lcid): Make externally available to allow
access to collation internals from regex functions.
(collate_charset): Ditto.
* wchar.h: Add __cplusplus guards to make C-clean.
* include/regex.h: New file, replacing regex/regex.h. Remove UCB
advertising clause.
* regex/COPYRIGHT: Accommodate BSD license. Remove UCB advertising
clause.
* regex/cclass.h: Remove.
* regex/cname.h: New file from FreeBSD.
* regex/engine.c: Ditto.
(NONCHAR): Tweak for Cygwin.
* regex/engine.ih: Remove.
* regex/mkh: Remove.
* regex/regcomp.c: New file from FreeBSD. Tweak slightly for Cygwin.
Import required collate internals from nlsfunc.cc.
(p_ere_exp): Add GNU-specific \< and \> handling for word boundaries.
(p_simp_re): Ditto.
(__collate_range_cmp): Define.
(p_b_term): Use Cygwin-specific collate internals.
(findmust): Ditto.
* regex/regcomp.ih: Remove.
* regex/regerror.c: New file from FreeBSD. Fix a few compiler warnings.
* regex/regerror.ih: Remove.
* regex/regex.7: New file from FreeBSD. Remove UCB advertising clause.
* regex/regex.h: Remove. Replaced by include/regex.h.
* regex/regexec.c: New file from FreeBSD. Fix a few compiler warnings.
* regex/regfree.c: New file from FreeBSD.
* regex/tests: Remove.
* regex/utils.h: New file from FreeBSD.
* strace.cc (strace::microseconds): Use hires class for calculating times.
* sync.h (new_muto): Use NO_COPY explicitly in declaration.
* times.cc (gettimeofday): Reflect change in usecs argument.
(hires::usecs): Ditto. Changed name from utime.
* winsup.h (NO_COPY): Add nocommon attribute to force setting aside space for
variable.
* regcomp.c (REQUIRE): Add a void cast to bypass a warning.
(NM): new variable.
(OBSOLETE_FUNCTIONS): Ditto.
(NEW_FUNCTIONS): Ditto.
(install-headers): Install regex.h.
(install-man): New target.
(install): Use new target.
(DLL_OFILES): Add v8_reg* stuff.
(new-cygwin1.dll): Eliminate stamp-cygwin-lib creation.
(libcygwin.a): Remove obsolete functions from import lib. Add new functions.
* configure.in: Detect 'nm' tool.
* configure: Regenerate.
* cygwin.din: Export posix_reg* functions. Eliminate export of v8 reg*
functions. This is now handled in object files themselves.
* regex/*: New files.
* regexp/v8_*.c: New files, renamed from non v8_ equivalents.