newlib-cygwin

mirror of git://sourceware.org/git/newlib-cygwin.git synced 2025-02-28 12:05:47 +08:00

Author	SHA1	Message	Date
Corinna Vinschen	006520ca2b	newlib: enable new math functions on Cygwin Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-06-27 15:53:51 +02:00
Szabolcs Nagy	b99d49e506	New pow implementation The new implementation is provided under !__OBSOLETE_MATH, it uses ISO C99 code. With default settings the worst case error in nearest rounding mode is 0.54 ULP with inlined fma and fma contraction. It uses a 4 KB lookup table in addition to the table in exp_data.c, on aarch64 .text+.rodata size of libm.a is increased by 2295 bytes. Improvements on Cortex-A72: latency: 3.3x thruput: 4.9x	2018-06-27 15:40:49 +02:00
Szabolcs Nagy	07e2c32828	New log2 implementation The new implementation is provided under !__OBSOLETE_MATH, it uses ISO C99 code. With default settings the worst case error in nearest rounding mode is 0.547 ULP with inlined fma and fma contraction. It uses a 1 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased by 1584 bytes. Note that the math.h header defines log2(x) to be log(x)/Ln2, this is not changed, so the new code is only used if that macro is suppressed. Improvements on Cortex-A72: latency: 2.0x thruput: 2.2x	2018-06-27 15:40:49 +02:00
Szabolcs Nagy	e5791079c6	New log implementation The new implementations are provided under !__OBSOLETE_MATH, it uses ISO C99 code. With default settings the worst case error in nearest rounding mode is 0.519 ULP with inlined fma and fma contraction. It uses a 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased by 1703 bytes. The w_log.c wrapper is disabled since error handling is inline in the new code. New __HAVE_FAST_FMA and __HAVE_FAST_FMA_DEFAULT feature macros were added to enable selecting between the code path that uses fma and the one that does not. Targets supposed to set __HAVE_FAST_FMA_DEFAULT if they have single instruction fma and the compiler can actually inline it (gcc has __FP_FAST_FMA macro but that does not guarantee inlining with -fno-builtin-fma). Improvements on Cortex-A72: latency: 1.9x thruput: 2.3x	2018-06-27 15:40:49 +02:00
Szabolcs Nagy	fb929067db	New exp and exp2 implementations The new implementations are provided under !__OBSOLETE_MATH, they use ISO C99 code. There are several settings, with the default one the worst case error in nearest rounding mode is 0.509 ULP for exp and 0.507 ULP for exp2 when a multiply and add is contracted into an fma. They use a shared 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased by 1868 bytes. The w_*.c wrappers are disabled for the new code as it takes care of error handling inline. The old exp2(x) code used to be just pow(2,x) so the speedup there is more significant. The file name has no special prefix to avoid any name collision with existing files. Improvements on Cortex-A72: exp latency: 3.2x exp thruput: 4.1x exp2 latency: 7.8x exp2 thruput: 18.8x	2018-06-27 15:40:49 +02:00
Szabolcs Nagy	cfbcbd1c95	Use uint32_t sign argument to math error functions This change is equivalent to the commit `c65db17340` and only affects code that is from the Arm optimized-routines project. It does not affect the observable behaviour, but the code generation can be different on 64bit targets. The intention is to make the portable semantics of the code obvious by using a fixed size type.	2018-06-27 15:40:49 +02:00
Takashi Yano	048490485a	Fix Unicode table. * (mkcategories): Fix a bug that outputs incorrect Unicode category table for code point ranges. * (categories.t): Rebuild it using the bug-fixed mkcategories. This fixes the problem reported in the following post. https://cygwin.com/ml/cygwin/2018-06/msg00248.html	2018-06-26 10:19:12 +02:00
Corinna Vinschen	b14daac482	Revert "Remove -fno-builtin to allow gcc to inline functions such as fabs, floor, creal, imag." This reverts commit c077b9de99c6980a0c1631ec2938f6ff2cf0c289. Yet another accidental commit...	2018-06-26 10:17:04 +02:00
Jon Beniston	c077b9de99	Remove -fno-builtin to allow gcc to inline functions such as fabs, floor, creal, imag.	2018-06-25 13:31:51 +02:00
Wilco Dijkstra	3baadb9912	Improve performance of sinf/cosf/sincosf Here is the correct patch with both filenames and int cast fixed: This patch is a complete rewrite of sinf, cosf and sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.56072, maximum relative error is 0.5303p-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases - on some targets this may be slow, so this can be configured to use floating point comparisons. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. Simple benchmark showing speedup factor on AArch64 for various ranges: range 0.7853982 sinf 1.7 cosf 2.2 sincosf 2.8 range 1.570796 sinf 1.9 cosf 1.9 sincosf 2.7 range 3.141593 sinf 2.0 cosf 2.0 sincosf 3.5 range 6.283185 sinf 2.3 cosf 2.3 sincosf 4.2 range 125.6637 sinf 2.9 cosf 3.0 sincosf 5.1 range 1.1259e15 sinf 26.8 cosf 26.8 sincosf 45.2 ChangeLog: 2018-05-18 Wilco Dijkstra <wdijkstr@arm.com> * newlib/libm/common/Makefile.in: Regenerated. * newlib/libm/common/Makefile.am: Add sinf.c, cosf.c, sincosf.c sincosf.h, sincosf_data.c. Add -fbuiltin -fno-math-errno to CFLAGS. * newlib/libm/common/math_config.h: Add HAVE_FAST_ROUND, HAVE_FAST_LROUND, roundtoint, converttoint, force_eval_float, force_eval_double, eval_as_float, eval_as_double, likely, unlikely. * newlib/libm/common/cosf.c: New file. * newlib/libm/common/sinf.c: Likewise. * newlib/libm/common/sincosf.h: Likewise. * newlib/libm/common/sincosf.c: Likewise. * newlib/libm/common/sincosf_data.c: Likewise. * newlib/libm/math/sf_cos.c: Add #if to build conditionally. * newlib/libm/math/sf_sin.c: Likewise. * newlib/libm/math/wf_sincos.c: Likewise. --	2018-06-21 09:37:04 +02:00
Corinna Vinschen	cfe8c6c504	Revert "Improve performance of sinf/cosf/sincosf" This reverts commit fca80a9d1b3fa6620cdaccec6b726eef1a6530a1. Accidentally pushed a preliminary version	2018-06-21 09:36:39 +02:00
Jon Beniston	b7d9d27b0e	libm/common/s_round.c (round): Add cast for 16-bit CPUs	2018-06-21 09:31:13 +02:00
Wilco Dijkstra	fca80a9d1b	Improve performance of sinf/cosf/sincosf This patch is a complete rewrite of sinf, cosf and sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.56072, maximum relative error is 0.5303p-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases - on some targets this may be slow, so this can be configured to use floating point comparisons. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. Simple benchmark showing speedup factor on AArch64 for various ranges: range 0.7853982 sinf 1.7 cosf 2.2 sincosf 2.8 range 1.570796 sinf 1.9 cosf 1.9 sincosf 2.7 range 3.141593 sinf 2.0 cosf 2.0 sincosf 3.5 range 6.283185 sinf 2.3 cosf 2.3 sincosf 4.2 range 125.6637 sinf 2.9 cosf 3.0 sincosf 5.1 range 1.1259e15 sinf 26.8 cosf 26.8 sincosf 45.2 ChangeLog: 2018-06-18 Wilco Dijkstra <wdijkstr@arm.com> * newlib/libm/common/Makefile.in: Regenerated. * newlib/libm/common/Makefile.am: Add sinf.c, cosf.c, sincosf.c sincosf.h, sincosf_data.c. Add -fbuiltin -fno-math-errno to CFLAGS. * newlib/libm/common/math_config.h: Add HAVE_FAST_ROUND, HAVE_FAST_LROUND, roundtoint, converttoint, force_eval_float, force_eval_double, eval_as_float, eval_as_double, likely, unlikely. * newlib/libm/common/cosf.c: New file. * newlib/libm/common/sinf.c: Likewise. * newlib/libm/common/sincosf.h: Likewise. * newlib/libm/common/sincosf.c: Likewise. * newlib/libm/common/sincosf_data.c: Likewise. * newlib/libm/math/sf_cos.c: Add #if to build conditionally. * newlib/libm/math/sf_sin.c: Likewise. * newlib/libm/math/wf_sincos.c: Likewise. --	2018-06-19 09:44:28 +02:00
Thomas Kindler	9dd3c3b0ad	newlib: getopt now permutes multi-flag options correctly Previously, "test 1 2 3 -a -b -c" was permuted to "test -a -b -c 1 2 3", but "test 1 2 3 -abc" was left as "test 1 2 3 -abc". Signed-off-by: Thomas Kindler <mail+newlib@t-kindler.de>	2018-06-18 18:45:44 +02:00
Jeff Johnston	4a3d0a5a5d	Fix issue with malloc_extend_top - when calculating a correction to align next brk to page boundary, ensure that the correction is less than a page size - if allocating the correction fails, ensure that the top size is set to brk + sbrk_size (minus any front alignment made) Signed-off-by: Jeff Johnston <jjohnstn@redhat.com>	2018-05-29 10:16:48 -04:00
Matthias Kannwischer	fcfea0ae2d	fix llrint and lrint for 52 <= exponent <= 62	2018-05-29 15:59:48 +02:00
Freddie Chopin	3305f35570	Fix 32-bit overflow in mktime() when time_t is 64-bits long When converting number of days since epoch (32-bits) to seconds, calculations using 32-bit `long` overflow for years above 2038. Solve this by casting number of days to `time_t` just before final multiplication. Signed-off-by: Freddie Chopin <freddie.chopin@gmail.com>	2018-05-29 15:27:03 +02:00
Jeff Johnston	e928275566	Use _LDBL_EQ_DBL in nexttowardf.c 2018-05-07 Tom de Vries <tom@codesourcery.com> * libm/common/nexttowardf.c: Use _LDBL_EQ_DBL instead of _LDBL_EQ_DOUBLE.	2018-05-07 12:22:12 -04:00
Jeff Johnston	cd31fbb2ae	Add nvptx port. - From: Cesar Philippidis <cesar@codesourcery.com> Date: Tue, 10 Apr 2018 14:43:42 -0700 Subject: [PATCH] nvptx port This port adds support for Nvidia GPU's, which are primarily used as offload accelerators in OpenACC and OpenMP.	2018-04-13 15:42:37 -04:00
Corinna Vinschen	27652b608d	strtod: Convert 64 bit double to 64 bit int during computation The gdtoa implementation uses the type long, defined as Long, in lots of code. For historical reason newlib defines Long as int32_t instead. This works fine, as long as floating point exceptions are not enabled. The conversion to 32 bit int can lead to a FE_INVALID situation. Example: const char str = "121645100408832000.0"; char ptr; feenableexcept (FE_INVALID); strtod (str, &ptr); This leads to the following situation in strtod double aadj; Long L; [...] L = (Long)aadj; For instance, on x86_64 the code here is cvttsd2si %xmm0,%eax At this point, aadj is 2529648000.0 in our example. The conversion to 32 bit %eax results in a negative int value, thus the conversion is invalid. With feenableexcept (FE_INVALID), a SIGFPE is raised. Fix this by always using 64 bit ints here if double is not a 32 bit type to avoid this type of FP exceptions. Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-04-09 11:31:04 +02:00
Corinna Vinschen	1ee6654e50	newlib: fix iswupper_l in !_MB_CAPABLE case Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-03-27 12:35:27 +02:00
Thomas Wolff	fc59da00c8	comments to document struct caseconv_entry explain design of compact (packed) struct caseconv_entry, in case it needs to be modified for future Unicode versions	2018-03-26 12:01:50 +02:00
Thomas Wolff	b49ce5af1b	newlib: fix indentation in toulower Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-03-26 10:00:16 +02:00
Hakan Lindqvist	3ce38df8d1	Reduce qsort stack consumption Classical function call recursion wastes a lot of stack space. Each recursion level requires a full stack frame comprising all local variables and additional space as dictated by the processor calling convention. This implementation instead stores the variables that are unique for each recursion level in a parameter stack array, and uses iteration to emulate recursion. Function call recursion is not used until the array is full. To ensure the stack consumption isn't worsened by this design, the size of the parameter stack array is chosen to be similar to the stack frame excluding the array. Each function call recursion level can handle 8 iterative recursion levels. Stack consumption will worsen when sorting tiny arrays that do not need recursion (of 6 elements or less). It will be about equal for up to 15 elements, and be an improvement for larger arrays. The best case improvement is a stack size reduction down to about one quarter of the stack consumption before the change. A design where the parameter stack array is large enough for the worst case recursion level was rejected because it would worsen the stack consumption when sorting arrays smaller than about 1500 elements. The worst case is 31 levels on a 32-bit system. A design with a dynamic parameter array size was rejected because of limitations in some compilers.	2018-03-16 10:21:23 +01:00
Hakan Lindqvist	0045445ad6	Ensure qsort recursion depth is bounded The qsort algorithm splits the input array in three parts. The left and right parts may need further sorting. One of them is sorted by recursion, the other by iteration. This update ensures that it is the smaller part that is chosen for recursion. By choosing the smaller part, each recursion level will handle less than half the array of the previous recursion level. Hence the recursion depth is bounded to be less than log2(n) i.e. 1 level per significant bit in the array size n. The update also includes code comments explaining the algorithm.	2018-03-16 10:21:23 +01:00
Joel Sherrill	948db3e4b7	Correct prototypes of pthread_mutex_getprioceiling() and pthread_setschedparam()	2018-03-15 09:25:45 -05:00
Richard Earnshaw	0bb8697333	[arm] Fix syscalls.c for newlib embedded syscalls builds Newlib has a build configuration where syscalls can be directly embedded in the newlib library rather than relying on libgloss. This configuration was broken recently by an update to the libgloss support for Arm that was not propagated to the syscalls interface in newlib itself. This patch restores the build. It's essentially a copy of https://sourceware.org/ml/newlib/2018/msg00128.html but there are some other minor cleanups and changes that I've made at the same time. None of those cleanups affect functionality. The prototypes of the following functions have been updated: _link, _sbrk, _getpid, _write, _swiwrite, _lseek, _swilseek, _read and _swiread. Signed-off-by: Richard Earnshaw <Richard.Earnshaw@arm.com>	2018-03-15 09:55:11 +00:00
Yaakov Selkowitz	829820af6e	ssp: fix wchar.h with -std=c99 https://sourceware.org/ml/newlib/2018/msg00261.html Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-03-14 10:46:32 -05:00
Yaakov Selkowitz	e494b56035	Fix alloc_align and alloc_size macros for multiple arguments https://sourceware.org/ml/newlib/2018/msg00263.html This is a follow-up to commit 4564b30f331a067e71b25308ac7c8a85ceb4b122. Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com>	2018-03-14 10:17:51 -05:00
Corinna Vinschen	134f93f313	ctype: align size of category bit fields to small targets needs E.g. arm ABI requires -fshort-enums for bare-metal toolchains. Given there are only 29 category enums, the compiler chooses an 8 bit enum type, so a size of 11 bits for the bitfield leads to a compile time error: error: width of 'cat' exceeds its type enum category cat: 11; ^~~ Fix this by aligning the size of the category members to byte borders. Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-03-14 11:38:24 +01:00
Corinna Vinschen	edcf783dc2	Revert "ctype: align size of category bit fields to small targets needs" This reverts commit e98d3eb3eb9b6abd897e102031a14b7057641a65. It has accidentally included some work in progress.	2018-03-14 11:36:06 +01:00
Thomas Wolff	44d90834fb	fix/enhance Unicode table generation scripts Scripts do not try to acquire Unicode data by best-effort magic anymore. Options supported: -h for help -i to copy Unicode data from /usr/share/unicode/ucd first -u to download Unicode data from unicode.org first If (despite of -i or -u if given) the necessary Unicode files are not available locally, table generation is skipped, but no error code is returned, so not to obstruct the build process if called from a Makefile.	2018-03-14 10:44:32 +01:00
Corinna Vinschen	e98d3eb3eb	ctype: align size of category bit fields to small targets needs E.g. arm ABI requires -fshort-enums for bare-metal toolchains. Given there are only 29 category enums, the compiler chooses an 8 bit enum type, so a size of 11 bits for the bitfield leads to a compile time error: error: width of 'cat' exceeds its type enum category cat: 11; ^~~ Fix this by aligning the size of the category members to byte borders. Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-03-14 10:36:38 +01:00
Corinna Vinschen	e186dc8661	towctrans_l: Always return a value from helper functions touupper and toulower didn't return a value in all cases. Worse, this only broke Cygwin when building without optimization for debug purposes. Why GCC neglects to notice this is a mystery. While at it, fix formatting. Signed-off-by: Corinna Vinschen <corinna@vinschen.de>	2018-03-13 22:09:30 +01:00
Joel Sherrill	5b97e36239	rtems/.../dirent.h: Add alphasort() prototype	2018-03-13 09:11:47 -05:00
Jon Turney	4564b30f33	Correct alloc_size annotation on reallocarray() Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2018-03-13 09:04:56 -05:00
Thomas Wolff	c8d96a96ea	make target for explicit Unicode data tables generation Run 'make unidata' in newlib target directory to generate Unicode data tables for libc functions wcwidth, tow* and isw*.	2018-03-12 12:09:44 +01:00
Thomas Wolff	a352730004	character data generation	2018-03-12 11:39:50 +01:00
Thomas Wolff	41f72ab4d7	use generated character data The tow* functions use an included case conversion table which can be generated from Unicode data. The isw* functions use a character categories table (provided by categories.c) which can be generated from Unicode data. Delegation between current-locale and specific-locale-dependent functions was reverted towards the generic locale-dependent functions (*_l.c); this is however only relevant on systems with non-Unicode wide character locales, thus not on Cygwin.	2018-03-12 11:39:42 +01:00
Thomas Wolff	3ccfb407af	generated character category data, Unicode 10.0 Table categories.t and tag enumeration categories.cat provide character class data for most of the isw* functions. These data are generated from Unicode data.	2018-03-12 11:09:31 +01:00
Thomas Wolff	402daa2f80	generated case conversion data, Unicode 10.0 Table caseconv.t provides case conversion data for the tow* functions, especially towupper and towlower. These data are generated from Unicode data.	2018-03-12 11:09:31 +01:00
Thomas Wolff	37132125bc	width data generation	2018-03-12 10:17:20 +01:00
Thomas Wolff	8e8fd6c849	use generated width data	2018-03-12 10:17:20 +01:00
Thomas Wolff	71291047e2	generated width data, Unicode 10.0 These tables provide character width properties for use by the wcwidth/wcswidth functions. They are generated from Unicode.	2018-03-12 10:17:20 +01:00
Sebastian Huber	f641474cb2	RTEMS: Use int for _CLOCKID_T_ Linux and FreeBSD use int as well. In addition, this fixes an Ada incompatiblity problem on 64-bit targets. See also GCC: gcc/ada/libgnarl/s-osinte__rtems.ads Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-03-06 11:40:16 +01:00
Sebastian Huber	a9c8434527	Make _CLOCKID_T_ system configurable Let systems optionally provide the _CLOCKID_T_ type via <machine/_types.h>. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-03-06 11:40:16 +01:00
Thomas Wolff	f92f048528	Locale modifier @cjkwide to adjust ambiguous-width in non-CJK locales Locale modifier @cjkwide makes Unicode "ambiguous width" characters wide. So ambiguous width characters can be enforced to have width 2 even in non-CJK locales. This gives e.g. users of "Powerline symbols" the opportunity to adjust their width to the desired behaviour (and the behaviour apparently expected by some tools) without having to set a CJK locale and without losing consistence of terminal character width with wcwidth/wcswidth locale width.	2018-03-05 17:15:12 +01:00
Our Air Quality	b7520b14d5	Add global stdio streams support for reent small.	2018-03-01 18:05:31 -05:00
Jaap de Wolff	8329f4867b	add forward declaration to __cxa_atexit to aeabi_atexit, to prevent warnings	2018-02-16 12:16:07 +01:00
Jaap de Wolff	337cee51ca	Add prototype to _malloc_lock() and *unlock() to malloc.h, and inlude this from nano-mallocr.c	2018-02-16 12:16:07 +01:00

1 2 3 4 5 ...

2841 Commits