Add support for the AMD GCN GPU architecture. This is primarily intended for
use with OpenMP and OpenACC offloading. It can also be used for stand-alone
programs, but this is intended mostly for testing the compiler and is not
expected to be useful in general.
The GPU architecture is highly parallel, and therefore Newlib must be
configured to use dynamic re-entrancy, and thread-safe malloc.
The only I/O available is a via a shared-memory interface provided by libgomp
and the gcn-run tool included with GCC. At this time this is limited to
stdout, argc/argv, and the return code.
for {n,u,m}stosbt
Integer overflows and wrong constants limited the accuracy of these
functions and created situatiosn where sbttoXs(Xstosbt(Y)) != Y. This
was especailly true in the ns case where we had millions of values
that were wrong.
Instead, used fixed constants because there's no way to say ceil(X)
for integer math. Document what these crazy constants are.
Also, use a shift one fewer left to avoid integer overflow causing
incorrect results, and adjust the equasion accordingly. Document this.
Allow times >= 1s to be well defined for these conversion functions
(at least the Xstosbt). There's too many users in the tree that they
work for >= 1s.
This fixes a failure on boot to program firmware on the mlx4
NIC. There was a msleep(1000) in the code. Prior to my recent rounding
changes, msleep(1000) worked, but msleep(1001) did not because the old
code rounded to just below 2^64 and the new code rounds to just above
it (overflowing, causing the msleep(1000) to really sleep 1ms).
A test program to test all cases will be committed shortly. The test
exaustively tries every value (thanks to bde for the test).
Sponsored by: Netflix, Inc
Differential Revision: https://reviews.freebsd.org/D18051
of the result rather than the floor(). Returning the floor means that
sbttoX(Xtosbt(y)) != y for almost all values of y. In practice, this
results in a difference of at most 1 in the lsb of the sbintime_t. This
difference is meaningless for all current users of these functions, but
is important for the newly introduced sysctl conversion routines which
implicitly rely on the transformation being idempotent.
Sponsored by: Netflix, Inc
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
and decimal time units. Use them in some existing code that is
vulnerable to roundoff errors.
The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call. The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS. (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
- Add CLOCK_REALTIME_COARSE, CLOCK_MONOTONIC_RAW,
CLOCK_MONOTONIC_COARSE and CLOCK_BOOTTIME
- Guard new values with __GNU_VISIBLE
- Add CLOCK_REALTIME_COARSE as (clockid_t) 0 for simplicity
(It allows to have all values < 8 and so be used as array
index into an array of clocks)
- Fix macro bracketing
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
Do not define __ATTRIBUTE_IMPURE_PTR__ for RTMES on the v850 target.
The previous definition lead to the following linker error in
combination with -fdata-sections:
relocation truncated to fit: R_V810_GPWLO_1 against symbol
`_global_impure_ptr' defined in .rodata._global_impure_ptr section in
libc.a(lib_a-impure.o)
relocation truncated to fit: R_V810_GPWLO_1 against symbol `_impure_ptr'
defined in .data._impure_ptr section in libc.a(lib_a-impure.o)
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
The malloc, alloc_size and alloc_aligned attributes must be only used in
case the function returns the pointer to the allocated memory.
See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87683
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Move common content of the various <sys/dirent.h> and the latest FreeBSD
<dirent.h> to <dirent.h>.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Make the POSIX O_CLOEXEC, O_NOFOLLOW, O_DIRECTORY, O_EXEC, and O_SEARCH
open() flags available also to non-Cygwin systems.
Make the BSD/glibc O_DIRECT open() flag available also to non-Cygwin
systems.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
These attributes help static analysis tools to produce less false
positives, e.g. double free warnings.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
When __HAVE_LOCALE_INFO__ is not selected, directly access the
existing _ctype_ variable from __locale_ctype_ptr() and
__locale_ctype_ptr_l(), eliminating the need for any locale or reent
structure
Signed-off-by: Keith Packard <keithp@keithp.com>
v2:
locale: fix conflict with __locale_ctype_ptr macro
If we are building without __HAVE_LOCALE_INFO__, there is a
macro providing __locale_ctype_ptr which in turn fouls up this
declaration.
Signed-off-by: Michael Lyle <mlyle@lyle.org>
This makes sure any system-defined limits are specified before the
defaults are checked. Without this, ARG_MAX and PATH_MAX end up
getting the default definitions from limits.h rather than the defines
from syslimits.h. This could potentially cause problems when
different files used different values for the same name.
Signed-off-by: Keith Packard <keithp@keithp.com>
Standard headers shouldn't use non-reserved identifiers as parameter
names in function declarations, because programs could in theory
define macros with such names before including a header.
Rack includes the following features: - A different SACK processing
scheme (the old sack structures are not used). - RACK (Recent
acknowledgment) where counting dup-acks is no longer done instead time
is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss
Probe) where we will probe for tail-losses to attempt to try not to take
a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS -
PRR (partial rate reduction) see the RFC.
Once built into your kernel, you can select this stack by either
socket option with the name of the stack is "rack" or by setting
the global sysctl so the default is rack.
Note that any connection that does not support SACK will be kicked
back to the "default" base FreeBSD stack (currently known as "default").
To build this into your kernel you will need to enable in your
kernel:
makeoptions WITH_EXTRA_TCP_STACKS=1
options TCPHPTS
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D15525
Add __nl_item to <sys/_types.h> for FreeBSD compatibility. Use it in
<langinfo.h> and the Cygwin <nl_types.h>. Make the enum __nl_item in
<langinfo.h> anonymous.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Restore FreeBSD compatibility for __alloc_size() and __alloc_align().
This is a follow-up to commit e494b56035.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
At least on GCC7 calling __alloc_size(x) twice is not equivalent to
calling using the attribute once with two arguments. The later is the
documented use in GCC documentation so add a new alloc_size(n, x)
alternative to cover for the few places where it is used: basically:
calloc(3), reallocarray(3) and mallocarray(9).
Submitted by: Mark Millard
MFC after: 3 days
Reference:
http://docs.freebsd.org/cgi/mid.cgi?F227842D-6BE2-4680-82E7-07906AF61CD7
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
GCC only activates C11 keywords in C mode, not C++ mode. This means
that when targeting an older C++ standard, we cannot fall back to using
_Static_assert(). In this case, do define _Static_assert() as a macro
that uses a typedef'ed array.
Discussed in: r322875 commit thread
Reported by: Mark MIllard
MFC after: 1 month
The new implementations are provided under !__OBSOLETE_MATH, it uses
ISO C99 code. With default settings the worst case error in nearest
rounding mode is 0.519 ULP with inlined fma and fma contraction. It uses
a 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased
by 1703 bytes. The w_log.c wrapper is disabled since error handling is
inline in the new code.
New __HAVE_FAST_FMA and __HAVE_FAST_FMA_DEFAULT feature macros were
added to enable selecting between the code path that uses fma and the
one that does not. Targets supposed to set __HAVE_FAST_FMA_DEFAULT
if they have single instruction fma and the compiler can actually
inline it (gcc has __FP_FAST_FMA macro but that does not guarantee
inlining with -fno-builtin-fma).
Improvements on Cortex-A72:
latency: 1.9x
thruput: 2.3x
Previously, "test 1 2 3 -a -b -c" was permuted to "test -a -b -c 1 2 3",
but "test 1 2 3 -abc" was left as "test 1 2 3 -abc".
Signed-off-by: Thomas Kindler <mail+newlib@t-kindler.de>
- From: Cesar Philippidis <cesar@codesourcery.com>
Date: Tue, 10 Apr 2018 14:43:42 -0700
Subject: [PATCH] nvptx port
This port adds support for Nvidia GPU's, which are primarily used as
offload accelerators in OpenACC and OpenMP.