This reverts commit 9068f6ea697b1b28ad1326a4c7a9ba86f08b985e.
The underlying macro needs to be reworked to avoid problems with control
flow statements.
Reported by: rlibby
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well. No functional change
intended.
Reviewed by: cem, kib, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32029
These allow one to non-destructively iterate over the set or clear bits
in a bitset. The motivation is that we have several code fragments
which iterate over a CPU set like this:
while ((cpu = CPU_FFS(&cpus)) != 0) {
cpu--;
CPU_CLR(cpu, &cpus);
<do something>;
}
This is slow since CPU_FFS begins the search at the beginning of the
bitset each time. On amd64 and arm64, CPU sets have size 256, so there
are four limbs in the bitset and we do a lot of unnecessary scanning.
A second problem is that this is destructive, so code which needs to
preserve the original set has to make a copy. In particular, we have
quite a few functions which take a cpuset_t parameter by value, meaning
that each call has to copy the 32 byte cpuset_t.
The new macros address both problems.
Reviewed by: cem, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32028
iflib now supports mapping each (TX,RX) queue pair to the same CPU
(default), to separate CPUs, or to a pair of physical and logical CPUs
that share the same L2 cache. The mapping mechanism supports unequal
numbers of TX and RX queues, with the excess queues always being
mapped to consecutive physical CPUs. When the platform cannot
distinguish between physical and logical CPUs, all are treated as
physical CPUs. See the comment on get_cpuid_for_queue() for the
entire matrix.
The following device-specific tunables influence the mapping process:
dev.<device>.<unit>.iflib.core_offset (existing)
dev.<device>.<unit>.iflib.separate_txrx (existing)
dev.<device>.<unit>.iflib.use_logical_cores (new)
The following new, read-only sysctls provide visibility of the mapping
results:
dev.<device>.<unit>.iflib.{t,r}xq<n>.cpu
When an iflib driver allocates TX softirqs without providing reference
RX IRQs, iflib now binds those TX softirqs to CPUs using the above
mapping mechanism (that is, treats them as if they were TX IRQs).
Previously, such bindings were left up to the grouptaskqueue code and
thus fell outside of the iflib CPU mapping strategy.
Reviewed by: kbowling
Tested by: olivier, pkelsey
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D24094
That is, provide wrappers around the atomic_testandclear and
atomic_testandset primitives.
Submitted by: jeff
Reviewed by: cem, kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22702
An upcoming patch to use the bitset macros for tracking vm page
dump information could conceivably need more than INT_MAX bits.
Expand the bit type to long so that the extra range is available
on 64-bit platforms where it would most likely be needed.
CPUSET_COUNT and DOMAINSET_COUNT are also modified to remain of
type `int`.
Reviewed by: kib, markj
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26190
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.
Don't supply a NAND (not (A and B)) operation at this time.
Discussed with: jeff
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22791
We have a couple optimizations for when the bitset is known to be just
one word. But with dynamically sized bitsets, it was actually more work
to determine the size than just to do the necessary computation. Now,
only use the optimization when the size is known to be constant.
Reviewed by: markj
Discussed with: jeff
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22639
This code has not been updated since 2016, and it looks like it has
rotted quite a bit since. It does not build against the current set
of phoenix sources -- I had to hack both the kernel headers and the
newlib headers up to get it to build, and I still have no idea if it
actually links or runs. It seems like the project itself has moved
away from newlib and to its own C library:
https://phoenix-rtos.com/documentation/libc/README.md
So since there's no interest from the phoenix folks to maintain this,
and it has a significant amount of non-standard code that we try to
keep up-to-date (without actually testing it), just punt it all.
I've had this lying around for probably a year or two at this point.
It just changes all the instance of "errno" from a common symbol to an
extern. I can't offhand recall where the actual definition is, but it
certainly exists in the generic code.
Convert all the libc/ subdir makes into the top-level Makefile. This
allows us to build all of libc from the top Makefile without using any
recursive make calls. This is faster and avoids the funky lib.a logic
where we unpack subdir archives to repack into a single libc.a. The
machine override logic is maintained though by way of Makefile include
ordering, and source file accumulation in libc_a_SOURCES.
There's a few dummy.c files that are no longer necessary since we aren't
doing the lib.a accumulating, so punt them.
The winsup code has been pulling the internal newlib ssp library out,
but that doesn't exist anymore, so change that to pull the objects.
This kills off the last configure script under libc/ and folds it
into the top newlib configure script. The a lot of the logic was
already in the top configure script, so move what's left into a
libc/acinclude.m4 file.
Remove dependency on __sdidinit member of struct _reent to check
object initialization. Like __sdidinit, the __cleanup member of
struct _reent is initialized in the __sinit() function. Checking
initialization against __cleanup serves the same purpose and will
reduce overhead in the __sfp() function in a follow up patch.
The crt0.o was handled in a subdir-by-subdir basis: it would be compiled
in one (e.g. libc/sys/$arch/), then copied up one level (libc/sys/), then
copied up another (libc/) before finally being copied & installed in the
top newlib dir. The libc/sys/ copy was cleaned up, and then the top dir
was changed to copy it directly out of the libc/sys/$arch/ dir. But the
libc/sys/ copy to libc/ was left behind. Clean that up now too.
When migrating the manual to the top-level, the include order was
sorted by name of the subdir. But this changed the chapter order
of the manual in the process. Change the sorting back to match
existing chapters and update the comments to explain.
The top-level newlib dir already takes care of recursing into the
sys/xxx/include/ subdirs and installing any headers found, so the
rtems subdir doesn't need to do this itself.
This is used in a bunch of places, but nowhere is it ever set, and
nowhere can I find any documentation, nor can I find any other project
using it. So delete the flags to simplify.
These targets don't actually cross-compile -- they try to pull some
objects out of the host's /lib/libc.a, /lib/libm.a, and /lib/crt0.o
directly and merge them into newlib's own libraries. This is hard
to keep working and impossible to test. Considering the vintage of
such targets, and gcc dropping them many many years ago, drop them
from newlib too. This will make cleaning up the build a lot easier.
The machine/{configure,Makefile} files exist only to fan out to the
specific machine/$arch/ subdir. We already have all that same info
in the phoenix/ dir itself, so by moving the recursive configure and
make calls into it, we can cut off this logic entirely and save the
overhead.
These were never added to the tree, and as we transition from autoconf
to automake, it really wants the latter subdirs to always exist. These
don't, so delete the logic.
This was only ever used for i?86-pc-linux-gnu targets, but that's been
broken for years, and has since been dropped. So clean this up too.
This also deletes the funky objectlist logic since it only existed for
the libtool libraries. Since it was the only thing left in the small
Makefile.shared file, we can punt that too.
This was only used by the i?86-pc-linux-gnu target which we've removed,
and even though it's using a "sys/linux/" dir to make it sound like it
only depends on the Linux kernel, it's actually tied to glibc APIs built
on top of Linux. Since the code relies on internal glibc APIs and has
been broken for some time, punt it all. If someone wants to bring it
back, they can try and actually keep the Linux-vs-glibc APIs separate.
Now that we use AC_NO_EXECUTABLES, and we require a recent version of
autoconf, we don't need to define our own copies of these macros. So
switch to the standard AC_PROG_CC.
This logic was added to libc & libm to get it working again after some
reworks in the CPP handling, but now that that's settled, let's move
this to the common newlib configure logic. This will make it easier
to consolidate all the configure calls into the top-level newlib dir.
This does create a lot of noise in the generate scripts, but that's
because of the ordering of the calls, not because of correctness. We
will try to draw that back down in follow up commits as we modernize
the toolchain calls in here.
THe stdio subdir is actually required by the documentation. The
stdio/def is handled dynamically, but libc.texi always expects it
to be included, and fails if it isn't. So making it required when
building docs is safe.
The xdr subdir is handled dynamically, but it doesn't include any
docs, so the dynamic logic isn't (currently) adding any value. So
making it required when building docs is safe.
That leaves: iconv, stdio64, posix, and signal subdirs. The chapters
have a little disclaimer saying they are system-dependent, but even
then, imo having stable manuals regardless of the target is preferable,
and we can add more disclaimer language to these chapters if we want.
This doesn't touch the man page codepaths, just the info/pdf.
Let automake manage whether the objects are included in lib.a. This
fixes failures after to commit 71086e8b2d
("newlib: delete (most) redundant lib_a_CCASFLAGS=$(AM_CCASFLAGS)") due
to automake generating different set of implicit rules, and the code in
here assuming the names of the generated objects.
When we had configure scripts in subdirs, the newlib_basedir value
was computed relative to that, and it'd be the same when used in the
Makefile in the same dir. With many subdir configure scripts removed,
the top-level configure & Makefile can't use the same relative path.
So switch the subdir Makefiles over to abs_newlib_basedir when they
use -I to find source headers.
Do this for all subdirs, even ones with configure scripts and where
newlib_basedir works. This makes the code consistent, and avoids
surprises if the configure script is ever removed in the future as
part of merging to the higher level.
Some of the subdirs were using -I$(newlib_basedir)/../newlib/ for
some reason. Collapse those too since newlib_basedir points to the
newlib source tree already.
When using the top-level configure script but subdir Makefiles, the
newlib_basedir value gets a bit out of sync: it's relative to where
configure lives, not where the Makefile lives. Move the abs setting
from the top-level configure script into acinclude.m4 so we can rely
on it being available everywhere. Although this commit doesn't use
it anywhere, just lays the groundwork.
The machine configure scripts are all effectively stub scripts that
pass the higher level options to its own makefile. There were only
three doing custom tests. The rest were all effectively the same as
the libc/ configure script.
So instead of recursively running configure in all of these subdirs,
generate their makefiles from the top-level configure. For the few
unique ones, deploy a pattern of including subdir logic via m4:
m4_include([machine/nds32/acinclude.m4])
Some of the generated machine makefiles have a bunch of extra stuff
added to them, but that's because they were inconsistent in their
configure libtool calls. The top-level has it, so it exports some
new vars to the ones that weren't already.
The sys configure scripts are almost all effectively stub scripts that
pass the higher level options to its own makefile. The phoenix & linux
ones are a bit more complicated with nested subdirs, so those have been
left alone for now. Plus, I don't really have a way of testing them.
There's no need to have a sys/ subdir just to copy the sys/$arch/crt0.o
up to sys/crt0.o, and then have libc/ copy sys/crt0.o up again. Just
have libc/ refer to sys/$arch/crt0.o directly and drop the intermediate
makefile entirely.
The sys/{configure,Makefile} files exist to fan out to the specific
sys/$arch/ subdir, and to possibly generate a crt0. We already have
all that same info in the libc/ dir itself, so by moving the recursive
configure and make calls into it, we can cut off some of this logic
entirely and save the overhead.
For arches that don't have a sys subdir, it means they can skip the
logic entirely.
The sys subdir itself is kept for the crt0 logic, for now. We'll try
and clean that up next.
It's unclear why this was added originally, but assuming it was needed
20 years ago, it shouldn't be explicitly required nowadays. Current
versions of autotools already take care of exporting LDFLAGS to the
Makefile as needed (things are actually getting linked). That's why
the configure diffs show LDFLAGS still here, but shifted to a diff
place in the output list. A few dirs stop exporting LDFLAGS, but
that's because they don't do any linking, only compiling, so it's
correct.
As for the use of $ldflags instead of the standard $LDFLAGS, I can't
really explain that at all. Just use the right name so users don't
have to dig into why their setting isn't respected, and then use a
non-standard name instead. Adjust the testsuite to match.
This define is only used by newlib internally, so stop exporting it
as HAVE_INITFINI_ARRAY since this can conflict with defines packages
use themselves.
We don't really need to add _ to HAVE_INIT_FINI too since it isn't
exported in newlib.h, but might as well be consistent here.
We can't (easily) add this to newlib_cflags like HAVE_INIT_FINI is
because this is based on a compile-time test in the top configure,
not on plain shell code in configure.host. We'd have to replicate
the test in every subdir in order to have it passed down.
Currently this is only enabled in the top-level as that's the only
place where it seemed to be used. But the libc/sys/phoenix/ dir
also uses this functionality, but fails to explicitly enable it.
Automake workedaround it, but generated warnings. Move the option
to NEWLIB_CONFIGURE so all dirs get it automatically iff they end
up using the option. If they don't use the option, there's no
difference to the generated code.
Since AM_INIT_AUTOMAKE calls AC_PROG_AWK, and some configure.ac
scripts call it too, we end up testing for awk multiple times. If
we change NEWLIB_CONFIGURE to require the macro instead, then it
makes sure it's always expanded, but only once.
While we're here, do the same thing with AC_PROG_INSTALL since it
is also called by AM_INIT_AUTOMAKE, although it doesn't currently
result in duplicate configure checks.
The AC_LIBTOOL_WIN32_DLL macro has been deprecated for a while and code
should call LT_INIT with win32-dll instead. Update the calls to match.
The generated code is noisy not because of substantial differences, but
because the order of some macros change (i.e. instead of calling AS and
then CC, CC is called first and then AS).
Since automake already sets per-library CCASFLAGS to $(AM_CCASFLAGS)
by default, there's no need to explicitly set it here.
Many of these dirs don't have .S files in the first place, so the rule
doesn't even do anything. That can easily be seen when Makefile.in has
no changes as a result.
For the dirs with .S files, the custom rules are the same as the pattern
.S.o rules, so this is a nice cleanup.
The only dir that was adding extra flags (newlib/libc/machine/mn10300/)
to the per-library setting can have it moved to the global AM_CCASFLAGS
since the subdir only has one target. Although the setting just adds
extra debugging flags, so maybe it should be deleted in general.
There are a few dirs that we leave the redundant setting in place. This
is to workaround an automake limitation in subdirs that support building
with & w/out libtool:
https://www.gnu.org/software/automake/manual/html_node/Objects-created-both-with-libtool-and-without.html
This matches what the other GNU toolchain projects have done already.
The generated diff in practice isn't terribly large. This will allow
more use of subdir local.mk includes due to fixes & improvements that
came after the 1.11 release series.
The newlib & libgloss dirs are already generated using autoconf-2.69.
To avoid merging new code and/or accidental regeneration using diff
versions, leverage config/override.m4 to pin to 2.69 exactly. This
matches what gcc/binutils/gdb are already doing.
The README file already says to use autoconf-2.69.
To accomplish this, it's just as simple as adding -I flags to the
top-level config/ dir when running aclocal. This is because the
override.m4 file overrides AC_INIT to first require the specific
autoconf version before calling the real AC_INIT.
Libtool needs to get BSD-format (or MS-format) output out of the system
nm, so that it can scan generated object files for symbol names for
-export-symbols-regex support. Some nms need specific flags to turn on
BSD-formatted output, so libtool checks for this in its AC_PATH_NM.
Unfortunately the code to do this has a pair of interlocking flaws:
- it runs the test by doing an nm of /dev/null. Some platforms
reasonably refuse to do an nm on a device file, but before now this
has only been worked around by assuming that the error message has a
specific textual form emitted by Tru64 nm, and that getting this
error means this is Tru64 nm and that nm -B would work to produce
BSD-format output, even though the test never actually got anything
but an error message out of nm -B. This is fixable by nm'ing *nm
itself* (since we necessarily have a path to it).
- the test is entirely skipped if NM is set in the environment, on the
grounds that the user has overridden the test: but the user cannot
reasonably be expected to know that libtool wants not only nm but
also flags forcing BSD-format output. Worse yet, one such "user" is
the top-level Cygnus configure script, which neither tests for
nor specifies any BSD-format flags. So platforms needing BSD-format
flags always fail to set them when run in a Cygnus tree, breaking
-export-symbols-regex on such platforms. Libtool also needs to
augment $LD on some platforms, but this is done unconditionally,
augmenting whatever the user specified: the nm check should do the
same.
One wrinkle: if the user has overridden $NM, a path might have been
provided: so we use the user-specified path if there was one, and
otherwise do the path search as usual. (If the nm specified doesn't
work, this might lead to a few extra pointless path searches -- but
the test is going to fail anyway, so that's not a problem.)
(Tested with NM unset, and set to nm, /usr/bin/nm, my-nm where my-nm is a
symlink to /usr/bin/nm on the PATH, and /not-on-the-path/my-nm where
*that* is a symlink to /usr/bin/nm.)
ChangeLog
2021-09-27 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27967
* libtool.m4 (LT_PATH_NM): Try BSDization flags with a user-provided
NM, if there is one. Run nm on itself, not on /dev/null, to avoid
errors from nms that refuse to work on non-regular files. Remove
other workarounds for this problem. Strip out blank lines from the
nm output.
This reports common symbols like GNU nm, via a type code of 'C'.
ChangeLog
2021-09-27 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27967
* libtool.m4 (lt_cv_sys_global_symbol_pipe): Augment symcode for
Solaris 11.
AR from older binutils doesn't work with --plugin and rc:
[hjl@gnu-cfl-2 bin]$ touch foo.c
[hjl@gnu-cfl-2 bin]$ ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/10/liblto_plugin.so rc libfoo.a foo.c
[hjl@gnu-cfl-2 bin]$ ./ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/10/liblto_plugin.so rc libfoo.a foo.c
./ar: no operation specified
[hjl@gnu-cfl-2 bin]$ ./ar --version
GNU ar (Linux/GNU Binutils) 2.29.51.0.1.20180112
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
[hjl@gnu-cfl-2 bin]$
Check if AR works with --plugin and rc before passing --plugin to AR and
RANLIB.
PR ld/27173
* configure: Regenerated.
* libtool.m4 (_LT_CMD_OLD_ARCHIVE): Check if AR works with
--plugin and rc before enabling --plugin.
config/
PR ld/27173
* gcc-plugin.m4 (GCC_PLUGIN_OPTION): Check if AR works with
--plugin and rc before enabling --plugin.
libiberty/
PR ld/27173
* configure: Regenerated.
zlib/
PR ld/27173
* configure: Regenerated.
The configure scripts were regenerated with 2.69 for the newlib-4.2.0
release in 484d2ebf8d, but the aclocal
files were not. Do that now to avoid confusion between the two as to
which version of autoconf was used.
Since automake deprecated the INCLUDES name in favor of AM_CPPFLAGS,
change all existing users over. The generated code is the same since
the two variables have been used in the same exact places by design.
There are other cleanups to be done, but lets focus on just renaming
here so we can upgrade to a newer automake version w/out triggering
new warnings.
The 'cygnus' option was removed from automake 1.13 in 2012, so the
presence of this option prevents that or a later version of automake
being used.
A check-list of the effects of '--cygnus' from the automake 1.12
documentation, and steps taken (where possible) to preserve those
effects (See also this thread [1] for discussion on that):
[1] https://lists.gnu.org/archive/html/bug-automake/2012-03/msg00048.html
1. The foreign strictness is implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
2. The options no-installinfo, no-dependencies and no-dist are implied.
Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4
Future work: Remove no-dependencies and any explicit header dependencies,
and use automatic dependency tracking instead. Are there explicit rules
which are now redundant to removing no-installinfo and no-dist?
3. The macro AM_MAINTAINER_MODE is required.
Already present in newlib/acinclude.m4
Note that maintainer-mode is still disabled by default.
4. Info files are always created in the build directory, and not in the
source directory.
This appears to be an error in the automake documentation describing
'--cygnus' [2]. newlib's info files are generated in the source
directory, and no special steps are needed to keep doing that.
[2] https://lists.gnu.org/archive/html/bug-automake/2012-04/msg00028.html
5. texinfo.tex is not required if a Texinfo source file is specified.
(The assumption is that the file will be supplied, but in a place that
automake cannot find.)
This effect is overriden by an explicit setting of the TEXINFO_TEX
variable (the directory part of which is fed into texi2X via the
TEXINPUTS environment variable).
6. Certain tools will be searched for in the build tree as well as in the
user's PATH. These tools are runtest, expect, makeinfo and texi2dvi.
For obscure automake reasons, this effect of '--cygnus' is not active
for makeinfo in newlib's configury.
However, there appears to be top-level configury which selects in-tree
runtest, expect and makeinfo, if present. So, if that works as it
appears, this effect is preserved. If not, this may cause problem if
anyone is building those tools in-tree.
This effect is not preserved for texi2dvi. This may cause problems if
anyone is building texinfo in-tree.
If needed, explicit checks for those tools looking in places relative to
$(top_srcdir)/../ as well as in PATH could be added.
7. The check target doesn't depend on all.
This effect is not preseved. The check target now depends on the all
target.
This concern seems somewhat academic given the current state of the
testsuite.
Also note that this doesn't touch libgloss.
Add the POSIX header file <poll.h> which is used by the GCC 11 Ada
runtime support.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
As discussed in GCC bug 97088
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97088), parameters in
prototypes of library functions should use reserved names, or no name
at all.
This patch moves the internal struct __tzrule_struct to its own
internal header sys/_tz_structs.h. This is included from newlib's
time code as well as from Cygwin's localtime wrapper.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@st.com>
Signed-off-by: Corinna Vinschen <corinna@vinschen.de>
The current gamma, gamma_r, gammaf and gammaf_r functions return
|gamma(x)| instead of ln(|gamma(x)|) due to a change made back in 2002
to the __ieee754_gamma_r implementation. This patch fixes that, making
all of these functions map too their lgamma equivalents.
To fix the underlying bug, the __ieee754_gamma functions have been
changed to return gamma(x), removing the _r variants as those are no
longer necessary. Their names have been changed to __ieee754_tgamma to
avoid potential confusion from users.
Now that the __ieee754_tgamma functions return the correctly signed
value, the tgamma functions have been modified to use them.
libm.a now exposes the following gamma functions:
ln(|gamma(x)|):
__ieee754_lgamma_r
__ieee754_lgammaf_r
lgamma
lgamma_r
gamma
gamma_r
lgammaf
lgammaf_r
gammaf
gammaf_r
lgammal (on machines where long double is double)
gamma(x):
__ieee754_tgamma
__ieee754_tgammaf
tgamma
tgammaf
tgammal (on machines where long double is double)
Additional aliases for any of the above functions can be added if
necessary; in particular, I'm not sure if we need to include
__ieee754_gamma*_r functions (which would return ln(|(gamma(x)|).
Signed-off-by: Keith Packard <keithp@keithp.com>
----
v2:
Switch commit message to ASCII
This edits licenses held by Berkeley and NetBSD, both of which
have removed the advertising requirement from their licenses.
Signed-off-by: Keith Packard <keithp@keithp.com>
The ioctl(2) is intended to provide more details about the cause of
the down for the link.
Eventually we might define a comprehensive list of codes for the
situations. But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information. Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.
Reviewed by: hselasky, rrs
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21527
KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports
offload of TLS for transmitted data. Key negotation must still be
performed in userland. Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option. All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.
Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type. Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.
At present, rekeying is not supported though the in-kernel framework
should support rekeying.
KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer. Each TLS frame is described by a single
ext_pgs mbuf. The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.
KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.
Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then
called to schedule TLS frames for encryption. In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed. For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking
ktls_enqueue().
A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue(). Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.
(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct
map.)
KTLS supports pluggable software encryption backends. Internally,
Netflix uses proprietary pure-software backends. This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames. As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.
Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready(). At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.
ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session. TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted. The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface. If so, the packet is tagged
with the TLS send tag and sent to the interface. The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation. If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped. In addition, a task is scheduled to refresh the TLS send
tag for the TLS session. If a new TLS send tag cannot be allocated,
the connection is dropped. If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag. (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another. As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being
dropped.)
ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.
Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option. They can also use this socket
option to toggle between software and ifnet TLS modes.
In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax. However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.
Various sysctls and counters are available under the kern.ipc.tls
sysctl node. The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default). The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.
KTLS is enabled via the KERN_TLS kernel option.
This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.
Reviewed by: gallatin, hselasky, rrs
Obtained from: Netflix
Sponsored by: Netflix, Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21277
IPPROTO 33 is DCCP in the IANA Registry:
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
IPPROTO_SEP was added about 20 years ago in r33804. The entries were added
straight from RFC1700, without regard to whether they were used.
The reference in RFC1700 for SEP is '[JC120] <mystery contact>', this is an
indication that the protocol number was probably in use in a private network.
As RFC1700 is no longer the authoritative list of internet numbers and that
IANA assinged 33 to DCCP in RFC4340, change the header to the actual
authoritative source.
Reviewed by: Richard Scheffenegger, bz
Approved by: bz (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21178
being used at NF as well as sets in some of the groundwork for
committing BBR. The hpts system is updated as well as some other needed
utilities for the entrance of BBR. This is actually part 1 of 3 more
needed commits which will finally complete with BBRv1 being added as a
new tcp stack.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D20834
multiple unmapped pages.
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.
For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer. This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers). It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused. To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.
Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.
NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.
If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.
Submitted by: gallatin (earlier version)
Reviewed by: gallatin, hselasky, rrs
Discussed with: ae, kp (firewalls)
Relnotes: yes
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20616
into using a STAILQ instead of a linear array.
The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.
To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi
All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.
This patch has been tested by pho@ .
Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
protections.
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions. If
present, these flags specify the maximum permissions.
While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.
This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete. This complements W^X protections allowing more
precise control by the programmer.
This change alters mprotect argument checking and returns an error when
unhandled protection flags are set. This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.
In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1. PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation. A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.
Reviewed by: emaste, kib (prior version), markj
Additional suggestions from: alc
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18880
DTR are asserted. Some development boards for example will reset on DTR,
and some radio interfaces will transmit on RTS.
This patch allows "stty -f /dev/ttyu9.init -rtsdtr" to prevent
RTS and DTR from being asserted on open(), allowing these devices
to be used without problems.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D20031
Error messages in gai_strerror(3) vary largely among OSs.
For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.
Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.
MFC after: 1 month
Differentical Revision: D18630
Applied changes from commit 8d98f95:
* arm/crt0.S: Initialise __heap_limit when ARM_RDI_MONITOR is defined.
* arm/syscalls.c: define __heap_limit global symbol.
* arm/syscalls.c (_sbrk): Honour __heap_limit.
Applied changes from commit 8d98f95:
Fixed semihosting for ARM when heapinfo not provided by debugger
Applied changes from the commit 9b11672:
When simulating arm code, the target program startup code (crt0) uses
semihosting invocations to get the command line from the simulator. The
simulator returns the command line and its size into the area passed in
parameter. (ARM 32-bit specifications :
http://infocenter.arm.com/help/topic/com.arm.doc.dui0058d/DUI0058.pdf
chapter "5.4.19 SYS_GET_CMDLINE").
The memory area pointed by the semihosting register argument is located
in .text section (usually not writtable (RX)).
If we run this code on a simulator that respects this rights properties
(qemu user-mode for instance), the command line will not be written to
the .text program memory, in particular the length of the string. The
program runs with an empty command line. This problem hasn't been seen
earlier probably because qemu user-mode is not so much used, but this can
happen with another simulator that refuse to write in a read-only segment.
With this modification, the command line can be correctly passed to the
target program.
Changes:
- newlib/libc/sys/arm/crt0.S : Arguments passed to the
AngelSWI_Reason_GetCmdLine semihosting invocation are placed into .data
section instead of .text
The Arm sys/param.h does not define anything differently to the
generic sys/param.h, but fails to define some things that that file
provides. There does not appear to be any reason to keep this version
and we should revert to using the common version.
SP initialization changes:
1. set default value in semihosting case as well
2. moved existing SP & SL init code for processor modes in separate routine and made it as "hook"
3. init SP for processor modes in Thumb mode as well
Add new macro FN_RETURN, FN_EH_START and FN_EH_END.