Commit Graph

255 Commits

Author SHA1 Message Date
Jeff Johnston 484d2ebf8d Update newlib to 4.2.0 2021-12-31 12:46:13 -05:00
Jon Turney bfcabeb876
newlib: Regenerate autotools files 2021-12-29 22:45:06 +00:00
Jon Turney a4e734fcdb
newlib: Remove automake option 'cygnus'
The 'cygnus' option was removed from automake 1.13 in 2012, so the
presence of this option prevents that or a later version of automake
being used.

A check-list of the effects of '--cygnus' from the automake 1.12
documentation, and steps taken (where possible) to preserve those
effects (See also this thread [1] for discussion on that):

[1] https://lists.gnu.org/archive/html/bug-automake/2012-03/msg00048.html

1. The foreign strictness is implied.

Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4

2. The options no-installinfo, no-dependencies and no-dist are implied.

Already present in AM_INIT_AUTOMAKE in newlib/acinclude.m4

Future work: Remove no-dependencies and any explicit header dependencies,
and use automatic dependency tracking instead.  Are there explicit rules
which are now redundant to removing no-installinfo and no-dist?

3. The macro AM_MAINTAINER_MODE is required.

Already present in newlib/acinclude.m4

Note that maintainer-mode is still disabled by default.

4. Info files are always created in the build directory, and not in the
source directory.

This appears to be an error in the automake documentation describing
'--cygnus' [2]. newlib's info files are generated in the source
directory, and no special steps are needed to keep doing that.

[2] https://lists.gnu.org/archive/html/bug-automake/2012-04/msg00028.html

5. texinfo.tex is not required if a Texinfo source file is specified.
(The assumption is that the file will be supplied, but in a place that
automake cannot find.)

This effect is overriden by an explicit setting of the TEXINFO_TEX
variable (the directory part of which is fed into texi2X via the
TEXINPUTS environment variable).

6. Certain tools will be searched for in the build tree as well as in the
user's PATH. These tools are runtest, expect, makeinfo and texi2dvi.

For obscure automake reasons, this effect of '--cygnus' is not active
for makeinfo in newlib's configury.

However, there appears to be top-level configury which selects in-tree
runtest, expect and makeinfo, if present. So, if that works as it
appears, this effect is preserved. If not, this may cause problem if
anyone is building those tools in-tree.

This effect is not preserved for texi2dvi. This may cause problems if
anyone is building texinfo in-tree.

If needed, explicit checks for those tools looking in places relative to
$(top_srcdir)/../ as well as in PATH could be added.

7. The check target doesn't depend on all.

This effect is not preseved. The check target now depends on the all
target.

This concern seems somewhat academic given the current state of the
testsuite.

Also note that this doesn't touch libgloss.
2021-12-29 22:45:04 +00:00
Jon Turney 8e166351b3
newlib: Regenerate autotools files 2021-12-29 22:45:03 +00:00
Jon Turney 639cb7ec1a
newlib: Regenerate all autotools files
Regenerate all aclocal.m4, configure and Makefile.in files.
2021-12-09 21:41:35 +00:00
Mike Frysinger 59e83de0b1 libgloss/newlib: update configure.ac in Makefile.in files
The maintainer rules refer to configure.in directly, so update that
after renaming all the configure.ac files.
2021-11-06 14:14:49 -04:00
Mike Frysinger 920617998e libgloss/newlib: rename configure.in to configure.ac
The .in name has been deprecated for a long time in favor of .ac.
2021-09-13 10:14:37 -04:00
Joel Sherrill 59584ff16b libc/sys/rtems/crt0.c: Fix two warnings.
__assert_func() is marked as noreturn and stub should not.
	__tls_get_addr() needed to return a value..
2021-06-17 12:58:36 -05:00
Sebastian Huber a485393aea RTEMS: Add <poll.h> and <sys/poll.h>
Add the POSIX header file <poll.h> which is used by the GCC 11 Ada
runtime support.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2021-01-05 13:41:34 -05:00
Jeff Johnston 415fdd4279 Bump up newlib version to 4.1.0 2020-12-18 18:50:49 -05:00
Jeff Johnston 14123c991b Bump newlib release to 4.0.0 2020-12-11 14:37:12 -05:00
Joel Sherrill fcaaf40c9d libc/sys/rtems/include/machine/_types.h: Define daddr_t to be 64 bits for RTEMS
This type needs to be able to represent a position on a disk or
file system.
2020-10-28 09:45:21 -05:00
Sebastian Huber b37a3388cc RTEMS: Include missing header and fix stub
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2020-03-13 13:51:20 -05:00
Keith Packard 9042d0ce65 Use remove-advertising-clause script to edit BSD licenses
This edits licenses held by Berkeley and NetBSD, both of which
have removed the advertising requirement from their licenses.

Signed-off-by: Keith Packard <keithp@keithp.com>
2020-01-29 19:03:31 +01:00
Jeff Johnston 4e78f8ea16 Bump up newlib release to 3.3.0 2020-01-21 15:17:43 -05:00
Jeff Johnston 1afb22a120 Bump up release to 3.2.0 for yearly snapshot 2020-01-02 14:56:24 -05:00
kib 7e9b1550fd Add SIOCGIFDOWNREASON.
The ioctl(2) is intended to provide more details about the cause of
the down for the link.

Eventually we might define a comprehensive list of codes for the
situations.  But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information.  Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.

Reviewed by:	hselasky, rrs
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21527
2019-09-25 09:01:28 +02:00
jhb 1b35636119 Add kernel-side support for in-kernel TLS.
KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets.  KTLS only supports
offload of TLS for transmitted data.  Key negotation must still be
performed in userland.  Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option.  All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.

Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type.  Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.

At present, rekeying is not supported though the in-kernel framework
should support rekeying.

KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer.  Each TLS frame is described by a single
ext_pgs mbuf.  The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.

KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.

Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame().  ktls_enqueue() is then
called to schedule TLS frames for encryption.  In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed.  For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking
ktls_enqueue().

A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue().  Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.

(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct
map.)

KTLS supports pluggable software encryption backends.  Internally,
Netflix uses proprietary pure-software backends.  This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames.  As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.

Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready().  At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.

ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation.  In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session.  TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted.  The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface.  If so, the packet is tagged
with the TLS send tag and sent to the interface.  The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation.  If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped.  In addition, a task is scheduled to refresh the TLS send
tag for the TLS session.  If a new TLS send tag cannot be allocated,
the connection is dropped.  If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag.  (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another.  As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being
dropped.)

ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8).  ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.

Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option.  They can also use this socket
option to toggle between software and ifnet TLS modes.

In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax.  However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.

Various sysctls and counters are available under the kern.ipc.tls
sysctl node.  The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default).  The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.

KTLS is enabled via the KERN_TLS kernel option.

This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.

Reviewed by:	gallatin, hselasky, rrs
Obtained from:	Netflix
Sponsored by:	Netflix, Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21277
2019-09-25 09:01:23 +02:00
thj 28a44b1ecd Rename IPPROTO 33 from SEP to DCCP
IPPROTO 33 is DCCP in the IANA Registry:
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml

IPPROTO_SEP was added about 20 years ago in r33804. The entries were added
straight from RFC1700, without regard to whether they were used.

The reference in RFC1700 for SEP is '[JC120] <mystery contact>', this is an
indication that the protocol number was probably in use in a private network.

As RFC1700 is no longer the authoritative list of internet numbers and that
IANA assinged 33 to DCCP in RFC4340, change the header to the actual
authoritative source.

Reviewed by:	Richard Scheffenegger, bz
Approved by:	bz (mentor)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21178
2019-09-25 09:01:19 +02:00
rrs 693ba4025f This commit updates rack to what is basically
being used at NF as well as sets in some of the groundwork for
committing BBR. The hpts system is updated as well as some other needed
utilities for the entrance of BBR. This is actually part 1 of 3 more
needed commits which will finally complete with BBRv1 being added as a
new tcp stack.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D20834
2019-09-25 09:01:19 +02:00
jhb 2f55e1fa06 Add an external mbuf buffer type that holds
multiple unmapped pages.

Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages.  It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.

For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer.  This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers).  It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused.  To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.

Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.

NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability.  This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands.  For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.

If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output.  If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.

Submitted by:	gallatin (earlier version)
Reviewed by:	gallatin, hselasky, rrs
Discussed with:	ae, kp (firewalls)
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-09-25 09:01:19 +02:00
hselasky d41e144869 Convert all IPv4 and IPv6 multicast memberships
into using a STAILQ instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-09-25 09:01:15 +02:00
brooks e94d2a0f8b Extend mmap/mprotect API to specify the max page
protections.

A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions.  If
present, these flags specify the maximum permissions.

While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.

This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete.  This complements W^X protections allowing more
precise control by the programmer.

This change alters mprotect argument checking and returns an error when
unhandled protection flags are set.  This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.

In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1.  PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation.  A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.

Reviewed by:	emaste, kib (prior version), markj
Additional suggestions from:	alc
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D18880
2019-09-25 09:01:10 +02:00
shurd 17baf5e390 Some devices take undesired actions when RTS and
DTR are asserted. Some development boards for example will reset on DTR,
and some radio interfaces will transmit on RTS.

This patch allows "stty -f /dev/ttyu9.init -rtsdtr" to prevent
RTS and DTR from being asserted on open(), allowing these devices
to be used without problems.

Reviewed by:    imp
Differential Revision:  https://reviews.freebsd.org/D20031
2019-09-25 09:01:07 +02:00
pfg 6bd0b9ed27 Fix mismatch from r342379. 2019-09-25 09:01:04 +02:00
pfg 84ba60e6eb gai_strerror() - Update string error messages according to RFC 3493.
Error messages in gai_strerror(3) vary largely among OSs.

For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.

Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.

MFC after:	1 month
Differentical Revision:	D18630
2019-09-25 08:38:27 +02:00
Sebastian Huber 9d4a6534fb Move RTEMS and XMK specific type definitions
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2019-02-19 09:06:22 +01:00
Jeff Johnston 5726873100 Bump release to 3.1.0 for yearly snapshot 2018-12-31 23:40:11 -05:00
Sebastian Huber dc6e94551f RTEMS: Use __uint64_t for __ino_t
FreeBSD uses a 64-bit ino_t since 2017-05-23.  We need this for the
pipe() support in libbsd.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-12-20 12:12:38 +01:00
markj 44756a36ab Plug routing sysctl leaks.
Various structures exported by sysctl_rtsock() contain padding fields
which were not being zeroed.

Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	ae
MFC after:	3 days
Security:	kernel memory disclosure
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18333
2018-12-20 12:12:37 +01:00
Sebastian Huber 1471e7cd74 RTEMS: Avoid <machine/param.h> in <sys/_cpuset.h>
The <machine/param.h> header file exposes some unrelated stuff not
covered by C or POSIX.  Avoid its use in <sys/_cpuset.h> since it is
included in <rtems.h>.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-11-08 09:38:12 +01:00
Joel Sherrill 037428fae3 newlib/libc/sys/rtems/include/machine/param.h: Add _KERNEL to stop method leakage
The following FreeBSD kernel methods are not in any standard and
prototypes/definitions were leaking into application space:

  + round_page()
  + trunc_page()
  + atop()
  + ptoa()
  + pgtok()
2018-10-18 17:19:50 -05:00
Sebastian Huber 738fdc6a42 RTEMS: Add struct dirent::d_type member
This is used by the file system support of libstdc++ for example.  Use
content from latest FreeBSD <sys/dirent.h>

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-10-11 08:29:16 +02:00
Sebastian Huber da418955f5 Move common <sys/dirent.h> content to <dirent.h>
Move common content of the various <sys/dirent.h> and the latest FreeBSD
<dirent.h> to <dirent.h>.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-10-11 08:29:16 +02:00
Jon Beniston a9cfb33b6c Add --disable-newlib-fno-builtin to allow compilation without -fno-builtin for smaller and faster code. 2018-08-31 15:40:42 -04:00
Sebastian Huber d13c84eb07 RTEMS: Add kvaddr_t and ksize_t
These types were introduced by FreeBSD commit:

"Make struct xinpcb and friends word-size independent.

Replace size_t members with ksize_t (uint64_t) and pointer members
(never used as pointers in userspace, but instead as unique
idenitifiers) with kvaddr_t (uint64_t). This makes the structs
identical between 32-bit and 64-bit ABIs.

On 64-bit bit systems, the ABI is maintained. On 32-bit systems,
this is an ABI breaking change. The ABI of most of these structs
was previously broken in r315662.  This also imposes a small API
change on userspace consumers who must handle kernel pointers
becoming virtual addresses.

PR:		228301 (exp-run by antoine)
Reviewed by:	jtl, kib, rwatson (various versions)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15386"

In RTEMS, there is no user/kernel space separation.  So, use the types
size_t and uintptr_t.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:07:29 +02:00
Sebastian Huber d35971f392 RTEMS: Introduce <machine/_kernel_mman.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber a2a8600f7d RTEMS: Introduce <machine/_kernel_socket.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber 764d748c9c RTEMS: Introduce <machine/_kernel_if.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber 0c0dd28596 RTEMS: Introduce <machine/_kernel_in.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber 9ce55ee716 RTEMS: Introduce <machine/_kernel_in6.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber c07fa084e0 RTEMS: Introduce <machine/_kernel_uio.h>
This helps to avoid Newlib updates due to FreeBSD kernel space changes.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:43 +02:00
Sebastian Huber 9bbf89dd11 RTEMS: Add __BSD_VISIBLE in <sys/_termios.h>
The __XSI_VISIBLE is not enabled by default in Newlib.  This is an
incompatiblity between FreeBSD and glibc.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:42 +02:00
Sebastian Huber 890c86d633 RTEMS: Update FreeBSD version tags
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2018-08-24 15:04:39 +02:00
tuexen fe3e8b90dc Add SOL_SOCKET level socket option
with name SO_DOMAIN to get the domain of a socket.

This is helpful when testing and Solaris and Linux have the same
socket option using the same name.

Reviewed by:		bcr@, rrs@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D16791
2018-08-24 15:00:04 +02:00
jtl 823b096471 Implement a limit on on the number of IPv6 reassembly
queues per bucket.

There is a hashing algorithm which should distribute IPv6 reassembly
queues across the available buckets in a relatively even way. However,
if there is a flaw in the hashing algorithm which allows a large number
of IPv6 fragment reassembly queues to end up in a single bucket, a per-
bucket limit could help mitigate the performance impact of this flaw.

Implement such a limit, with a default of twice the maximum number of
reassembly queues divided by the number of buckets. Recalculate the
limit any time the maximum number of reassembly queues changes.
However, allow the user to override the value using a sysctl
(net.inet6.ip6.maxfragbucketsize).

Reviewed by:	jhb
Security:	FreeBSD-SA-18:10.ip
Security:	CVE-2018-6923
2018-08-24 15:00:04 +02:00
jtl 0e5c59050d Add a limit of the number of fragments per IPv6 packet.
The IPv4 fragment reassembly code supports a limit on the number of
fragments per packet. The default limit is currently 17 fragments.
Among other things, this limit serves to limit the number of fragments
the code must parse when trying to reassembly a packet.

Add a limit to the IPv6 reassembly code. By default, limit a packet
to 65 fragments (64 on the queue, plus one final fragment to complete
the packet). This allows an average fragment size of 1,008 bytes, which
should be sufficient to hold a fragment. (Recall that the IPv6 minimum
MTU is 1280 bytes. Therefore, this configuration allows a full-size
IPv6 packet to be fragmented on a link with the minimum MTU and still
carry approximately 272 bytes of headers before the fragmented portion
of the packet.)

Users can adjust this limit using the net.inet6.ip6.maxfragsperpacket
sysctl.

Reviewed by:	jhb
Security:	FreeBSD-SA-18:10.ip
Security:	CVE-2018-6923
2018-08-24 15:00:04 +02:00
rrs 215e33310b This commit brings in a new refactored TCP stack called Rack.
Rack includes the following features: - A different SACK processing
scheme (the old sack structures are not used). - RACK (Recent
acknowledgment) where counting dup-acks is no longer done instead time
is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss
Probe) where we will probe for tail-losses to attempt to try not to take
a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS -
PRR (partial rate reduction) see the RFC.

Once built into your kernel, you can select this stack by either
socket option with the name of the stack is "rack" or by setting
the global sysctl so the default is rack.

Note that any connection that does not support SACK will be kicked
back to the "default" base  FreeBSD stack (currently known as "default").

To build this into your kernel you will need to enable in your
kernel:
   makeoptions WITH_EXTRA_TCP_STACKS=1
   options TCPHPTS

Sponsored by:	Netflix Inc.
Differential Revision:		https://reviews.freebsd.org/D15525
2018-08-24 15:00:04 +02:00
sbruno b40c48e057 Load balance sockets with new SO_REUSEPORT_LB option.
This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple
programs or threads to bind to the same port and incoming connections will be
load balanced using a hash function.

Most of the code was copied from a similar patch for DragonflyBSD.

However, in DragonflyBSD, load balancing is a global on/off setting and can not
be set per socket. This patch allows for simultaneous use of both the current
SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system.

Required changes to structures:
Globally change so_options from 16 to 32 bit value to allow for more options.
Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets.

Limitations:
As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or
threads sharing the same socket).

This is a substantially different contribution as compared to its original
incarnation at svn r332894 and reverted at svn r332967.  Thanks to rwatson@
for the substantive feedback that is included in this commit.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Obtained from:	DragonflyBSD
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11003
2018-08-24 15:00:04 +02:00
mmacy 44e0190a8c iflib(9): Add support for cloning pseudo interfaces
Part 3 of many ...
The VPC framework relies heavily on cloning pseudo interfaces
(vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc).

This pulls in that piece. Some ancillary changes get pulled
in as a side effect.

Reviewed by:	shurd@
Approved by:	sbruno@
Sponsored by:	Joyent, Inc.
Differential Revision:	https://reviews.freebsd.org/D15347
2018-08-24 15:00:04 +02:00