newlib-cygwin

Commit Graph

Author	SHA1	Message	Date
Stefan Eßer	a1071cb178	sys/bitset.h: reduce visibility of BIT_* macros Add two underscore characters "__" to names of BIT_* and BITSET_* macros to move them to the implementation name space and to prevent a name space pollution due to BIT_* macros in 3rd party programs with conflicting parameter signatures. These prefixed macro names are used in kernel header files to define macros in e.g. sched.h, sys/cpuset.h and sys/domainset.h. If C programs are built with either -D_KERNEL (automatically passed when building a kernel or kernel modules) or -D_WANT_FREENBSD_BITSET (or this macros is defined in the source code before including the bitset macros), then all macros are made visible with their previous names, too. E.g., both __BIT_SET() and BIT_SET() are visible with either of _KERNEL or _WANT_FREEBSD_BITSET defined. The main reason for this change is that some 3rd party sources including sched.h have been found to contain conflicting BIT_* macros. As a work-around, parts of shed.h have been made conditional and depend on _WITH_CPU_SET_T being set when sched.h is included. Ports that expect the full functionality provided by sched.h need to be built with -D_WITH_CPU_SET_T. But this leads to conflicts if BIT_* macros are defined in that program, too. This patch set makes all of sched.h visible again without this parameter being passed and without any name space pollution due to BIT_* macros becoming visible when sched.h is included. This patch set will be backported to the STABLE branches, but ports will need to use -D_WITH_CPU_SET_T as long as there are supported releases that do not contain these patches. Reviewed by: kib, markj MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D33235	2022-06-22 10:15:26 +02:00
Konstantin Belousov	4ac3ee88c7	sched.h: add CPU_EQUAL() for better compatibility with Linux Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32901	2022-06-22 10:15:26 +02:00
Mark Johnston	6070714e0f	cpuset(9): Add CPU_FOREACH_IS(SET\|CLR) and modify consumers to use it This implementation is faster and doesn't modify the cpuset, so it lets us avoid some unnecessary copying as well. No functional change intended. This is a re-application of commit 9068f6ea697b1b28ad1326a4c7a9ba86f08b985e. Reviewed by: cem, kib, jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32029	2022-06-22 10:15:26 +02:00
Mark Johnston	27e8401c46	bitset: Reimplement BIT_FOREACH_IS(SET\|CLR) Eliminate the nested loops and re-implement following a suggestion from rlibby. Add some simple regression tests. Reviewed by: rlibby, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32472	2022-06-22 10:15:26 +02:00
Mark Johnston	96ddb4055e	Revert "cpuset(9): Add CPU_FOREACH_IS(SET\|CLR) and modify consumers to use it" This reverts commit 9068f6ea697b1b28ad1326a4c7a9ba86f08b985e. The underlying macro needs to be reworked to avoid problems with control flow statements. Reported by: rlibby	2022-06-22 10:15:26 +02:00
Mark Johnston	112245b78f	cpuset(9): Add CPU_FOREACH_IS(SET\|CLR) and modify consumers to use it This implementation is faster and doesn't modify the cpuset, so it lets us avoid some unnecessary copying as well. No functional change intended. Reviewed by: cem, kib, jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32029	2022-06-22 10:15:26 +02:00
Mark Johnston	37a3e59636	bitset(9): Introduce BIT_FOREACH_ISSET and BIT_FOREACH_ISCLR These allow one to non-destructively iterate over the set or clear bits in a bitset. The motivation is that we have several code fragments which iterate over a CPU set like this: while ((cpu = CPU_FFS(&cpus)) != 0) { cpu--; CPU_CLR(cpu, &cpus); <do something>; } This is slow since CPU_FFS begins the search at the beginning of the bitset each time. On amd64 and arm64, CPU sets have size 256, so there are four limbs in the bitset and we do a lot of unnecessary scanning. A second problem is that this is destructive, so code which needs to preserve the original set has to make a copy. In particular, we have quite a few functions which take a cpuset_t parameter by value, meaning that each call has to copy the 32 byte cpuset_t. The new macros address both problems. Reviewed by: cem, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32028	2022-06-22 10:15:26 +02:00
Patrick Kelsey	7c03cdf47e	iflib: Improve mapping of TX/RX queues to CPUs iflib now supports mapping each (TX,RX) queue pair to the same CPU (default), to separate CPUs, or to a pair of physical and logical CPUs that share the same L2 cache. The mapping mechanism supports unequal numbers of TX and RX queues, with the excess queues always being mapped to consecutive physical CPUs. When the platform cannot distinguish between physical and logical CPUs, all are treated as physical CPUs. See the comment on get_cpuid_for_queue() for the entire matrix. The following device-specific tunables influence the mapping process: dev.<device>.<unit>.iflib.core_offset (existing) dev.<device>.<unit>.iflib.separate_txrx (existing) dev.<device>.<unit>.iflib.use_logical_cores (new) The following new, read-only sysctls provide visibility of the mapping results: dev.<device>.<unit>.iflib.{t,r}xq<n>.cpu When an iflib driver allocates TX softirqs without providing reference RX IRQs, iflib now binds those TX softirqs to CPUs using the above mapping mechanism (that is, treats them as if they were TX IRQs). Previously, such bindings were left up to the grouptaskqueue code and thus fell outside of the iflib CPU mapping strategy. Reviewed by: kbowling Tested by: olivier, pkelsey MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D24094	2022-06-22 10:15:26 +02:00
Ryan Libby	fb0a5865e4	bitset: implement BIT_TEST_CLR_ATOMIC & BIT_TEST_SET_ATOMIC That is, provide wrappers around the atomic_testandclear and atomic_testandset primitives. Submitted by: jeff Reviewed by: cem, kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22702	2022-06-22 10:15:26 +02:00
D Scott Phillips	9d50b44689	bitset: expand bit index type to `long` An upcoming patch to use the bitset macros for tracking vm page dump information could conceivably need more than INT_MAX bits. Expand the bit type to long so that the extra range is available on 64-bit platforms where it would most likely be needed. CPUSET_COUNT and DOMAINSET_COUNT are also modified to remain of type `int`. Reviewed by: kib, markj Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26190	2022-06-22 10:15:26 +02:00
D Scott Phillips	4c5b7bec97	bitset: add BIT_FFS_AT() for finding the first bit set greater than a start bit Reviewed by: kib Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26128	2022-06-22 10:15:26 +02:00
Konstantin Belousov	5e7a2b174a	Fix undefined behavior: left-shifting into the sign bit. Reviewed by: dim, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22898	2022-06-22 10:15:26 +02:00
Ryan Libby	de1380c36b	bitset: rename confusing macro NAND to ANDNOT s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual implementation is "and not" (or "but not"), i.e. A but not B. Fortunately this does appear to be what all existing callers want. Don't supply a NAND (not (A and B)) operation at this time. Discussed with: jeff Reviewed by: cem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22791	2022-06-22 10:15:26 +02:00
Ryan Libby	96c645a0b1	bitset: avoid pessimized code when bitset size is not constant We have a couple optimizations for when the bitset is known to be just one word. But with dynamically sized bitsets, it was actually more work to determine the size than just to do the necessary computation. Now, only use the optimization when the size is known to be constant. Reviewed by: markj Discussed with: jeff Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22639	2022-06-22 10:15:26 +02:00
Jeff Roberson	a6bd733db9	Use a precise bit count for the slab free items in UMA. This significantly shrinks embedded slab structures. Reviewed by: markj, rlibby (prior version) Differential Revision: https://reviews.freebsd.org/D22584	2022-06-22 10:15:26 +02:00
Sebastian Huber	a13a044c17	RTEMS: Remove FreeBSD version tags	2022-06-22 10:14:38 +02:00
Sebastian Huber	a485393aea	RTEMS: Add <poll.h> and <sys/poll.h> Add the POSIX header file <poll.h> which is used by the GCC 11 Ada runtime support. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2021-01-05 13:41:34 -05:00
Joel Sherrill	fcaaf40c9d	libc/sys/rtems/include/machine/_types.h: Define daddr_t to be 64 bits for RTEMS This type needs to be able to represent a position on a disk or file system.	2020-10-28 09:45:21 -05:00
Keith Packard	9042d0ce65	Use remove-advertising-clause script to edit BSD licenses This edits licenses held by Berkeley and NetBSD, both of which have removed the advertising requirement from their licenses. Signed-off-by: Keith Packard <keithp@keithp.com>	2020-01-29 19:03:31 +01:00
kib	7e9b1550fd	Add SIOCGIFDOWNREASON. The ioctl(2) is intended to provide more details about the cause of the down for the link. Eventually we might define a comprehensive list of codes for the situations. But interface also allows the driver to provide free-form null-terminated ASCII string to provide arbitrary non-formalized information. Sample implementation exists for mlx5(4), where the string is fetched from firmware controlling the port. Reviewed by: hselasky, rrs Sponsored by: Mellanox Technologies MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21527	2019-09-25 09:01:28 +02:00
jhb	1b35636119	Add kernel-side support for in-kernel TLS. KTLS adds support for in-kernel framing and encryption of Transport Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports offload of TLS for transmitted data. Key negotation must still be performed in userland. Once completed, transmit session keys for a connection are provided to the kernel via a new TCP_TXTLS_ENABLE socket option. All subsequent data transmitted on the socket is placed into TLS frames and encrypted using the supplied keys. Any data written to a KTLS-enabled socket via write(2), aio_write(2), or sendfile(2) is assumed to be application data and is encoded in TLS frames with an application data type. Individual records can be sent with a custom type (e.g. handshake messages) via sendmsg(2) with a new control message (TLS_SET_RECORD_TYPE) specifying the record type. At present, rekeying is not supported though the in-kernel framework should support rekeying. KTLS makes use of the recently added unmapped mbufs to store TLS frames in the socket buffer. Each TLS frame is described by a single ext_pgs mbuf. The ext_pgs structure contains the header of the TLS record (and trailer for encrypted records) as well as references to the associated TLS session. KTLS supports two primary methods of encrypting TLS frames: software TLS and ifnet TLS. Software TLS marks mbufs holding socket data as not ready via M_NOTREADY similar to sendfile(2) when TLS framing information is added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then called to schedule TLS frames for encryption. In the case of sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving the mbufs marked M_NOTREADY until encryption is completed. For other writes (vn_sendfile when pages are available, write(2), etc.), the PRUS_NOTREADY is set when invoking pru_send() along with invoking ktls_enqueue(). A pool of worker threads (the "KTLS" kernel process) encrypts TLS frames queued via ktls_enqueue(). Each TLS frame is temporarily mapped using the direct map and passed to a software encryption backend to perform the actual encryption. (Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if someone wished to make this work on architectures without a direct map.) KTLS supports pluggable software encryption backends. Internally, Netflix uses proprietary pure-software backends. This commit includes a simple backend in a new ktls_ocf.ko module that uses the kernel's OpenCrypto framework to provide AES-GCM encryption of TLS frames. As a result, software TLS is now a bit of a misnomer as it can make use of hardware crypto accelerators. Once software encryption has finished, the TLS frame mbufs are marked ready via pru_ready(). At this point, the encrypted data appears as regular payload to the TCP stack stored in unmapped mbufs. ifnet TLS permits a NIC to offload the TLS encryption and TCP segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS) is allocated on the interface a socket is routed over and associated with a TLS session. TLS records for a TLS session using ifnet TLS are not marked M_NOTREADY but are passed down the stack unencrypted. The ip_output_send() and ip6_output_send() helper functions that apply send tags to outbound IP packets verify that the send tag of the TLS record matches the outbound interface. If so, the packet is tagged with the TLS send tag and sent to the interface. The NIC device driver must recognize packets with the TLS send tag and schedule them for TLS encryption and TCP segmentation. If the the outbound interface does not match the interface in the TLS send tag, the packet is dropped. In addition, a task is scheduled to refresh the TLS send tag for the TLS session. If a new TLS send tag cannot be allocated, the connection is dropped. If a new TLS send tag is allocated, however, subsequent packets will be tagged with the correct TLS send tag. (This latter case has been tested by configuring both ports of a Chelsio T6 in a lagg and failing over from one port to another. As the connections migrated to the new port, new TLS send tags were allocated for the new port and connections resumed without being dropped.) ifnet TLS can be enabled and disabled on supported network interfaces via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported across both vlan devices and lagg interfaces using failover, lacp with flowid enabled, or lacp with flowid enabled. Applications may request the current KTLS mode of a connection via a new TCP_TXTLS_MODE socket option. They can also use this socket option to toggle between software and ifnet TLS modes. In addition, a testing tool is available in tools/tools/switch_tls. This is modeled on tcpdrop and uses similar syntax. However, instead of dropping connections, -s is used to force KTLS connections to switch to software TLS and -i is used to switch to ifnet TLS. Various sysctls and counters are available under the kern.ipc.tls sysctl node. The kern.ipc.tls.enable node must be set to true to enable KTLS (it is off by default). The use of unmapped mbufs must also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS. KTLS is enabled via the KERN_TLS kernel option. This patch is the culmination of years of work by several folks including Scott Long and Randall Stewart for the original design and implementation; Drew Gallatin for several optimizations including the use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records awaiting software encryption, and pluggable software crypto backends; and John Baldwin for modifications to support hardware TLS offload. Reviewed by: gallatin, hselasky, rrs Obtained from: Netflix Sponsored by: Netflix, Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21277	2019-09-25 09:01:23 +02:00
thj	28a44b1ecd	Rename IPPROTO 33 from SEP to DCCP IPPROTO 33 is DCCP in the IANA Registry: https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml IPPROTO_SEP was added about 20 years ago in r33804. The entries were added straight from RFC1700, without regard to whether they were used. The reference in RFC1700 for SEP is '[JC120] <mystery contact>', this is an indication that the protocol number was probably in use in a private network. As RFC1700 is no longer the authoritative list of internet numbers and that IANA assinged 33 to DCCP in RFC4340, change the header to the actual authoritative source. Reviewed by: Richard Scheffenegger, bz Approved by: bz (mentor) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21178	2019-09-25 09:01:19 +02:00
rrs	693ba4025f	This commit updates rack to what is basically being used at NF as well as sets in some of the groundwork for committing BBR. The hpts system is updated as well as some other needed utilities for the entrance of BBR. This is actually part 1 of 3 more needed commits which will finally complete with BBRv1 being added as a new tcp stack. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D20834	2019-09-25 09:01:19 +02:00
jhb	2f55e1fa06	Add an external mbuf buffer type that holds multiple unmapped pages. Unmapped mbufs allow sendfile to carry multiple pages of data in a single mbuf, without mapping those pages. It is a requirement for Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web serving workloads when used by sendfile, due to effectively compressing socket buffers by an order of magnitude, and hence reducing cache misses. For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer now points to a struct mbuf_ext_pgs structure instead of a data buffer. This structure contains an array of physical addresses (this reduces cache misses compared to an earlier version that stored an array of vm_page_t pointers). It also stores additional fields needed for in-kernel TLS such as the TLS header and trailer data that are currently unused. To more easily detect these mbufs, the M_NOMAP flag is set in m_flags in addition to M_EXT. Various functions like m_copydata() have been updated to safely access packet contents (using uiomove_fromphys()), to make things like BPF safe. NIC drivers advertise support for unmapped mbufs on transmit via a new IFCAP_NOMAP capability. This capability can be toggled via the new 'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only transmit packet contents via DMA and use bus_dma, adding the capability to if_capabilities and if_capenable should be all that is required. If a NIC does not support unmapped mbufs, they are converted to a chain of mapped mbufs (using sf_bufs to provide the mapping) in ip_output or ip6_output. If an unmapped mbuf requires software checksums, it is also converted to a chain of mapped mbufs before computing the checksum. Submitted by: gallatin (earlier version) Reviewed by: gallatin, hselasky, rrs Discussed with: ae, kp (firewalls) Relnotes: yes Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20616	2019-09-25 09:01:19 +02:00
hselasky	d41e144869	Convert all IPv4 and IPv6 multicast memberships into using a STAILQ instead of a linear array. The multicast memberships for the inpcb structure are protected by a non-sleepable lock, INP_WLOCK(), which needs to be dropped when calling the underlying possibly sleeping if_ioctl() method. When using a linear array to keep track of multicast memberships, the computed memory location of the multicast filter may suddenly change, due to concurrent insertion or removal of elements in the linear array. This in turn leads to various invalid memory access issues and kernel panics. To avoid this problem, put all multicast memberships on a STAILQ based list. Then the memory location of the IPv4 and IPv6 multicast filters become fixed during their lifetime and use after free and memory leak issues are easier to track, for example by: vmstat -m \| grep multi All list manipulation has been factored into inline functions including some macros, to easily allow for a future hash-list implementation, if needed. This patch has been tested by pho@ . Differential Revision: https://reviews.freebsd.org/D20080 Reviewed by: markj @ MFC after: 1 week Sponsored by: Mellanox Technologies	2019-09-25 09:01:15 +02:00
brooks	e94d2a0f8b	Extend mmap/mprotect API to specify the max page protections. A new macro PROT_MAX() alters a protection value so it can be OR'd with a regular protection value to specify the maximum permissions. If present, these flags specify the maximum permissions. While these flags are non-portable, they can be used in portable code with simple ifdefs to expand PROT_MAX() to 0. This change allows (e.g.) a region that must be writable during run-time linking or JIT code generation to be made permanently read+execute after writes are complete. This complements W^X protections allowing more precise control by the programmer. This change alters mprotect argument checking and returns an error when unhandled protection flags are set. This differs from POSIX (in that POSIX only specifies an error), but is the documented behavior on Linux and more closely matches historical mmap behavior. In addition to explicit setting of the maximum permissions, an experimental sysctl vm.imply_prot_max causes mmap to assume that the initial permissions requested should be the maximum when the sysctl is set to 1. PROT_NONE mappings are excluded from this for compatibility with rtld and other consumers that use such mappings to reserve address space before mapping contents into part of the reservation. A final version this is expected to provide per-binary and per-process opt-in/out options and this sysctl will go away in its current form. As such it is undocumented. Reviewed by: emaste, kib (prior version), markj Additional suggestions from: alc Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18880	2019-09-25 09:01:10 +02:00
shurd	17baf5e390	Some devices take undesired actions when RTS and DTR are asserted. Some development boards for example will reset on DTR, and some radio interfaces will transmit on RTS. This patch allows "stty -f /dev/ttyu9.init -rtsdtr" to prevent RTS and DTR from being asserted on open(), allowing these devices to be used without problems. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D20031	2019-09-25 09:01:07 +02:00
pfg	6bd0b9ed27	Fix mismatch from r342379.	2019-09-25 09:01:04 +02:00
pfg	84ba60e6eb	gai_strerror() - Update string error messages according to RFC 3493. Error messages in gai_strerror(3) vary largely among OSs. For new software we largely replaced the obsoleted EAI_NONAME and with EAI_NODATA but we never updated the corresponding message to better match the intended use. We also have references to ai_flags and ai_family which are not very descriptive for non-developer end users. Bring new new error messages based on informational RFC 3493, which has obsoleted RFC 2553, and make them consistent among the header adn manpage. MFC after: 1 month Differentical Revision: D18630	2019-09-25 08:38:27 +02:00
Sebastian Huber	9d4a6534fb	Move RTEMS and XMK specific type definitions Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2019-02-19 09:06:22 +01:00
Sebastian Huber	dc6e94551f	RTEMS: Use __uint64_t for __ino_t FreeBSD uses a 64-bit ino_t since 2017-05-23. We need this for the pipe() support in libbsd. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-12-20 12:12:38 +01:00
markj	44756a36ab	Plug routing sysctl leaks. Various structures exported by sysctl_rtsock() contain padding fields which were not being zeroed. Reported by: Thomas Barabosch, Fraunhofer FKIE Reviewed by: ae MFC after: 3 days Security: kernel memory disclosure Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18333	2018-12-20 12:12:37 +01:00
Sebastian Huber	1471e7cd74	RTEMS: Avoid <machine/param.h> in <sys/_cpuset.h> The <machine/param.h> header file exposes some unrelated stuff not covered by C or POSIX. Avoid its use in <sys/_cpuset.h> since it is included in <rtems.h>. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-11-08 09:38:12 +01:00
Joel Sherrill	037428fae3	newlib/libc/sys/rtems/include/machine/param.h: Add _KERNEL to stop method leakage The following FreeBSD kernel methods are not in any standard and prototypes/definitions were leaking into application space: + round_page() + trunc_page() + atop() + ptoa() + pgtok()	2018-10-18 17:19:50 -05:00
Sebastian Huber	738fdc6a42	RTEMS: Add struct dirent::d_type member This is used by the file system support of libstdc++ for example. Use content from latest FreeBSD <sys/dirent.h> Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-10-11 08:29:16 +02:00
Sebastian Huber	da418955f5	Move common <sys/dirent.h> content to <dirent.h> Move common content of the various <sys/dirent.h> and the latest FreeBSD <dirent.h> to <dirent.h>. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-10-11 08:29:16 +02:00
Sebastian Huber	d13c84eb07	RTEMS: Add kvaddr_t and ksize_t These types were introduced by FreeBSD commit: "Make struct xinpcb and friends word-size independent. Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) with kvaddr_t (uint64_t). This makes the structs identical between 32-bit and 64-bit ABIs. On 64-bit bit systems, the ABI is maintained. On 32-bit systems, this is an ABI breaking change. The ABI of most of these structs was previously broken in r315662. This also imposes a small API change on userspace consumers who must handle kernel pointers becoming virtual addresses. PR: 228301 (exp-run by antoine) Reviewed by: jtl, kib, rwatson (various versions) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15386" In RTEMS, there is no user/kernel space separation. So, use the types size_t and uintptr_t. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:07:29 +02:00
Sebastian Huber	d35971f392	RTEMS: Introduce <machine/_kernel_mman.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	a2a8600f7d	RTEMS: Introduce <machine/_kernel_socket.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	764d748c9c	RTEMS: Introduce <machine/_kernel_if.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	0c0dd28596	RTEMS: Introduce <machine/_kernel_in.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	9ce55ee716	RTEMS: Introduce <machine/_kernel_in6.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	c07fa084e0	RTEMS: Introduce <machine/_kernel_uio.h> This helps to avoid Newlib updates due to FreeBSD kernel space changes. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:43 +02:00
Sebastian Huber	9bbf89dd11	RTEMS: Add __BSD_VISIBLE in <sys/_termios.h> The __XSI_VISIBLE is not enabled by default in Newlib. This is an incompatiblity between FreeBSD and glibc. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:42 +02:00
Sebastian Huber	890c86d633	RTEMS: Update FreeBSD version tags Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-08-24 15:04:39 +02:00
tuexen	fe3e8b90dc	Add SOL_SOCKET level socket option with name SO_DOMAIN to get the domain of a socket. This is helpful when testing and Solaris and Linux have the same socket option using the same name. Reviewed by: bcr@, rrs@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16791	2018-08-24 15:00:04 +02:00
jtl	823b096471	Implement a limit on on the number of IPv6 reassembly queues per bucket. There is a hashing algorithm which should distribute IPv6 reassembly queues across the available buckets in a relatively even way. However, if there is a flaw in the hashing algorithm which allows a large number of IPv6 fragment reassembly queues to end up in a single bucket, a per- bucket limit could help mitigate the performance impact of this flaw. Implement such a limit, with a default of twice the maximum number of reassembly queues divided by the number of buckets. Recalculate the limit any time the maximum number of reassembly queues changes. However, allow the user to override the value using a sysctl (net.inet6.ip6.maxfragbucketsize). Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-24 15:00:04 +02:00
jtl	0e5c59050d	Add a limit of the number of fragments per IPv6 packet. The IPv4 fragment reassembly code supports a limit on the number of fragments per packet. The default limit is currently 17 fragments. Among other things, this limit serves to limit the number of fragments the code must parse when trying to reassembly a packet. Add a limit to the IPv6 reassembly code. By default, limit a packet to 65 fragments (64 on the queue, plus one final fragment to complete the packet). This allows an average fragment size of 1,008 bytes, which should be sufficient to hold a fragment. (Recall that the IPv6 minimum MTU is 1280 bytes. Therefore, this configuration allows a full-size IPv6 packet to be fragmented on a link with the minimum MTU and still carry approximately 272 bytes of headers before the fragmented portion of the packet.) Users can adjust this limit using the net.inet6.ip6.maxfragsperpacket sysctl. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-24 15:00:04 +02:00
rrs	215e33310b	This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent acknowledgment) where counting dup-acks is no longer done instead time is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt to try not to take a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS - PRR (partial rate reduction) see the RFC. Once built into your kernel, you can select this stack by either socket option with the name of the stack is "rack" or by setting the global sysctl so the default is rack. Note that any connection that does not support SACK will be kicked back to the "default" base FreeBSD stack (currently known as "default"). To build this into your kernel you will need to enable in your kernel: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15525	2018-08-24 15:00:04 +02:00
sbruno	b40c48e057	Load balance sockets with new SO_REUSEPORT_LB option. This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple programs or threads to bind to the same port and incoming connections will be load balanced using a hash function. Most of the code was copied from a similar patch for DragonflyBSD. However, in DragonflyBSD, load balancing is a global on/off setting and can not be set per socket. This patch allows for simultaneous use of both the current SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system. Required changes to structures: Globally change so_options from 16 to 32 bit value to allow for more options. Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets. Limitations: As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or threads sharing the same socket). This is a substantially different contribution as compared to its original incarnation at svn r332894 and reverted at svn r332967. Thanks to rwatson@ for the substantive feedback that is included in this commit. Submitted by: Johannes Lundberg <johalun0@gmail.com> Obtained from: DragonflyBSD Relnotes: Yes Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D11003	2018-08-24 15:00:04 +02:00
mmacy	44e0190a8c	iflib(9): Add support for cloning pseudo interfaces Part 3 of many ... The VPC framework relies heavily on cloning pseudo interfaces (vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc). This pulls in that piece. Some ancillary changes get pulled in as a side effect. Reviewed by: shurd@ Approved by: sbruno@ Sponsored by: Joyent, Inc. Differential Revision: https://reviews.freebsd.org/D15347	2018-08-24 15:00:04 +02:00
sbruno	6a98562b52	Revert r332894 at the request of the submitter. Submitted by: Johannes Lundberg <johalun0_gmail.com> Sponsored by: Limelight Networks	2018-08-24 15:00:04 +02:00
sbruno	5c636abe89	Load balance sockets with new SO_REUSEPORT_LB option This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple programs or threads to bind to the same port and incoming connections will be load balanced using a hash function. Most of the code was copied from a similar patch for DragonflyBSD. However, in DragonflyBSD, load balancing is a global on/off setting and can not be set per socket. This patch allows for simultaneous use of both the current SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system. Required changes to structures Globally change so_options from 16 to 32 bit value to allow for more options. Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets. Limitations As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or threads sharing the same socket). Submitted by: Johannes Lundberg <johanlun0@gmail.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D11003	2018-08-24 15:00:04 +02:00
brooks	341e131f7f	Add 32-bit compat for ioctls that take struct ifgroupreq. Use an accessor to access ifgr_group and ifgr_groups. Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such as "case SIOCAIFGROUP:". This avoids poluting the switch statements with large numbers of #ifdefs. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14960	2018-08-24 15:00:03 +02:00
brooks	79291d6123	Use an accessor function to access ifr_data. This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size). This is believed to be sufficent to fully support ifconfig on 32-bit systems. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14900	2018-08-24 15:00:03 +02:00
jeff	c0f64943e7	Implement several enhancements to NUMA policies. Add a new "interleave" allocation policy which stripes pages across domains with a stride or width keeping contiguity within a multi-page region. Move the kernel to the dedicated numbered cpuset #2 making it possible to assign kernel threads and memory policy separately from user. This also eliminates the need for the complicated interrupt binding code. Add a sysctl API for viewing and manipulating domainsets. Refactor some of the cpuset_t manipulation code using the generic bitset type so that it can be used for both. This probably belongs in a dedicated subr file. Attempt to improve the include situation. Reviewed by: kib Discussed with: jhb (cpuset parts) Tested by: pho (before review feedback) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14839	2018-08-24 15:00:03 +02:00
brooks	f967e60cab	Fix access to ifru_buffer on freebsd32. Make all kernel accesses to ifru_buffer go via access functions which take the process ABI into account and use an appropriate union to access members in the correct place in struct ifreq. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14846	2018-08-24 15:00:03 +02:00
kib	b0250c7356	Allow to specify PCP on packets not belonging to any VLAN. According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be considered as untagged, and only PCP and DEI values from the VLAN tag are meaningful. See for instance https://www.cisco.com/c/en/us/td/docs/switches/connectedgrid/cg-switch-sw-master/software/configuration/guide/vlan0/b_vlan_0.html. Make it possible to specify PCP value for outgoing packets on an ethernet interface. When PCP is supplied, the tag is appended, VLAN id set to 0, and PCP is filled by the supplied value. The code to do VLAN tag encapsulation is refactored from the if_vlan.c and moved into if_ethersubr.c. Drivers might have issues with filtering VID 0 packets on receive. This bug should be fixed for each driver. Reviewed by: ae (previous version), hselasky, melifaro Sponsored by: Mellanox Technologies MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D14702	2018-08-24 15:00:03 +02:00
brooks	9cea1c4489	Move uio enums to sys/_uio.h. Include _uio.h instead of uio.h in several headers to reduce header polution. Fix a few places that relied on header polution to get the uio.h header. I have not moved struct uio as many more things that use it rely on header polution to get other definitions from uio.h. Reviewed by: cem, kib, markj Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14811	2018-08-24 15:00:03 +02:00
jtl	61d5f8adfa	Add the "TCP Blackbox Recorder" which we discussed at the developer summits at BSDCan and BSDCam in 2017. The TCP Blackbox Recorder allows you to capture events on a TCP connection in a ring buffer. It stores metadata with the event. It optionally stores the TCP header associated with an event (if the event is associated with a packet) and also optionally stores information on the sockets. It supports setting a log ID on a TCP connection and using this to correlate multiple connections that share a common log ID. You can log connections in different modes. If you are doing a coordinated test with a particular connection, you may tell the system to put it in mode 4 (continuous dump). Or, if you just want to monitor for errors, you can put it in mode 1 (ring buffer) and dump all the ring buffers associated with the connection ID when we receive an error signal for that connection ID. You can set a default mode that will be applied to a particular ratio of incoming connections. You can also manually set a mode using a socket option. This commit includes only basic probes. rrs@ has added quite an abundance of probes in his TCP development work. He plans to commit those soon. There are user-space programs which we plan to commit as ports. These read the data from the log device and output pcapng files, and then let you analyze the data (and metadata) in the pcapng files. Reviewed by: gnn (previous version) Obtained from: Netflix, Inc. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11085	2018-08-24 15:00:03 +02:00
brooks	4d144963ea	Add _IOC_NEWLEN() and _IOC_NEWTYPE() macros. These macros take an existing ioctl(2) command and replace the length with the specified length or length of the specified type respectively. These can be used to define commands for 32-bit compatibility with fewer opportunities for cut-and-paste errors then a whole new definition. Reviewed by: cem, kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14706	2018-08-24 15:00:03 +02:00
pkelsey	b4d6660d85	This is an implementation of the client side of TCP Fast Open (TFO) [RFC7413]. It also includes a pre-shared key mode of operation in which the server requires the client to be in possession of a shared secret in order to successfully open TFO connections with that server. The names of some existing fastopen sysctls have changed (e.g., net.inet.tcp.fastopen.enabled -> net.inet.tcp.fastopen.server_enable). Reviewed by: tuexen MFC after: 1 month Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14047	2018-08-24 15:00:03 +02:00
ae@FreeBSD.org	b43341334e	Follow the RFC6980 and silently ignore following IPv6 NDP messages that had the IPv6 fragmentation header: o Neighbor Solicitation o Neighbor Advertisement o Router Solicitation o Router Advertisement o Redirect Introduce M_FRAGMENTED mbuf flag, and set it after IPv6 fragment reassembly is completed. Then check the presence of this flag in correspondig ND6 handling routines. PR: 224247 MFC after: 2 weeks	2018-08-24 15:00:03 +02:00
pfg	ba2eaf10ad	SPDX: license IDs for some ISC-related files.	2018-08-24 15:00:03 +02:00
glebius	d937538075	Garbage collect IFCAP_POLLING_NOCOUNT. It wasn't used since very beginning of polling(4). The module always ignored return value from driver polling handler.	2018-08-24 15:00:03 +02:00
pfg	fba31eac2e	sys/sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2018-08-24 15:00:03 +02:00
pfg	1329e846c7	include: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2018-08-24 15:00:03 +02:00
pfg	9f0f4785e8	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2018-08-24 15:00:03 +02:00
kib	91e828be4f	Use hardware timestamps to report packet timestamps for SO_TIMESTAMP and other similar socket options. Provide new control message SCM_TIME_INFO to supply information about timestamp. Currently it indicates that the timestamp was hardware-assisted and high-precision, for software timestamps the message is not returned. Reserved fields are added to ABI to report additional info about it, it is expected that raw hardware clock value might be useful for some applications. Reviewed by: gallatin (previous version), hselasky Sponsored by: Mellanox Technologies MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D12638	2018-08-24 15:00:02 +02:00
kib	7ff81234c4	Add a place for a driver to report rx timestamps in nanoseconds from boot for the received packets. The rcv_tstmp field overlaps the place of Ln header length indicators, not used by received packets. The basic pkthdr rearrangement change in sys/mbuf.h was provided by gallatin. There are two accompanying M_ flags: M_TSTMP means that there is the timestamp (and it was generated by hardware). Another flag M_TSTMP_HPREC indicates that the timestamp is high-precision. Practically M_TSTMP_HPREC means that hardware provided additional precision comparing with the stamps when the flag is not set. E.g., for ConnectX all packets are stamped by hardware when PCIe transaction to write out the completion descriptor is performed, but PTP packet are stamped on port. For Intel cards, when PTP assist is enabled, only PTP packets are stamped in the limited number of registers, so if Intel cards ever start support this mechanism, they would always set M_TSTMP \| M_TSTMP_HPREC if hardware timestamp is present for the given packet. Add IFCAP_HWRXTSTMP interface capability to indicate the support for hardware rx timestamping, and ifconfig(8) command to toggle it. Based on the patch by: gallatin Reviewed by: gallatin (previous version), hselasky Sponsored by: Mellanox Technologies MFC after: 2 weeks (? mbuf KBI issue) X-Differential revision: https://reviews.freebsd.org/D12638	2018-08-24 15:00:02 +02:00
sephe	1182b9fe17	if: Add ioctls to get RSS key and hash type/function. It will be needed by hn(4) to configure its RSS key and hash type/function in the transparent VF mode in order to match VF's RSS settings. The description of the transparent VF mode and the RSS hash value issue are here: https://svnweb.freebsd.org/base?view=revision&revision=322299 https://svnweb.freebsd.org/base?view=revision&revision=322485 These are generic enough to promise two independent IOCs instead of abusing SIOCGDRVSPEC. Setting RSS key and hash type/function is a different story, which probably requires more discussion. Comment about UDP_{IPV4,IPV6,IPV6_EX} were only in the patch in the review request; these hash types are standardized now. Reviewed by: gallatin MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12174	2018-08-24 15:00:02 +02:00
des	b329ee9386	Correct sysctl names.	2018-08-24 15:00:02 +02:00
kib	471f29861e	Relax visibility for some termios symbols. They are defined by XSI or newer SUS. This is a follow-up to r318780. Reported by: jbeich Obtained from: DragonflyBSD commit e08b3836c962 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-08-24 15:00:02 +02:00
kib	eb82d7086c	Implement address space guards. Guard, requested by the MAP_GUARD mmap(2) flag, prevents the reuse of the allocated address space, but does not allow instantiation of the pages in the range. It is useful for more explicit support for usual two-stage reserve then commit allocators, since it prevents accidental instantiation of the mapping, e.g. by mprotect(2). Use guards to reimplement stack grow code. Explicitely track stack grow area with the guard, including the stack guard page. On stack grow, trivial shift of the guard map entry and stack map entry limits makes the stack expansion. Move the code to detect stack grow and call vm_map_growstack(), from vm_fault() into vm_map_lookup(). As result, it is impossible to get random mapping to occur in the stack grow area, or to overlap the stack guard page. Enable stack guard page by default. Reviewed by: alc, markj Man page update reviewed by: alc, bjk, emaste, markj, pho Tested by: pho, Qualys Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11306 (man pages)	2018-08-24 15:00:02 +02:00
glebius	99b9b925fe	Listening sockets improvements. o Separate fields of struct socket that belong to listening from fields that belong to normal dataflow, and unionize them. This shrinks the structure a bit. - Take out selinfo's from the socket buffers into the socket. The first reason is to support braindamaged scenario when a socket is added to kevent(2) and then listen(2) is cast on it. The second reason is that there is future plan to make socket buffers pluggable, so that for a dataflow socket a socket buffer can be changed, and in this case we also want to keep same selinfos through the lifetime of a socket. - Remove struct struct so_accf. Since now listening stuff no longer affects struct socket size, just move its fields into listening part of the union. - Provide sol_upcall field and enforce that so_upcall_set() may be called only on a dataflow socket, which has buffers, and for listening sockets provide solisten_upcall_set(). o Remove ACCEPT_LOCK() global. - Add a mutex to socket, to be used instead of socket buffer lock to lock fields of struct socket that don't belong to a socket buffer. - Allow to acquire two socket locks, but the first one must belong to a listening socket. - Make soref()/sorele() to use atomic(9). This allows in some situations to do soref() without owning socket lock. There is place for improvement here, it is possible to make sorele() also to lock optionally. - Most protocols aren't touched by this change, except UNIX local sockets. See below for more information. o Reduce copy-and-paste in kernel modules that accept connections from listening sockets: provide function solisten_dequeue(), and use it in the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4), infiniband, rpc. o UNIX local sockets. - Removal of ACCEPT_LOCK() global uncovered several races in the UNIX local sockets. Most races exist around spawning a new socket, when we are connecting to a local listening socket. To cover them, we need to hold locks on both PCBs when spawning a third one. This means holding them across sonewconn(). This creates a LOR between pcb locks and unp_list_lock. - To fix the new LOR, abandon the global unp_list_lock in favor of global unp_link_lock. Indeed, separating these two locks didn't provide us any extra parralelism in the UNIX sockets. - Now call into uipc_attach() may happen with unp_link_lock hold if, we are accepting, or without unp_link_lock in case if we are just creating a socket. - Another problem in UNIX sockets is that uipc_close() basicly did nothing for a listening socket. The vnode remained opened for connections. This is fixed by removing vnode in uipc_close(). Maybe the right way would be to do it for all sockets (not only listening), simply move the vnode teardown from uipc_detach() to uipc_close()? Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D9770	2018-08-24 15:00:02 +02:00
delphij	ca3b7a988a	Implement INHERIT_ZERO for minherit(2). INHERIT_ZERO is an OpenBSD feature. When a page is marked as such, it would be zeroed upon fork(). This would be used in new arc4random(3) functions. PR: 182610 Reviewed by: kib (earlier version) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D427	2018-08-24 15:00:02 +02:00
imp	16636ede3c	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2018-08-24 15:00:02 +02:00
ed@FreeBSD.org	08139e557b	mprotect(): Change prototype to comply to POSIX. Our mprotect() function seems to take a "const void " address to the pages whose permissions need to be adjusted. POSIX uses "void ". Simply stick to the POSIX one to prevent us from writing unportable code. PR: 211423 (exp-run) Tested by: antoine@ (Thanks!)	2018-08-24 15:00:02 +02:00
kib	c3df6d5155	Implement process-shared locks support for libthr.so.3, without breaking the ABI. Special value is stored in the lock pointer to indicate shared lock, and offline page in the shared memory is allocated to store the actual lock. Reviewed by: vangyzen (previous version) Discussed with: deischen, emaste, jhb, rwatson, Martin Simmons <martin@lispworks.com> Tested by: pho Sponsored by: The FreeBSD Foundation	2018-08-24 15:00:02 +02:00
jhb	7cfc736e89	Add a new file operations hook for mmap operations. File type-specific logic is now placed in the mmap hook implementation rather than requiring it to be placed in sys/vm/vm_mmap.c. This hook allows new file types to support mmap() as well as potentially allowing mmap() for existing file types that do not currently support any mapping. The vm_mmap() function is now split up into two functions. A new vm_mmap_object() function handles the "back half" of vm_mmap() and accepts a referenced VM object to map rather than a (handle, handle_type) tuple. vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a a VM object and then calling vm_mmap_object() to handle the actual mapping. The vm_mmap() function remains for use by other parts of the kernel (e.g. device drivers and exec) but now only supports mapping vnodes, character devices, and anonymous memory. The mmap() system call invokes vm_mmap_object() directly with a NULL object for anonymous mappings. For mappings using a file descriptor, the descriptors fo_mmap() hook is invoked instead. The fo_mmap() hook is responsible for performing type-specific checks and adjustments to arguments as well as possibly modifying mapping parameters such as flags or the object offset. The fo_mmap() hook routines then call vm_mmap_object() to handle the actual mapping. The fo_mmap() hook is optional. If it is not set, then fo_mmap() will fail with ENODEV. A fo_mmap() hook is implemented for regular files, character devices, and shared memory objects (created via shm_open()). While here, consistently use the VM_PROT_* constants for the vm_prot_t type for the 'prot' variable passed to vm_mmap() and vm_mmap_object() as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines. Previously some places were using the mmap()-specific PROT_* constants instead. While this happens to work because PROT_xx == VM_PROT_xx, using VM_PROT_* is more correct. Differential Revision: https://reviews.freebsd.org/D2658 Reviewed by: alc (glanced over), kib MFC after: 1 month Sponsored by: Chelsio	2018-08-24 15:00:02 +02:00
jhb	60b466fbc2	Retire the unimplemented MAP_RENAME and MAP_NORESERVE flags to mmap(2). Older binaries are still permitted to use these flags. PR: 193961 (exp-run in ports) Differential Revision: https://reviews.freebsd.org/D848 Reviewed by: kib	2018-08-24 15:00:02 +02:00
jhb	3d5043e2cb	Add a new fo_fill_kinfo fileops method to add type-specific information to struct kinfo_file. - Move the various fill_*_info() methods out of kern_descrip.c and into the various file type implementations. - Rework the support for kinfo_ofile to generate a suitable kinfo_file object for each file and then convert that to a kinfo_ofile structure rather than keeping a second, different set of code that directly manipulates type-specific file information. - Remove the shm_path() and ksem_info() layering violations. Differential Revision: https://reviews.freebsd.org/D775 Reviewed by: kib, glebius (earlier version)	2018-08-24 15:00:02 +02:00
kib	de24ef326d	Add MAP_EXCL flag for mmap(2). It should be combined with MAP_FIXED, and prevents the request from deleting existing mappings in the region, failing instead. Reviewed by: alc Discussed with: jhb Tested by: markj, pho (previous version, as part of the bigger patch) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-08-24 15:00:02 +02:00
jhb	472476a5a7	Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)	2018-08-24 15:00:02 +02:00
kib	e6a85661ce	Implement read(2)/write(2) and neccessary lseek(2) for posix shmfd. Add MAC framework entries for posix shm read and write. Do not allow implicit extension of the underlying memory segment past the limit set by ftruncate(2) by either of the syscalls. Read and write returns short i/o, lseek(2) fails with EINVAL when resulting offset does not fit into the limit. Discussed with: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2018-08-24 15:00:02 +02:00
Sebastian Huber	916ef5fb88	RTEMS: Unconditionally define _off_t to int64_t Exotic RTEMS targets can define this back to int32_t as an exception if there are good reasons. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-07-20 06:53:21 +02:00
Joel Sherrill	5b97e36239	rtems/.../dirent.h: Add alphasort() prototype	2018-03-13 09:11:47 -05:00
Sebastian Huber	f641474cb2	RTEMS: Use int for _CLOCKID_T_ Linux and FreeBSD use int as well. In addition, this fixes an Ada incompatiblity problem on 64-bit targets. See also GCC: gcc/ada/libgnarl/s-osinte__rtems.ads Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2018-03-06 11:40:16 +01:00
Sebastian Huber	dadc9e7e4a	RTEMS: Add semaphore <sys/lock.h> functions Declare semaphore try wait and post binary functions. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-11-30 07:00:45 +01:00
Sebastian Huber	5a2ab9d55e	RTEMS: Timed wait by ticks <sys/lock.h> functions Declare timed wait by ticks functions. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-11-30 07:00:45 +01:00
Sebastian Huber	186166f67a	RTEMS: Add set/get name <sys/lock.h> functions Add inline functions to set/get the name. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-11-30 07:00:45 +01:00
Sebastian Huber	ce189d8afe	RTEMS: Remove internal timecounter API Change copyright. Original BSD content moved to <machine/_kernel_time.h>. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-26 08:47:21 +02:00
Sebastian Huber	c165a27c01	RTEMS: Fix _PTHREAD_MUTEX_INITIALIZER Add missing braces around initializer. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-13 08:07:13 +02:00
Sebastian Huber	3a79700c2d	RTEMS: Make pthread_mutex_t self-contained Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:13 +02:00
Sebastian Huber	55c5dda9b5	RTEMS: Make pthread_cond_t self-contained Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:13 +02:00
Sebastian Huber	d902eef093	RTEMS: Make pthread_rwlock_t self-contained Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:13 +02:00
Sebastian Huber	9187bb23a0	RTEMS: Make pthread_barrier_t self-contained Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:13 +02:00
Sebastian Huber	8253c240cb	RTEMS: Make sem_t self-contained Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:12 +02:00
Sebastian Huber	4fef7312b3	RTEMS: Optimize pthread_once_t Reduce size of pthread_once_t and make it zero-initialized. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-10-05 14:56:12 +02:00
Sebastian Huber	524eb4dc29	RTEMS: Use __uint64_t for _CLOCK_T_ This addresses: https://devel.rtems.org/ticket/2135 Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>	2017-08-25 14:25:42 +02:00

1 2 3 4 5

208 Commits