newlib-cygwin

Commit Graph

Author	SHA1	Message	Date
Alexander V. Chernikov	3be97ff62c	Revert "SO_RERROR indicates that receive buffer overflows" Wrong version of the change was pushed inadvertenly. This reverts commit 4a01b854ca5c2e5124958363b3326708b913af71.	2022-07-11 13:19:29 +02:00
Alexander V. Chernikov	2ba2e1e052	SO_RERROR indicates that receive buffer overflows should be handled as errors. Historically receive buffer overflows have been ignored and programs could not tell if they missed messages or messages had been truncated because of overflows. Since programs historically do not expect to get receive overflow errors, this behavior is not the default. This is really really important for programs that use route(4) to keep in sync with the system. If we loose a message then we need to reload the full system state, otherwise the behaviour from that point is undefined and can lead to chasing bogus bug reports.	2022-07-11 13:19:29 +02:00
Alex Richardson	8054ce555f	Expose clang's alignment builtins and use them for roundup2/rounddown2 This makes roundup2/rounddown2 type- and const-preserving and allows using it on pointer types without casting to uintptr_t first. Not performing pointer-to-integer conversions also helps the compiler's optimization passes and can therefore result in better code generation. When using it with integer values there should be no change other than the compiler checking that the alignment value is a valid power-of-two. I originally implemented these builtins for CHERI a few years ago and they have been very useful for CheriBSD. However, they are also useful for non-CHERI code so I was able to upstream them for Clang 10.0. Rationale from the clang documentation: Clang provides builtins to support checking and adjusting alignment of pointers and integers. These builtins can be used to avoid relying on implementation-defined behavior of arithmetic on integers derived from pointers. Additionally, these builtins retain type information and, unlike bitwise arithmetic, they can perform semantic checking on the alignment value. There is also a feature request for GCC, so GCC may also support it in the future: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98641 Reviewed By: brooks, jhb, imp Differential Revision: https://reviews.freebsd.org/D28332	2022-07-11 13:19:29 +02:00
Gleb Smirnoff	5bc5689a6a	Catch up with 6edfd179c86: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG. Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was extended to array of such pointers. Then, such mbufs were augmented with header/trailer. Basically, extended mbufs are extended, and set of features is subject to change. The new name should be generic enough to avoid further renaming.	2022-07-11 11:52:46 +02:00
Konstantin Belousov	581bde91a5	Add tcgetwinsize(3) and tcsetwinsize(3) to termios These functions get/set tty winsize respectively, and are trivial wrappers around corresponding termio ioctls. The functions are expected to be a part of POSIX.1 issue 8: https://www.austingroupbugs.net/view.php?id=1151#c3856. They are currently available in NetBSD and in musl libc. PR: 251868 Submitted by: Soumendra Ganguly <soumendraganguly@gmail.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27650	2022-07-11 11:52:46 +02:00
Andrew Gallatin	c76896074b	Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain In order to efficiently serve web traffic on a NUMA machine, one must avoid as many NUMA domain crossings as possible. With SO_REUSEPORT_LB, a number of workers can share a listen socket. However, even if a worker sets affinity to a core or set of cores on a NUMA domain, it will receive connections associated with all NUMA domains in the system. This will lead to cross-domain traffic when the server writes to the socket or calls sendfile(), and memory is allocated on the server's local NUMA node, but transmitted on the NUMA node associated with the TCP connection. Similarly, when the server reads from the socket, he will likely be reading memory allocated on the NUMA domain associated with the TCP connection. This change provides a new socket ioctl, TCP_REUSPORT_LB_NUMA. A server can now tell the kernel to filter traffic so that only incoming connections associated with the desired NUMA domain are given to the server. (Of course, in the case where there are no servers sharing the listen socket on some domain, then as a fallback, traffic will be hashed as normal to all servers sharing the listen socket regardless of domain). This allows a server to deal only with traffic that is local to its NUMA domain, and avoids cross-domain traffic in most cases. This patch, and a corresponding small patch to nginx to use TCP_REUSPORT_LB_NUMA allows us to serve 190Gb/s of kTLS encrypted https media content from dual-socket Xeons with only 13% (as measured by pcm.x) cross domain traffic on the memory controller. Reviewed by: jhb, bz (earlier version), bcr (man page) Tested by: gonzo Sponsored by: Netfix Differential Revision: https://reviews.freebsd.org/D21636	2022-07-11 11:52:46 +02:00
Brooks Davis	70b6efc47d	style(9): Correct whitespace in struct definitions struct ifconf and struct ifreq use the odd style "struct<tab>foo". struct ifdrv seems to have tried to follow this but was committed with spaces in place of most tabs resulting in "struct<space><space>ifdrv". MFC after: 3 days	2022-07-11 11:52:46 +02:00
Conrad Meyer	3f7425e8bb	unix(4): Enhance LOCAL_CREDS_PERSISTENT ABI As this ABI is still fresh (r367287), let's correct some mistakes now: - Version the structure to allow for future changes - Include sender's pid in control message structure - Use a distinct control message type from the cmsgcred / sockcred mess Discussed with: kib, markj, trasz Differential Revision: https://reviews.freebsd.org/D27084	2022-07-11 11:52:46 +02:00
Conrad Meyer	55dec604f8	unix(4): Add SOL_LOCAL:LOCAL_CREDS_PERSISTENT This option is intended to be semantically identical to Linux's SOL_SOCKET:SO_PASSCRED. For now, it is mutually exclusive with the pre-existing sockopt SOL_LOCAL:LOCAL_CREDS. Reviewed by: markj (penultimate version) Differential Revision: https://reviews.freebsd.org/D27011	2022-07-11 11:52:46 +02:00
Warner Losh	1cb590ab48	Integrate 4.4BSD-Lite2 changes to IOC_* definitions Bring in the long-overdue 4.4BSD-Lite2 rev 8.3 by cgd of sys/ioccom.h. This uses UL suffix for the IOC_* constants so they don't sign extend. Also bring in the handy diagram from NetBSD's version of this file. This alters the 4.4BSD-Lite2 code slightly in a way that's semantically the same but more compact. This should stop the warnings from Chrome for bogus sign extension. Reviewed by: kib@, jhb@ Differential Revision: https://reviews.freebsd.org/D26423	2022-07-11 11:52:46 +02:00
John Baldwin	5ea36d92e6	Support hardware rate limiting (pacing) with TLS offload. - Add a new send tag type for a send tag that supports both rate limiting (packet pacing) and TLS offload (mostly similar to D22669 but adds a separate structure when allocating the new tag type). - When allocating a send tag for TLS offload, check to see if the connection already has a pacing rate. If so, allocate a tag that supports both rate limiting and TLS offload rather than a plain TLS offload tag. - When setting an initial rate on an existing ifnet KTLS connection, set the rate in the TCP control block inp and then reset the TLS send tag (via ktls_output_eagain) to reallocate a TLS + ratelimit send tag. This allocates the TLS send tag asynchronously from a task queue, so the TLS rate limit tag alloc is always sleepable. - When modifying a rate on a connection using KTLS, look for a TLS send tag. If the send tag is only a plain TLS send tag, assume we failed to allocate a TLS ratelimit tag (either during the TCP_TXTLS_ENABLE socket option, or during the send tag reset triggered by ktls_output_eagain) and ignore the new rate. If the send tag is a ratelimit TLS send tag, change the rate on the TLS tag and leave the inp tag alone. - Lock the inp lock when setting sb_tls_info for a socket send buffer so that the routines in tcp_ratelimit can safely dereference the pointer without needing to grab the socket buffer lock. - Add an IFCAP_TXTLS_RTLMT capability flag and associated administrative controls in ifconfig(8). TLS rate limit tags are only allocated if this capability is enabled. Note that TLS offload (whether unlimited or rate limited) always requires IFCAP_TXTLS[46]. Reviewed by: gallatin, hselasky Relnotes: yes Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26691	2022-07-11 11:52:46 +02:00
Andrey V. Elsukov	b8e36b9251	Implement SIOCGIFALIAS. It is lightweight way to check if an IPv4 address exists. Submitted by: Roy Marples Reviewed by: gnn, melifaro MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26636	2022-07-11 11:52:46 +02:00
Richard Scheffenegger	3f0cc70c13	Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow. This adds a new IP_PROTO / IPV6_PROTO setsockopt (getsockopt) option IP(V6)_VLAN_PCP, which can be set to -1 (interface default), or explicitly to any priority between 0 and 7. Note that for untagged traffic, explicitly adding a priority will insert a special 801.1Q vlan header with vlan ID = 0 to carry the priority setting Reviewed by: gallatin, rrs MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D26409	2022-07-11 11:52:46 +02:00
Konstantin Belousov	ec997fae0e	Fix typo. Sponsored by: Mellanox Technologies/NVIDIA Networking MFC after: 3 days	2022-07-11 11:52:46 +02:00
Alexander V. Chernikov	48ba673ce9	Introduce scalable route multipath. This change is based on the nexthop objects landed in D24232. The change introduces the concept of nexthop groups. Each group contains the collection of nexthops with their relative weights and a dataplane-optimized structure to enable efficient nexthop selection. Simular to the nexthops, nexthop groups are immutable. Dataplane part gets compiled during group creation and is basically an array of nexthop pointers, compiled w.r.t their weights. With this change, `rt_nhop` field of `struct rtentry` contains either nexthop or nexthop group. They are distinguished by the presense of NHF_MULTIPATH flag. All dataplane lookup functions returns pointer to the nexthop object, leaving nexhop groups details inside routing subsystem. User-visible changes: The change is intended to be backward-compatible: all non-mpath operations should work as before with ROUTE_MPATH and net.route.multipath=1. All routes now comes with weight, default weight is 1, maximum is 2^24-1. Current maximum multipath group width is statically set to 64. This will become sysctl-tunable in the followup changes. Using functionality: * Recompile kernel with ROUTE_MPATH * set net.route.multipath to 1 route add -6 2001:db8::/32 2001:db8::2 -weight 10 route add -6 2001:db8::/32 2001:db8::3 -weight 20 netstat -6On Nexthop groups data Internet6: GrpIdx NhIdx Weight Slots Gateway Netif Refcnt 1 ------- ------- ------- --------------------------------------- --------- 1 13 10 1 2001:db8::2 vlan2 14 20 2 2001:db8::3 vlan2 Next steps: * Land outbound hashing for locally-originated routes ( D26523 ). * Fix net/bird multipath (net/frr seems to work fine) * Add ROUTE_MPATH to GENERIC * Set net.route.multipath=1 by default Tested by: olivier Reviewed by: glebius Relnotes: yes Differential Revision: https://reviews.freebsd.org/D26449	2022-07-11 11:52:46 +02:00
Ed Maste	9dd91a8330	add SIOCGIFDATA ioctl For interfaces that do not support SIOCGIFMEDIA (for which there are quite a few) the only fallback is to query the interface for if_data->ifi_link_state. While it's possible to get at if_data for an interface via getifaddrs(3) or sysctl, both are heavy weight mechanisms. SIOCGIFDATA is a simple ioctl to retrieve this fast with very little resource use in comparison. This implementation mirrors that of other similar ioctls in FreeBSD. Submitted by: Roy Marples <roy@marples.name> Reviewed by: markj MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D26538	2022-07-11 11:52:46 +02:00
Richard Scheffenegger	7b30b9f648	TCP: send full initial window when timestamps are in use The fastpath in tcp_output tries to send out full segments, and avoid sending partial segments by comparing against the static t_maxseg variable. That value does not consider tcp options like timestamps, while the initial window calculation is using the correct dynamic tcp_maxseg() function. Due to this interaction, the last, full size segment is considered too short and not sent out immediately. Reviewed by: tuexen MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D26478	2022-07-11 11:52:46 +02:00
Navdeep Parhar	43e76bafcd	Add two new ifnet capabilities for hw checksumming and TSO for VXLAN traffic. These are similar to the existing VLAN capabilities. Reviewed by: kib@ Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25873	2022-07-11 11:52:46 +02:00
Konstantin Belousov	1306ff4c92	Support for userspace non-transparent superpages (largepages). Created with shm_open2(SHM_LARGEPAGE) and then configured with FIOSSHMLPGCNF ioctl, largepages posix shared memory objects guarantee that all userspace mappings of it are served by superpage non-managed mappings. Only amd64 for now, both 2M and 1G superpages can be requested, the later requires CPU feature. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652	2022-07-11 11:52:46 +02:00
Mark Johnston	1a5f14a0c5	Include the psind in data returned by mincore(2). Currently we use a single bit to indicate whether the virtual page is part of a superpage. To support a forthcoming implementation of non-transparent 1GB superpages, it is useful to provide more detailed information about large page sizes. The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind) values, indicating a mapping of size psind, where psind is an index into the pagesizes array returned by getpagesizes(3), which in turn comes from the hw.pagesizes sysctl. MINCORE_PSIND(1) is equal to the old value of MINCORE_SUPER. For now, two bits are used to record the page size, permitting values of MAXPAGESIZES up to 4. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26238	2022-07-11 11:52:46 +02:00
Mateusz Guzik	d066d123f1	sys: clean up empty lines in .c and .h files	2022-07-11 11:52:46 +02:00
Mateusz Guzik	27fc846731	net: clean up empty lines in .c and .h files	2022-07-11 11:52:46 +02:00
Konstantin Belousov	941cda2c16	Add SOL_LOCAL symbolic constant for unix socket option level. The constant seems to exists on MacOS X >= 10.8. Requested by: swills Reviewed by: allanjude, kevans Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25933	2022-07-11 11:52:46 +02:00
Kyle Evans	c95c267a46	shm_open2: Implement SHM_GROW_ON_WRITE Lack of SHM_GROW_ON_WRITE is actively breaking Python's memfd_create tests, so go ahead and implement it. A future change will make memfd_create always set SHM_GROW_ON_WRITE, to match Linux behavior and unbreak Python's tests on -CURRENT. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D25502	2022-07-11 11:52:46 +02:00
Wei Hu	1a840361e8	HyperV socket implementation for FreeBSD This change adds Hyper-V socket feature in FreeBSD. New socket address family AF_HYPERV and its kernel support are added. Submitted by: Wei Hu <weh@microsoft.com> Reviewed by: Dexuan Cui <decui@microsoft.com> Relnotes: yes Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D24061	2022-07-11 11:52:46 +02:00
John Baldwin	7293d1e7b6	Initial support for kernel offload of TLS receive. - Add a new TCP_RXTLS_ENABLE socket option to set the encryption and authentication algorithms and keys as well as the initial sequence number. - When reading from a socket using KTLS receive, applications must use recvmsg(). Each successful call to recvmsg() will return a single TLS record. A new TCP control message, TLS_GET_RECORD, will contain the TLS record header of the decrypted record. The regular message buffer passed to recvmsg() will receive the decrypted payload. This is similar to the interface used by Linux's KTLS RX except that Linux does not return the full TLS header in the control message. - Add plumbing to the TOE KTLS interface to request either transmit or receive KTLS sessions. - When a socket is using receive KTLS, redirect reads from soreceive_stream() into soreceive_generic(). - Note that this interface is currently only defined for TLS 1.1 and 1.2, though I believe we will be able to reuse the same interface and structures for 1.3.	2022-07-11 11:52:46 +02:00
Randall Stewart	1da65b8919	This change does a small prepratory step in getting the latest rack and bbr in from the NF repo. When those come in the OOB data handling will be fixed where Skyzaller crashes. Differential Revision: https://reviews.freebsd.org/D24575	2022-07-11 11:52:46 +02:00
Alexander V. Chernikov	b948693357	Convert route caching to nexthop caching. This change is build on top of nexthop objects introduced in r359823. Nexthops are separate datastructures, containing all necessary information to perform packet forwarding such as gateway interface and mtu. Nexthops are shared among the routes, providing more pre-computed cache-efficient data while requiring less memory. Splitting the LPM code and the attached data solves multiple long-standing problems in the routing layer, drastically reduces the coupling with outher parts of the stack and allows to transparently introduce faster lookup algorithms. Route caching was (re)introduced to minimise (slow) routing lookups, allowing for notably better performance for large TCP senders. Caching works by acquiring rtentry reference, which is protected by per-rtentry mutex. If the routing table is changed (checked by comparing the rtable generation id) or link goes down, cache record gets withdrawn. Nexthops have the same reference counting interface, backed by refcount(9). This change merely replaces rtentry with the actual forwarding nextop as a cached object, which is mostly mechanical. Other moving parts like cache cleanup on rtable change remains the same. Differential Revision: https://reviews.freebsd.org/D24340	2022-07-11 11:52:46 +02:00
Jonathan T. Looney	09e5cb57a0	Make the path length of UNIX domain sockets specified by a #define. Also, add a comment describing the historical context for this length. Reviewed by: bz, jhb, kbowling (previous version) MFC after: 2 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24272	2022-07-11 11:52:46 +02:00
Alexander V. Chernikov	86484e84d7	Introduce nexthop objects and new routing KPI. This is the foundational change for the routing subsytem rearchitecture. More details and goals are available in https://reviews.freebsd.org/D24141 . This patch introduces concept of nexthop objects and new nexthop-based routing KPI. Nexthops are objects, containing all necessary information for performing the packet output decision. Output interface, mtu, flags, gw address goes there. For most of the cases, these objects will serve the same role as the struct rtentry is currently serving. Typically there will be low tens of such objects for the router even with multiple BGP full-views, as these objects will be shared between routing entries. This allows to store more information in the nexthop. New KPI: struct nhop_object fib4_lookup(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, uint32_t flowid); struct nhop_object fib6_lookup(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, uint32_t flowid); These 2 function are intended to replace all all flavours of <in_\|in6_>rtalloc[1]<_ign><_fib>, mpath functions and the previous fib[46]-generation functions. Upon successful lookup, they return nexthop object which is guaranteed to exist within current NET_EPOCH. If longer lifetime is desired, one can specify NHR_REF as a flag and get a referenced version of the nexthop. Reference semantic closely resembles rtentry one, allowing sed-style conversion. Additionally, another 2 functions are introduced to support uRPF functionality inside variety of our firewalls. Their primary goal is to hide the multipath implementation details inside the routing subsystem, greatly simplifying firewalls implementation: int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); All functions have a separate scopeid argument, paving way to eliminating IPv6 scope embedding and allowing to support IPv4 link-locals in the future. Structure changes: * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size. * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz. Old KPI: During the transition state old and new KPI will coexists. As there are another 4-5 decent-sized conversion patches, it will probably take a couple of weeks. To support both KPIs, fields not required by the new KPI (most of rtentry) has to be kept, resulting in the temporary size increase. Once conversion is finished, rtentry will notably shrink. More details: * architectural overview: https://reviews.freebsd.org/D24141 * list of the next changes: https://reviews.freebsd.org/D24232 Reviewed by: ae,glebius(initial version) Differential Revision: https://reviews.freebsd.org/D24232	2022-07-11 11:52:46 +02:00
Gleb Smirnoff	f3303cf1d5	Although most of the NIC drivers are epoch ready, due to peer pressure switch over to opt-in instead of opt-out for epoch. Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch when processing its packets. Now this will create recursive entrance in epoch in >90% network drivers, but will guarantee safeness of the transition. Mark several tested drivers as IFF_KNOWSEPOCH. Reviewed by: hselasky, jeff, bz, gallatin Differential Revision: https://reviews.freebsd.org/D23674	2022-07-11 11:52:46 +02:00
Randall Stewart	0dfcaa0211	White space cleanup -- remove trailing tab's or spaces from any line. Sponsored by: Netflix Inc.	2022-07-11 11:52:46 +02:00
Gleb Smirnoff	301991542a	Introduce flag IFF_NEEDSEPOCH that marks Ethernet interfaces that supposedly may call into ether_input() without network epoch. They all need to be reviewed before 13.0-RELEASE. Some may need be fixed. The flag is not planned to be used in the kernel for a long time.	2022-07-11 11:52:46 +02:00
Michael Tuexen	ebbb6536b7	Add flags for upcoming patches related to improved ECN handling. No functional change. Submitted by: Richard Scheffenegger Reviewed by: rgrimes@, tuexen@ Differential Revision: https://reviews.freebsd.org/D22429	2022-07-11 11:52:46 +02:00
Edward Tomasz Napierala	0c4d87ca5f	Make use of the stats(3) framework in the TCP stack. This makes it possible to retrieve per-connection statistical information such as the receive window size, RTT, or goodput, using a newly added TCP_STATS getsockopt(3) option, and extract them using the stats_voistat_fetch(3) API. See the net/tcprtt port for an example consumer of this API. Compared to the existing TCP_INFO system, the main differences are that this mechanism is easy to extend without breaking ABI, and provides statistical information instead of raw "snapshots" of values at a given point in time. stats(3) is more generic and can be used in both userland and the kernel. Reviewed by: thj Tested by: thj Obtained from: Netflix Relnotes: yes Sponsored by: Klara Inc, Netflix Differential Revision: https://reviews.freebsd.org/D20655	2022-07-11 11:52:46 +02:00
David Bright	0c854dd6d1	Jail and capability mode for shm_rename; add audit support for shm_rename Co-mingling two things here: * Addressing some feedback from Konstantin and Kyle re: jail, capability mode, and a few other things * Adding audit support as promised. The audit support change includes a partial refresh of OpenBSM from upstream, where the change to add shm_rename has already been accepted. Matthew doesn't plan to work on refreshing anything else to support audit for those new event types. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: kib Relnotes: Yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22083	2022-07-11 11:52:46 +02:00
John Baldwin	12fb531a70	Add a TOE KTLS mode and a TOE hook for allocating TLS sessions. This adds the glue to allocate TLS sessions and invokes it from the TLS enable socket option handler. This also adds some counters for active TOE sessions. The TOE KTLS mode is returned by getsockopt(TLSTX_TLS_MODE) when TOE KTLS is in use on a socket, but cannot be set via setsockopt(). To simplify various checks, a TLS session now includes an explicit 'mode' member set to the value returned by TLSTX_TLS_MODE. Various places that used to check 'sw_encrypt' against NULL to determine software vs ifnet (NIC) TLS now check 'mode' instead. Reviewed by: np, gallatin Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21891	2022-07-11 11:52:46 +02:00
Kyle Evans	1ef7e3904d	MFD_*: swap ordering This API is still young enough that I would expect no one to be dependant on this yet... Swap the ordering while it's young to match Linux values to potentially ease implementation of linuxolator syscall, being able to reuse existing constants.	2022-07-11 11:52:46 +02:00
David Bright	53648039c4	Add an shm_rename syscall Add an atomic shm rename operation, similar in spirit to a file rename. Atomically unlink an shm from a source path and link it to a destination path. If an existing shm is linked at the destination path, unlink it as part of the same atomic operation. The caller needs the same permissions as shm_unlink to the shm being renamed, and the same permissions for the shm at the destination which is being unlinked, if it exists. If those fail, EACCES is returned, as with the other shm_* syscalls. truss support is included; audit support will come later. This commit includes only the implementation; the sysent-generated bits will come in a follow-on commit. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jilles (earlier revision) Reviewed by: brueffer (manpages, earlier revision) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21423	2022-07-11 11:52:46 +02:00
Kyle Evans	9243caa8d3	Add linux-compatible memfd_create memfd_create is effectively a SHM_ANON shm_open(2) mapping with optional CLOEXEC and file sealing support. This is used by some mesa parts, some linux libs, and qemu can also take advantage of it and uses the sealing to prevent resizing the region. This reimplements shm_open in terms of shm_open2(2) at the same time. shm_open(2) will be moved to COMPAT12 shortly. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21393	2022-07-11 11:52:46 +02:00
Kyle Evans	99b66f5315	Add a shm_open2 syscall to support upcoming memfd_create shm_open2 allows a little more flexibility than the original shm_open. shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate shmflag argument that can be expanded later. Currently the only shmflag is to allow file sealing on the returned fd. shm_open and memfd_create will both be implemented in libc to use this new syscall. __FreeBSD_version is bumped to indicate the presence. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21393	2022-07-11 11:52:46 +02:00
Randall Stewart	878b65b3b6	This commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This is a completely separate TCP stack (tcp_bbr.ko) that will be built only if you add the make options WITH_EXTRA_TCP_STACKS=1 and also include the option TCPHPTS. You can also include the RATELIMIT option if you have a NIC interface that supports hardware pacing, BBR understands how to use such a feature. Note that this commit also adds in a general purpose time-filter which allows you to have a min-filter or max-filter. A filter allows you to have a low (or high) value for some period of time and degrade slowly to another value has time passes. You can find out the details of BBR by looking at the original paper at: https://queue.acm.org/detail.cfm?id=3022184 or consult many other web resources you can find on the web referenced by "BBR congestion control". It should be noted that BBRv1 (which this is) does tend to unfairness in cases of small buffered paths, and it will usually get less bandwidth in the case of large BDP paths(when competing with new-reno or cubic flows). BBR is still an active research area and we do plan on implementing V2 of BBR to see if it is an improvement over V1. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D21582	2022-07-11 11:52:46 +02:00
Alan Somers	ce921ffca8	Reduce namespace pollution from r349233 Define __daddr_t in _types.h and use it in filio.h Reported by: ian, bde Reviewed by: ian, imp, cem MFC after: 2 weeks MFC-With: 349233 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20715	2022-07-11 11:52:46 +02:00
Alan Somers	5a6ad7c5bc	#include <sys/types.h> from sys/filio.h This fixes world build after r349231 Reported by: Jenkins MFC after: 2 weeks MFC-With: 349231 Sponsored by: The FreeBSD Foundation	2022-07-11 11:52:46 +02:00
Alan Somers	8fe49db783	Add FIOBMAP2 ioctl This ioctl exposes VOP_BMAP information to userland. It can be used by programs like fragmentation analyzers and optimized cp implementations. But I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD and Linux. FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block number instead of 32-bit, and it also returns runp and runb. Reviewed by: mckusick MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20705	2022-07-11 11:52:46 +02:00
Brooks Davis	c42aaaea4f	Move 32-bit compat support for FIODGNAME to the right place. ioctl(2) commands only have meaning in the context of a file descriptor so translating them in the syscall layer is incorrect. The new handler users an accessor to retrieve/construct a pointer from the last member of the passed structure and relies on type punning to access the other member which requires no translation. Unlike r339174 this change supports both places FIODGNAME is handled. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17475	2022-07-11 11:52:46 +02:00
Pedro F. Giffuni	eb4cbf4fd3	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2022-07-11 11:52:46 +02:00
Sebastian Huber	5c0c0e5c77	RTEMS: Remove FreeBSD version tags	2022-07-11 11:52:46 +02:00
Warner Losh	9331071f02	cdefs.h: Remove redundant #ifdefs Remove redunant #ifdef __GNUC__ inside an #if defined(__GNUC__) block. They are nops. Sponsored by: Netflix	2022-07-11 11:52:46 +02:00
Mark Johnston	f537ff8ee5	cdefs: Add a default definition for __nosanitizememory MFC after: 1 week Sponsored by: The FreeBSD Foundation	2022-07-11 11:52:46 +02:00

... 4 5 6 7 8 ...

20930 Commits All Branches Search

20930 Commits

All Branches