There is only one function in init/initramfs.c that is in the .text
section, and it is marked __weak. When building with clang-12 and the
integrated assembler, this leads to a bug with recordmcount:
./scripts/recordmcount "init/initramfs.o"
Cannot find symbol for section 2: .text.
init/initramfs.o: failed
I'm not quite sure what exactly goes wrong, but I notice that this
function is only ever called from an __init function, and normally
inlined. Marking it __init as well is clearly correct and it leads to
recordmcount no longer complaining.
Link: https://lkml.kernel.org/r/20201204165742.3815221-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Barret Rhoden <brho@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull Kbuild fixes from Masahiro Yamada:
- Move -Wcast-align to W=3, which tends to be false-positive and there
is no tree-wide solution.
- Pass -fmacro-prefix-map to KBUILD_CPPFLAGS because it is a
preprocessor option and makes sense for .S files as well.
- Disable -gdwarf-2 for Clang's integrated assembler to avoid warnings.
- Disable --orphan-handling=warn for LLD 10.0.1 to avoid warnings.
- Fix undesirable line breaks in *.mod files.
* tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: avoid split lines in .mod files
kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
kbuild: Hoist '--orphan-handling' into Kconfig
Kbuild: do not emit debug info for assembly with LLVM_IAS=1
kbuild: use -fmacro-prefix-map for .S sources
Makefile.extrawarn: move -Wcast-align to W=3
Steps on the way to 5.10-rc7
Resolves a merge issue in:
arch/arm64/kernel/process.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: If22f5ca1f09e08cdb95f841f3381eda5cd31ee00
Pull bootconfig fixes from Steven Rostedt:
"Have bootconfig size and checksum be little endian
In case the bootconfig is created on one kind of endian machine, and
then read on the other kind of endian kernel, the size and checksum
will be incorrect. Instead, have both the size and checksum always be
little endian and have the tool and the kernel convert it from little
endian to or from the host endian"
* tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
docs: bootconfig: Add the endianness of fields
tools/bootconfig: Store size and checksum in footer as le32
bootconfig: Load size and checksum in the footer as le32
audio driver us edma to instore data.
fsl-edma used api in virt-dma, like vchan_init to dma opt.
There is not hardware involved. build-in DMA_VIRTUAL_CHANNELS
Bug: 160627323
Bug: 174628645
Signed-off-by: zhang sanshan <pete.zhang@nxp.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Change-Id: I5427cb6ef3725163f396836bfd3ebe23037c06f2
Currently, '--orphan-handling=warn' is spread out across four different
architectures in their respective Makefiles, which makes it a little
unruly to deal with in case it needs to be disabled for a specific
linker version (in this case, ld.lld 10.0.1).
To make it easier to control this, hoist this warning into Kconfig and
the main Makefile so that disabling it is simpler, as the warning will
only be enabled in a couple places (main Makefile and a couple of
compressed boot folders that blow away LDFLAGS_vmlinx) and making it
conditional is easier due to Kconfig syntax. One small additional
benefit of this is saving a call to ld-option on incremental builds
because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.
To keep the list of supported architectures the same, introduce
CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
gain this automatically after all of the sections are specified and size
asserted. A special thanks to Kees Cook for the help text on this
config.
Link: https://github.com/ClangBuiltLinux/linux/issues/1187
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull printk fixes from Petr Mladek:
- do not lose trailing newline in pr_cont() calls
- two trivial fixes for a dead store and a config description
* tag 'printk-for-5.10-rc6-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: finalize records with trailing newlines
printk: remove unneeded dead-store assignment
init/Kconfig: Fix CPU number in LOG_CPU_MAX_BUF_SHIFT description
In certain audio use cases, scheduling RT threads on cores that are
handling softirqs can lead to glitches. Prevent this behavior.
Bug: 31501544
Bug: 168521633
Change-Id: I99dd7aaa12c11270b28dbabea484bcc8fb8ba0c1
Signed-off-by: John Dias <joaodias@google.com>
[elavila: Port to mainline, amend commit text]
Signed-off-by: J. Avila <elavila@google.com>
Currently, LOG_BUF_SHIFT defaults to 17, which is 2 ^ 17 bytes = 128 KB,
and LOG_CPU_MAX_BUF_SHIFT defaults to 12, which is 2 ^ 12 bytes = 4 KB.
Half of 128 KB is 64 KB, so more than 16 CPUs are required for the value
to be used, as then the sum of contributions is greater than 64 KB for
the first time. My guess is, that the description was written with the
configuration values used in the SUSE in mind.
Fixes: 23b2899f7f ("printk: allow increasing the ring buffer depending on the number of CPUs")
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200811092924.6256-1-pmenzel@molgen.mpg.de
Steps on the way to 5.10-rc1
Resolves merge issues in:
drivers/net/virtio_net.c
net/xfrm/xfrm_state.c
net/xfrm/xfrm_user.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3132e7802f25cb775eb02d0b3a03068da39a6fe2
Pull more Kunit updates from Shuah Khan:
- add Kunit to kernel_init() and remove KUnit from init calls entirely.
This addresses the concern that Kunit would not work correctly during
late init phase.
- add a linker section where KUnit can put references to its test
suites.
This is the first step in transitioning to dispatching all KUnit
tests from a centralized executor rather than having each as its own
separate late_initcall.
- add a centralized executor to dispatch tests rather than relying on
late_initcall to schedule each test suite separately. Centralized
execution is for built-in tests only; modules will execute tests when
loaded.
- convert bitfield test to use KUnit framework
- Documentation updates for naming guidelines and how
kunit_test_suite() works.
- add test plan to KUnit TAP format
* tag 'linux-kselftest-kunit-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
lib: kunit: Fix compilation test when using TEST_BIT_FIELD_COMPILE
lib: kunit: add bitfield test conversion to KUnit
Documentation: kunit: add a brief blurb about kunit_test_suite
kunit: test: add test plan to KUnit TAP format
init: main: add KUnit to kernel init
kunit: test: create a single centralized executor for all tests
vmlinux.lds.h: add linker section for KUnit test suites
Documentation: kunit: Add naming guidelines
Pull networking updates from Jakub Kicinski:
- Add redirect_neigh() BPF packet redirect helper, allowing to limit
stack traversal in common container configs and improving TCP
back-pressure.
Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
- Expand netlink policy support and improve policy export to user
space. (Ge)netlink core performs request validation according to
declared policies. Expand the expressiveness of those policies
(min/max length and bitmasks). Allow dumping policies for particular
commands. This is used for feature discovery by user space (instead
of kernel version parsing or trial and error).
- Support IGMPv3/MLDv2 multicast listener discovery protocols in
bridge.
- Allow more than 255 IPv4 multicast interfaces.
- Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
packets of TCPv6.
- In Multi-patch TCP (MPTCP) support concurrent transmission of data on
multiple subflows in a load balancing scenario. Enhance advertising
addresses via the RM_ADDR/ADD_ADDR options.
- Support SMC-Dv2 version of SMC, which enables multi-subnet
deployments.
- Allow more calls to same peer in RxRPC.
- Support two new Controller Area Network (CAN) protocols - CAN-FD and
ISO 15765-2:2016.
- Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
kernel problem.
- Add TC actions for implementing MPLS L2 VPNs.
- Improve nexthop code - e.g. handle various corner cases when nexthop
objects are removed from groups better, skip unnecessary
notifications and make it easier to offload nexthops into HW by
converting to a blocking notifier.
- Support adding and consuming TCP header options by BPF programs,
opening the doors for easy experimental and deployment-specific TCP
option use.
- Reorganize TCP congestion control (CC) initialization to simplify
life of TCP CC implemented in BPF.
- Add support for shipping BPF programs with the kernel and loading
them early on boot via the User Mode Driver mechanism, hence reusing
all the user space infra we have.
- Support sleepable BPF programs, initially targeting LSM and tracing.
- Add bpf_d_path() helper for returning full path for given 'struct
path'.
- Make bpf_tail_call compatible with bpf-to-bpf calls.
- Allow BPF programs to call map_update_elem on sockmaps.
- Add BPF Type Format (BTF) support for type and enum discovery, as
well as support for using BTF within the kernel itself (current use
is for pretty printing structures).
- Support listing and getting information about bpf_links via the bpf
syscall.
- Enhance kernel interfaces around NIC firmware update. Allow
specifying overwrite mask to control if settings etc. are reset
during update; report expected max time operation may take to users;
support firmware activation without machine reboot incl. limits of
how much impact reset may have (e.g. dropping link or not).
- Extend ethtool configuration interface to report IEEE-standard
counters, to limit the need for per-vendor logic in user space.
- Adopt or extend devlink use for debug, monitoring, fw update in many
drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
dpaa2-eth).
- In mlxsw expose critical and emergency SFP module temperature alarms.
Refactor port buffer handling to make the defaults more suitable and
support setting these values explicitly via the DCBNL interface.
- Add XDP support for Intel's igb driver.
- Support offloading TC flower classification and filtering rules to
mscc_ocelot switches.
- Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
fixed interval period pulse generator and one-step timestamping in
dpaa-eth.
- Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
offload.
- Add Lynx PHY/PCS MDIO module, and convert various drivers which have
this HW to use it. Convert mvpp2 to split PCS.
- Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
7-port Mediatek MT7531 IP.
- Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
and wcn3680 support in wcn36xx.
- Improve performance for packets which don't require much offloads on
recent Mellanox NICs by 20% by making multiple packets share a
descriptor entry.
- Move chelsio inline crypto drivers (for TLS and IPsec) from the
crypto subtree to drivers/net. Move MDIO drivers out of the phy
directory.
- Clean up a lot of W=1 warnings, reportedly the actively developed
subsections of networking drivers should now build W=1 warning free.
- Make sure drivers don't use in_interrupt() to dynamically adapt their
code. Convert tasklets to use new tasklet_setup API (sadly this
conversion is not yet complete).
* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
net, sockmap: Don't call bpf_prog_put() on NULL pointer
bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
bpf, sockmap: Add locking annotations to iterator
netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
net: fix pos incrementment in ipv6_route_seq_next
net/smc: fix invalid return code in smcd_new_buf_create()
net/smc: fix valid DMBE buffer sizes
net/smc: fix use-after-free of delayed events
bpfilter: Fix build error with CONFIG_BPFILTER_UMH
cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
bpf: Fix register equivalence tracking.
rxrpc: Fix loss of final ack on shutdown
rxrpc: Fix bundle counting for exclusive connections
netfilter: restore NF_INET_NUMHOOKS
ibmveth: Identify ingress large send packets.
ibmveth: Switch order of ibmveth_helper calls.
cxgb4: handle 4-tuple PEDIT to NAT mode translation
selftests: Add VRF route leaking tests
...
Pull trivial updates from Jiri Kosina:
"The latest advances in computer science from the trivial queue"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
xtensa: fix Kconfig typo
spelling.txt: Remove some duplicate entries
mtd: rawnand: oxnas: cleanup/simplify code
selftests: vm: add fragment CONFIG_GUP_BENCHMARK
perf: Fix opt help text for --no-bpf-event
HID: logitech-dj: Fix spelling in comment
bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
scif: Fix spelling of EACCES
printk: fix global comment
lib/bitmap.c: fix spello
fs: Fix missing 'bit' in comment
Pull printk updates from Petr Mladek:
"The big new thing is the fully lockless ringbuffer implementation,
including the support for continuous lines. It will allow to store and
read messages in any situation wihtout the risk of deadlocks and
without the need of temporary per-CPU buffers.
The access is still serialized by logbuf_lock. It synchronizes few
more operations, for example, temporary buffer for formatting the
message, syslog and kmsg_dump operations. The lock removal is being
discussed and should be ready for the next release.
The continuous lines are handled exactly the same way as before to
avoid regressions in user space. It means that they are appended to
the last message when the caller is the same. Only the last message
can be extended.
The data ring includes plain text of the messages. Except for an
integer at the beginning of each message that points back to the
descriptor ring with other metadata.
The dictionary has to stay. journalctl uses it to filter the log. It
allows to show messages related to a given device. The dictionary
values are stored in the descriptor ring with the other metadata.
This is the first part of the printk rework as discussed at Plumbers
2019, see https://lore.kernel.org/r/87k1acz5rx.fsf@linutronix.de. The
next big step will be handling consoles by kthreads during the normal
system operation. It will require special handling of situations when
the kthreads could not get scheduled, for example, early boot,
suspend, panic.
Other changes:
- Add John Ogness as a reviewer for printk subsystem. He is author of
the rework and is familiar with the code and history.
- Fix locking in serial8250_do_startup() to prevent lockdep report.
- Few code cleanups"
* tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (27 commits)
printk: Use fallthrough pseudo-keyword
printk: reduce setup_text_buf size to LOG_LINE_MAX
printk: avoid and/or handle record truncation
printk: remove dict ring
printk: move dictionary keys to dev_printk_info
printk: move printk_info into separate array
printk: reimplement log_cont using record extension
printk: ringbuffer: add finalization/extension support
printk: ringbuffer: change representation of states
printk: ringbuffer: clear initial reserved fields
printk: ringbuffer: add BLK_DATALESS() macro
printk: ringbuffer: relocate get_data()
printk: ringbuffer: avoid memcpy() on state_var
printk: ringbuffer: fix setting state in desc_read()
kernel.h: Move oops_in_progress to printk.h
scripts/gdb: update for lockless printk ringbuffer
scripts/gdb: add utils.read_ulong()
docs: vmcoreinfo: add lockless printk ringbuffer vmcoreinfo
printk: reduce LOG_BUF_SHIFT range for H8300
printk: ringbuffer: support dataless records
...
Pull io_uring updates from Jens Axboe:
- Add blkcg accounting for io-wq offload (Dennis)
- A use-after-free fix for io-wq (Hillf)
- Cancelation fixes and improvements
- Use proper files_struct references for offload
- Cleanup of io_uring_get_socket() since that can now go into our own
header
- SQPOLL fixes and cleanups, and support for sharing the thread
- Improvement to how page accounting is done for registered buffers and
huge pages, accounting the real pinned state
- Series cleaning up the xarray code (Willy)
- Various cleanups, refactoring, and improvements (Pavel)
- Use raw spinlock for io-wq (Sebastian)
- Add support for ring restrictions (Stefano)
* tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (62 commits)
io_uring: keep a pointer ref_node in file_data
io_uring: refactor *files_register()'s error paths
io_uring: clean file_data access in files_register
io_uring: don't delay io_init_req() error check
io_uring: clean leftovers after splitting issue
io_uring: remove timeout.list after hrtimer cancel
io_uring: use a separate struct for timeout_remove
io_uring: improve submit_state.ios_left accounting
io_uring: simplify io_file_get()
io_uring: kill extra check in fixed io_file_get()
io_uring: clean up ->files grabbing
io_uring: don't io_prep_async_work() linked reqs
io_uring: Convert advanced XArray uses to the normal API
io_uring: Fix XArray usage in io_uring_add_task_file
io_uring: Fix use of XArray in __io_uring_files_cancel
io_uring: fix break condition for __io_uring_register() waiting
io_uring: no need to call xa_destroy() on empty xarray
io_uring: batch account ->req_issue and task struct references
io_uring: kill callback_head argument for io_req_task_work_add()
io_uring: move req preps out of io_issue_sqe()
...
Pull documentation updates from Jonathan Corbet:
"As hoped, things calmed down for docs this cycle; fewer changes and
almost no conflicts at all. This includes:
- A reworked and expanded user-mode Linux document
- Some simplifications and improvements for submitting-patches.rst
- An emergency fix for (some) problems with Sphinx 3.x
- Some welcome automarkup improvements to automatically generate
cross-references to struct definitions and other documents
- The usual collection of translation updates, typo fixes, etc"
* tag 'docs-5.10' of git://git.lwn.net/linux: (81 commits)
gpiolib: Update indentation in driver.rst for code excerpts
Documentation/admin-guide: tainted-kernels: Fix typo occured
Documentation: better locations for sysfs-pci, sysfs-tagging
docs: programming-languages: refresh blurb on clang support
Documentation: kvm: fix a typo
Documentation: Chinese translation of Documentation/arm64/amu.rst
doc: zh_CN: index files in arm64 subdirectory
mailmap: add entry for <mstarovoitov@marvell.com>
doc: seq_file: clarify role of *pos in ->next()
docs: trace: ring-buffer-design.rst: use the new SPDX tag
Documentation: kernel-parameters: clarify "module." parameters
Fix references to nommu-mmap.rst
docs: rewrite admin-guide/sysctl/abi.rst
docs: fb: Remove vesafb scrollback boot option
docs: fb: Remove sstfb scrollback boot option
docs: fb: Remove matroxfb scrollback boot option
docs: fb: Remove framebuffer scrollback boot option
docs: replace the old User Mode Linux HowTo with a new one
Documentation/admin-guide: blockdev/ramdisk: remove use of "rdev"
Documentation/admin-guide: README & svga: remove use of "rdev"
...
Although we have not seen any actual examples where KUnit doesn't work
because it runs in the late init phase of the kernel, it has been a
concern for some time that this could potentially be an issue in the
future. So, remove KUnit from init calls entirely, instead call directly
from kernel_init() so that KUnit runs after late init.
Co-developed-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Grab actual references to the files_struct. To avoid circular references
issues due to this, we add a per-task note that keeps track of what
io_uring contexts a task has used. When the tasks execs or exits its
assigned files, we cancel requests based on this tracking.
With that, we can grab proper references to the files table, and no
longer need to rely on stashing away ring_fd and ring_file to check
if the ring_fd may have been closed.
Cc: stable@vger.kernel.org # v5.5+
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes up a merge issue in:
net/ipv6/route.c
on the way to a 5.9-rc7 release.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4eb508eb3761b95ad8f39dd79f03b3352481ceaf
Two minor conflicts:
1) net/ipv4/route.c, adding a new local variable while
moving another local variable and removing it's
initial assignment.
2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
One pretty prints the port mode differently, whilst another
changes the driver to try and obtain the port mode from
the port node rather than the switch node.
Signed-off-by: David S. Miller <davem@davemloft.net>
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.
Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
init_stat() returns 0 on success, same as vfs_lstat(). When it replaced
vfs_lstat(), the '!' was dropped.
Fixes: 716308a533 ("init: add an init_stat helper")
Signed-off-by: Barret Rhoden <brho@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce sleepable BPF programs that can request such property for themselves
via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
to use helpers like bpf_copy_from_user() that might sleep. At present only
fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
when they are attached to kernel functions that are known to allow sleeping.
The non-sleepable programs are relying on implicit rcu_read_lock() and
migrate_disable() to protect life time of programs, maps that they use and
per-cpu kernel structures used to pass info between bpf programs and the
kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
should not be enclosed in migrate_disable() as well. Therefore
rcu_read_lock_trace is used to protect the life time of sleepable progs.
There are many networking and tracing program types. In many cases the
'struct bpf_prog *' pointer itself is rcu protected within some other kernel
data structure and the kernel code is using rcu_dereference() to load that
program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
Instead sleepable bpf programs are allowed with bpf trampoline only. The
program pointers are hard-coded into generated assembly of bpf trampoline and
synchronize_rcu_tasks_trace() is used to protect the life time of the program.
The same trampoline can hold both sleepable and non-sleepable progs.
When rcu_read_lock_trace is held it means that some sleepable bpf program is
running from bpf trampoline. Those programs can use bpf arrays and preallocated
hash/lru maps. These map types are waiting on programs to complete via
synchronize_rcu_tasks_trace();
Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
synchronize_rcu_tasks() to wait for sleepable progs to finish and for
trampoline assembly to finish.
This is the first step of introducing sleepable progs. Eventually dynamically
allocated hash maps can be allowed and networking program types can become
sleepable too.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com
Add kernel module with user mode driver that populates bpffs with
BPF iterators.
$ mount bpffs /my/bpffs/ -t bpf
$ ls -la /my/bpffs/
total 4
drwxrwxrwt 2 root root 0 Jul 2 00:27 .
drwxr-xr-x 19 root root 4096 Jul 2 00:09 ..
-rw------- 1 root root 0 Jul 2 00:27 maps.debug
-rw------- 1 root root 0 Jul 2 00:27 progs.debug
The user mode driver will load BPF Type Formats, create BPF maps, populate BPF
maps, load two BPF programs, attach them to BPF iterators, and finally send two
bpf_link IDs back to the kernel.
The kernel will pin two bpf_links into newly mounted bpffs instance under
names "progs.debug" and "maps.debug". These two files become human readable.
$ cat /my/bpffs/progs.debug
id name attached
11 dump_bpf_map bpf_iter_bpf_map
12 dump_bpf_prog bpf_iter_bpf_prog
27 test_pkt_access
32 test_main test_pkt_access test_pkt_access
33 test_subprog1 test_pkt_access_subprog1 test_pkt_access
34 test_subprog2 test_pkt_access_subprog2 test_pkt_access
35 test_subprog3 test_pkt_access_subprog3 test_pkt_access
36 new_get_skb_len get_skb_len test_pkt_access
37 new_get_skb_ifindex get_skb_ifindex test_pkt_access
38 new_get_constant get_constant test_pkt_access
The BPF program dump_bpf_prog() in iterators.bpf.c is printing this data about
all BPF programs currently loaded in the system. This information is unstable
and will change from kernel to kernel as ".debug" suffix conveys.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200819042759.51280-4-alexei.starovoitov@gmail.com
CEC_{CORE,NOTIFIER,PIN} are hidden options which are selected by
drivers. Select them here so the ABI surface is available to
modules.
Bug: 162507365
Bug: 164180803
Change-Id: Idf8d072388b387442d967bf1c6d3be1f9a734ab6
Signed-off-by: Alistair Delva <adelva@google.com>
Pull OpenRISC updates from Stafford Horne:
"A few patches all over the place during this cycle, mostly bug and
sparse warning fixes for OpenRISC, but a few enhancements too. Note,
there are 2 non OpenRISC specific fixups.
Non OpenRISC fixes:
- In init we need to align the init_task correctly to fix an issue
with MUTEX_FLAGS, reviewed by Peter Z. No one picked this up so I
kept it on my tree.
- In asm-generic/io.h I fixed up some sparse warnings, OK'd by Arnd.
Arnd asked to merge it via my tree.
OpenRISC fixes:
- Many fixes for OpenRISC sprase warnings.
- Add support OpenRISC SMP tlb flushing rather than always flushing
the entire TLB on every CPU.
- Fix bug when dumping stack via /proc/xxx/stack of user threads"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: uaccess: Add user address space check to access_ok
openrisc: signal: Fix sparse address space warnings
openrisc: uaccess: Remove unused macro __addr_ok
openrisc: uaccess: Use static inline function in access_ok
openrisc: uaccess: Fix sparse address space warnings
openrisc: io: Fixup defines and move include to the end
asm-generic/io.h: Fix sparse warnings on big-endian architectures
openrisc: Implement proper SMP tlb flushing
openrisc: Fix oops caused when dumping stack
openrisc: Add support for external initrd images
init: Align init_task to avoid conflict with MUTEX_FLAGS
openrisc: fix __user in raw_copy_to_user()'s prototype
Tiny steps on the way to 5.9-rc1.
Fixes conflicts in:
fs/f2fs/inline.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I16d863ae44a51156499458e8c3486587cbe2babe
Pull locking updates from Thomas Gleixner:
"A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in
various situations caused by the lockdep additions to seqcount to
validate that the write side critical sections are non-preemptible.
- The seqcount associated lock debug addons which were blocked by the
above fallout.
seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict
per CPU seqcounts. As the lock is not part of the seqcount, lockdep
cannot validate that the lock is held.
This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored
and write_seqcount_begin() has a lockdep assertion to validate that
the lock is held.
Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API
is unchanged and determines the type at compile time with the help
of _Generic which is possible now that the minimal GCC version has
been moved up.
Adding this lockdep coverage unearthed a handful of seqcount bugs
which have been addressed already independent of this.
While generally useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if
the writers are serialized by an associated lock, which leads to
the well known reader preempts writer livelock. RT prevents this by
storing the associated lock pointer independent of lockdep in the
seqcount and changing the reader side to block on the lock when a
reader detects that a writer is in the write side critical section.
- Conversion of seqcount usage sites to associated types and
initializers"
* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
locking/seqlock, headers: Untangle the spaghetti monster
locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
x86/headers: Remove APIC headers from <asm/smp.h>
seqcount: More consistent seqprop names
seqcount: Compress SEQCNT_LOCKNAME_ZERO()
seqlock: Fold seqcount_LOCKNAME_init() definition
seqlock: Fold seqcount_LOCKNAME_t definition
seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
hrtimer: Use sequence counter with associated raw spinlock
kvm/eventfd: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
vfs: Use sequence counter with associated spinlock
timekeeping: Use sequence counter with associated raw spinlock
xfrm: policy: Use sequence counters with associated lock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
netfilter: conntrack: Use sequence counter with associated spinlock
sched: tasks: Use sequence counter with associated spinlock
...