Commit Graph

1947 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
009596c521 Merge 5.15.13 into android13-5.15
Changes in 5.15.13
	Input: i8042 - add deferred probe support
	Input: i8042 - enable deferred probe quirk for ASUS UM325UA
	tomoyo: Check exceeded quota early in tomoyo_domain_quota_is_ok().
	tomoyo: use hwight16() in tomoyo_domain_quota_is_ok()
	net/sched: Extend qdisc control block with tc control block
	parisc: Clear stale IIR value on instruction access rights trap
	platform/mellanox: mlxbf-pmc: Fix an IS_ERR() vs NULL bug in mlxbf_pmc_map_counters
	platform/x86: apple-gmux: use resource_size() with res
	memblock: fix memblock_phys_alloc() section mismatch error
	ALSA: hda: intel-sdw-acpi: harden detection of controller
	ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2
	recordmcount.pl: fix typo in s390 mcount regex
	powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversion
	efi: Move efifb_setup_from_dmi() prototype from arch headers
	selinux: initialize proto variable in selinux_ip_postroute_compat()
	scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()
	net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
	net/mlx5: Fix error print in case of IRQ request failed
	net/mlx5: Fix SF health recovery flow
	net/mlx5: Fix tc max supported prio for nic mode
	net/mlx5e: Wrap the tx reporter dump callback to extract the sq
	net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
	net/mlx5e: Fix ICOSQ recovery flow for XSK
	net/mlx5e: Use tc sample stubs instead of ifdefs in source file
	net/mlx5e: Delete forward rule for ct or sample action
	udp: using datalen to cap ipv6 udp max gso segments
	selftests: Calculate udpgso segment count without header adjustment
	sctp: use call_rcu to free endpoint
	net/smc: fix using of uninitialized completions
	net: usb: pegasus: Do not drop long Ethernet frames
	net: ag71xx: Fix a potential double free in error handling paths
	net: lantiq_xrx200: fix statistics of received bytes
	NFC: st21nfca: Fix memory leak in device probe and remove
	net/smc: don't send CDC/LLC message if link not ready
	net/smc: fix kernel panic caused by race of smc_sock
	igc: Do not enable crosstimestamping for i225-V models
	igc: Fix TX timestamp support for non-MSI-X platforms
	drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization
	drm/amd/display: Set optimize_pwr_state for DCN31
	ionic: Initialize the 'lif->dbid_inuse' bitmap
	net/mlx5e: Fix wrong features assignment in case of error
	net: bridge: mcast: add and enforce query interval minimum
	net: bridge: mcast: add and enforce startup query interval minimum
	selftests/net: udpgso_bench_tx: fix dst ip argument
	selftests: net: Fix a typo in udpgro_fwd.sh
	net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper
	net/ncsi: check for error return from call to nla_put_u32
	selftests: net: using ping6 for IPv6 in udpgro_fwd.sh
	fsl/fman: Fix missing put_device() call in fman_port_probe
	i2c: validate user data in compat ioctl
	nfc: uapi: use kernel size_t to fix user-space builds
	uapi: fix linux/nfc.h userspace compilation errors
	drm/nouveau: wait for the exclusive fence after the shared ones v2
	drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
	drm/amdgpu: add support for IP discovery gc_info table v2
	drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
	xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set.
	usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear.
	usb: mtu3: add memory barrier before set GPD's HWO
	usb: mtu3: fix list_head check warning
	usb: mtu3: set interval of FS intr and isoc endpoint
	nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert
	binder: fix async_free_space accounting for empty parcels
	scsi: vmw_pvscsi: Set residual data length conditionally
	Input: appletouch - initialize work before device registration
	Input: spaceball - fix parsing of movement data packets
	mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'
	net: fix use-after-free in tw_timer_handler
	fs/mount_setattr: always cleanup mount_kattr
	perf intel-pt: Fix parsing of VM time correlation arguments
	perf script: Fix CPU filtering of a script's switch events
	perf scripts python: intel-pt-events.py: Fix printing of switch events
	Linux 5.15.13

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie0d5130643ce211d5e21ea9d11bc5577efeddc90
2022-01-05 15:36:44 +01:00
Paul Blakey
0d76daf201 net/sched: Extend qdisc control block with tc control block
[ Upstream commit ec624fe740b416fb68d536b37fb8eef46f90b5c2 ]

BPF layer extends the qdisc control block via struct bpf_skb_data_end
and because of that there is no more room to add variables to the
qdisc layer control block without going over the skb->cb size.

Extend the qdisc control block with a tc control block,
and move all tc related variables to there as a pre-step for
extending the tc control block with additional members.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-05 12:42:33 +01:00
Greg Kroah-Hartman
e5f6d1dffb Merge 5.15.7 into android13-5.15
Changes in 5.15.7
	ALSA: usb-audio: Restrict rates for the shared clocks
	ALSA: usb-audio: Rename early_playback_start flag with lowlatency_playback
	ALSA: usb-audio: Disable low-latency playback for free-wheel mode
	ALSA: usb-audio: Disable low-latency mode for implicit feedback sync
	ALSA: usb-audio: Check available frames for the next packet size
	ALSA: usb-audio: Add spinlock to stop_urbs()
	ALSA: usb-audio: Improved lowlatency playback support
	ALSA: usb-audio: Avoid killing in-flight URBs during draining
	ALSA: usb-audio: Fix packet size calculation regression
	ALSA: usb-audio: Less restriction for low-latency playback mode
	ALSA: usb-audio: Switch back to non-latency mode at a later point
	ALSA: usb-audio: Don't start stream for capture at prepare
	gfs2: release iopen glock early in evict
	gfs2: Fix length of holes reported at end-of-file
	powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
	powerpc/pseries/ddw: Do not try direct mapping with persistent memory and one window
	drm/sun4i: fix unmet dependency on RESET_CONTROLLER for PHY_SUN6I_MIPI_DPHY
	mac80211: do not access the IV when it was stripped
	mac80211: fix throughput LED trigger
	x86/hyperv: Move required MSRs check to initial platform probing
	net/smc: Transfer remaining wait queue entries during fallback
	atlantic: Fix OOB read and write in hw_atl_utils_fw_rpc_wait
	net: return correct error code
	pinctrl: qcom: fix unmet dependencies on GPIOLIB for GPIOLIB_IRQCHIP
	platform/x86: dell-wmi-descriptor: disable by default
	platform/x86: thinkpad_acpi: Add support for dual fan control
	platform/x86: thinkpad_acpi: Fix WWAN device disabled issue after S3 deep
	s390/setup: avoid using memblock_enforce_memory_limit
	btrfs: silence lockdep when reading chunk tree during mount
	btrfs: check-integrity: fix a warning on write caching disabled disk
	thermal: core: Reset previous low and high trip during thermal zone init
	scsi: iscsi: Unblock session then wake up error handler
	net: usb: r8152: Add MAC passthrough support for more Lenovo Docks
	drm/amd/pm: Remove artificial freq level on Navi1x
	drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered again
	drm/amd/amdgpu: fix potential memleak
	ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile
	ata: libahci: Adjust behavior when StorageD3Enable _DSD is set
	ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
	ipv6: check return value of ipv6_skip_exthdr
	net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
	net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
	perf sort: Fix the 'weight' sort key behavior
	perf sort: Fix the 'ins_lat' sort key behavior
	perf sort: Fix the 'p_stage_cyc' sort key behavior
	perf inject: Fix ARM SPE handling
	perf hist: Fix memory leak of a perf_hpp_fmt
	perf report: Fix memory leaks around perf_tip()
	tracing: Don't use out-of-sync va_list in event printing
	net/smc: Avoid warning of possible recursive locking
	ACPI: Add stubs for wakeup handler functions
	net/tls: Fix authentication failure in CCM mode
	vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
	kprobes: Limit max data_size of the kretprobe instances
	ALSA: hda/cs8409: Set PMSG_ON earlier inside cs8409 driver
	rt2x00: do not mark device gone on EPROTO errors during start
	ipmi: Move remove_work to dedicated workqueue
	cpufreq: Fix get_cpu_device() failure in add_cpu_dev_symlink()
	iwlwifi: mvm: retry init flow if failed
	dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow
	s390/pci: move pseudo-MMIO to prevent MIO overlap
	fget: check that the fd still exists after getting a ref to it
	sata_fsl: fix UAF in sata_fsl_port_stop when rmmod sata_fsl
	sata_fsl: fix warning in remove_proc_entry when rmmod sata_fsl
	scsi: lpfc: Fix non-recovery of remote ports following an unsolicited LOGO
	scsi: ufs: ufs-pci: Add support for Intel ADL
	ipv6: fix memory leak in fib6_rule_suppress
	drm/amd/display: Allow DSC on supported MST branch devices
	drm/i915/dp: Perform 30ms delay after source OUI write
	KVM: fix avic_set_running for preemptable kernels
	KVM: Disallow user memslot with size that exceeds "unsigned long"
	KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
	KVM: Ensure local memslot copies operate on up-to-date arch-specific data
	KVM: x86: ignore APICv if LAPIC is not enabled
	KVM: nVMX: Emulate guest TLB flush on nested VM-Enter with new vpid12
	KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST
	KVM: nVMX: Abide to KVM_REQ_TLB_FLUSH_GUEST request on nested vmentry/vmexit
	KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
	KVM: x86: Use a stable condition around all VT-d PI paths
	KVM: MMU: shadow nested paging does not have PKU
	KVM: arm64: Avoid setting the upper 32 bits of TCR_EL2 and CPTR_EL2 to 1
	KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
	KVM: x86: check PIR even for vCPUs with disabled APICv
	tracing/histograms: String compares should not care about signed values
	net: dsa: mv88e6xxx: Fix application of erratum 4.8 for 88E6393X
	net: dsa: mv88e6xxx: Drop unnecessary check in mv88e6393x_serdes_erratum_4_6()
	net: dsa: mv88e6xxx: Save power by disabling SerDes trasmitter and receiver
	net: dsa: mv88e6xxx: Add fix for erratum 5.2 of 88E6393X family
	net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
	net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
	wireguard: selftests: increase default dmesg log size
	wireguard: allowedips: add missing __rcu annotation to satisfy sparse
	wireguard: selftests: actually test for routing loops
	wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLIST
	wireguard: device: reset peer src endpoint when netns exits
	wireguard: receive: use ring buffer for incoming handshakes
	wireguard: receive: drop handshakes if queue lock is contended
	wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()
	i2c: stm32f7: flush TX FIFO upon transfer errors
	i2c: stm32f7: recover the bus on access timeout
	i2c: stm32f7: stop dma transfer in case of NACK
	i2c: cbus-gpio: set atomic transfer callback
	natsemi: xtensa: fix section mismatch warnings
	tcp: fix page frag corruption on page fault
	net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
	net: mpls: Fix notifications when deleting a device
	siphash: use _unaligned version by default
	arm64: ftrace: add missing BTIs
	iwlwifi: fix warnings produced by kernel debug options
	net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
	net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
	selftests: net: Correct case name
	net: dsa: b53: Add SPI ID table
	mt76: mt7915: fix NULL pointer dereference in mt7915_get_phy_mode
	ASoC: tegra: Fix wrong value type in ADMAIF
	ASoC: tegra: Fix wrong value type in I2S
	ASoC: tegra: Fix wrong value type in DMIC
	ASoC: tegra: Fix wrong value type in DSPK
	ASoC: tegra: Fix kcontrol put callback in ADMAIF
	ASoC: tegra: Fix kcontrol put callback in I2S
	ASoC: tegra: Fix kcontrol put callback in DMIC
	ASoC: tegra: Fix kcontrol put callback in DSPK
	ASoC: tegra: Fix kcontrol put callback in AHUB
	rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle()
	rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()
	ALSA: intel-dsp-config: add quirk for CML devices based on ES8336 codec
	net: stmmac: Avoid DMA_CHAN_CONTROL write if no Split Header support
	net: usb: lan78xx: lan78xx_phy_init(): use PHY_POLL instead of "0" if no IRQ is available
	net: marvell: mvpp2: Fix the computation of shared CPUs
	dpaa2-eth: destroy workqueue at the end of remove function
	octeontx2-af: Fix a memleak bug in rvu_mbox_init()
	net: annotate data-races on txq->xmit_lock_owner
	ipv4: convert fib_num_tclassid_users to atomic_t
	net/smc: fix wrong list_del in smc_lgr_cleanup_early
	net/rds: correct socket tunable error in rds_tcp_tune()
	net/smc: Keep smc_close_final rc during active close
	drm/msm/a6xx: Allocate enough space for GMU registers
	drm/msm: Do hw_init() before capturing GPU state
	drm/vc4: kms: Wait for the commit before increasing our clock rate
	drm/vc4: kms: Fix return code check
	drm/vc4: kms: Add missing drm_crtc_commit_put
	drm/vc4: kms: Clear the HVS FIFO commit pointer once done
	drm/vc4: kms: Don't duplicate pending commit
	drm/vc4: kms: Fix previous HVS commit wait
	atlantic: Increase delay for fw transactions
	atlatnic: enable Nbase-t speeds with base-t
	atlantic: Fix to display FW bundle version instead of FW mac version.
	atlantic: Add missing DIDs and fix 115c.
	Remove Half duplex mode speed capabilities.
	atlantic: Fix statistics logic for production hardware
	atlantic: Remove warn trace message.
	KVM: x86/mmu: Skip tlb flush if it has been done in zap_gfn_range()
	KVM: x86/mmu: Pass parameter flush as false in kvm_tdp_mmu_zap_collapsible_sptes()
	drm/msm/devfreq: Fix OPP refcnt leak
	drm/msm: Fix mmap to include VM_IO and VM_DONTDUMP
	drm/msm: Fix wait_fence submitqueue leak
	drm/msm: Restore error return on invalid fence
	ASoC: rk817: Add module alias for rk817-codec
	iwlwifi: Fix memory leaks in error handling path
	KVM: X86: Fix when shadow_root_level=5 && guest root_level<4
	KVM: SEV: initialize regions_list of a mirror VM
	net/mlx5e: Fix missing IPsec statistics on uplink representor
	net/mlx5: Move MODIFY_RQT command to ignore list in internal error state
	net/mlx5: E-switch, Respect BW share of the new group
	net/mlx5: E-Switch, fix single FDB creation on BlueField
	net/mlx5: E-Switch, Check group pointer before reading bw_share value
	KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln register
	KVM: VMX: Set failure code in prepare_vmcs02()
	mctp: Don't let RTM_DELROUTE delete local routes
	Revert "drm/i915: Implement Wa_1508744258"
	io-wq: don't retry task_work creation failure on fatal conditions
	x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
	x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
	x86/entry: Use the correct fence macro after swapgs in kernel CR3
	x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
	preempt/dynamic: Fix setup_preempt_mode() return value
	sched/uclamp: Fix rq->uclamp_max not set on first enqueue
	KVM: SEV: Return appropriate error codes if SEV-ES scratch setup fails
	KVM: x86/mmu: Rename slot_handle_leaf to slot_handle_level_4k
	KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
	net/mlx5e: Rename lro_timeout to packet_merge_timeout
	net/mlx5e: Rename TIR lro functions to TIR packet merge functions
	net/mlx5e: Sync TIR params updates against concurrent create/modify
	serial: 8250_bcm7271: UART errors after resuming from S2
	parisc: Fix KBUILD_IMAGE for self-extracting kernel
	parisc: Fix "make install" on newer debian releases
	parisc: Mark cr16 CPU clocksource unstable on all SMP machines
	vgacon: Propagate console boot parameters before calling `vc_resize'
	xhci: Fix commad ring abort, write all 64 bits to CRCR register.
	USB: NO_LPM quirk Lenovo Powered USB-C Travel Hub
	usb: typec: tcpm: Wait in SNK_DEBOUNCED until disconnect
	usb: cdns3: gadget: fix new urb never complete if ep cancel previous requests
	usb: cdnsp: Fix a NULL pointer dereference in cdnsp_endpoint_init()
	x86/tsc: Add a timer to make sure TSC_adjust is always checked
	x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
	x86/64/mm: Map all kernel memory into trampoline_pgd
	tty: serial: msm_serial: Deactivate RX DMA for polling support
	serial: pl011: Add ACPI SBSA UART match id
	serial: tegra: Change lower tolerance baud rate limit for tegra20 and tegra30
	serial: core: fix transmit-buffer reset and memleak
	serial: 8250_pci: Fix ACCES entries in pci_serial_quirks array
	serial: 8250_pci: rewrite pericom_do_set_divisor()
	serial: 8250: Fix RTS modem control while in rs485 mode
	serial: liteuart: Fix NULL pointer dereference in ->remove()
	serial: liteuart: fix use-after-free and memleak on unbind
	serial: liteuart: fix minor-number leak on probe errors
	ipmi: msghandler: Make symbol 'remove_work_wq' static
	Linux 5.15.7

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9300a10911f6205d2fb76f18255b017d34d68d1d
2021-12-08 13:46:21 +01:00
Eric Dumazet
94782c8ffd net: annotate data-races on txq->xmit_lock_owner
commit 7a10d8c810cfad3e79372d7d1c77899d86cd6662 upstream.

syzbot found that __dev_queue_xmit() is reading txq->xmit_lock_owner
without annotations.

No serious issue there, let's document what is happening there.

BUG: KCSAN: data-race in __dev_queue_xmit / __dev_queue_xmit

write to 0xffff888139d09484 of 4 bytes by interrupt on cpu 0:
 __netif_tx_unlock include/linux/netdevice.h:4437 [inline]
 __dev_queue_xmit+0x948/0xf70 net/core/dev.c:4229
 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265
 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline]
 macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567
 __netdev_start_xmit include/linux/netdevice.h:4987 [inline]
 netdev_start_xmit include/linux/netdevice.h:5001 [inline]
 xmit_one+0x105/0x2f0 net/core/dev.c:3590
 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606
 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342
 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817
 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194
 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259
 neigh_hh_output include/net/neighbour.h:511 [inline]
 neigh_output include/net/neighbour.h:525 [inline]
 ip6_finish_output2+0x995/0xbb0 net/ipv6/ip6_output.c:126
 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
 ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224
 dst_output include/net/dst.h:450 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508
 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702
 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898
 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421
 expire_timers+0x116/0x240 kernel/time/timer.c:1466
 __run_timers+0x368/0x410 kernel/time/timer.c:1734
 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747
 __do_softirq+0x158/0x2de kernel/softirq.c:558
 __irq_exit_rcu kernel/softirq.c:636 [inline]
 irq_exit_rcu+0x37/0x70 kernel/softirq.c:648
 sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1097
 asm_sysvec_apic_timer_interrupt+0x12/0x20

read to 0xffff888139d09484 of 4 bytes by interrupt on cpu 1:
 __dev_queue_xmit+0x5e3/0xf70 net/core/dev.c:4213
 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265
 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline]
 macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567
 __netdev_start_xmit include/linux/netdevice.h:4987 [inline]
 netdev_start_xmit include/linux/netdevice.h:5001 [inline]
 xmit_one+0x105/0x2f0 net/core/dev.c:3590
 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606
 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342
 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817
 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194
 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259
 neigh_resolve_output+0x3db/0x410 net/core/neighbour.c:1523
 neigh_output include/net/neighbour.h:527 [inline]
 ip6_finish_output2+0x9be/0xbb0 net/ipv6/ip6_output.c:126
 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
 ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224
 dst_output include/net/dst.h:450 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline]
 ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508
 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702
 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898
 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421
 expire_timers+0x116/0x240 kernel/time/timer.c:1466
 __run_timers+0x368/0x410 kernel/time/timer.c:1734
 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747
 __do_softirq+0x158/0x2de kernel/softirq.c:558
 __irq_exit_rcu kernel/softirq.c:636 [inline]
 irq_exit_rcu+0x37/0x70 kernel/softirq.c:648
 sysvec_apic_timer_interrupt+0x8d/0xb0 arch/x86/kernel/apic/apic.c:1097
 asm_sysvec_apic_timer_interrupt+0x12/0x20
 kcsan_setup_watchpoint+0x94/0x420 kernel/kcsan/core.c:443
 folio_test_anon include/linux/page-flags.h:581 [inline]
 PageAnon include/linux/page-flags.h:586 [inline]
 zap_pte_range+0x5ac/0x10e0 mm/memory.c:1347
 zap_pmd_range mm/memory.c:1467 [inline]
 zap_pud_range mm/memory.c:1496 [inline]
 zap_p4d_range mm/memory.c:1517 [inline]
 unmap_page_range+0x2dc/0x3d0 mm/memory.c:1538
 unmap_single_vma+0x157/0x210 mm/memory.c:1583
 unmap_vmas+0xd0/0x180 mm/memory.c:1615
 exit_mmap+0x23d/0x470 mm/mmap.c:3170
 __mmput+0x27/0x1b0 kernel/fork.c:1113
 mmput+0x3d/0x50 kernel/fork.c:1134
 exit_mm+0xdb/0x170 kernel/exit.c:507
 do_exit+0x608/0x17a0 kernel/exit.c:819
 do_group_exit+0xce/0x180 kernel/exit.c:929
 get_signal+0xfc3/0x1550 kernel/signal.c:2852
 arch_do_signal_or_restart+0x8c/0x2e0 arch/x86/kernel/signal.c:868
 handle_signal_work kernel/entry/common.c:148 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
 exit_to_user_mode_prepare+0x113/0x190 kernel/entry/common.c:207
 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300
 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0xffffffff

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G        W         5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211130170155.2331929-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08 09:04:49 +01:00
Greg Kroah-Hartman
36de88a855 Merge 5.15.3 into android13-5.15
Changes in 5.15.3
	xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay
	usb: xhci: Enable runtime-pm by default on AMD Yellow Carp platform
	Input: iforce - fix control-message timeout
	Input: elantench - fix misreporting trackpoint coordinates
	Input: i8042 - Add quirk for Fujitsu Lifebook T725
	libata: fix read log timeout value
	ocfs2: fix data corruption on truncate
	scsi: scsi_ioctl: Validate command size
	scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run
	scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()
	scsi: lpfc: Don't release final kref on Fport node while ABTS outstanding
	scsi: lpfc: Fix FCP I/O flush functionality for TMF routines
	scsi: qla2xxx: Fix crash in NVMe abort path
	scsi: qla2xxx: Fix kernel crash when accessing port_speed sysfs file
	scsi: qla2xxx: Fix use after free in eh_abort path
	ce/gf100: fix incorrect CE0 address calculation on some GPUs
	char: xillybus: fix msg_ep UAF in xillyusb_probe()
	mmc: mtk-sd: Add wait dma stop done flow
	mmc: dw_mmc: Dont wait for DRTO on Write RSP error
	exfat: fix incorrect loading of i_blocks for large files
	io-wq: remove worker to owner tw dependency
	parisc: Fix set_fixmap() on PA1.x CPUs
	parisc: Fix ptrace check on syscall return
	tpm: Check for integer overflow in tpm2_map_response_body()
	firmware/psci: fix application of sizeof to pointer
	crypto: s5p-sss - Add error handling in s5p_aes_probe()
	media: rkvdec: Do not override sizeimage for output format
	media: ite-cir: IR receiver stop working after receive overflow
	media: rkvdec: Support dynamic resolution changes
	media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers
	media: v4l2-ioctl: Fix check_ext_ctrls
	ALSA: hda/realtek: Fix mic mute LED for the HP Spectre x360 14
	ALSA: hda/realtek: Add a quirk for HP OMEN 15 mute LED
	ALSA: hda/realtek: Add quirk for Clevo PC70HS
	ALSA: hda/realtek: Headset fixup for Clevo NH77HJQ
	ALSA: hda/realtek: Add a quirk for Acer Spin SP513-54N
	ALSA: hda/realtek: Add quirk for ASUS UX550VE
	ALSA: hda/realtek: Add quirk for HP EliteBook 840 G7 mute LED
	ALSA: ua101: fix division by zero at probe
	ALSA: 6fire: fix control and bulk message timeouts
	ALSA: line6: fix control and interrupt message timeouts
	ALSA: mixer: oss: Fix racy access to slots
	ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume
	ALSA: usb-audio: Line6 HX-Stomp XL USB_ID for 48k-fixed quirk
	ALSA: usb-audio: Add registration quirk for JBL Quantum 400
	ALSA: hda: Free card instance properly at probe errors
	ALSA: synth: missing check for possible NULL after the call to kstrdup
	ALSA: pci: rme: Fix unaligned buffer addresses
	ALSA: PCM: Fix NULL dereference at mmap checks
	ALSA: timer: Fix use-after-free problem
	ALSA: timer: Unconditionally unlink slave instances, too
	Revert "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
	ext4: fix lazy initialization next schedule time computation in more granular unit
	ext4: ensure enough credits in ext4_ext_shift_path_extents
	ext4: refresh the ext4_ext_path struct after dropping i_data_sem.
	fuse: fix page stealing
	x86/sme: Use #define USE_EARLY_PGTABLE_L5 in mem_encrypt_identity.c
	x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
	x86/irq: Ensure PI wakeup handler is unregistered before module unload
	x86/iopl: Fake iopl(3) CLI/STI usage
	btrfs: clear MISSING device status bit in btrfs_close_one_device
	btrfs: fix lost error handling when replaying directory deletes
	btrfs: call btrfs_check_rw_degradable only if there is a missing device
	KVM: x86/mmu: Drop a redundant, broken remote TLB flush
	KVM: VMX: Unregister posted interrupt wakeup handler on hardware unsetup
	KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
	ia64: kprobes: Fix to pass correct trampoline address to the handler
	selinux: fix race condition when computing ocontext SIDs
	ipmi:watchdog: Set panic count to proper value on a panic
	md/raid1: only allocate write behind bio for WriteMostly device
	hwmon: (pmbus/lm25066) Add offset coefficients
	regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled
	regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property
	EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
	mwifiex: fix division by zero in fw download path
	ath6kl: fix division by zero in send path
	ath6kl: fix control-message timeout
	ath10k: fix control-message timeout
	ath10k: fix division by zero in send path
	PCI: Mark Atheros QCA6174 to avoid bus reset
	rtl8187: fix control-message timeouts
	evm: mark evm_fixmode as __ro_after_init
	ifb: Depend on netfilter alternatively to tc
	platform/surface: aggregator_registry: Add support for Surface Laptop Studio
	mt76: mt7615: fix skb use-after-free on mac reset
	HID: surface-hid: Use correct event registry for managing HID events
	HID: surface-hid: Allow driver matching for target ID 1 devices
	wcn36xx: Fix HT40 capability for 2Ghz band
	wcn36xx: Fix tx_status mechanism
	wcn36xx: Fix (QoS) null data frame bitrate/modulation
	PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions
	mwifiex: Read a PCI register after writing the TX ring write pointer
	mwifiex: Try waking the firmware until we get an interrupt
	libata: fix checking of DMA state
	dma-buf: fix and rework dma_buf_poll v7
	wcn36xx: handle connection loss indication
	rsi: fix occasional initialisation failure with BT coex
	rsi: fix key enabled check causing unwanted encryption for vap_id > 0
	rsi: fix rate mask set leading to P2P failure
	rsi: Fix module dev_oper_mode parameter description
	perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
	perf/x86/intel/uncore: Fix invalid unit check
	perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
	RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
	ASoC: tegra: Set default card name for Trimslice
	ASoC: tegra: Restore AC97 support
	signal: Remove the bogus sigkill_pending in ptrace_stop
	memory: renesas-rpc-if: Correct QSPI data transfer in Manual mode
	signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
	signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed
	soc: samsung: exynos-pmu: Fix compilation when nothing selects CONFIG_MFD_CORE
	soc: fsl: dpio: replace smp_processor_id with raw_smp_processor_id
	soc: fsl: dpio: use the combined functions to protect critical zone
	mtd: rawnand: socrates: Keep the driver compatible with on-die ECC engines
	mctp: handle the struct sockaddr_mctp padding fields
	power: supply: max17042_battery: Prevent int underflow in set_soc_threshold
	power: supply: max17042_battery: use VFSOC for capacity when no rsns
	iio: core: fix double free in iio_device_unregister_sysfs()
	iio: core: check return value when calling dev_set_name()
	KVM: arm64: Extract ESR_ELx.EC only
	KVM: x86: Fix recording of guest steal time / preempted status
	KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows
	KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
	KVM: nVMX: Handle dynamic MSR intercept toggling
	can: peak_usb: always ask for BERR reporting for PCAN-USB devices
	can: mcp251xfd: mcp251xfd_irq(): add missing can_rx_offload_threaded_irq_finish() in case of bus off
	can: j1939: j1939_tp_cmd_recv(): ignore abort message in the BAM transport
	can: j1939: j1939_can_recv(): ignore messages with invalid source address
	can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM
	iio: adc: tsc2046: fix scan interval warning
	powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found
	io_uring: honour zeroes as io-wq worker limits
	ring-buffer: Protect ring_buffer_reset() from reentrancy
	serial: core: Fix initializing and restoring termios speed
	ifb: fix building without CONFIG_NET_CLS_ACT
	xen/balloon: add late_initcall_sync() for initial ballooning done
	ovl: fix use after free in struct ovl_aio_req
	ovl: fix filattr copy-up failure
	PCI: pci-bridge-emul: Fix emulation of W1C bits
	PCI: cadence: Add cdns_plat_pcie_probe() missing return
	cxl/pci: Fix NULL vs ERR_PTR confusion
	PCI: aardvark: Do not clear status bits of masked interrupts
	PCI: aardvark: Fix checking for link up via LTSSM state
	PCI: aardvark: Do not unmask unused interrupts
	PCI: aardvark: Fix reporting Data Link Layer Link Active
	PCI: aardvark: Fix configuring Reference clock
	PCI: aardvark: Fix return value of MSI domain .alloc() method
	PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
	PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
	PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
	PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
	PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge
	quota: check block number when reading the block in quota file
	quota: correct error number in free_dqentry()
	cifs: To match file servers, make sure the server hostname matches
	cifs: set a minimum of 120s for next dns resolution
	mfd: simple-mfd-i2c: Select MFD_CORE to fix build error
	pinctrl: core: fix possible memory leak in pinctrl_enable()
	coresight: cti: Correct the parameter for pm_runtime_put
	coresight: trbe: Fix incorrect access of the sink specific data
	coresight: trbe: Defer the probe on offline CPUs
	iio: buffer: check return value of kstrdup_const()
	iio: buffer: Fix memory leak in iio_buffers_alloc_sysfs_and_mask()
	iio: buffer: Fix memory leak in __iio_buffer_alloc_sysfs_and_mask()
	iio: buffer: Fix memory leak in iio_buffer_register_legacy_sysfs_groups()
	drivers: iio: dac: ad5766: Fix dt property name
	iio: dac: ad5446: Fix ad5622_write() return value
	iio: ad5770r: make devicetree property reading consistent
	Documentation:devicetree:bindings:iio:dac: Fix val
	USB: serial: keyspan: fix memleak on probe errors
	serial: 8250: fix racy uartclk update
	ksmbd: set unique value to volume serial field in FS_VOLUME_INFORMATION
	io-wq: serialize hash clear with wakeup
	serial: 8250: Fix reporting real baudrate value in c_ospeed field
	Revert "serial: 8250: Fix reporting real baudrate value in c_ospeed field"
	most: fix control-message timeouts
	USB: iowarrior: fix control-message timeouts
	USB: chipidea: fix interrupt deadlock
	power: supply: max17042_battery: Clear status bits in interrupt handler
	component: do not leave master devres group open after bind
	dma-buf: WARN on dmabuf release with pending attachments
	drm: panel-orientation-quirks: Update the Lenovo Ideapad D330 quirk (v2)
	drm: panel-orientation-quirks: Add quirk for KD Kurio Smart C15200 2-in-1
	drm: panel-orientation-quirks: Add quirk for the Samsung Galaxy Book 10.6
	Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()
	Bluetooth: fix use-after-free error in lock_sock_nested()
	Bluetooth: call sock_hold earlier in sco_conn_del
	drm/panel-orientation-quirks: add Valve Steam Deck
	rcutorture: Avoid problematic critical section nesting on PREEMPT_RT
	platform/x86: wmi: do not fail if disabling fails
	drm/amdgpu: move iommu_resume before ip init/resume
	MIPS: lantiq: dma: add small delay after reset
	MIPS: lantiq: dma: reset correct number of channel
	locking/lockdep: Avoid RCU-induced noinstr fail
	net: sched: update default qdisc visibility after Tx queue cnt changes
	ACPI: resources: Add DMI-based legacy IRQ override quirk
	rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
	smackfs: Fix use-after-free in netlbl_catmap_walk()
	ath11k: Align bss_chan_info structure with firmware
	crypto: aesni - check walk.nbytes instead of err
	x86/mm/64: Improve stack overflow warnings
	x86: Increase exception stack sizes
	mwifiex: Run SET_BSS_MODE when changing from P2P to STATION vif-type
	mwifiex: Properly initialize private structure on interface type changes
	spi: Check we have a spi_device_id for each DT compatible
	fscrypt: allow 256-bit master keys with AES-256-XTS
	drm/amdgpu: Fix MMIO access page fault
	drm/amd/display: Fix null pointer dereference for encoders
	selftests: net: fib_nexthops: Wait before checking reported idle time
	ath11k: Avoid reg rules update during firmware recovery
	ath11k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
	ath11k: Change DMA_FROM_DEVICE to DMA_TO_DEVICE when map reinjected packets
	ath10k: high latency fixes for beacon buffer
	octeontx2-pf: Enable promisc/allmulti match MCAM entries.
	media: mt9p031: Fix corrupted frame after restarting stream
	media: netup_unidvb: handle interrupt properly according to the firmware
	media: atomisp: Fix error handling in probe
	media: stm32: Potential NULL pointer dereference in dcmi_irq_thread()
	media: uvcvideo: Set capability in s_param
	media: uvcvideo: Return -EIO for control errors
	media: uvcvideo: Set unique vdev name based in type
	media: vidtv: Fix memory leak in remove
	media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe()
	media: s5p-mfc: Add checking to s5p_mfc_probe().
	media: videobuf2: rework vb2_mem_ops API
	media: imx: set a media_device bus_info string
	media: rcar-vin: Use user provided buffers when starting
	media: mceusb: return without resubmitting URB in case of -EPROTO error.
	ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
	rtw88: fix RX clock gate setting while fifo dump
	brcmfmac: Add DMI nvram filename quirk for Cyberbook T116 tablet
	media: rcar-csi2: Add checking to rcsi2_start_receiver()
	ipmi: Disable some operations during a panic
	fs/proc/uptime.c: Fix idle time reporting in /proc/uptime
	kselftests/sched: cleanup the child processes
	ACPICA: Avoid evaluating methods too early during system resume
	cpufreq: Make policy min/max hard requirements
	ice: Move devlink port to PF/VF struct
	media: imx-jpeg: Fix possible null pointer dereference
	media: ipu3-imgu: imgu_fmt: Handle properly try
	media: ipu3-imgu: VIDIOC_QUERYCAP: Fix bus_info
	media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte()
	net-sysfs: try not to restart the syscall if it will fail eventually
	drm/amdkfd: rm BO resv on validation to avoid deadlock
	tracefs: Have tracefs directories not set OTH permission bits by default
	tracing: Disable "other" permission bits in the tracefs files
	ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create()
	KVM: arm64: Propagate errors from __pkvm_prot_finalize hypercall
	mmc: moxart: Fix reference count leaks in moxart_probe
	iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
	ACPI: battery: Accept charges over the design capacity as full
	ACPI: scan: Release PM resources blocked by unused objects
	drm/amd/display: fix null pointer deref when plugging in display
	drm/amdkfd: fix resume error when iommu disabled in Picasso
	net: phy: micrel: make *-skew-ps check more lenient
	leaking_addresses: Always print a trailing newline
	thermal/core: Fix null pointer dereference in thermal_release()
	drm/msm: prevent NULL dereference in msm_gpu_crashstate_capture()
	thermal/drivers/tsens: Add timeout to get_temp_tsens_valid
	block: bump max plugged deferred size from 16 to 32
	floppy: fix calling platform_device_unregister() on invalid drives
	md: update superblock after changing rdev flags in state_store
	memstick: r592: Fix a UAF bug when removing the driver
	locking/rwsem: Disable preemption for spinning region
	lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
	lib/xz: Validate the value before assigning it to an enum variable
	workqueue: make sysfs of unbound kworker cpumask more clever
	tracing/cfi: Fix cmp_entries_* functions signature mismatch
	mt76: mt7915: fix an off-by-one bound check
	mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
	iwlwifi: change all JnP to NO-160 configuration
	block: remove inaccurate requeue check
	media: allegro: ignore interrupt if mailbox is not initialized
	drm/amdgpu/pm: properly handle sclk for profiling modes on vangogh
	nvmet: fix use-after-free when a port is removed
	nvmet-rdma: fix use-after-free when a port is removed
	nvmet-tcp: fix use-after-free when a port is removed
	nvme: drop scan_lock and always kick requeue list when removing namespaces
	samples/bpf: Fix application of sizeof to pointer
	arm64: vdso32: suppress error message for 'make mrproper'
	PM: hibernate: Get block device exclusively in swsusp_check()
	selftests: kvm: fix mismatched fclose() after popen()
	selftests/bpf: Fix perf_buffer test on system with offline cpus
	iwlwifi: mvm: disable RX-diversity in powersave
	smackfs: use __GFP_NOFAIL for smk_cipso_doi()
	ARM: clang: Do not rely on lr register for stacktrace
	gre/sit: Don't generate link-local addr if addr_gen_mode is IN6_ADDR_GEN_MODE_NONE
	can: bittiming: can_fixup_bittiming(): change type of tseg1 and alltseg to unsigned int
	gfs2: Cancel remote delete work asynchronously
	gfs2: Fix glock_hash_walk bugs
	ARM: 9136/1: ARMv7-M uses BE-8, not BE-32
	tools/latency-collector: Use correct size when writing queue_full_warning
	vrf: run conntrack only in context of lower/physdev for locally generated packets
	net: annotate data-race in neigh_output()
	ACPI: AC: Quirk GK45 to skip reading _PSR
	ACPI: resources: Add one more Medion model in IRQ override quirk
	btrfs: reflink: initialize return value to 0 in btrfs_extent_same()
	btrfs: do not take the uuid_mutex in btrfs_rm_device
	spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()
	wcn36xx: Correct band/freq reporting on RX
	wcn36xx: Fix packet drop on resume
	Revert "wcn36xx: Enable firmware link monitoring"
	ftrace: do CPU checking after preemption disabled
	inet: remove races in inet{6}_getname()
	x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
	drm/amd/display: dcn20_resource_construct reduce scope of FPU enabled
	selftests/core: fix conflicting types compile error for close_range()
	perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
	parisc: fix warning in flush_tlb_all
	task_stack: Fix end_of_stack() for architectures with upwards-growing stack
	erofs: don't trigger WARN() when decompression fails
	parisc/unwind: fix unwinder when CONFIG_64BIT is enabled
	parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
	netfilter: conntrack: set on IPS_ASSURED if flows enters internal stream state
	selftests/bpf: Fix strobemeta selftest regression
	fbdev/efifb: Release PCI device's runtime PM ref during FB destroy
	drm/bridge: anx7625: Propagate errors from sp_tx_rst_aux()
	perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
	perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
	perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
	perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
	drm/bridge: it66121: Initialize {device,vendor}_ids
	drm/bridge: it66121: Wait for next bridge to be probed
	Bluetooth: fix init and cleanup of sco_conn.timeout_work
	libbpf: Don't crash on object files with no symbol tables
	Bluetooth: hci_uart: fix GPF in h5_recv
	rcu: Fix existing exp request check in sync_sched_exp_online_cleanup()
	MIPS: lantiq: dma: fix burst length for DEU
	x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
	objtool: Handle __sanitize_cov*() tail calls
	net/mlx5: Publish and unpublish all devlink parameters at once
	drm/v3d: fix wait for TMU write combiner flush
	crypto: sm4 - Do not change section of ck and sbox
	virtio-gpu: fix possible memory allocation failure
	lockdep: Let lock_is_held_type() detect recursive read as read
	net: net_namespace: Fix undefined member in key_remove_domain()
	net: phylink: don't call netif_carrier_off() with NULL netdev
	drm: bridge: it66121: Fix return value it66121_probe
	spi: Fixed division by zero warning
	cgroup: Make rebind_subsystems() disable v2 controllers all at once
	wcn36xx: Fix Antenna Diversity Switching
	wilc1000: fix possible memory leak in cfg_scan_result()
	Bluetooth: btmtkuart: fix a memleak in mtk_hci_wmt_sync
	drm/amdgpu: Fix crash on device remove/driver unload
	drm/amd/display: Pass display_pipe_params_st as const in DML
	drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage
	crypto: caam - disable pkc for non-E SoCs
	crypto: qat - power up 4xxx device
	Bluetooth: hci_h5: Fix (runtime)suspend issues on RTL8723BS HCIs
	bnxt_en: Check devlink allocation and registration status
	qed: Don't ignore devlink allocation failures
	rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()
	mptcp: do not shrink snd_nxt when recovering
	fortify: Fix dropped strcpy() compile-time write overflow check
	mac80211: twt: don't use potentially unaligned pointer
	cfg80211: always free wiphy specific regdomain
	net/mlx5: Accept devlink user input after driver initialization complete
	net: dsa: rtl8366rb: Fix off-by-one bug
	net: dsa: rtl8366: Fix a bug in deleting VLANs
	bpf/tests: Fix error in tail call limit tests
	ath11k: fix some sleeping in atomic bugs
	ath11k: Avoid race during regd updates
	ath11k: fix packet drops due to incorrect 6 GHz freq value in rx status
	ath11k: Fix memory leak in ath11k_qmi_driver_event_work
	gve: DQO: avoid unused variable warnings
	ath10k: Fix missing frame timestamp for beacon/probe-resp
	ath10k: sdio: Add missing BH locking around napi_schdule()
	drm/ttm: stop calling tt_swapin in vm_access
	arm64: mm: update max_pfn after memory hotplug
	drm/amdgpu: fix warning for overflow check
	libbpf: Fix skel_internal.h to set errno on loader retval < 0
	media: em28xx: add missing em28xx_close_extension
	media: meson-ge2d: Fix rotation parameter changes detection in 'ge2d_s_ctrl()'
	media: cxd2880-spi: Fix a null pointer dereference on error handling path
	media: ttusb-dec: avoid release of non-acquired mutex
	media: dvb-usb: fix ununit-value in az6027_rc_query
	media: imx258: Fix getting clock frequency
	media: v4l2-ioctl: S_CTRL output the right value
	media: mtk-vcodec: venc: fix return value when start_streaming fails
	media: TDA1997x: handle short reads of hdmi info frame.
	media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'
	media: imx-jpeg: Fix the error handling path of 'mxc_jpeg_probe()'
	media: i2c: ths8200 needs V4L2_ASYNC
	media: sun6i-csi: Allow the video device to be open multiple times
	media: radio-wl1273: Avoid card name truncation
	media: si470x: Avoid card name truncation
	media: tm6000: Avoid card name truncation
	media: cx23885: Fix snd_card_free call on null card pointer
	media: atmel: fix the ispck initialization
	scs: Release kasan vmalloc poison in scs_free process
	kprobes: Do not use local variable when creating debugfs file
	crypto: ecc - fix CRYPTO_DEFAULT_RNG dependency
	drm: fb_helper: fix CONFIG_FB dependency
	cpuidle: Fix kobject memory leaks in error paths
	media: em28xx: Don't use ops->suspend if it is NULL
	ath10k: Don't always treat modem stop events as crashes
	ath9k: Fix potential interrupt storm on queue reset
	PM: EM: Fix inefficient states detection
	x86/insn: Use get_unaligned() instead of memcpy()
	EDAC/amd64: Handle three rank interleaving mode
	rcu: Always inline rcu_dynticks_task*_{enter,exit}()
	rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
	netfilter: nft_dynset: relax superfluous check on set updates
	media: venus: fix vpp frequency calculation for decoder
	media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()
	crypto: ccree - avoid out-of-range warnings from clang
	crypto: qat - detect PFVF collision after ACK
	crypto: qat - disregard spurious PFVF interrupts
	hwrng: mtk - Force runtime pm ops for sleep ops
	ima: fix deadlock when traversing "ima_default_rules".
	b43legacy: fix a lower bounds test
	b43: fix a lower bounds test
	gve: Recover from queue stall due to missed IRQ
	gve: Track RX buffer allocation failures
	mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured
	mmc: sdhci-omap: Fix context restore
	memstick: avoid out-of-range warning
	memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()
	net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE
	hwmon: Fix possible memleak in __hwmon_device_register()
	hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff
	ath10k: fix max antenna gain unit
	kernel/sched: Fix sched_fork() access an invalid sched_task_group
	net: fealnx: fix build for UML
	net: intel: igc_ptp: fix build for UML
	net: tulip: winbond-840: fix build for UML
	tcp: switch orphan_count to bare per-cpu counters
	crypto: octeontx2 - set assoclen in aead_do_fallback()
	thermal/core: fix a UAF bug in __thermal_cooling_device_register()
	drm/msm/dsi: do not enable irq handler before powering up the host
	drm/msm: Fix potential Oops in a6xx_gmu_rpmh_init()
	drm/msm: potential error pointer dereference in init()
	drm/msm: unlock on error in get_sched_entity()
	drm/msm: fix potential NULL dereference in cleanup
	drm/msm: uninitialized variable in msm_gem_import()
	net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
	thermal/drivers/qcom/lmh: make QCOM_LMH depends on QCOM_SCM
	mailbox: Remove WARN_ON for async_cb.cb in cmdq_exec_done
	media: ivtv: fix build for UML
	media: ir_toy: assignment to be16 should be of correct type
	mmc: mxs-mmc: disable regulator on error and in the remove function
	io-wq: Remove duplicate code in io_workqueue_create()
	block: ataflop: fix breakage introduced at blk-mq refactoring
	blk-wbt: prevent NULL pointer dereference in wb_timer_fn
	platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning
	mailbox: mtk-cmdq: Validate alias_id on probe
	mailbox: mtk-cmdq: Fix local clock ID usage
	ACPI: PM: Turn off unused wakeup power resources
	ACPI: PM: Fix sharing of wakeup power resources
	drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu
	mt76: mt7921: fix endianness in mt7921_mcu_tx_done_event
	mt76: mt7915: fix endianness warning in mt7915_mac_add_txs_skb
	mt76: mt7921: fix endianness warning in mt7921_update_txs
	mt76: mt7615: fix endianness warning in mt7615_mac_write_txwi
	mt76: mt7915: fix info leak in mt7915_mcu_set_pre_cal()
	mt76: connac: fix mt76_connac_gtk_rekey_tlv usage
	mt76: fix build error implicit enumeration conversion
	mt76: mt7921: fix survey-dump reporting
	mt76: mt76x02: fix endianness warnings in mt76x02_mac.c
	mt76: mt7921: Fix out of order process by invalid event pkt
	mt76: mt7915: fix potential overflow of eeprom page index
	mt76: mt7915: fix bit fields for HT rate idx
	mt76: mt7921: fix dma hang in rmmod
	mt76: connac: fix GTK rekey offload failure on WPA mixed mode
	mt76: overwrite default reg_ops if necessary
	mt76: mt7921: report HE MU radiotap
	mt76: mt7921: fix firmware usage of RA info using legacy rates
	mt76: mt7921: fix kernel warning from cfg80211_calculate_bitrate
	mt76: mt7921: always wake device if necessary in debugfs
	mt76: mt7915: fix hwmon temp sensor mem use-after-free
	mt76: mt7615: fix hwmon temp sensor mem use-after-free
	mt76: mt7915: fix possible infinite loop release semaphore
	mt76: mt7921: fix retrying release semaphore without end
	mt76: mt7615: fix monitor mode tear down crash
	mt76: connac: fix possible NULL pointer dereference in mt76_connac_get_phy_mode_v2
	mt76: mt7915: fix sta_rec_wtbl tag len
	mt76: mt7915: fix muar_idx in mt7915_mcu_alloc_sta_req()
	rsi: stop thread firstly in rsi_91x_init() error handling
	mwifiex: Send DELBA requests according to spec
	iwlwifi: mvm: reset PM state on unsuccessful resume
	iwlwifi: pnvm: don't kmemdup() more than we have
	iwlwifi: pnvm: read EFI data only if long enough
	net: enetc: unmap DMA in enetc_send_cmd()
	phy: micrel: ksz8041nl: do not use power down mode
	nbd: Fix use-after-free in pid_show
	nvme-rdma: fix error code in nvme_rdma_setup_ctrl
	PM: hibernate: fix sparse warnings
	clocksource/drivers/timer-ti-dm: Select TIMER_OF
	x86/sev: Fix stack type check in vc_switch_off_ist()
	drm/msm: Fix potential NULL dereference in DPU SSPP
	drm/msm/dsi: fix wrong type in msm_dsi_host
	crypto: tcrypt - fix skcipher multi-buffer tests for 1420B blocks
	smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
	KVM: selftests: Fix nested SVM tests when built with clang
	libbpf: Fix memory leak in btf__dedup()
	bpftool: Avoid leaking the JSON writer prepared for program metadata
	libbpf: Fix overflow in BTF sanity checks
	libbpf: Fix BTF header parsing checks
	mt76: mt7615: mt7622: fix ibss and meshpoint
	s390/gmap: validate VMA in __gmap_zap()
	s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap()
	s390/mm: validate VMA in PGSTE manipulation functions
	s390/mm: fix VMA and page table handling code in storage key handling functions
	s390/uv: fully validate the VMA before calling follow_page()
	KVM: s390: pv: avoid double free of sida page
	KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
	irq: mips: avoid nested irq_enter()
	net: dsa: avoid refcount warnings when ->port_{fdb,mdb}_del returns error
	ARM: 9142/1: kasan: work around LPAE build warning
	ath10k: fix module load regression with iram-recovery feature
	block: ataflop: more blk-mq refactoring fixes
	blk-cgroup: synchronize blkg creation against policy deactivation
	libbpf: Fix off-by-one bug in bpf_core_apply_relo()
	tpm: fix Atmel TPM crash caused by too frequent queries
	tpm_tis_spi: Add missing SPI ID
	libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
	tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
	tracing: Fix missing trace_boot_init_histograms kstrdup NULL checks
	cpufreq: intel_pstate: Fix cpu->pstate.turbo_freq initialization
	spi: spi-rpc-if: Check return value of rpcif_sw_init()
	samples/kretprobes: Fix return value if register_kretprobe() failed
	KVM: s390: Fix handle_sske page fault handling
	libertas_tf: Fix possible memory leak in probe and disconnect
	libertas: Fix possible memory leak in probe and disconnect
	wcn36xx: add proper DMA memory barriers in rx path
	wcn36xx: Fix discarded frames due to wrong sequence number
	bpf: Avoid races in __bpf_prog_run() for 32bit arches
	bpf: Fixes possible race in update_prog_stats() for 32bit arches
	wcn36xx: Channel list update before hardware scan
	drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()
	drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
	selftests/bpf: Fix fd cleanup in sk_lookup test
	selftests/bpf: Fix memory leak in test_ima
	sctp: allow IP fragmentation when PLPMTUD enters Error state
	sctp: reset probe_timer in sctp_transport_pl_update
	sctp: subtract sctphdr len in sctp_transport_pl_hlen
	sctp: return true only for pathmtu update in sctp_transport_pl_toobig
	net: amd-xgbe: Toggle PLL settings during rate change
	ipmi: kcs_bmc: Fix a memory leak in the error handling path of 'kcs_bmc_serio_add_device()'
	nfp: fix NULL pointer access when scheduling dim work
	nfp: fix potential deadlock when canceling dim work
	net: phylink: avoid mvneta warning when setting pause parameters
	net: bridge: fix uninitialized variables when BRIDGE_CFM is disabled
	selftests: net: bridge: update IGMP/MLD membership interval value
	crypto: pcrypt - Delay write to padata->info
	selftests/bpf: Fix fclose/pclose mismatch in test_progs
	udp6: allow SO_MARK ctrl msg to affect routing
	ibmvnic: don't stop queue in xmit
	ibmvnic: Process crqs after enabling interrupts
	ibmvnic: delay complete()
	selftests: mptcp: fix proto type in link_failure tests
	skmsg: Lose offset info in sk_psock_skb_ingress
	cgroup: Fix rootcg cpu.stat guest double counting
	bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
	bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
	of: unittest: fix EXPECT text for gpio hog errors
	cpufreq: Fix parameter in parse_perf_domain()
	staging: r8188eu: fix memory leak in rtw_set_key
	arm64: dts: meson: sm1: add Ethernet PHY reset line for ODROID-C4/HC4
	iio: st_sensors: disable regulators after device unregistration
	RDMA/rxe: Fix wrong port_cap_flags
	ARM: dts: BCM5301X: Fix memory nodes names
	arm64: dts: broadcom: bcm4908: Fix UART clock name
	clk: mvebu: ap-cpu-clk: Fix a memory leak in error handling paths
	scsi: pm80xx: Fix lockup in outbound queue management
	scsi: qla2xxx: edif: Use link event to wake up app
	scsi: lpfc: Fix NVMe I/O failover to non-optimized path
	ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc()
	arm64: dts: rockchip: Fix GPU register width for RK3328
	ARM: dts: qcom: msm8974: Add xo_board reference clock to DSI0 PHY
	RDMA/bnxt_re: Fix query SRQ failure
	arm64: dts: ti: k3-j721e-main: Fix "max-virtual-functions" in PCIe EP nodes
	arm64: dts: ti: k3-j721e-main: Fix "bus-range" upto 256 bus number for PCIe
	arm64: dts: ti: j7200-main: Fix "vendor-id"/"device-id" properties of pcie node
	arm64: dts: ti: j7200-main: Fix "bus-range" upto 256 bus number for PCIe
	arm64: dts: meson-g12a: Fix the pwm regulator supply properties
	arm64: dts: meson-g12b: Fix the pwm regulator supply properties
	arm64: dts: meson-sm1: Fix the pwm regulator supply properties
	bus: ti-sysc: Fix timekeeping_suspended warning on resume
	ARM: dts: at91: tse850: the emac<->phy interface is rmii
	arm64: dts: qcom: sc7180: Base dynamic CPU power coefficients in reality
	soc: qcom: llcc: Disable MMUHWT retention
	arm64: dts: qcom: sc7280: fix display port phy reg property
	scsi: dc395: Fix error case unwinding
	MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
	JFS: fix memleak in jfs_mount
	pinctrl: renesas: rzg2l: Fix missing port register 21h
	ASoC: wcd9335: Use correct version to initialize Class H
	arm64: dts: qcom: msm8916: Fix Secondary MI2S bit clock
	arm64: dts: renesas: beacon: Fix Ethernet PHY mode
	iommu/mediatek: Fix out-of-range warning with clang
	arm64: dts: qcom: pm8916: Remove wrong reg-names for rtc@6000
	iommu/dma: Fix sync_sg with swiotlb
	iommu/dma: Fix arch_sync_dma for map
	ALSA: hda: Reduce udelay() at SKL+ position reporting
	ALSA: hda: Use position buffer for SKL+ again
	ALSA: usb-audio: Fix possible race at sync of urb completions
	soundwire: debugfs: use controller id and link_id for debugfs
	power: reset: at91-reset: check properly the return value of devm_of_iomap
	scsi: ufs: core: Fix ufshcd_probe_hba() prototype to match the definition
	scsi: ufs: core: Stop clearing UNIT ATTENTIONS
	scsi: megaraid_sas: Fix concurrent access to ISR between IRQ polling and real interrupt
	scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
	driver core: Fix possible memory leak in device_link_add()
	arm: dts: omap3-gta04a4: accelerometer irq fix
	ASoC: SOF: topology: do not power down primary core during topology removal
	iio: st_pressure_spi: Add missing entries SPI to device ID table
	soc/tegra: Fix an error handling path in tegra_powergate_power_up()
	memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe
	clk: at91: check pmc node status before registering syscore ops
	powerpc/mem: Fix arch/powerpc/mm/mem.c:53:12: error: no previous prototype for 'create_section_mapping'
	video: fbdev: chipsfb: use memset_io() instead of memset()
	powerpc: fix unbalanced node refcount in check_kvm_guest()
	powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
	serial: 8250_dw: Drop wrong use of ACPI_PTR()
	usb: gadget: hid: fix error code in do_config()
	power: supply: rt5033_battery: Change voltage values to µV
	power: supply: max17040: fix null-ptr-deref in max17040_probe()
	scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn()
	RDMA/mlx4: Return missed an error if device doesn't support steering
	usb: musb: select GENERIC_PHY instead of depending on it
	staging: most: dim2: do not double-register the same device
	staging: ks7010: select CRYPTO_HASH/CRYPTO_MICHAEL_MIC
	RDMA/core: Set sgtable nents when using ib_dma_virt_map_sg()
	dyndbg: make dyndbg a known cli param
	powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10
	pinctrl: renesas: checker: Fix off-by-one bug in drive register check
	ARM: dts: stm32: Reduce DHCOR SPI NOR frequency to 50 MHz
	ARM: dts: stm32: fix STUSB1600 Type-C irq level on stm32mp15xx-dkx
	ARM: dts: stm32: fix SAI sub nodes register range
	ARM: dts: stm32: fix AV96 board SAI2 pin muxing on stm32mp15
	ASoC: cs42l42: Always configure both ASP TX channels
	ASoC: cs42l42: Correct some register default values
	ASoC: cs42l42: Defer probe if request_threaded_irq() returns EPROBE_DEFER
	soc: qcom: rpmhpd: Make power_on actually enable the domain
	soc: qcom: socinfo: add two missing PMIC IDs
	iio: buffer: Fix double-free in iio_buffers_alloc_sysfs_and_mask()
	usb: typec: STUSB160X should select REGMAP_I2C
	iio: adis: do not disabe IRQs in 'adis_init()'
	soundwire: bus: stop dereferencing invalid slave pointer
	scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
	scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
	serial: imx: fix detach/attach of serial console
	usb: dwc2: drd: fix dwc2_force_mode call in dwc2_ovr_init
	usb: dwc2: drd: fix dwc2_drd_role_sw_set when clock could be disabled
	usb: dwc2: drd: reset current session before setting the new one
	powerpc/booke: Disable STRICT_KERNEL_RWX, DEBUG_PAGEALLOC and KFENCE
	usb: dwc3: gadget: Skip resizing EP's TX FIFO if already resized
	firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available()
	soc: qcom: rpmhpd: fix sm8350_mxc's peer domain
	soc: qcom: apr: Add of_node_put() before return
	arm64: dts: qcom: pmi8994: Fix "eternal"->"external" typo in WLED node
	arm64: dts: qcom: sdm845: Use RPMH_CE_CLK macro directly
	arm64: dts: qcom: sdm845: Fix Qualcomm crypto engine bus clock
	pinctrl: equilibrium: Fix function addition in multiple groups
	ASoC: topology: Fix stub for snd_soc_tplg_component_remove()
	phy: qcom-qusb2: Fix a memory leak on probe
	phy: ti: gmii-sel: check of_get_address() for failure
	phy: qcom-qmp: another fix for the sc8180x PCIe definition
	phy: qcom-snps: Correct the FSEL_MASK
	phy: Sparx5 Eth SerDes: Fix return value check in sparx5_serdes_probe()
	serial: xilinx_uartps: Fix race condition causing stuck TX
	clk: at91: sam9x60-pll: use DIV_ROUND_CLOSEST_ULL
	clk: at91: clk-master: check if div or pres is zero
	clk: at91: clk-master: fix prescaler logic
	HID: u2fzero: clarify error check and length calculations
	HID: u2fzero: properly handle timeouts in usb_submit_urb
	powerpc/nohash: Fix __ptep_set_access_flags() and ptep_set_wrprotect()
	powerpc/book3e: Fix set_memory_x() and set_memory_nx()
	powerpc/44x/fsp2: add missing of_node_put
	powerpc/xmon: fix task state output
	ALSA: oxfw: fix functional regression for Mackie Onyx 1640i in v5.14 or later
	iommu/dma: Fix incorrect error return on iommu deferred attach
	powerpc: Don't provide __kernel_map_pages() without ARCH_SUPPORTS_DEBUG_PAGEALLOC
	ASoC: cs42l42: Correct configuring of switch inversion from ts-inv
	RDMA/hns: Fix initial arm_st of CQ
	RDMA/hns: Modify the value of MAX_LP_MSG_LEN to meet hardware compatibility
	ASoC: rsnd: Fix an error handling path in 'rsnd_node_count()'
	serial: cpm_uart: Protect udbg definitions by CONFIG_SERIAL_CPM_CONSOLE
	virtio_ring: check desc == NULL when using indirect with packed
	vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
	mips: cm: Convert to bitfield API to fix out-of-bounds access
	power: supply: bq27xxx: Fix kernel crash on IRQ handler register error
	RDMA/core: Require the driver to set the IOVA correctly during rereg_mr
	apparmor: fix error check
	rpmsg: Fix rpmsg_create_ept return when RPMSG config is not defined
	mtd: rawnand: intel: Fix potential buffer overflow in probe
	nfsd: don't alloc under spinlock in rpc_parse_scope_id
	rtc: ds1302: Add SPI ID table
	rtc: ds1390: Add SPI ID table
	rtc: pcf2123: Add SPI ID table
	remoteproc: imx_rproc: Fix TCM io memory type
	i2c: i801: Use PCI bus rescan mutex to protect P2SB access
	dmaengine: idxd: move out percpu_ref_exit() to ensure it's outside submission
	rtc: mcp795: Add SPI ID table
	Input: ariel-pwrbutton - add SPI device ID table
	i2c: mediatek: fixing the incorrect register offset
	NFS: Default change_attr_type to NFS4_CHANGE_TYPE_IS_UNDEFINED
	NFS: Don't set NFS_INO_DATA_INVAL_DEFER and NFS_INO_INVALID_DATA
	NFS: Ignore the directory size when marking for revalidation
	NFS: Fix dentry verifier races
	pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds
	drm/bridge/lontium-lt9611uxc: fix provided connector suport
	drm/plane-helper: fix uninitialized variable reference
	PCI: aardvark: Don't spam about PIO Response Status
	PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge
	opp: Fix return in _opp_add_static_v2()
	NFS: Fix deadlocks in nfs_scan_commit_list()
	sparc: Add missing "FORCE" target when using if_changed
	fs: orangefs: fix error return code of orangefs_revalidate_lookup()
	Input: st1232 - increase "wait ready" timeout
	drm/bridge: nwl-dsi: Add atomic_get_input_bus_fmts
	mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare()
	PCI: uniphier: Serialize INTx masking/unmasking and fix the bit operation
	mtd: rawnand: arasan: Prevent an unsupported configuration
	mtd: core: don't remove debugfs directory if device is in use
	remoteproc: Fix a memory leak in an error handling path in 'rproc_handle_vdev()'
	rtc: rv3032: fix error handling in rv3032_clkout_set_rate()
	dmaengine: at_xdmac: call at_xdmac_axi_config() on resume path
	dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro
	dmaengine: stm32-dma: fix stm32_dma_get_max_width
	NFS: Fix up commit deadlocks
	NFS: Fix an Oops in pnfs_mark_request_commit()
	Fix user namespace leak
	auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string
	auxdisplay: ht16k33: Connect backlight to fbdev
	auxdisplay: ht16k33: Fix frame buffer device blanking
	soc: fsl: dpaa2-console: free buffer before returning from dpaa2_console_read
	netfilter: nfnetlink_queue: fix OOB when mac header was cleared
	dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result`
	dmaengine: tegra210-adma: fix pm runtime unbalance
	dmanegine: idxd: fix resource free ordering on driver removal
	dmaengine: idxd: reconfig device after device reset command
	signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL)
	m68k: set a default value for MEMORY_RESERVE
	watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT
	ar7: fix kernel builds for compiler test
	scsi: target: core: Remove from tmr_list during LUN unlink
	scsi: qla2xxx: Relogin during fabric disturbance
	scsi: qla2xxx: Fix gnl list corruption
	scsi: qla2xxx: Turn off target reset during issue_lip
	scsi: qla2xxx: edif: Fix app start fail
	scsi: qla2xxx: edif: Fix app start delay
	scsi: qla2xxx: edif: Flush stale events and msgs on session down
	scsi: qla2xxx: edif: Increase ELS payload
	scsi: qla2xxx: edif: Fix EDIF bsg
	NFSv4: Fix a regression in nfs_set_open_stateid_locked()
	dmaengine: idxd: fix resource leak on dmaengine driver disable
	i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
	gpio: realtek-otto: fix GPIO line IRQ offset
	xen-pciback: Fix return in pm_ctrl_init()
	nbd: fix max value for 'first_minor'
	nbd: fix possible overflow for 'first_minor' in nbd_dev_add()
	io-wq: fix max-workers not correctly set on multi-node system
	net: davinci_emac: Fix interrupt pacing disable
	kselftests/net: add missed icmp.sh test to Makefile
	kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile
	kselftests/net: add missed SRv6 tests
	kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile
	kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile
	ethtool: fix ethtool msg len calculation for pause stats
	openrisc: fix SMP tlb flush NULL pointer dereference
	net: vlan: fix a UAF in vlan_dev_real_dev()
	net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
	ice: Fix replacing VF hardware MAC to existing MAC filter
	ice: Fix not stopping Tx queues for VFs
	kdb: Adopt scheduler's task classification
	ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
	PCI: j721e: Fix j721e_pcie_probe() error path
	nvdimm/btt: do not call del_gendisk() if not needed
	scsi: bsg: Fix errno when scsi_bsg_register_queue() fails
	scsi: ufs: ufshpb: Use proper power management API
	scsi: ufs: core: Fix NULL pointer dereference
	scsi: ufs: ufshpb: Properly handle max-single-cmd
	selftests: net: properly support IPv6 in GSO GRE test
	drm/nouveau/svm: Fix refcount leak bug and missing check against null bug
	nvdimm/pmem: cleanup the disk if pmem_release_disk() is yet assigned
	block/ataflop: use the blk_cleanup_disk() helper
	block/ataflop: add registration bool before calling del_gendisk()
	block/ataflop: provide a helper for cleanup up an atari disk
	ataflop: remove ataflop_probe_lock mutex
	PCI: Do not enable AtomicOps on VFs
	cpufreq: intel_pstate: Clear HWP desired on suspend/shutdown and offline
	net: phy: fix duplex out of sync problem while changing settings
	block: fix device_add_disk() kobject_create_and_add() error handling
	drm/ttm: remove ttm_bo_vm_insert_huge()
	bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
	octeontx2-pf: select CONFIG_NET_DEVLINK
	ALSA: memalloc: Catch call with NULL snd_dma_buffer pointer
	mfd: core: Add missing of_node_put for loop iteration
	mfd: cpcap: Add SPI device ID table
	mfd: sprd: Add SPI device ID table
	mfd: altera-sysmgr: Fix a mistake caused by resource_size conversion
	ACPI: PM: Fix device wakeup power reference counting error
	libbpf: Fix lookup_and_delete_elem_flags error reporting
	selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
	selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
	selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
	selftests/bpf/xdp_redirect_multi: Limit the tests in netns
	drm: fb_helper: improve CONFIG_FB dependency
	Revert "drm/imx: Annotate dma-fence critical section in commit path"
	drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
	can: etas_es58x: es58x_rx_err_msg(): fix memory leak in error path
	can: mcp251xfd: mcp251xfd_chip_start(): fix error handling for mcp251xfd_chip_rx_int_enable()
	mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
	zram: off by one in read_block_state()
	perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
	llc: fix out-of-bound array index in llc_sk_dev_hash()
	nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
	litex_liteeth: Fix a double free in the remove function
	arm64: arm64_ftr_reg->name may not be a human-readable string
	arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
	bpf, sockmap: Remove unhash handler for BPF sockmap usage
	bpf, sockmap: Fix race in ingress receive verdict with redirect to self
	bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
	bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
	dmaengine: stm32-dma: fix burst in case of unaligned memory address
	dmaengine: stm32-dma: avoid 64-bit division in stm32_dma_get_max_width
	gve: Fix off by one in gve_tx_timeout()
	drm/i915/fb: Fix rounding error in subsampled plane size calculation
	init: make unknown command line param message clearer
	seq_file: fix passing wrong private data
	drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
	net: dsa: mv88e6xxx: Don't support >1G speeds on 6191X on ports other than 10
	net/sched: sch_taprio: fix undefined behavior in ktime_mono_to_any
	net: hns3: fix ROCE base interrupt vector initialization bug
	net: hns3: fix pfc packet number incorrect after querying pfc parameters
	net: hns3: fix kernel crash when unload VF while it is being reset
	net: hns3: allow configure ETS bandwidth of all TCs
	net: stmmac: allow a tc-taprio base-time of zero
	net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
	net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
	vsock: prevent unnecessary refcnt inc for nonblocking connect
	net/smc: fix sk_refcnt underflow on linkdown and fallback
	cxgb4: fix eeprom len when diagnostics not implemented
	selftests/net: udpgso_bench_rx: fix port argument
	thermal: int340x: fix build on 32-bit targets
	smb3: do not error on fsync when readonly
	ARM: 9155/1: fix early early_iounmap()
	ARM: 9156/1: drop cc-option fallbacks for architecture selection
	parisc: Fix backtrace to always include init funtion names
	parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
	MIPS: fix duplicated slashes for Platform file path
	MIPS: fix *-pkg builds for loongson2ef platform
	MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL
	x86/mce: Add errata workaround for Skylake SKX37
	PCI/MSI: Move non-mask check back into low level accessors
	PCI/MSI: Destroy sysfs before freeing entries
	KVM: x86: move guest_pv_has out of user_access section
	posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
	irqchip/sifive-plic: Fixup EOI failed when masked
	f2fs: should use GFP_NOFS for directory inodes
	f2fs: include non-compressed blocks in compr_written_block
	f2fs: fix UAF in f2fs_available_free_memory
	ceph: fix mdsmap decode when there are MDS's beyond max_mds
	erofs: fix unsafe pagevec reuse of hooked pclusters
	drm/i915/guc: Fix blocked context accounting
	block: Hold invalidate_lock in BLKDISCARD ioctl
	block: Hold invalidate_lock in BLKZEROOUT ioctl
	block: Hold invalidate_lock in BLKRESETZONE ioctl
	ksmbd: Fix buffer length check in fsctl_validate_negotiate_info()
	ksmbd: don't need 8byte alignment for request length in ksmbd_check_message
	dmaengine: ti: k3-udma: Set bchan to NULL if a channel request fail
	dmaengine: ti: k3-udma: Set r/tchan or rflow to NULL if request fail
	dmaengine: bestcomm: fix system boot lockups
	net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE
	9p/net: fix missing error check in p9_check_errors
	mm/filemap.c: remove bogus VM_BUG_ON
	memcg: prohibit unconditional exceeding the limit of dying tasks
	mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
	mm, oom: do not trigger out_of_memory from the #PF
	mm, thp: lock filemap when truncating page cache
	mm, thp: fix incorrect unmap behavior for private pages
	mfd: dln2: Add cell for initializing DLN2 ADC
	video: backlight: Drop maximum brightness override for brightness zero
	bcache: fix use-after-free problem in bcache_device_free()
	bcache: Revert "bcache: use bvec_virt"
	PM: sleep: Avoid calling put_device() under dpm_list_mtx
	s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove
	s390/cio: check the subchannel validity for dev_busid
	s390/tape: fix timer initialization in tape_std_assign()
	s390/ap: Fix hanging ioctl caused by orphaned replies
	s390/cio: make ccw_device_dma_* more robust
	remoteproc: elf_loader: Fix loading segment when is_iomem true
	remoteproc: Fix the wrong default value of is_iomem
	remoteproc: imx_rproc: Fix ignoring mapping vdev regions
	remoteproc: imx_rproc: Fix rsc-table name
	mtd: rawnand: fsmc: Fix use of SM ORDER
	mtd: rawnand: ams-delta: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: xway: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: mpc5121: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: gpio: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: pasemi: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: orion: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: plat_nand: Keep the driver compatible with on-die ECC engines
	mtd: rawnand: au1550nd: Keep the driver compatible with on-die ECC engines
	powerpc/vas: Fix potential NULL pointer dereference
	powerpc/bpf: Fix write protecting JIT code
	powerpc/32e: Ignore ESR in instruction storage interrupt handler
	powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
	powerpc/security: Use a mutex for interrupt exit code patching
	powerpc/64s/interrupt: Fix check_return_regs_valid() false positive
	powerpc/pseries/mobility: ignore ibm, platform-facilities updates
	powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
	drm/sun4i: Fix macros in sun8i_csc.h
	PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
	PCI: aardvark: Fix PCIe Max Payload Size setting
	SUNRPC: Partial revert of commit 6f9f17287e
	drm/amd/display: Look at firmware version to determine using dmub on dcn21
	media: vidtv: move kfree(dvb) to vidtv_bridge_dev_release()
	cifs: fix memory leak of smb3_fs_context_dup::server_hostname
	ath10k: fix invalid dma_addr_t token assignment
	mmc: moxart: Fix null pointer dereference on pointer host
	selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
	selftests/bpf: Fix also no-alu32 strobemeta selftest
	arch/cc: Introduce a function to check for confidential computing features
	x86/sev: Add an x86 version of cc_platform_has()
	x86/sev: Make the #VC exception stacks part of the default stacks storage
	media: videobuf2: always set buffer vb2 pointer
	media: videobuf2-dma-sg: Fix buf->vb NULL pointer dereference
	Linux 5.15.3

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I09574eb6b4fbe930bd13f932cc618846972fcc27
2021-11-19 15:38:07 +01:00
Jakub Kicinski
ae11b215ae net: sched: update default qdisc visibility after Tx queue cnt changes
[ Upstream commit 1e080f17750d1083e8a32f7b350584ae1cd7ff20 ]

mq / mqprio make the default child qdiscs visible. They only do
so for the qdiscs which are within real_num_tx_queues when the
device is registered. Depending on order of calls in the driver,
or if user space changes config via ethtool -L the number of
qdiscs visible under tc qdisc show will differ from the number
of queues. This is confusing to users and potentially to system
configuration scripts which try to make sure qdiscs have the
right parameters.

Add a new Qdisc_ops callback and make relevant qdiscs TTRT.

Note that this uncovers the "shortcut" created by
commit 1f27cde313 ("net: sched: use pfifo_fast for non real queues")
The default child qdiscs beyond initial real_num_tx are always
pfifo_fast, no matter what the sysfs setting is. Fixing this
gets a little tricky because we'd need to keep a reference
on whatever the default qdisc was at the time of creation.
In practice this is likely an non-issue the qdiscs likely have
to be configured to non-default settings, so whatever user space
is doing such configuration can replace the pfifos... now that
it will see them.

Reported-by: Matthew Massey <matthewmassey@fb.com>
Reviewed-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:10 +01:00
Greg Kroah-Hartman
d1a66e7942 Merge tag 'v5.15' into android-mainline
Linux 5.15

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I81d96ada6b66bcf73d192ddde9018f1804e7d90b
2021-11-01 07:40:54 +01:00
Cyril Strejc
9122a70a63 net: multicast: calculate csum of looped-back and forwarded packets
During a testing of an user-space application which transmits UDP
multicast datagrams and utilizes multicast routing to send the UDP
datagrams out of defined network interfaces, I've found a multicast
router does not fill-in UDP checksum into locally produced, looped-back
and forwarded UDP datagrams, if an original output NIC the datagrams
are sent to has UDP TX checksum offload enabled.

The datagrams are sent malformed out of the NIC the datagrams have been
forwarded to.

It is because:

1. If TX checksum offload is enabled on the output NIC, UDP checksum
   is not calculated by kernel and is not filled into skb data.

2. dev_loopback_xmit(), which is called solely by
   ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY
   unconditionally.

3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except
   CHECKSUM_COMPLETE"), the ip_summed value is preserved during
   forwarding.

4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during
   a packet egress.

The minimum fix in dev_loopback_xmit():

1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the
   case when the original output NIC has TX checksum offload enabled.
   The effects are:

     a) If the forwarding destination interface supports TX checksum
        offloading, the NIC driver is responsible to fill-in the
        checksum.

     b) If the forwarding destination interface does NOT support TX
        checksum offloading, checksums are filled-in by kernel before
        skb is submitted to the NIC driver.

     c) For local delivery, checksum validation is skipped as in the
        case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary().

2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It
   means, for CHECKSUM_NONE, the behavior is unmodified and is there
   to skip a looped-back packet local delivery checksum validation.

Signed-off-by: Cyril Strejc <cyril.strejc@skoda.cz>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:09:22 +01:00
Michael Chan
0c57eeecc5 net: Prevent infinite while loop in skb_tx_hash()
Drivers call netdev_set_num_tc() and then netdev_set_tc_queue()
to set the queue count and offset for each TC.  So the queue count
and offset for the TCs may be zero for a short period after dev->num_tc
has been set.  If a TX packet is being transmitted at this time in the
code path netdev_pick_tx() -> skb_tx_hash(), skb_tx_hash() may see
nonzero dev->num_tc but zero qcount for the TC.  The while loop that
keeps looping while hash >= qcount will not end.

Fix it by checking the TC's qcount to be nonzero before using it.

Fixes: eadec877ce ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 15:58:01 +01:00
Greg Kroah-Hartman
f30aef53ca Merge f9e36107ec ("Merge tag 'for-5.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux") into android-mainline
Steps on the way to 5.15-rc3

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2c129754b55c04be6f83c9d551d024f72b64cb04
2021-09-24 10:36:36 +02:00
Xuan Zhuo
3765996e4f napi: fix race inside napi_enable
The process will cause napi.state to contain NAPI_STATE_SCHED and
not in the poll_list, which will cause napi_disable() to get stuck.

The prefix "NAPI_STATE_" is removed in the figure below, and
NAPI_STATE_HASHED is ignored in napi.state.

                      CPU0       |                   CPU1       | napi.state
===============================================================================
napi_disable()                   |                              | SCHED | NPSVC
napi_enable()                    |                              |
{                                |                              |
    smp_mb__before_atomic();     |                              |
    clear_bit(SCHED, &n->state); |                              | NPSVC
                                 | napi_schedule_prep()         | SCHED | NPSVC
                                 | napi_poll()                  |
                                 |   napi_complete_done()       |
                                 |   {                          |
                                 |      if (n->state & (NPSVC | | (1)
                                 |               _BUSY_POLL)))  |
                                 |           return false;      |
                                 |     ................         |
                                 |   }                          | SCHED | NPSVC
                                 |                              |
    clear_bit(NPSVC, &n->state); |                              | SCHED
}                                |                              |
                                 |                              |
napi_schedule_prep()             |                              | SCHED | MISSED (2)

(1) Here return direct. Because of NAPI_STATE_NPSVC exists.
(2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list

Since NAPI_STATE_SCHED already exists and napi is not in the
sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always
exist.

1. This will cause this queue to no longer receive packets.
2. If you encounter napi_disable under the protection of rtnl_lock, it
   will cause the entire rtnl_lock to be locked, affecting the overall
   system.

This patch uses cmpxchg to implement napi_enable(), which ensures that
there will be no race due to the separation of clear two bits.

Fixes: 2d8bff1269 ("netpoll: Close race condition between poll_one_napi and napi_disable")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 09:41:29 +01:00
Greg Kroah-Hartman
bc2f6edebd Merge 9e9fb7655e ("Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next") into android-mainline
Steps on the way to 5.15-rc1

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I49577d606b2710975407eae3fee60bc331397810
2021-09-07 14:40:30 +02:00
Changbin Du
afa79d08c6 net: in_irq() cleanup
Replace the obsolete and ambiguos macro in_irq() with new
macro in_hardirq().

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Link: https://lore.kernel.org/r/20210813145749.86512-1-changbin.du@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 14:09:19 -07:00
Jakub Kicinski
d1a4e0a957 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
bpf-next 2021-08-10

We've added 31 non-merge commits during the last 8 day(s) which contain
a total of 28 files changed, 3644 insertions(+), 519 deletions(-).

1) Native XDP support for bonding driver & related BPF selftests, from Jussi Maki.

2) Large batch of new BPF JIT tests for test_bpf.ko that came out as a result from
   32-bit MIPS JIT development, from Johan Almbladh.

3) Rewrite of netcnt BPF selftest and merge into test_progs, from Stanislav Fomichev.

4) Fix XDP bpf_prog_test_run infra after net to net-next merge, from Andrii Nakryiko.

5) Follow-up fix in unix_bpf_update_proto() to enforce socket type, from Cong Wang.

6) Fix bpf-iter-tcp4 selftest to print the correct dest IP, from Jose Blanquicet.

7) Various misc BPF XDP sample improvements, from Niklas Söderlund, Matthew Cover,
   and Muhammad Falak R Wani.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (31 commits)
  bpf, tests: Add tail call test suite
  bpf, tests: Add tests for BPF_CMPXCHG
  bpf, tests: Add tests for atomic operations
  bpf, tests: Add test for 32-bit context pointer argument passing
  bpf, tests: Add branch conversion JIT test
  bpf, tests: Add word-order tests for load/store of double words
  bpf, tests: Add tests for ALU operations implemented with function calls
  bpf, tests: Add more ALU64 BPF_MUL tests
  bpf, tests: Add more BPF_LSH/RSH/ARSH tests for ALU64
  bpf, tests: Add more ALU32 tests for BPF_LSH/RSH/ARSH
  bpf, tests: Add more tests of ALU32 and ALU64 bitwise operations
  bpf, tests: Fix typos in test case descriptions
  bpf, tests: Add BPF_MOV tests for zero and sign extension
  bpf, tests: Add BPF_JMP32 test cases
  samples, bpf: Add an explict comment to handle nested vlan tagging.
  selftests/bpf: Add tests for XDP bonding
  selftests/bpf: Fix xdp_tx.c prog section name
  net, core: Allow netdev_lower_get_next_private_rcu in bh context
  bpf, devmap: Exclude XDP broadcast to master device
  net, bonding: Add XDP support to the bonding driver
  ...
====================

Link: https://lore.kernel.org/r/20210810130038.16927-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10 07:53:22 -07:00
Jussi Maki
6891866999 net, core: Allow netdev_lower_get_next_private_rcu in bh context
For the XDP bonding slave lookup to work in the NAPI poll context in which
the redudant rcu_read_lock() has been removed we have to follow the same
approach as in 694cea395f ("bpf: Allow RCU-protected lookups to happen
from bh context") and modify the WARN_ON to also check rcu_read_lock_bh_held().

Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210731055738.16820-6-joamaki@gmail.com
2021-08-09 23:25:15 +02:00
Jussi Maki
879af96ffd net, core: Add support for XDP redirection to slave device
This adds the ndo_xdp_get_xmit_slave hook for transforming XDP_TX
into XDP_REDIRECT after BPF program run when the ingress device
is a bond slave.

The dev_xdp_prog_count is exposed so that slave devices can be checked
for loaded XDP programs in order to avoid the situation where both
bond master and slave have programs loaded according to xdp_state.

Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Link: https://lore.kernel.org/bpf/20210731055738.16820-3-joamaki@gmail.com
2021-08-09 23:15:35 +02:00
Yajun Deng
1160dfa178 net: Remove redundant if statements
The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05 13:27:50 +01:00
Sebastian Andrzej Siewior
372bbdd5bb net: Replace deprecated CPU-hotplug functions.
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-04 13:47:50 -07:00
Jakub Kicinski
271e5b7d00 net: add netif_set_real_num_queues() for device reconfig
netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues()
can fail which breaks drivers trying to implement reconfiguration
in a way that can't leave the device half-broken. In other words
those functions are incompatible with prepare/commit approach.

Luckily setting real number of queues can fail only if the number
is increased, meaning that if we order operations correctly we
can guarantee ending up with either new config (success), or
the old one (on error).

Provide a helper implementing such logic so that drivers don't
have to duplicate it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:05:13 +01:00
Arnd Bergmann
5ea2f5ffde move netdev_boot_setup into Space.c
This is now only used by a handful of old ISA drivers,
and can be moved into the file they already all depend on.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03 13:05:26 +01:00
Paolo Abeni
a432934a30 sk_buff: avoid potentially clearing 'slow_gro' field
If skb_dst_set_noref() is invoked with a NULL dst, the 'slow_gro'
field is cleared, too. That could lead to wrong behavior if
the skb later enters the GRO stage.

Fix the potential issue replacing preserving a non-zero value of
the 'slow_gro' field.

Additionally, fix a comment typo.

Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 8a886b142b ("sk_buff: track dst status in slow_gro")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/aa42529252dc8bb02bd42e8629427040d1058537.1627662501.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-30 20:48:06 +02:00
Davide Caratti
3aa2605594 net/sched: store the last executed chain also for clsact egress
currently, only 'ingress' and 'clsact ingress' qdiscs store the tc 'chain
id' in the skb extension. However, userspace programs (like ovs) are able
to setup egress rules, and datapath gets confused in case it doesn't find
the 'chain id' for a packet that's "recirculated" by tc.
Change tcf_classify() to have the same semantic as tcf_classify_ingress()
so that a single function can be called in ingress / egress, using the tc
ingress / egress block respectively.

Suggested-by: Alaa Hleilel <alaa@nvidia.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29 22:17:37 +01:00
Paolo Abeni
5e10da5385 skbuff: allow 'slow_gro' for skb carring sock reference
This change leverages the infrastructure introduced by the previous
patches to allow soft devices passing to the GRO engine owned skbs
without impacting the fast-path.

It's up to the GRO caller ensuring the slow_gro bit validity before
invoking the GRO engine. The new helper skb_prepare_for_gro() is
introduced for that goal.

On slow_gro, skbs are aggregated only with equal sk.
Additionally, skb truesize on GRO recycle and free is correctly
updated so that sk wmem is not changed by the GRO processing.

rfc-> v1:
 - fixed bad truesize on dev_gro_receive NAPI_FREE
 - use the existing state bit

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29 12:18:12 +01:00
Paolo Abeni
9efb4b5baf net: optimize GRO for the common case.
After the previous patches, at GRO time, skb->slow_gro is
usually 0, unless the packets comes from some H/W offload
slowpath or tunnel.

We can optimize the GRO code assuming !skb->slow_gro is likely.
This remove multiple conditionals in the most common path, at the
price of an additional one when we hit the above "slow-paths".

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29 12:18:12 +01:00
Greg Kroah-Hartman
e9975a8f2e Merge tag 'v5.14-rc3' into android-mainline
Linux 5.14-rc3

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5db742d6b8faf7b40efe1cb3b5beae486010f3fd
2021-07-28 14:45:26 +02:00
David S. Miller
5af84df962 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts are simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:13:06 +01:00
Lee Jones
946e465c81 Merge tag 'v5.14-rc2' into android-mainline
Linux 5.14-rc2

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: Ia2131de59daa96610741f5a0ff267b0d08697023
2021-07-22 14:14:38 +01:00
Vasily Averin
c948f51c16 memcg: enable accounting for net_device and Tx/Rx queues
Container netadmin can create a lot of fake net devices,
then create a new net namespace and repeat it again and again.
Net device can request the creation of up to 4096 tx and rx queues,
and force kernel to allocate up to several tens of megabytes memory
per net device.

It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20 06:00:38 -07:00
David S. Miller
82a1ffe57e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-07-15

The following pull-request contains BPF updates for your *net-next* tree.

We've added 45 non-merge commits during the last 15 day(s) which contain
a total of 52 files changed, 3122 insertions(+), 384 deletions(-).

The main changes are:

1) Introduce bpf timers, from Alexei.

2) Add sockmap support for unix datagram socket, from Cong.

3) Fix potential memleak and UAF in the verifier, from He.

4) Add bpf_get_func_ip helper, from Jiri.

5) Improvements to generic XDP mode, from Kumar.

6) Support for passing xdp_md to XDP programs in bpf_prog_run, from Zvi.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15 22:40:10 -07:00
David S. Miller
20192d9c9f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Andrii Nakryiko says:

====================
pull-request: bpf 2021-07-15

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 5 day(s) which contain
a total of 9 files changed, 37 insertions(+), 15 deletions(-).

The main changes are:

1) Fix NULL pointer dereference in BPF_TEST_RUN for BPF_XDP_DEVMAP and
   BPF_XDP_CPUMAP programs, from Xuan Zhuo.

2) Fix use-after-free of net_device in XDP bpf_link, from Xuan Zhuo.

3) Follow-up fix to subprog poke descriptor use-after-free problem, from
   Daniel Borkmann and John Fastabend.

4) Fix out-of-range array access in s390 BPF JIT backend, from Colin Ian King.

5) Fix memory leak in BPF sockmap, from John Fastabend.

6) Fix for sockmap to prevent proc stats reporting bug, from John Fastabend
   and Jakub Sitnicki.

7) Fix NULL pointer dereference in bpftool, from Tobias Klauser.

8) AF_XDP documentation fixes, from Baruch Siach.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15 14:39:45 -07:00
Qitao Xu
70713dddf3 net_sched: introduce tracepoint trace_qdisc_enqueue()
Tracepoint trace_qdisc_enqueue() is introduced to trace skb at
the entrance of TC layer on TX side. This is similar to
trace_qdisc_dequeue():

1. For both we only trace successful cases. The failure cases
   can be traced via trace_kfree_skb().

2. They are called at entrance or exit of TC layer, not for each
   ->enqueue() or ->dequeue(). This is intentional, because
   we want to make trace_qdisc_enqueue() symmetric to
   trace_qdisc_dequeue(), which is easier to use.

The return value of qdisc_enqueue() is not interesting here,
we have Qdisc's drop packets in ->dequeue(), it is impossible to
trace them even if we have the return value, the only way to trace
them is tracing kfree_skb().

We only add information we need to trace ring buffer. If any other
information is needed, it is easy to extend it without breaking ABI,
see commit 3dd344ea84 ("net: tracepoint: exposing sk_family in all
tcp:tracepoints").

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Qitao Xu <qitao.xu@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15 10:32:38 -07:00
Xuan Zhuo
5acc7d3e8d xdp, net: Fix use-after-free in bpf_xdp_link_release
The problem occurs between dev_get_by_index() and dev_xdp_attach_link().
At this point, dev_xdp_uninstall() is called. Then xdp link will not be
detached automatically when dev is released. But link->dev already
points to dev, when xdp link is released, dev will still be accessed,
but dev has been released.

dev_get_by_index()        |
link->dev = dev           |
                          |      rtnl_lock()
                          |      unregister_netdevice_many()
                          |          dev_xdp_uninstall()
                          |      rtnl_unlock()
rtnl_lock();              |
dev_xdp_attach_link()     |
rtnl_unlock();            |
                          |      netdev_run_todo() // dev released
bpf_xdp_link_release()    |
    /* access dev.        |
       use-after-free */  |

[   45.966867] BUG: KASAN: use-after-free in bpf_xdp_link_release+0x3b8/0x3d0
[   45.967619] Read of size 8 at addr ffff00000f9980c8 by task a.out/732
[   45.968297]
[   45.968502] CPU: 1 PID: 732 Comm: a.out Not tainted 5.13.0+ #22
[   45.969222] Hardware name: linux,dummy-virt (DT)
[   45.969795] Call trace:
[   45.970106]  dump_backtrace+0x0/0x4c8
[   45.970564]  show_stack+0x30/0x40
[   45.970981]  dump_stack_lvl+0x120/0x18c
[   45.971470]  print_address_description.constprop.0+0x74/0x30c
[   45.972182]  kasan_report+0x1e8/0x200
[   45.972659]  __asan_report_load8_noabort+0x2c/0x50
[   45.973273]  bpf_xdp_link_release+0x3b8/0x3d0
[   45.973834]  bpf_link_free+0xd0/0x188
[   45.974315]  bpf_link_put+0x1d0/0x218
[   45.974790]  bpf_link_release+0x3c/0x58
[   45.975291]  __fput+0x20c/0x7e8
[   45.975706]  ____fput+0x24/0x30
[   45.976117]  task_work_run+0x104/0x258
[   45.976609]  do_notify_resume+0x894/0xaf8
[   45.977121]  work_pending+0xc/0x328
[   45.977575]
[   45.977775] The buggy address belongs to the page:
[   45.978369] page:fffffc00003e6600 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4f998
[   45.979522] flags: 0x7fffe0000000000(node=0|zone=0|lastcpupid=0x3ffff)
[   45.980349] raw: 07fffe0000000000 fffffc00003e6708 ffff0000dac3c010 0000000000000000
[   45.981309] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[   45.982259] page dumped because: kasan: bad access detected
[   45.982948]
[   45.983153] Memory state around the buggy address:
[   45.983753]  ffff00000f997f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   45.984645]  ffff00000f998000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   45.985533] >ffff00000f998080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   45.986419]                                               ^
[   45.987112]  ffff00000f998100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   45.988006]  ffff00000f998180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   45.988895] ==================================================================
[   45.989773] Disabling lock debugging due to kernel taint
[   45.990552] Kernel panic - not syncing: panic_on_warn set ...
[   45.991166] CPU: 1 PID: 732 Comm: a.out Tainted: G    B             5.13.0+ #22
[   45.991929] Hardware name: linux,dummy-virt (DT)
[   45.992448] Call trace:
[   45.992753]  dump_backtrace+0x0/0x4c8
[   45.993208]  show_stack+0x30/0x40
[   45.993627]  dump_stack_lvl+0x120/0x18c
[   45.994113]  dump_stack+0x1c/0x34
[   45.994530]  panic+0x3a4/0x7d8
[   45.994930]  end_report+0x194/0x198
[   45.995380]  kasan_report+0x134/0x200
[   45.995850]  __asan_report_load8_noabort+0x2c/0x50
[   45.996453]  bpf_xdp_link_release+0x3b8/0x3d0
[   45.997007]  bpf_link_free+0xd0/0x188
[   45.997474]  bpf_link_put+0x1d0/0x218
[   45.997942]  bpf_link_release+0x3c/0x58
[   45.998429]  __fput+0x20c/0x7e8
[   45.998833]  ____fput+0x24/0x30
[   45.999247]  task_work_run+0x104/0x258
[   45.999731]  do_notify_resume+0x894/0xaf8
[   46.000236]  work_pending+0xc/0x328
[   46.000697] SMP: stopping secondary CPUs
[   46.001226] Dumping ftrace buffer:
[   46.001663]    (ftrace buffer empty)
[   46.002110] Kernel Offset: disabled
[   46.002545] CPU features: 0x00000001,23202c00
[   46.003080] Memory Limit: none

Fixes: aa8d3a716b ("bpf, xdp: Add bpf_link-based XDP attachment API")
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210710031635.41649-1-xuanzhuo@linux.alibaba.com
2021-07-13 08:22:31 -07:00
Lee Jones
8e658623d4 Merge commit c288d9cd71 ("Merge tag 'for-5.14/io_uring-2021-06-30' of git://git.kernel.dk/linux-block") into android-mainline
Another small step en route to v5.14-rc1

Change-Id: I24899ab78da7d367574ed69ceaa82ab0837d9556
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2021-07-12 10:02:27 +01:00
Antoine Tenart
28b34f01a7 net: do not reuse skbuff allocated from skbuff_fclone_cache in the skb cache
Some socket buffers allocated in the fclone cache (in __alloc_skb) can
end-up in the following path[1]:

napi_skb_finish
  __kfree_skb_defer
    napi_skb_cache_put

The issue is napi_skb_cache_put is not fclone friendly and will put
those skbuff in the skb cache to be reused later, although this cache
only expects skbuff allocated from skbuff_head_cache. When this happens
the skbuff is eventually freed using the wrong origin cache, and we can
see traces similar to:

[ 1223.947534] cache_from_obj: Wrong slab cache. skbuff_head_cache but object is from skbuff_fclone_cache
[ 1223.948895] WARNING: CPU: 3 PID: 0 at mm/slab.h:442 kmem_cache_free+0x251/0x3e0
[ 1223.950211] Modules linked in:
[ 1223.950680] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.13.0+ #474
[ 1223.951587] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-3.fc34 04/01/2014
[ 1223.953060] RIP: 0010:kmem_cache_free+0x251/0x3e0

Leading sometimes to other memory related issues.

Fix this by using __kfree_skb for fclone skbuff, similar to what is done
the other place __kfree_skb_defer is called.

[1] At least in setups using veth pairs and tunnels. Building a kernel
    with KASAN we can for example see packets allocated in
    sk_stream_alloc_skb hit the above path and later the issue arises
    when the skbuff is reused.

Fixes: 9243adfc31 ("skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing")
Cc: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09 11:26:27 -07:00
Lee Jones
7889eed917 Merge 54a728dc5e ("Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") into android-mainline
A little step towards 5.14-rc1

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Change-Id: I2573a6df9f4e7b67194327ac6db6082a574d2809
2021-07-09 10:55:21 +01:00
Bae Soukjin
1da7910c1d ANDROID: vendor_hooks: Add vendor hook to the net
android_vh_ptype_head:
    To add a debugging chain to ptype list

  android_vh_kfree_skb
    To sniff the dropped packet at kernel network

Bug: 163716381

Signed-off-by: Bae Soukjin <soukjin.bae@samsung.com>
Change-Id: Ide80bf0a129da31a1824d4a33026ac42be327361
(cherry picked from commit d88b2969cf)
(cherry picked from commit a8021ba684)
2021-07-09 04:37:28 +00:00
Florian Fainelli
9615fe36b3 skbuff: Fix build with SKB extensions disabled
We will fail to build with CONFIG_SKB_EXTENSIONS disabled after
8550ff8d8c ("skbuff: Release nfct refcount on napi stolen or re-used
skbs") since there is an unconditionally use of skb_ext_find() without
an appropriate stub. Simply build the code conditionally and properly
guard against both COFNIG_SKB_EXTENSIONS as well as
CONFIG_NET_TC_SKB_EXT being disabled.

Fixes: Fixes: 8550ff8d8c ("skbuff: Release nfct refcount on napi stolen or re-used skbs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-08 00:07:14 -07:00
Kumar Kartikeya Dwivedi
2ea5eabaf0 bpf: devmap: Implement devmap prog execution for generic XDP
This lifts the restriction on running devmap BPF progs in generic
redirect mode. To match native XDP behavior, it is invoked right before
generic_xdp_tx is called, and only supports XDP_PASS/XDP_ABORTED/
XDP_DROP actions.

We also return 0 even if devmap program drops the packet, as
semantically redirect has already succeeded and the devmap prog is the
last point before TX of the packet to device where it can deliver a
verdict on the packet.

This also means it must take care of freeing the skb, as
xdp_do_generic_redirect callers only do that in case an error is
returned.

Since devmap entry prog is supported, remove the check in
generic_xdp_install entirely.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210702111825.491065-5-memxor@gmail.com
2021-07-07 20:01:45 -07:00
Kumar Kartikeya Dwivedi
11941f8a85 bpf: cpumap: Implement generic cpumap
This change implements CPUMAP redirect support for generic XDP programs.
The idea is to reuse the cpu map entry's queue that is used to push
native xdp frames for redirecting skb to a different CPU. This will
match native XDP behavior (in that RPS is invoked again for packet
reinjected into networking stack).

To be able to determine whether the incoming skb is from the driver or
cpumap, we reuse skb->redirected bit that skips generic XDP processing
when it is set. To always make use of this, CONFIG_NET_REDIRECT guard on
it has been lifted and it is always available.

>From the redirect side, we add the skb to ptr_ring with its lowest bit
set to 1.  This should be safe as skb is not 1-byte aligned. This allows
kthread to discern between xdp_frames and sk_buff. On consumption of the
ptr_ring item, the lowest bit is unset.

In the end, the skb is simply added to the list that kthread is anyway
going to maintain for xdp_frames converted to skb, and then received
again by using netif_receive_skb_list.

Bulking optimization for generic cpumap is left as an exercise for a
future patch for now.

Since cpumap entry progs are now supported, also remove check in
generic_xdp_install for the cpumap.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20210702111825.491065-4-memxor@gmail.com
2021-07-07 20:01:45 -07:00
Kumar Kartikeya Dwivedi
fe21cb91ae net: core: Split out code to run generic XDP prog
This helper can later be utilized in code that runs cpumap and devmap
programs in generic redirect mode and adjust skb based on changes made
to xdp_buff.

When returning XDP_REDIRECT/XDP_TX, it invokes __skb_push, so whenever a
generic redirect path invokes devmap/cpumap prog if set, it must
__skb_pull again as we expect mac header to be pulled.

It also drops the skb_reset_mac_len call after do_xdp_generic, as the
mac_header and network_header are advanced by the same offset, so the
difference (mac_len) remains constant.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210702111825.491065-2-memxor@gmail.com
2021-07-07 20:01:45 -07:00
Paul Blakey
8550ff8d8c skbuff: Release nfct refcount on napi stolen or re-used skbs
When multiple SKBs are merged to a new skb under napi GRO,
or SKB is re-used by napi, if nfct was set for them in the
driver, it will not be released while freeing their stolen
head state or on re-use.

Release nfct on napi's stolen or re-used SKBs, and
in gro_list_prepare, check conntrack metadata diff.

Fixes: 5c6b946047 ("net/mlx5e: CT: Handle misses after executing CT action")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-06 10:26:29 -07:00
Linus Torvalds
dbe69e4337 Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core:

   - BPF:
      - add syscall program type and libbpf support for generating
        instructions and bindings for in-kernel BPF loaders (BPF loaders
        for BPF), this is a stepping stone for signed BPF programs
      - infrastructure to migrate TCP child sockets from one listener to
        another in the same reuseport group/map to improve flexibility
        of service hand-off/restart
      - add broadcast support to XDP redirect

   - allow bypass of the lockless qdisc to improving performance (for
     pktgen: +23% with one thread, +44% with 2 threads)

   - add a simpler version of "DO_ONCE()" which does not require jump
     labels, intended for slow-path usage

   - virtio/vsock: introduce SOCK_SEQPACKET support

   - add getsocketopt to retrieve netns cookie

   - ip: treat lowest address of a IPv4 subnet as ordinary unicast
     address allowing reclaiming of precious IPv4 addresses

   - ipv6: use prandom_u32() for ID generation

   - ip: add support for more flexible field selection for hashing
     across multi-path routes (w/ offload to mlxsw)

   - icmp: add support for extended RFC 8335 PROBE (ping)

   - seg6: add support for SRv6 End.DT46 behavior

   - mptcp:
      - DSS checksum support (RFC 8684) to detect middlebox meddling
      - support Connection-time 'C' flag
      - time stamping support

   - sctp: packetization Layer Path MTU Discovery (RFC 8899)

   - xfrm: speed up state addition with seq set

   - WiFi:
      - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
      - aggregation handling improvements for some drivers
      - minstrel improvements for no-ack frames
      - deferred rate control for TXQs to improve reaction times
      - switch from round robin to virtual time-based airtime scheduler

   - add trace points:
      - tcp checksum errors
      - openvswitch - action execution, upcalls
      - socket errors via sk_error_report

  Device APIs:

   - devlink: add rate API for hierarchical control of max egress rate
     of virtual devices (VFs, SFs etc.)

   - don't require RCU read lock to be held around BPF hooks in NAPI
     context

   - page_pool: generic buffer recycling

  New hardware/drivers:

   - mobile:
      - iosm: PCIe Driver for Intel M.2 Modem
      - support for Qualcomm MSM8998 (ipa)

   - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

   - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

   - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

   - NXP SJA1110 Automotive Ethernet 10-port switch

   - Qualcomm QCA8327 switch support (qca8k)

   - Mikrotik 10/25G NIC (atl1c)

  Driver changes:

   - ACPI support for some MDIO, MAC and PHY devices from Marvell and
     NXP (our first foray into MAC/PHY description via ACPI)

   - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

   - Mellanox/Nvidia NIC (mlx5)
      - NIC VF offload of L2 bridging
      - support IRQ distribution to Sub-functions

   - Marvell (prestera):
      - add flower and match all
      - devlink trap
      - link aggregation

   - Netronome (nfp): connection tracking offload

   - Intel 1GE (igc): add AF_XDP support

   - Marvell DPU (octeontx2): ingress ratelimit offload

   - Google vNIC (gve): new ring/descriptor format support

   - Qualcomm mobile (rmnet & ipa): inline checksum offload support

   - MediaTek WiFi (mt76)
      - mt7915 MSI support
      - mt7915 Tx status reporting
      - mt7915 thermal sensors support
      - mt7921 decapsulation offload
      - mt7921 enable runtime pm and deep sleep

   - Realtek WiFi (rtw88)
      - beacon filter support
      - Tx antenna path diversity support
      - firmware crash information via devcoredump

   - Qualcomm WiFi (wcn36xx)
      - Wake-on-WLAN support with magic packets and GTK rekeying

   - Micrel PHY (ksz886x/ksz8081): add cable test support"

* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
  tcp: change ICSK_CA_PRIV_SIZE definition
  tcp_yeah: check struct yeah size at compile time
  gve: DQO: Fix off by one in gve_rx_dqo()
  stmmac: intel: set PCI_D3hot in suspend
  stmmac: intel: Enable PHY WOL option in EHL
  net: stmmac: option to enable PHY WOL with PMT enabled
  net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
  net: use netdev_info in ndo_dflt_fdb_{add,del}
  ptp: Set lookup cookie when creating a PTP PPS source.
  net: sock: add trace for socket errors
  net: sock: introduce sk_error_report
  net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
  net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: include bridge addresses which are local in the host fdb list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: install the host MDB and FDB entries in the master's RX filter
  net: dsa: reference count the FDB addresses at the cross-chip notifier level
  net: dsa: introduce a separate cross-chip notifier type for host FDBs
  net: dsa: reference count the MDB entries at the cross-chip notifier level
  ...
2021-06-30 15:51:09 -07:00
Jakub Kicinski
b6df00789e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c.

Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.

skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29 15:45:27 -07:00
Tanner Love
127d7355ab net: update netdev_rx_csum_fault() print dump only once
Printing this stack dump multiple times does not provide additional
useful information, and consumes time in the data path. Printing once
is sufficient.

Changes
  v2: Format indentation properly

Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:54:57 -07:00
Yunsheng Lin
c4fef01ba4 net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc
Currently pfifo_fast has both TCQ_F_CAN_BYPASS and TCQ_F_NOLOCK
flag set, but queue discipline by-pass does not work for lockless
qdisc because skb is always enqueued to qdisc even when the qdisc
is empty, see __dev_xmit_skb().

This patch calls sch_direct_xmit() to transmit the skb directly
to the driver for empty lockless qdisc, which aviod enqueuing
and dequeuing operation.

As qdisc->empty is not reliable to indicate a empty qdisc because
there is a time window between enqueuing and setting qdisc->empty.
So we use the MISSED state added in commit a90c57f2ce ("net:
sched: fix packet stuck problem for lockless qdisc"), which
indicate there is lock contention, suggesting that it is better
not to do the qdisc bypass in order to avoid packet out of order
problem.

In order to make MISSED state reliable to indicate a empty qdisc,
we need to ensure that testing and clearing of MISSED state is
within the protection of qdisc->seqlock, only setting MISSED state
can be done without the protection of qdisc->seqlock. A MISSED
state testing is added without the protection of qdisc->seqlock to
aviod doing unnecessary spin_trylock() for contention case.

As the enqueuing is not within the protection of qdisc->seqlock,
there is still a potential data race as mentioned by Jakub [1]:

      thread1               thread2             thread3
qdisc_run_begin() # true
                        qdisc_run_begin(q)
                             set(MISSED)
pfifo_fast_dequeue
  clear(MISSED)
  # recheck the queue
qdisc_run_end()
                            enqueue skb1
                                             qdisc empty # true
                                          qdisc_run_begin() # true
                                          sch_direct_xmit() # skb2
                         qdisc_run_begin()
                            set(MISSED)

When above happens, skb1 enqueued by thread2 is transmited after
skb2 is transmited by thread3 because MISSED state setting and
enqueuing is not under the qdisc->seqlock. If qdisc bypass is
disabled, skb1 has better chance to be transmited quicker than
skb2.

This patch does not take care of the above data race, because we
view this as similar as below:
Even at the same time CPU1 and CPU2 write the skb to two socket
which both heading to the same qdisc, there is no guarantee that
which skb will hit the qdisc first, because there is a lot of
factor like interrupt/softirq/cache miss/scheduling afffecting
that.

There are below cases that need special handling:
1. When MISSED state is cleared before another round of dequeuing
   in pfifo_fast_dequeue(), and __qdisc_run() might not be able to
   dequeue all skb in one round and call __netif_schedule(), which
   might result in a non-empty qdisc without MISSED set. In order
   to avoid this, the MISSED state is set for lockless qdisc and
   __netif_schedule() will be called at the end of qdisc_run_end.

2. The MISSED state also need to be set for lockless qdisc instead
   of calling __netif_schedule() directly when requeuing a skb for
   a similar reason.

3. For netdev queue stopped case, the MISSED case need clearing
   while the netdev queue is stopped, otherwise there may be
   unnecessary __netif_schedule() calling. So a new DRAINING state
   is added to indicate this case, which also indicate a non-empty
   qdisc.

4. As there is already netif_xmit_frozen_or_stopped() checking in
   dequeue_skb() and sch_direct_xmit(), which are both within the
   protection of qdisc->seqlock, but the same checking in
   __dev_xmit_skb() is without the protection, which might cause
   empty indication of a lockless qdisc to be not reliable. So
   remove the checking in __dev_xmit_skb(), and the checking in
   the protection of qdisc->seqlock seems enough to avoid the cpu
   consumption problem for netdev queue stopped case.

1. https://lkml.org/lkml/2021/5/29/215

Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # flexcan
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23 12:17:35 -07:00
Sebastian Andrzej Siewior
2b4cd14fd9 net/netif_receive_skb_core: Use migrate_disable()
The preempt disable around do_xdp_generic() has been introduced in
commit
   bbbe211c29 ("net: rcu lock and preempt disable missing around generic xdp")

For BPF it is enough to use migrate_disable() and the code was updated
as it can be seen in commit
   3c58482a38 ("bpf: Provide bpf_prog_run_pin_on_cpu() helper")

This is a leftover which was not converted.

Use migrate_disable() before invoking do_xdp_generic().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21 12:08:02 -07:00
Peter Zijlstra
2f064a59a1 sched: Change task_struct::state
Change the type and name of task_struct::state. Drop the volatile and
shrink it to an 'unsigned int'. Rename it in order to find all uses
such that we can use READ_ONCE/WRITE_ONCE as appropriate.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20210611082838.550736351@infradead.org
2021-06-18 11:43:09 +02:00
Jakub Kicinski
5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Yunsheng Lin
dcad9ee9e0 net: sched: fix tx action reschedule issue with stopped queue
The netdev qeueue might be stopped when byte queue limit has
reached or tx hw ring is full, net_tx_action() may still be
rescheduled if STATE_MISSED is set, which consumes unnecessary
cpu without dequeuing and transmiting any skb because the
netdev queue is stopped, see qdisc_run_end().

This patch fixes it by checking the netdev queue state before
calling qdisc_run() and clearing STATE_MISSED if netdev queue is
stopped during qdisc_run(), the net_tx_action() is rescheduled
again when netdev qeueue is restarted, see netif_tx_wake_queue().

As there is time window between netif_xmit_frozen_or_stopped()
checking and STATE_MISSED clearing, between which STATE_MISSED
may set by net_tx_action() scheduled by netif_tx_wake_queue(),
so set the STATE_MISSED again if netdev queue is restarted.

Fixes: 6b3ba9146f ("net: sched: allow qdiscs to handle locking")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:05:46 -07:00
Yunsheng Lin
102b55ee92 net: sched: fix tx action rescheduling issue during deactivation
Currently qdisc_run() checks the STATE_DEACTIVATED of lockless
qdisc before calling __qdisc_run(), which ultimately clear the
STATE_MISSED when all the skb is dequeued. If STATE_DEACTIVATED
is set before clearing STATE_MISSED, there may be rescheduling
of net_tx_action() at the end of qdisc_run_end(), see below:

CPU0(net_tx_atcion)  CPU1(__dev_xmit_skb)  CPU2(dev_deactivate)
          .                   .                     .
          .            set STATE_MISSED             .
          .           __netif_schedule()            .
          .                   .           set STATE_DEACTIVATED
          .                   .                qdisc_reset()
          .                   .                     .
          .<---------------   .              synchronize_net()
clear __QDISC_STATE_SCHED  |  .                     .
          .                |  .                     .
          .                |  .            some_qdisc_is_busy()
          .                |  .               return *false*
          .                |  .                     .
  test STATE_DEACTIVATED   |  .                     .
__qdisc_run() *not* called |  .                     .
          .                |  .                     .
   test STATE_MISS         |  .                     .
 __netif_schedule()--------|  .                     .
          .                   .                     .
          .                   .                     .

__qdisc_run() is not called by net_tx_atcion() in CPU0 because
CPU2 has set STATE_DEACTIVATED flag during dev_deactivate(), and
STATE_MISSED is only cleared in __qdisc_run(), __netif_schedule
is called at the end of qdisc_run_end(), causing tx action
rescheduling problem.

qdisc_run() called by net_tx_action() runs in the softirq context,
which should has the same semantic as the qdisc_run() called by
__dev_xmit_skb() protected by rcu_read_lock_bh(). And there is a
synchronize_net() between STATE_DEACTIVATED flag being set and
qdisc_reset()/some_qdisc_is_busy in dev_deactivate(), we can safely
bail out for the deactived lockless qdisc in net_tx_action(), and
qdisc_reset() will reset all skb not dequeued yet.

So add the rcu_read_lock() explicitly to protect the qdisc_run()
and do the STATE_DEACTIVATED checking in net_tx_action() before
calling qdisc_run_begin(). Another option is to do the checking in
the qdisc_run_end(), but it will add unnecessary overhead for
non-tx_action case, because __dev_queue_xmit() will not see qdisc
with STATE_DEACTIVATED after synchronize_net(), the qdisc with
STATE_DEACTIVATED can only be seen by net_tx_action() because of
__netif_schedule().

The STATE_DEACTIVATED checking in qdisc_run() is to avoid race
between net_tx_action() and qdisc_reset(), see:
commit d518d2ed86 ("net/sched: fix race between deactivation
and dequeue for NOLOCK qdisc"). As the bailout added above for
deactived lockless qdisc in net_tx_action() provides better
protection for the race without calling qdisc_run() at all, so
remove the STATE_DEACTIVATED checking in qdisc_run().

After qdisc_reset(), there is no skb in qdisc to be dequeued, so
clear the STATE_MISSED in dev_reset_queue() too.

Fixes: 6b3ba9146f ("net: sched: allow qdiscs to handle locking")
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
V8: Clearing STATE_MISSED before calling __netif_schedule() has
    avoid the endless rescheduling problem, but there may still
    be a unnecessary rescheduling, so adjust the commit log.
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14 15:05:46 -07:00