Commit Graph

852 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
962d1816e1 Merge 5.15.86 into android14-5.15
Changes in 5.15.86
	drm/amd/display: Manually adjust strobe for DCN303
	usb: musb: remove extra check in musb_gadget_vbus_draw
	arm64: dts: qcom: ipq6018-cp01-c1: use BLSPI1 pins
	arm64: dts: qcom: sm8250-sony-xperia-edo: fix touchscreen bias-disable
	arm64: dts: qcom: msm8996: Add MSM8996 Pro support
	arm64: dts: qcom: msm8996: fix supported-hw in cpufreq OPP tables
	arm64: dts: qcom: msm8996: fix GPU OPP table
	ARM: dts: qcom: apq8064: fix coresight compatible
	arm64: dts: qcom: sdm630: fix UART1 pin bias
	arm64: dts: qcom: sdm845-cheza: fix AP suspend pin bias
	arm64: dts: qcom: msm8916: Drop MSS fallback compatible
	objtool, kcsan: Add volatile read/write instrumentation to whitelist
	ARM: dts: stm32: Drop stm32mp15xc.dtsi from Avenger96
	ARM: dts: stm32: Fix AV96 WLAN regulator gpio property
	drivers: soc: ti: knav_qmss_queue: Mark knav_acc_firmwares as static
	arm64: dts: qcom: pm660: Use unique ADC5_VCOIN address in node name
	arm64: dts: qcom: sm8250: correct LPASS pin pull down
	soc: qcom: llcc: make irq truly optional
	arm64: dts: qcom: Correct QMP PHY child node name
	arm64: dts: qcom: sm8150: fix UFS PHY registers
	arm64: dts: qcom: sm8250: fix UFS PHY registers
	arm64: dts: qcom: sm8350: fix UFS PHY registers
	arm64: dts: qcom: sm8250: drop bogus DP PHY clock
	soc: qcom: apr: make code more reuseable
	soc: qcom: apr: Add check for idr_alloc and of_property_read_string_index
	arm64: dts: qcom: sm6125: fix SDHCI CQE reg names
	arm: dts: spear600: Fix clcd interrupt
	soc: ti: knav_qmss_queue: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync
	soc: ti: knav_qmss_queue: Fix PM disable depth imbalance in knav_queue_probe
	soc: ti: smartreflex: Fix PM disable depth imbalance in omap_sr_probe
	arm64: Treat ESR_ELx as a 64-bit register
	arm64: mm: kfence: only handle translation faults
	perf: arm_dsu: Fix hotplug callback leak in dsu_pmu_init()
	perf/arm_dmc620: Fix hotplug callback leak in dmc620_pmu_init()
	perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init()
	arm64: dts: ti: k3-am65-main: Drop dma-coherent in crypto node
	arm64: dts: ti: k3-j721e-main: Drop dma-coherent in crypto node
	ARM: dts: nuvoton: Remove bogus unit addresses from fixed-partition nodes
	arm64: dts: mt6779: Fix devicetree build warnings
	arm64: dts: mt2712e: Fix unit_address_vs_reg warning for oscillators
	arm64: dts: mt2712e: Fix unit address for pinctrl node
	arm64: dts: mt2712-evb: Fix vproc fixed regulators unit names
	arm64: dts: mt2712-evb: Fix usb vbus regulators unit names
	arm64: dts: mediatek: pumpkin-common: Fix devicetree warnings
	arm64: dts: mediatek: mt6797: Fix 26M oscillator unit name
	ARM: dts: dove: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: armada-370: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: armada-xp: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: armada-375: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: armada-38x: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: armada-39x: Fix assigned-addresses for every PCIe Root Port
	ARM: dts: turris-omnia: Add ethernet aliases
	ARM: dts: turris-omnia: Add switch port 6 node
	arm64: dts: armada-3720-turris-mox: Add missing interrupt for RTC
	seccomp: Move copy_seccomp() to no failure path.
	pstore/ram: Fix error return code in ramoops_probe()
	ARM: mmp: fix timer_read delay
	pstore: Avoid kcore oops by vmap()ing with VM_IOREMAP
	tpm/tpm_ftpm_tee: Fix error handling in ftpm_mod_init()
	tpm/tpm_crb: Fix error message in __crb_relinquish_locality()
	ovl: store lower path in ovl_inode
	ovl: use ovl_copy_{real,upper}attr() wrappers
	ovl: remove privs in ovl_copyfile()
	ovl: remove privs in ovl_fallocate()
	sched/fair: Cleanup task_util and capacity type
	sched/uclamp: Fix relationship between uclamp and migration margin
	sched/uclamp: Make task_fits_capacity() use util_fits_cpu()
	sched/uclamp: Make select_idle_capacity() use util_fits_cpu()
	sched/fair: Removed useless update of p->recent_used_cpu
	sched/core: Introduce sched_asym_cpucap_active()
	sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()
	cpuidle: dt: Return the correct numbers of parsed idle states
	alpha: fix TIF_NOTIFY_SIGNAL handling
	alpha: fix syscall entry in !AUDUT_SYSCALL case
	x86/sgx: Reduce delay and interference of enclave release
	PM: hibernate: Fix mistake in kerneldoc comment
	fs: don't audit the capability check in simple_xattr_list()
	cpufreq: qcom-hw: Fix memory leak in qcom_cpufreq_hw_read_lut()
	selftests/ftrace: event_triggers: wait longer for test_event_enable
	perf: Fix possible memleak in pmu_dev_alloc()
	lib/debugobjects: fix stat count and optimize debug_objects_mem_init
	platform/x86: huawei-wmi: fix return value calculation
	timerqueue: Use rb_entry_safe() in timerqueue_getnext()
	proc: fixup uptime selftest
	lib/fonts: fix undefined behavior in bit shift for get_default_font
	ocfs2: fix memory leak in ocfs2_stack_glue_init()
	MIPS: vpe-mt: fix possible memory leak while module exiting
	MIPS: vpe-cmp: fix possible memory leak while module exiting
	selftests/efivarfs: Add checking of the test return value
	PNP: fix name memory leak in pnp_alloc_dev()
	perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology()
	perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox()
	perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map()
	perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box()
	platform/chrome: cros_usbpd_notify: Fix error handling in cros_usbpd_notify_init()
	thermal: core: fix some possible name leaks in error paths
	irqchip: gic-pm: Use pm_runtime_resume_and_get() in gic_probe()
	irqchip/wpcm450: Fix memory leak in wpcm450_aic_of_init()
	EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
	SUNRPC: Return true/false (not 1/0) from bool functions
	NFSD: Finish converting the NFSv2 GETACL result encoder
	nfsd: don't call nfsd_file_put from client states seqfile display
	genirq/irqdesc: Don't try to remove non-existing sysfs files
	cpufreq: amd_freq_sensitivity: Add missing pci_dev_put()
	libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value
	lib/notifier-error-inject: fix error when writing -errno to debugfs file
	debugfs: fix error when writing negative value to atomic_t debugfs file
	rapidio: fix possible name leaks when rio_add_device() fails
	rapidio: rio: fix possible name leak in rio_register_mport()
	clocksource/drivers/sh_cmt: Access registers according to spec
	mips: ralink: mt7621: define MT7621_SYSC_BASE with __iomem
	mips: ralink: mt7621: soc queries and tests as functions
	mips: ralink: mt7621: do not use kzalloc too early
	futex: Move to kernel/futex/
	futex: Resend potentially swallowed owner death notification
	cpu/hotplug: Make target_store() a nop when target == state
	cpu/hotplug: Do not bail-out in DYING/STARTING sections
	clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock()
	ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage()
	uprobes/x86: Allow to probe a NOP instruction with 0x66 prefix
	x86/xen: Fix memory leak in xen_smp_intr_init{_pv}()
	x86/xen: Fix memory leak in xen_init_lock_cpu()
	xen/privcmd: Fix a possible warning in privcmd_ioctl_mmap_resource()
	PM: runtime: Do not call __rpm_callback() from rpm_idle()
	platform/chrome: cros_ec_typec: Cleanup switch handle return paths
	platform/chrome: cros_ec_typec: zero out stale pointers
	platform/x86: mxm-wmi: fix memleak in mxm_wmi_call_mx[ds|mx]()
	platform/x86: intel_scu_ipc: fix possible name leak in __intel_scu_ipc_register()
	MIPS: BCM63xx: Add check for NULL for clk in clk_enable
	MIPS: OCTEON: warn only once if deprecated link status is being used
	lockd: set other missing fields when unlocking files
	fs: sysv: Fix sysv_nblocks() returns wrong value
	rapidio: fix possible UAF when kfifo_alloc() fails
	eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
	relay: fix type mismatch when allocating memory in relay_create_buf()
	hfs: Fix OOB Write in hfs_asc2mac
	rapidio: devices: fix missing put_device in mport_cdev_open
	platform/mellanox: mlxbf-pmc: Fix event typo
	wifi: ath9k: hif_usb: fix memory leak of urbs in ath9k_hif_usb_dealloc_tx_urbs()
	wifi: ath9k: hif_usb: Fix use-after-free in ath9k_hif_usb_reg_in_cb()
	wifi: rtl8xxxu: Fix reading the vendor of combo chips
	drm/bridge: adv7533: remove dynamic lane switching from adv7533 bridge
	libbpf: Fix use-after-free in btf_dump_name_dups
	libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
	ata: libata: move ata_{port,link,dev}_dbg to standard pr_XXX() macros
	ata: add/use ata_taskfile::{error|status} fields
	ata: libata: fix NCQ autosense logic
	ipmi: kcs: Poll OBF briefly to reduce OBE latency
	drm/amdgpu/powerplay/psm: Fix memory leak in power state init
	media: v4l2-ctrls: Fix off-by-one error in integer menu control check
	media: coda: jpeg: Add check for kmalloc
	media: adv748x: afe: Select input port when initializing AFE
	media: i2c: ad5820: Fix error path
	venus: pm_helpers: Fix error check in vcodec_domains_get()
	soreuseport: Fix socket selection for SO_INCOMING_CPU.
	media: exynos4-is: don't rely on the v4l2_async_subdev internals
	libbpf: Btf dedup identical struct test needs check for nested structs/arrays
	can: kvaser_usb: do not increase tx statistics when sending error message frames
	can: kvaser_usb: kvaser_usb_leaf: Get capabilities from device
	can: kvaser_usb: kvaser_usb_leaf: Rename {leaf,usbcan}_cmd_error_event to {leaf,usbcan}_cmd_can_error_event
	can: kvaser_usb: kvaser_usb_leaf: Handle CMD_ERROR_EVENT
	can: kvaser_usb_leaf: Set Warning state even without bus errors
	can: kvaser_usb: make use of units.h in assignment of frequency
	can: kvaser_usb_leaf: Fix improved state not being reported
	can: kvaser_usb_leaf: Fix wrong CAN state after stopping
	can: kvaser_usb_leaf: Fix bogus restart events
	can: kvaser_usb: Add struct kvaser_usb_busparams
	can: kvaser_usb: Compare requested bittiming parameters with actual parameters in do_set_{,data}_bittiming
	drm/rockchip: lvds: fix PM usage counter unbalance in poweron
	clk: renesas: r9a06g032: Repair grave increment error
	spi: Update reference to struct spi_controller
	drm/panel/panel-sitronix-st7701: Remove panel on DSI attach failure
	ima: Handle -ESTALE returned by ima_filter_rule_match()
	drm/msm/hdmi: drop unused GPIO support
	drm/msm/hdmi: use devres helper for runtime PM management
	bpf: Fix slot type check in check_stack_write_var_off
	media: vivid: fix compose size exceed boundary
	media: platform: exynos4-is: fix return value check in fimc_md_probe()
	bpf: propagate precision in ALU/ALU64 operations
	bpf: Check the other end of slot_type for STACK_SPILL
	bpf: propagate precision across all frames, not just the last one
	clk: qcom: gcc-sm8250: Use retention mode for USB GDSCs
	mtd: Fix device name leak when register device failed in add_mtd_device()
	Input: joystick - fix Kconfig warning for JOYSTICK_ADC
	wifi: rsi: Fix handling of 802.3 EAPOL frames sent via control port
	media: camss: Clean up received buffers on failed start of streaming
	net, proc: Provide PROC_FS=n fallback for proc_create_net_single_write()
	rxrpc: Fix ack.bufferSize to be 0 when generating an ack
	bfq: fix waker_bfqq inconsistency crash
	drm/radeon: Add the missed acpi_put_table() to fix memory leak
	drm/mediatek: Modify dpi power on/off sequence.
	ASoC: pxa: fix null-pointer dereference in filter()
	libbpf: Fix uninitialized warning in btf_dump_dump_type_data
	nvmet: only allocate a single slab for bvecs
	regulator: core: fix unbalanced of node refcount in regulator_dev_lookup()
	amdgpu/pm: prevent array underflow in vega20_odn_edit_dpm_table()
	nvme: return err on nvme_init_non_mdts_limits fail
	regulator: qcom-rpmh: Fix PMR735a S3 regulator spec
	drm/fourcc: Add packed 10bit YUV 4:2:0 format
	drm/fourcc: Fix vsub/hsub for Q410 and Q401
	integrity: Fix memory leakage in keyring allocation error path
	ima: Fix misuse of dereference of pointer in template_desc_init_fields()
	block: clear ->slave_dir when dropping the main slave_dir reference
	wifi: ath10k: Fix return value in ath10k_pci_init()
	drm/msm/a6xx: Fix speed-bin detection vs probe-defer
	mtd: lpddr2_nvm: Fix possible null-ptr-deref
	Input: elants_i2c - properly handle the reset GPIO when power is off
	media: vidtv: Fix use-after-free in vidtv_bridge_dvb_init()
	media: solo6x10: fix possible memory leak in solo_sysfs_init()
	media: platform: exynos4-is: Fix error handling in fimc_md_init()
	media: videobuf-dma-contig: use dma_mmap_coherent
	inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict()
	mtd: spi-nor: hide jedec_id sysfs attribute if not present
	mtd: spi-nor: Fix the number of bytes for the dummy cycles
	bpf: Move skb->len == 0 checks into __bpf_redirect
	HID: hid-sensor-custom: set fixed size for custom attributes
	pinctrl: k210: call of_node_put()
	ALSA: pcm: fix undefined behavior in bit shift for SNDRV_PCM_RATE_KNOT
	ALSA: seq: fix undefined behavior in bit shift for SNDRV_SEQ_FILTER_USE_EVENT
	regulator: core: use kfree_const() to free space conditionally
	clk: rockchip: Fix memory leak in rockchip_clk_register_pll()
	drm/amdgpu: fix pci device refcount leak
	bonding: fix link recovery in mode 2 when updelay is nonzero
	mtd: maps: pxa2xx-flash: fix memory leak in probe
	drbd: remove call to memset before free device/resource/connection
	drbd: destroy workqueue when drbd device was freed
	ASoC: qcom: Add checks for devm_kcalloc
	media: vimc: Fix wrong function called when vimc_init() fails
	media: imon: fix a race condition in send_packet()
	clk: imx8mn: rename vpu_pll to m7_alt_pll
	clk: imx: replace osc_hdmi with dummy
	clk: imx8mn: fix imx8mn_sai2_sels clocks list
	clk: imx8mn: fix imx8mn_enet_phy_sels clocks list
	pinctrl: pinconf-generic: add missing of_node_put()
	media: dvb-core: Fix ignored return value in dvb_register_frontend()
	media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()
	media: s5p-mfc: Add variant data for MFC v7 hardware for Exynos 3250 SoC
	drm/tegra: Add missing clk_disable_unprepare() in tegra_dc_probe()
	ASoC: dt-bindings: wcd9335: fix reset line polarity in example
	ASoC: mediatek: mtk-btcvsd: Add checks for write and read of mtk_btcvsd_snd
	NFSv4.2: Clear FATTR4_WORD2_SECURITY_LABEL when done decoding
	NFSv4.2: Fix a memory stomp in decode_attr_security_label
	NFSv4.2: Fix initialisation of struct nfs4_label
	NFSv4: Fix a credential leak in _nfs4_discover_trunking()
	NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturn
	NFS: Fix an Oops in nfs_d_automount()
	ALSA: asihpi: fix missing pci_disable_device()
	wifi: iwlwifi: mvm: fix double free on tx path.
	ASoC: mediatek: mt8173: Fix debugfs registration for components
	ASoC: mediatek: mt8173: Enable IRQ when pdata is ready
	drm/amd/pm/smu11: BACO is supported when it's in BACO state
	drm/radeon: Fix PCI device refcount leak in radeon_atrm_get_bios()
	drm/amdgpu: Fix PCI device refcount leak in amdgpu_atrm_get_bios()
	drm/amdkfd: Fix memory leakage
	ASoC: pcm512x: Fix PM disable depth imbalance in pcm512x_probe
	netfilter: conntrack: set icmpv6 redirects as RELATED
	Input: wistron_btns - disable on UML
	bpf, sockmap: Fix repeated calls to sock_put() when msg has more_data
	bpf, sockmap: Fix missing BPF_F_INGRESS flag when using apply_bytes
	bpf, sockmap: Fix data loss caused by using apply_bytes on ingress redirect
	bonding: uninitialized variable in bond_miimon_inspect()
	spi: spidev: mask SPI_CS_HIGH in SPI_IOC_RD_MODE
	wifi: mac80211: fix memory leak in ieee80211_if_add()
	wifi: cfg80211: Fix not unregister reg_pdev when load_builtin_regdb_keys() fails
	mt76: stop the radar detector after leaving dfs channel
	wifi: mt76: mt7921: fix reporting of TX AGGR histogram
	wifi: mt76: fix coverity overrun-call in mt76_get_txpower()
	regulator: core: fix module refcount leak in set_supply()
	clk: qcom: lpass-sc7180: Fix pm_runtime usage
	clk: qcom: clk-krait: fix wrong div2 functions
	hsr: Add a rcu-read lock to hsr_forward_skb().
	hsr: Avoid double remove of a node.
	hsr: Disable netpoll.
	hsr: Synchronize sending frames to have always incremented outgoing seq nr.
	hsr: Synchronize sequence number updates.
	configfs: fix possible memory leak in configfs_create_dir()
	regulator: core: fix resource leak in regulator_register()
	hwmon: (jc42) Convert register access and caching to regmap/regcache
	hwmon: (jc42) Restore the min/max/critical temperatures on resume
	bpf, sockmap: fix race in sock_map_free()
	ALSA: pcm: Set missing stop_operating flag at undoing trigger start
	media: saa7164: fix missing pci_disable_device()
	ALSA: mts64: fix possible null-ptr-defer in snd_mts64_interrupt
	xprtrdma: Fix regbuf data not freed in rpcrdma_req_create()
	SUNRPC: Fix missing release socket in rpc_sockname()
	NFSv4.x: Fail client initialisation if state manager thread can't run
	riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNC
	mmc: alcor: fix return value check of mmc_add_host()
	mmc: moxart: fix return value check of mmc_add_host()
	mmc: mxcmmc: fix return value check of mmc_add_host()
	mmc: pxamci: fix return value check of mmc_add_host()
	mmc: rtsx_pci: fix return value check of mmc_add_host()
	mmc: rtsx_usb_sdmmc: fix return value check of mmc_add_host()
	mmc: toshsd: fix return value check of mmc_add_host()
	mmc: vub300: fix return value check of mmc_add_host()
	mmc: wmt-sdmmc: fix return value check of mmc_add_host()
	mmc: atmel-mci: fix return value check of mmc_add_host()
	mmc: omap_hsmmc: fix return value check of mmc_add_host()
	mmc: meson-gx: fix return value check of mmc_add_host()
	mmc: via-sdmmc: fix return value check of mmc_add_host()
	mmc: wbsd: fix return value check of mmc_add_host()
	mmc: mmci: fix return value check of mmc_add_host()
	mmc: renesas_sdhi: alway populate SCC pointer
	memstick: ms_block: Add error handling support for add_disk()
	memstick/ms_block: Add check for alloc_ordered_workqueue
	mmc: core: Normalize the error handling branch in sd_read_ext_regs()
	regulator: qcom-labibb: Fix missing of_node_put() in qcom_labibb_regulator_probe()
	media: c8sectpfe: Add of_node_put() when breaking out of loop
	media: coda: Add check for dcoda_iram_alloc
	media: coda: Add check for kmalloc
	clk: samsung: Fix memory leak in _samsung_clk_register_pll()
	spi: spi-gpio: Don't set MOSI as an input if not 3WIRE mode
	wifi: rtl8xxxu: Add __packed to struct rtl8723bu_c2h
	wifi: rtl8xxxu: Fix the channel width reporting
	wifi: brcmfmac: Fix error return code in brcmf_sdio_download_firmware()
	blktrace: Fix output non-blktrace event when blk_classic option enabled
	bpf: Do not zero-extend kfunc return values
	clk: socfpga: Fix memory leak in socfpga_gate_init()
	net: vmw_vsock: vmci: Check memcpy_from_msg()
	net: defxx: Fix missing err handling in dfx_init()
	net: stmmac: selftests: fix potential memleak in stmmac_test_arpoffload()
	net: stmmac: fix possible memory leak in stmmac_dvr_probe()
	drivers: net: qlcnic: Fix potential memory leak in qlcnic_sriov_init()
	of: overlay: fix null pointer dereferencing in find_dup_cset_node_entry() and find_dup_cset_prop()
	ethernet: s2io: don't call dev_kfree_skb() under spin_lock_irqsave()
	net: farsync: Fix kmemleak when rmmods farsync
	net/tunnel: wait until all sk_user_data reader finish before releasing the sock
	net: apple: mace: don't call dev_kfree_skb() under spin_lock_irqsave()
	net: apple: bmac: don't call dev_kfree_skb() under spin_lock_irqsave()
	net: emaclite: don't call dev_kfree_skb() under spin_lock_irqsave()
	net: ethernet: dnet: don't call dev_kfree_skb() under spin_lock_irqsave()
	hamradio: don't call dev_kfree_skb() under spin_lock_irqsave()
	net: amd: lance: don't call dev_kfree_skb() under spin_lock_irqsave()
	af_unix: call proto_unregister() in the error path in af_unix_init()
	net: amd-xgbe: Fix logic around active and passive cables
	net: amd-xgbe: Check only the minimum speed for active/passive cables
	can: tcan4x5x: Remove invalid write in clear_interrupts
	can: m_can: Call the RAM init directly from m_can_chip_config
	can: tcan4x5x: Fix use of register error status mask
	net: lan9303: Fix read error execution path
	ntb_netdev: Use dev_kfree_skb_any() in interrupt context
	sctp: sysctl: make extra pointers netns aware
	Bluetooth: MGMT: Fix error report for ADD_EXT_ADV_PARAMS
	Bluetooth: btintel: Fix missing free skb in btintel_setup_combined()
	Bluetooth: btusb: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: hci_qca: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: hci_ll: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: hci_h5: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: hci_bcsp: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave()
	Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave()
	stmmac: fix potential division by 0
	i40e: Fix the inability to attach XDP program on downed interface
	net: dsa: tag_8021q: avoid leaking ctx on dsa_tag_8021q_register() error path
	apparmor: fix a memleak in multi_transaction_new()
	apparmor: fix lockdep warning when removing a namespace
	apparmor: Fix abi check to include v8 abi
	crypto: hisilicon/qm - fix missing destroy qp_idr
	crypto: sun8i-ss - use dma_addr instead u32
	crypto: nitrox - avoid double free on error path in nitrox_sriov_init()
	scsi: core: Fix a race between scsi_done() and scsi_timeout()
	apparmor: Use pointer to struct aa_label for lbs_cred
	PCI: dwc: Fix n_fts[] array overrun
	RDMA/core: Fix order of nldev_exit call
	PCI: pci-epf-test: Register notifier if only core_init_notifier is enabled
	f2fs: Fix the race condition of resize flag between resizefs
	crypto: rockchip - do not do custom power management
	crypto: rockchip - do not store mode globally
	crypto: rockchip - add fallback for cipher
	crypto: rockchip - add fallback for ahash
	crypto: rockchip - better handle cipher key
	crypto: rockchip - remove non-aligned handling
	crypto: rockchip - rework by using crypto_engine
	apparmor: Fix memleak in alloc_ns()
	f2fs: fix to invalidate dcc->f2fs_issue_discard in error path
	f2fs: fix normal discard process
	f2fs: fix to destroy sbi->post_read_wq in error path of f2fs_fill_super()
	RDMA/irdma: Report the correct link speed
	scsi: qla2xxx: Fix set-but-not-used variable warnings
	RDMA/siw: Fix immediate work request flush to completion queue
	IB/mad: Don't call to function that might sleep while in atomic context
	PCI: vmd: Disable MSI remapping after suspend
	RDMA/restrack: Release MR restrack when delete
	RDMA/core: Make sure "ib_port" is valid when access sysfs node
	RDMA/nldev: Return "-EAGAIN" if the cm_id isn't from expected port
	RDMA/siw: Set defined status for work completion with undefined status
	scsi: scsi_debug: Fix a warning in resp_write_scat()
	crypto: ccree - Remove debugfs when platform_driver_register failed
	crypto: cryptd - Use request context instead of stack for sub-request
	crypto: hisilicon/qm - add missing pci_dev_put() in q_num_set()
	RDMA/hns: Repacing 'dseg_len' by macros in fill_ext_sge_inl_data()
	RDMA/hns: Fix ext_sge num error when post send
	PCI: Check for alloc failure in pci_request_irq()
	RDMA/hfi: Decrease PCI device reference count in error path
	crypto: ccree - Make cc_debugfs_global_fini() available for module init function
	RDMA/hns: fix memory leak in hns_roce_alloc_mr()
	RDMA/rxe: Fix NULL-ptr-deref in rxe_qp_do_cleanup() when socket create failed
	dt-bindings: imx6q-pcie: Fix clock names for imx6sx and imx8mq
	dt-bindings: visconti-pcie: Fix interrupts array max constraints
	scsi: hpsa: Fix possible memory leak in hpsa_init_one()
	crypto: tcrypt - Fix multibuffer skcipher speed test mem leak
	padata: Always leave BHs disabled when running ->parallel()
	padata: Fix list iterator in padata_do_serial()
	scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add()
	scsi: hpsa: Fix error handling in hpsa_add_sas_host()
	scsi: hpsa: Fix possible memory leak in hpsa_add_sas_device()
	scsi: efct: Fix possible memleak in efct_device_init()
	scsi: scsi_debug: Fix a warning in resp_verify()
	scsi: scsi_debug: Fix a warning in resp_report_zones()
	scsi: fcoe: Fix possible name leak when device_register() fails
	scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper()
	scsi: ipr: Fix WARNING in ipr_init()
	scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails
	scsi: snic: Fix possible UAF in snic_tgt_create()
	RDMA/nldev: Add checks for nla_nest_start() in fill_stat_counter_qps()
	f2fs: avoid victim selection from previous victim section
	RDMA/nldev: Fix failure to send large messages
	crypto: amlogic - Remove kcalloc without check
	crypto: omap-sham - Use pm_runtime_resume_and_get() in omap_sham_probe()
	riscv/mm: add arch hook arch_clear_hugepage_flags
	RDMA/hfi1: Fix error return code in parse_platform_config()
	RDMA/srp: Fix error return code in srp_parse_options()
	PCI: mt7621: Rename mt7621_pci_ to mt7621_pcie_
	PCI: mt7621: Add sentinel to quirks table
	orangefs: Fix sysfs not cleanup when dev init failed
	RDMA/hns: Fix AH attr queried by query_qp
	RDMA/hns: Fix PBL page MTR find
	RDMA/hns: Fix page size cap from firmware
	RDMA/hns: Fix error code of CMD
	crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
	hwrng: amd - Fix PCI device refcount leak
	hwrng: geode - Fix PCI device refcount leak
	IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces
	RISC-V: Align the shadow stack
	drivers: dio: fix possible memory leak in dio_init()
	serial: tegra: Read DMA status before terminating
	serial: 8250_bcm7271: Fix error handling in brcmuart_init()
	class: fix possible memory leak in __class_register()
	vfio: platform: Do not pass return buffer to ACPI _RST method
	uio: uio_dmem_genirq: Fix missing unlock in irq configuration
	uio: uio_dmem_genirq: Fix deadlock between irq config and handling
	usb: fotg210-udc: Fix ages old endianness issues
	staging: vme_user: Fix possible UAF in tsi148_dma_list_add
	usb: typec: Check for ops->exit instead of ops->enter in altmode_exit
	usb: typec: tcpci: fix of node refcount leak in tcpci_register_port()
	usb: typec: tipd: Cleanup resources if devm_tps6598_psy_register fails
	usb: typec: tipd: Fix spurious fwnode_handle_put in error path
	extcon: usbc-tusb320: Add support for mode setting and reset
	extcon: usbc-tusb320: Add support for TUSB320L
	usb: typec: Factor out non-PD fwnode properties
	extcon: usbc-tusb320: Factor out extcon into dedicated functions
	extcon: usbc-tusb320: Add USB TYPE-C support
	extcon: usbc-tusb320: Update state on probe even if no IRQ pending
	serial: amba-pl011: avoid SBSA UART accessing DMACR register
	serial: pl011: Do not clear RX FIFO & RX interrupt in unthrottle.
	serial: stm32: move dma_request_chan() before clk_prepare_enable()
	serial: pch: Fix PCI device refcount leak in pch_request_dma()
	tty: serial: clean up stop-tx part in altera_uart_tx_chars()
	tty: serial: altera_uart_{r,t}x_chars() need only uart_port
	serial: altera_uart: fix locking in polling mode
	serial: sunsab: Fix error handling in sunsab_init()
	test_firmware: fix memory leak in test_firmware_init()
	misc: ocxl: fix possible name leak in ocxl_file_register_afu()
	ocxl: fix pci device refcount leak when calling get_function_0()
	misc: tifm: fix possible memory leak in tifm_7xx1_switch_media()
	misc: sgi-gru: fix use-after-free error in gru_set_context_option, gru_fault and gru_handle_user_call_os
	firmware: raspberrypi: fix possible memory leak in rpi_firmware_probe()
	cxl: fix possible null-ptr-deref in cxl_guest_init_afu|adapter()
	cxl: fix possible null-ptr-deref in cxl_pci_init_afu|adapter()
	iio: temperature: ltc2983: make bulk write buffer DMA-safe
	iio: adis: handle devices that cannot unmask the drdy pin
	iio: adis: stylistic changes
	iio:imu:adis: Move exports into IIO_ADISLIB namespace
	iio: adis: add '__adis_enable_irq()' implementation
	counter: stm32-lptimer-cnt: fix the check on arr and cmp registers update
	coresight: trbe: remove cpuhp instance node before remove cpuhp state
	usb: roles: fix of node refcount leak in usb_role_switch_is_parent()
	usb: gadget: f_hid: fix f_hidg lifetime vs cdev
	usb: gadget: f_hid: fix refcount leak on error path
	drivers: mcb: fix resource leak in mcb_probe()
	mcb: mcb-parse: fix error handing in chameleon_parse_gdd()
	chardev: fix error handling in cdev_device_add()
	i2c: pxa-pci: fix missing pci_disable_device() on error in ce4100_i2c_probe
	staging: rtl8192u: Fix use after free in ieee80211_rx()
	staging: rtl8192e: Fix potential use-after-free in rtllib_rx_Monitor()
	vme: Fix error not catched in fake_init()
	gpiolib: Get rid of redundant 'else'
	gpiolib: cdev: fix NULL-pointer dereferences
	gpiolib: make struct comments into real kernel docs
	gpiolib: protect the GPIO device against being dropped while in use by user-space
	i2c: mux: reg: check return value after calling platform_get_resource()
	i2c: ismt: Fix an out-of-bounds bug in ismt_access()
	usb: storage: Add check for kcalloc
	tracing/hist: Fix issue of losting command info in error_log
	ksmbd: Fix resource leak in ksmbd_session_rpc_open()
	samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe()
	thermal/drivers/imx8mm_thermal: Validate temperature range
	thermal/drivers/qcom/temp-alarm: Fix inaccurate warning for gen2
	thermal/drivers/qcom/lmh: Fix irq handler return value
	fbdev: ssd1307fb: Drop optional dependency
	fbdev: pm2fb: fix missing pci_disable_device()
	fbdev: via: Fix error in via_core_init()
	fbdev: vermilion: decrease reference count in error path
	fbdev: ep93xx-fb: Add missing clk_disable_unprepare in ep93xxfb_probe()
	fbdev: geode: don't build on UML
	fbdev: uvesafb: don't build on UML
	fbdev: uvesafb: Fixes an error handling path in uvesafb_probe()
	HSI: omap_ssi_core: fix unbalanced pm_runtime_disable()
	HSI: omap_ssi_core: fix possible memory leak in ssi_probe()
	power: supply: fix residue sysfs file in error handle route of __power_supply_register()
	perf trace: Return error if a system call doesn't exist
	perf trace: Use macro RAW_SYSCALL_ARGS_NUM to replace number
	perf trace: Handle failure when trace point folder is missed
	perf symbol: correction while adjusting symbol
	power: supply: z2_battery: Fix possible memleak in z2_batt_probe()
	HSI: omap_ssi_core: Fix error handling in ssi_init()
	power: supply: ab8500: Fix error handling in ab8500_charger_init()
	power: supply: fix null pointer dereferencing in power_supply_get_battery_info
	perf stat: Refactor __run_perf_stat() common code
	perf stat: Do not delay the workload with --delay
	RDMA/siw: Fix pointer cast warning
	fs/ntfs3: Avoid UBSAN error on true_sectors_per_clst()
	overflow: Implement size_t saturating arithmetic helpers
	fs/ntfs3: Harden against integer overflows
	iommu/sun50i: Fix reset release
	iommu/sun50i: Consider all fault sources for reset
	iommu/sun50i: Fix R/W permission check
	iommu/sun50i: Fix flush size
	iommu/rockchip: fix permission bits in page table entries v2
	phy: usb: s2 WoL wakeup_count not incremented for USB->Eth devices
	include/uapi/linux/swab: Fix potentially missing __always_inline
	pwm: tegra: Improve required rate calculation
	fs/ntfs3: Fix slab-out-of-bounds read in ntfs_trim_fs
	dmaengine: idxd: Fix crc_val field for completion record
	rtc: rtc-cmos: Do not check ACPI_FADT_LOW_POWER_S0
	rtc: cmos: Fix event handler registration ordering issue
	rtc: cmos: Fix wake alarm breakage
	rtc: cmos: fix build on non-ACPI platforms
	rtc: cmos: Call cmos_wake_setup() from cmos_do_probe()
	rtc: cmos: Call rtc_wake_setup() from cmos_do_probe()
	rtc: cmos: Eliminate forward declarations of some functions
	rtc: cmos: Rename ACPI-related functions
	rtc: cmos: Disable ACPI RTC event on removal
	rtc: snvs: Allow a time difference on clock register read
	rtc: pcf85063: Fix reading alarm
	iommu/amd: Fix pci device refcount leak in ppr_notifier()
	iommu/fsl_pamu: Fix resource leak in fsl_pamu_probe()
	macintosh: fix possible memory leak in macio_add_one_device()
	macintosh/macio-adb: check the return value of ioremap()
	powerpc/52xx: Fix a resource leak in an error handling path
	cxl: Fix refcount leak in cxl_calc_capp_routing
	powerpc/xmon: Fix -Wswitch-unreachable warning in bpt_cmds
	powerpc/xive: add missing iounmap() in error path in xive_spapr_populate_irq_data()
	powerpc/perf: callchain validate kernel stack pointer bounds
	powerpc/83xx/mpc832x_rdb: call platform_device_put() in error case in of_fsl_spi_probe()
	powerpc/hv-gpci: Fix hv_gpci event list
	selftests/powerpc: Fix resource leaks
	iommu/sun50i: Remove IOMMU_DOMAIN_IDENTITY
	pwm: sifive: Call pwm_sifive_update_clock() while mutex is held
	pwm: mtk-disp: Fix the parameters calculated by the enabled flag of disp_pwm
	pwm: mediatek: always use bus clock for PWM on MT7622
	remoteproc: sysmon: fix memory leak in qcom_add_sysmon_subdev()
	remoteproc: qcom: q6v5: Fix potential null-ptr-deref in q6v5_wcss_init_mmio()
	remoteproc: qcom_q6v5_pas: disable wakeup on probe fail or remove
	remoteproc: qcom_q6v5_pas: detach power domains on remove
	remoteproc: qcom_q6v5_pas: Fix missing of_node_put() in adsp_alloc_memory_region()
	remoteproc: qcom: q6v5: Fix missing clk_disable_unprepare() in q6v5_wcss_qcs404_power_on()
	powerpc/eeh: Drop redundant spinlock initialization
	powerpc/pseries/eeh: use correct API for error log size
	mfd: bd957x: Fix Kconfig dependency on REGMAP_IRQ
	mfd: qcom_rpm: Fix an error handling path in qcom_rpm_probe()
	mfd: pm8008: Remove driver data structure pm8008_data
	mfd: pm8008: Fix return value check in pm8008_probe()
	netfilter: flowtable: really fix NAT IPv6 offload
	rtc: st-lpc: Add missing clk_disable_unprepare in st_rtc_probe()
	rtc: pic32: Move devm_rtc_allocate_device earlier in pic32_rtc_probe()
	rtc: pcf85063: fix pcf85063_clkout_control
	nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
	net: macsec: fix net device access prior to holding a lock
	mISDN: hfcsusb: don't call dev_kfree_skb/kfree_skb() under spin_lock_irqsave()
	mISDN: hfcpci: don't call dev_kfree_skb/kfree_skb() under spin_lock_irqsave()
	mISDN: hfcmulti: don't call dev_kfree_skb/kfree_skb() under spin_lock_irqsave()
	block, bfq: fix possible uaf for 'bfqq->bic'
	selftests/bpf: Add test for unstable CT lookup API
	net: enetc: avoid buffer leaks on xdp_do_redirect() failure
	nfc: pn533: Clear nfc_target before being used
	unix: Fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
	r6040: Fix kmemleak in probe and remove
	igc: Enhance Qbv scheduling by using first flag bit
	igc: Use strict cycles for Qbv scheduling
	igc: Add checking for basetime less than zero
	igc: allow BaseTime 0 enrollment for Qbv
	igc: recalculate Qbv end_time by considering cycle time
	igc: Lift TAPRIO schedule restriction
	igc: Set Qbv start_time and end_time to end_time if not being configured in GCL
	rtc: mxc_v2: Add missing clk_disable_unprepare()
	selftests: devlink: fix the fd redirect in dummy_reporter_test
	openvswitch: Fix flow lookup to use unmasked key
	soc: mediatek: pm-domains: Fix the power glitch issue
	arm64: dts: mt8183: Fix Mali GPU clock
	skbuff: Account for tail adjustment during pull operations
	mailbox: mpfs: read the system controller's status
	mailbox: arm_mhuv2: Fix return value check in mhuv2_probe()
	mailbox: zynq-ipi: fix error handling while device_register() fails
	net_sched: reject TCF_EM_SIMPLE case for complex ematch module
	rxrpc: Fix missing unlock in rxrpc_do_sendmsg()
	myri10ge: Fix an error handling path in myri10ge_probe()
	net: stream: purge sk_error_queue in sk_stream_kill_queues()
	HID: amd_sfh: Add missing check for dma_alloc_coherent
	rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state()
	arm64: make is_ttbrX_addr() noinstr-safe
	video: hyperv_fb: Avoid taking busy spinlock on panic path
	x86/hyperv: Remove unregister syscore call from Hyper-V cleanup
	binfmt_misc: fix shift-out-of-bounds in check_special_flags
	fs: jfs: fix shift-out-of-bounds in dbAllocAG
	udf: Avoid double brelse() in udf_rename()
	jfs: Fix fortify moan in symlink
	fs: jfs: fix shift-out-of-bounds in dbDiscardAG
	ACPICA: Fix error code path in acpi_ds_call_control_method()
	nilfs2: fix shift-out-of-bounds/overflow in nilfs_sb2_bad_offset()
	nilfs2: fix shift-out-of-bounds due to too large exponent of block size
	acct: fix potential integer overflow in encode_comp_t()
	hfs: fix OOB Read in __hfs_brec_find
	drm/etnaviv: add missing quirks for GC300
	media: imx-jpeg: Disable useless interrupt to avoid kernel panic
	brcmfmac: return error when getting invalid max_flowrings from dongle
	wifi: ath9k: verify the expected usb_endpoints are present
	wifi: ar5523: Fix use-after-free on ar5523_cmd() timed out
	ASoC: codecs: rt298: Add quirk for KBL-R RVP platform
	ipmi: fix memleak when unload ipmi driver
	drm/amd/display: prevent memory leak
	Revert "drm/amd/display: Limit max DSC target bpp for specific monitors"
	qed (gcc13): use u16 for fid to be big enough
	bpf: make sure skb->len != 0 when redirecting to a tunneling device
	net: ethernet: ti: Fix return type of netcp_ndo_start_xmit()
	hamradio: baycom_epp: Fix return type of baycom_send_packet()
	wifi: brcmfmac: Fix potential shift-out-of-bounds in brcmf_fw_alloc_request()
	igb: Do not free q_vector unless new one was allocated
	drm/amdgpu: Fix type of second parameter in trans_msg() callback
	drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback
	s390/ctcm: Fix return type of ctc{mp,}m_tx()
	s390/netiucv: Fix return type of netiucv_tx()
	s390/lcs: Fix return type of lcs_start_xmit()
	drm/msm: Use drm_mode_copy()
	drm/rockchip: Use drm_mode_copy()
	drm/sti: Use drm_mode_copy()
	drm/mediatek: Fix return type of mtk_hdmi_bridge_mode_valid()
	drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()
	md/raid1: stop mdx_raid1 thread when raid1 array run failed
	drm/amd/display: fix array index out of bound error in bios parser
	net: add atomic_long_t to net_device_stats fields
	ipv6/sit: use DEV_STATS_INC() to avoid data-races
	mrp: introduce active flags to prevent UAF when applicant uninit
	ppp: associate skb with a device at tx
	bpf: Prevent decl_tag from being referenced in func_proto arg
	ethtool: avoiding integer overflow in ethtool_phys_id()
	media: dvb-frontends: fix leak of memory fw
	media: dvbdev: adopts refcnt to avoid UAF
	media: dvb-usb: fix memory leak in dvb_usb_adapter_init()
	blk-mq: fix possible memleak when register 'hctx' failed
	drm/amd/display: Use the largest vready_offset in pipe group
	libbpf: Avoid enum forward-declarations in public API in C++ mode
	regulator: core: fix use_count leakage when handling boot-on
	wifi: mt76: do not run mt76u_status_worker if the device is not running
	mmc: f-sdh30: Add quirks for broken timeout clock capability
	mmc: renesas_sdhi: better reset from HS400 mode
	media: si470x: Fix use-after-free in si470x_int_in_callback()
	clk: st: Fix memory leak in st_of_quadfs_setup()
	crypto: hisilicon/hpre - fix resource leak in remove process
	scsi: lpfc: Fix hard lockup when reading the rx_monitor from debugfs
	scsi: ufs: Reduce the START STOP UNIT timeout
	scsi: elx: libefc: Fix second parameter type in state callbacks
	hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param()
	drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
	drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
	orangefs: Fix kmemleak in orangefs_prepare_debugfs_help_string()
	orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init()
	tools/include: Add _RET_IP_ and math definitions to kernel.h
	KVM: selftests: Fix build regression by using accessor function
	hwmon: (jc42) Fix missing unlock on error in jc42_write()
	ALSA/ASoC: hda: move/rename snd_hdac_ext_stop_streams to hdac_stream.c
	ALSA: hda: add snd_hdac_stop_streams() helper
	ASoC: Intel: Skylake: Fix driver hang during shutdown
	ASoC: mediatek: mt8173-rt5650-rt5514: fix refcount leak in mt8173_rt5650_rt5514_dev_probe()
	ASoC: audio-graph-card: fix refcount leak of cpu_ep in __graph_for_each_link()
	ASoC: rockchip: pdm: Add missing clk_disable_unprepare() in rockchip_pdm_runtime_resume()
	ASoC: mediatek: mt8183: fix refcount leak in mt8183_mt6358_ts3a227_max98357_dev_probe()
	ASoC: wm8994: Fix potential deadlock
	ASoC: rockchip: spdif: Add missing clk_disable_unprepare() in rk_spdif_runtime_resume()
	ASoC: rt5670: Remove unbalanced pm_runtime_put()
	drm/i915/display: Don't disable DDI/Transcoder when setting phy test pattern
	LoadPin: Ignore the "contents" argument of the LSM hooks
	pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion
	perf debug: Set debug_peo_args and redirect_to_stderr variable to correct values in perf_quiet_option()
	afs: Fix lost servers_outstanding count
	pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES
	ima: Simplify ima_lsm_copy_rule
	ALSA: usb-audio: add the quirk for KT0206 device
	ALSA: hda/realtek: Add quirk for Lenovo TianYi510Pro-14IOB
	ALSA: hda/hdmi: Add HP Device 0x8711 to force connect list
	usb: cdnsp: fix lack of ZLP for ep0
	usb: xhci-mtk: fix leakage of shared hcd when fail to set wakeup irq
	arm64: dts: qcom: sm8250: fix USB-DP PHY registers
	usb: dwc3: Fix race between dwc3_set_mode and __dwc3_set_mode
	usb: dwc3: core: defer probe on ulpi_read_id timeout
	xhci: Prevent infinite loop in transaction errors recovery for streams
	HID: wacom: Ensure bootloader PID is usable in hidraw mode
	HID: mcp2221: don't connect hidraw
	loop: Fix the max_loop commandline argument treatment when it is set to 0
	9p: set req refcount to zero to avoid uninitialized usage
	security: Restrict CONFIG_ZERO_CALL_USED_REGS to gcc or clang > 15.0.6
	reiserfs: Add missing calls to reiserfs_security_free()
	iio: fix memory leak in iio_device_register_eventset()
	iio: adc: ad_sigma_delta: do not use internal iio_dev lock
	iio: adc128s052: add proper .data members in adc128_of_match table
	regulator: core: fix deadlock on regulator enable
	floppy: Fix memory leak in do_floppy_init()
	gcov: add support for checksum field
	fbdev: fbcon: release buffer when fbcon_do_set_font() failed
	ovl: fix use inode directly in rcu-walk mode
	btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
	scsi: qla2xxx: Fix crash when I/O abort times out
	net: stmmac: fix errno when create_singlethread_workqueue() fails
	media: dvbdev: fix build warning due to comments
	media: dvbdev: fix refcnt bug
	extcon: usbc-tusb320: Call the Type-C IRQ handler only if a port is registered
	mfd: qcom_rpm: Use devm_of_platform_populate() to simplify code
	pwm: tegra: Fix 32 bit build
	Linux 5.15.86

Change-Id: Ic157edd6a65abf4a3167b5d227edeb0564f1be4e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-01-18 12:52:16 +00:00
Zqiang
98a5b1265a rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state()
[ Upstream commit ceb1c8c9b8aa9199da46a0f29d2d5f08d9b44c15 ]

Running rcutorture with non-zero fqs_duration module parameter in a
kernel built with CONFIG_PREEMPTION=y results in the following splat:

BUG: using __this_cpu_read() in preemptible [00000000]
code: rcu_torture_fqs/398
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 398 Comm: rcu_torture_fqs Not tainted 6.0.0-rc1-yoctodev-standard+
Call Trace:
<TASK>
dump_stack_lvl+0x5b/0x86
dump_stack+0x10/0x16
check_preemption_disabled+0xe5/0xf0
__this_cpu_preempt_check+0x13/0x20
rcu_force_quiescent_state.part.0+0x1c/0x170
rcu_force_quiescent_state+0x1e/0x30
rcu_torture_fqs+0xca/0x160
? rcu_torture_boost+0x430/0x430
kthread+0x192/0x1d0
? kthread_complete_and_exit+0x30/0x30
ret_from_fork+0x22/0x30
</TASK>

The problem is that rcu_force_quiescent_state() uses __this_cpu_read()
in preemptible code instead of the proper raw_cpu_read().  This commit
therefore changes __this_cpu_read() to raw_cpu_read().

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31 13:14:39 +01:00
Greg Kroah-Hartman
335a32d033 Merge 5.15.75 into android14-5.15
Changes in 5.15.75
	Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
	ALSA: oss: Fix potential deadlock at unregistration
	ALSA: rawmidi: Drop register_mutex in snd_rawmidi_free()
	ALSA: usb-audio: Fix potential memory leaks
	ALSA: usb-audio: Fix NULL dererence at error path
	ALSA: hda/realtek: remove ALC289_FIXUP_DUAL_SPK for Dell 5530
	ALSA: hda/realtek: Correct pin configs for ASUS G533Z
	ALSA: hda/realtek: Add quirk for ASUS GV601R laptop
	ALSA: hda/realtek: Add Intel Reference SSID to support headset keys
	mtd: rawnand: atmel: Unmap streaming DMA mappings
	io_uring/net: don't update msg_name if not provided
	hv_netvsc: Fix race between VF offering and VF association message from host
	cifs: destage dirty pages before re-reading them for cache=none
	cifs: Fix the error length of VALIDATE_NEGOTIATE_INFO message
	iio: dac: ad5593r: Fix i2c read protocol requirements
	iio: ltc2497: Fix reading conversion results
	iio: adc: ad7923: fix channel readings for some variants
	iio: pressure: dps310: Refactor startup procedure
	iio: pressure: dps310: Reset chip after timeout
	xhci: dbc: Fix memory leak in xhci_alloc_dbc()
	usb: add quirks for Lenovo OneLink+ Dock
	can: kvaser_usb: Fix use of uninitialized completion
	can: kvaser_usb_leaf: Fix overread with an invalid command
	can: kvaser_usb_leaf: Fix TX queue out of sync after restart
	can: kvaser_usb_leaf: Fix CAN state after restart
	mmc: sdhci-sprd: Fix minimum clock limit
	i2c: designware: Fix handling of real but unexpected device interrupts
	fs: dlm: fix race between test_bit() and queue_work()
	fs: dlm: handle -EBUSY first in lock arg validation
	HID: multitouch: Add memory barriers
	quota: Check next/prev free block number after reading from quota file
	platform/chrome: cros_ec_proto: Update version on GET_NEXT_EVENT failure
	ASoC: wcd9335: fix order of Slimbus unprepare/disable
	ASoC: wcd934x: fix order of Slimbus unprepare/disable
	hwmon: (gsc-hwmon) Call of_node_get() before of_find_xxx API
	net: thunderbolt: Enable DMA paths only after rings are enabled
	regulator: qcom_rpm: Fix circular deferral regression
	arm64: topology: move store_cpu_topology() to shared code
	riscv: topology: fix default topology reporting
	RISC-V: Make port I/O string accessors actually work
	parisc: fbdev/stifb: Align graphics memory size to 4MB
	riscv: Allow PROT_WRITE-only mmap()
	riscv: Make VM_WRITE imply VM_READ
	riscv: always honor the CONFIG_CMDLINE_FORCE when parsing dtb
	riscv: Pass -mno-relax only on lld < 15.0.0
	UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
	nvmem: core: Fix memleak in nvmem_register()
	nvme-multipath: fix possible hang in live ns resize with ANA access
	nvme-pci: set min_align_mask before calculating max_hw_sectors
	Revert "drm/amdgpu: use dirty framebuffer helper"
	dmaengine: mxs: use platform_driver_register
	drm/virtio: Check whether transferred 2D BO is shmem
	drm/virtio: Unlock reservations on virtio_gpu_object_shmem_init() error
	drm/virtio: Use appropriate atomic state in virtio_gpu_plane_cleanup_fb()
	drm/udl: Restore display mode on resume
	arm64: errata: Add Cortex-A55 to the repeat tlbi list
	mm/damon: validate if the pmd entry is present before accessing
	mm/mmap: undo ->mmap() when arch_validate_flags() fails
	xen/gntdev: Prevent leaking grants
	xen/gntdev: Accommodate VMA splitting
	PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge
	serial: 8250: Let drivers request full 16550A feature probing
	serial: 8250: Request full 16550A feature probing for OxSemi PCIe devices
	NFSD: Protect against send buffer overflow in NFSv3 READDIR
	NFSD: Protect against send buffer overflow in NFSv2 READ
	NFSD: Protect against send buffer overflow in NFSv3 READ
	powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
	powerpc/boot: Explicitly disable usage of SPE instructions
	slimbus: qcom-ngd: use correct error in message of pdr_add_lookup() failure
	slimbus: qcom-ngd: cleanup in probe error path
	scsi: qedf: Populate sysfs attributes for vport
	gpio: rockchip: request GPIO mux to pinctrl when setting direction
	pinctrl: rockchip: add pinmux_ops.gpio_set_direction callback
	fbdev: smscufx: Fix use-after-free in ufx_ops_open()
	ksmbd: fix endless loop when encryption for response fails
	ksmbd: Fix wrong return value and message length check in smb2_ioctl()
	ksmbd: Fix user namespace mapping
	fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE
	btrfs: fix race between quota enable and quota rescan ioctl
	btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
	f2fs: complete checkpoints during remount
	f2fs: flush pending checkpoints when freezing super
	f2fs: increase the limit for reserve_root
	f2fs: fix to do sanity check on destination blkaddr during recovery
	f2fs: fix to do sanity check on summary info
	hardening: Avoid harmless Clang option under CONFIG_INIT_STACK_ALL_ZERO
	hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
	jbd2: wake up journal waiters in FIFO order, not LIFO
	jbd2: fix potential buffer head reference count leak
	jbd2: fix potential use-after-free in jbd2_fc_wait_bufs
	jbd2: add miss release buffer head in fc_do_one_pass()
	ext4: avoid crash when inline data creation follows DIO write
	ext4: fix null-ptr-deref in ext4_write_info
	ext4: make ext4_lazyinit_thread freezable
	ext4: fix check for block being out of directory size
	ext4: don't increase iversion counter for ea_inodes
	ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
	ext4: place buffer head allocation before handle start
	ext4: fix dir corruption when ext4_dx_add_entry() fails
	ext4: fix miss release buffer head in ext4_fc_write_inode
	ext4: fix potential memory leak in ext4_fc_record_modified_inode()
	ext4: fix potential memory leak in ext4_fc_record_regions()
	ext4: update 'state->fc_regions_size' after successful memory allocation
	livepatch: fix race between fork and KLP transition
	ftrace: Properly unset FTRACE_HASH_FL_MOD
	ring-buffer: Allow splice to read previous partially read pages
	ring-buffer: Have the shortest_full queue be the shortest not longest
	ring-buffer: Check pending waiters when doing wake ups as well
	ring-buffer: Add ring_buffer_wake_waiters()
	ring-buffer: Fix race between reset page and reading page
	tracing: Disable interrupt or preemption before acquiring arch_spinlock_t
	tracing: Wake up ring buffer waiters on closing of the file
	tracing: Wake up waiters when tracing is disabled
	tracing: Add ioctl() to force ring buffer waiters to wake up
	tracing: Move duplicate code of trace_kprobe/eprobe.c into header
	tracing: Add "(fault)" name injection to kernel probes
	tracing: Fix reading strings from synthetic events
	thunderbolt: Explicitly enable lane adapter hotplug events at startup
	efi: libstub: drop pointless get_memory_map() call
	media: cedrus: Set the platform driver data earlier
	media: cedrus: Fix endless loop in cedrus_h265_skip_bits()
	blk-wbt: call rq_qos_add() after wb_normal is initialized
	KVM: x86/emulator: Fix handing of POP SS to correctly set interruptibility
	KVM: nVMX: Unconditionally purge queued/injected events on nested "exit"
	KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02
	KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS
	staging: greybus: audio_helper: remove unused and wrong debugfs usage
	drm/nouveau/kms/nv140-: Disable interlacing
	drm/nouveau: fix a use-after-free in nouveau_gem_prime_import_sg_table()
	drm/i915: Fix watermark calculations for gen12+ RC CCS modifier
	drm/i915: Fix watermark calculations for gen12+ MC CCS modifier
	drm/i915: Fix watermark calculations for gen12+ CCS+CC modifier
	drm/amd/display: Fix vblank refcount in vrr transition
	smb3: must initialize two ACL struct fields to zero
	selinux: use "grep -E" instead of "egrep"
	ima: fix blocking of security.ima xattrs of unsupported algorithms
	userfaultfd: open userfaultfds with O_RDONLY
	ntfs3: rework xattr handlers and switch to POSIX ACL VFS helpers
	thermal: cpufreq_cooling: Check the policy first in cpufreq_cooling_register()
	sh: machvec: Use char[] for section boundaries
	MIPS: SGI-IP27: Free some unused memory
	MIPS: SGI-IP27: Fix platform-device leak in bridge_platform_create()
	ARM: 9244/1: dump: Fix wrong pg_level in walk_pmd()
	ARM: 9247/1: mm: set readonly for MT_MEMORY_RO with ARM_LPAE
	objtool: Preserve special st_shndx indexes in elf_update_symbol
	nfsd: Fix a memory leak in an error handling path
	SUNRPC: Fix svcxdr_init_decode's end-of-buffer calculation
	SUNRPC: Fix svcxdr_init_encode's buflen calculation
	NFSD: Protect against send buffer overflow in NFSv2 READDIR
	NFSD: Fix handling of oversized NFSv4 COMPOUND requests
	wifi: rtlwifi: 8192de: correct checking of IQK reload
	wifi: ath10k: add peer map clean up for peer delete in ath10k_sta_state()
	leds: lm3601x: Don't use mutex after it was destroyed
	bpf: Fix reference state management for synchronous callbacks
	wifi: mac80211: allow bw change during channel switch in mesh
	bpftool: Fix a wrong type cast in btf_dumper_int
	spi: mt7621: Fix an error message in mt7621_spi_probe()
	x86/resctrl: Fix to restore to original value when re-enabling hardware prefetch register
	xsk: Fix backpressure mechanism on Tx
	bpf: Disable preemption when increasing per-cpu map_locked
	bpf: Propagate error from htab_lock_bucket() to userspace
	bpf: Use this_cpu_{inc|dec|inc_return} for bpf_task_storage_busy
	Bluetooth: btusb: mediatek: fix WMT failure during runtime suspend
	wifi: rtl8xxxu: tighten bounds checking in rtl8xxxu_read_efuse()
	wifi: rtw88: add missing destroy_workqueue() on error path in rtw_core_init()
	selftests/xsk: Avoid use-after-free on ctx
	spi: qup: add missing clk_disable_unprepare on error in spi_qup_resume()
	spi: qup: add missing clk_disable_unprepare on error in spi_qup_pm_resume_runtime()
	wifi: rtl8xxxu: Fix skb misuse in TX queue selection
	spi: meson-spicc: do not rely on busy flag in pow2 clk ops
	bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
	wifi: rtl8xxxu: gen2: Fix mistake in path B IQ calibration
	wifi: rtl8xxxu: Remove copy-paste leftover in gen2_update_rate_mask
	wifi: mt76: sdio: fix transmitting packet hangs
	wifi: mt76: mt7615: add mt7615_mutex_acquire/release in mt7615_sta_set_decap_offload
	wifi: mt76: mt7915: do not check state before configuring implicit beamform
	Bluetooth: RFCOMM: Fix possible deadlock on socket shutdown/release
	net: fs_enet: Fix wrong check in do_pd_setup
	bpf: Ensure correct locking around vulnerable function find_vpid()
	Bluetooth: hci_{ldisc,serdev}: check percpu_init_rwsem() failure
	netfilter: conntrack: fix the gc rescheduling delay
	netfilter: conntrack: revisit the gc initial rescheduling bias
	wifi: ath11k: fix number of VHT beamformee spatial streams
	x86/microcode/AMD: Track patch allocation size explicitly
	x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
	spi: dw: Fix PM disable depth imbalance in dw_spi_bt1_probe
	spi/omap100k:Fix PM disable depth imbalance in omap1_spi100k_probe
	skmsg: Schedule psock work if the cached skb exists on the psock
	i2c: mlxbf: support lock mechanism
	Bluetooth: hci_core: Fix not handling link timeouts propertly
	xfrm: Reinject transport-mode packets through workqueue
	netfilter: nft_fib: Fix for rpath check with VRF devices
	spi: s3c64xx: Fix large transfers with DMA
	wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
	vhost/vsock: Use kvmalloc/kvfree for larger packets.
	eth: alx: take rtnl_lock on resume
	mISDN: fix use-after-free bugs in l1oip timer handlers
	sctp: handle the error returned from sctp_auth_asoc_init_active_key
	tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
	spi: Ensure that sg_table won't be used after being freed
	hwmon: (pmbus/mp2888) Fix sensors readouts for MPS Multi-phase mp2888 controller
	net: rds: don't hold sock lock when cancelling work from rds_tcp_reset_callbacks()
	bnx2x: fix potential memory leak in bnx2x_tpa_stop()
	net: wwan: iosm: Call mutex_init before locking it
	net/ieee802154: reject zero-sized raw_sendmsg()
	once: add DO_ONCE_SLOW() for sleepable contexts
	net: mvpp2: fix mvpp2 debugfs leak
	drm: bridge: adv7511: fix CEC power down control register offset
	drm: bridge: adv7511: unregister cec i2c device after cec adapter
	drm/bridge: Avoid uninitialized variable warning
	drm/mipi-dsi: Detach devices when removing the host
	drm/virtio: Correct drm_gem_shmem_get_sg_table() error handling
	drm/bridge: parade-ps8640: Fix regulator supply order
	drm/dp_mst: fix drm_dp_dpcd_read return value checks
	drm:pl111: Add of_node_put() when breaking out of for_each_available_child_of_node()
	ASoC: mt6359: fix tests for platform_get_irq() failure
	platform/chrome: fix double-free in chromeos_laptop_prepare()
	platform/chrome: fix memory corruption in ioctl
	ASoC: tas2764: Allow mono streams
	ASoC: tas2764: Drop conflicting set_bias_level power setting
	ASoC: tas2764: Fix mute/unmute
	platform/x86: msi-laptop: Fix old-ec check for backlight registering
	platform/x86: msi-laptop: Fix resource cleanup
	platform/chrome: cros_ec_typec: Correct alt mode index
	drm/amdgpu: add missing pci_disable_device() in amdgpu_pmops_runtime_resume()
	drm/bridge: megachips: Fix a null pointer dereference bug
	ASoC: rsnd: Add check for rsnd_mod_power_on
	ALSA: hda: beep: Simplify keep-power-at-enable behavior
	drm/bochs: fix blanking
	drm/omap: dss: Fix refcount leak bugs
	drm/amdgpu: Fix memory leak in hpd_rx_irq_create_workqueue()
	mmc: au1xmmc: Fix an error handling path in au1xmmc_probe()
	ASoC: eureka-tlv320: Hold reference returned from of_find_xxx API
	drm/msm/dpu: index dpu_kms->hw_vbif using vbif_idx
	drm/msm/dp: correct 1.62G link rate at dp_catalog_ctrl_config_msa()
	drm/vmwgfx: Fix memory leak in vmw_mksstat_add_ioctl()
	ASoC: codecs: tx-macro: fix kcontrol put
	ASoC: da7219: Fix an error handling path in da7219_register_dai_clks()
	ALSA: dmaengine: increment buffer pointer atomically
	mmc: wmt-sdmmc: Fix an error handling path in wmt_mci_probe()
	ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe
	ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe
	ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe
	ASoC: mt6660: Fix PM disable depth imbalance in mt6660_i2c_probe
	ALSA: hda/hdmi: Don't skip notification handling during PM operation
	memory: pl353-smc: Fix refcount leak bug in pl353_smc_probe()
	memory: of: Fix refcount leak bug in of_get_ddr_timings()
	memory: of: Fix refcount leak bug in of_lpddr3_get_ddr_timings()
	locks: fix TOCTOU race when granting write lease
	soc: qcom: smsm: Fix refcount leak bugs in qcom_smsm_probe()
	soc: qcom: smem_state: Add refcounting for the 'state->of_node'
	ARM: dts: imx6qdl-kontron-samx6i: hook up DDC i2c bus
	ARM: dts: turris-omnia: Fix mpp26 pin name and comment
	ARM: dts: kirkwood: lsxl: fix serial line
	ARM: dts: kirkwood: lsxl: remove first ethernet port
	ia64: export memory_add_physaddr_to_nid to fix cxl build error
	soc/tegra: fuse: Drop Kconfig dependency on TEGRA20_APB_DMA
	arm64: dts: ti: k3-j7200: fix main pinmux range
	ARM: dts: exynos: correct s5k6a3 reset polarity on Midas family
	ARM: Drop CMDLINE_* dependency on ATAGS
	ext4: don't run ext4lazyinit for read-only filesystems
	arm64: ftrace: fix module PLTs with mcount
	ARM: dts: exynos: fix polarity of VBUS GPIO of Origen
	iio: adc: at91-sama5d2_adc: fix AT91_SAMA5D2_MR_TRACKTIM_MAX
	iio: adc: at91-sama5d2_adc: check return status for pressure and touch
	iio: adc: at91-sama5d2_adc: lock around oversampling and sample freq
	iio: adc: at91-sama5d2_adc: disable/prepare buffer on suspend/resume
	iio: inkern: only release the device node when done with it
	iio: inkern: fix return value in devm_of_iio_channel_get_by_name()
	iio: ABI: Fix wrong format of differential capacitance channel ABI.
	iio: magnetometer: yas530: Change data type of hard_offsets to signed
	RDMA/mlx5: Don't compare mkey tags in DEVX indirect mkey
	usb: common: debug: Check non-standard control requests
	clk: meson: Hold reference returned by of_get_parent()
	clk: oxnas: Hold reference returned by of_get_parent()
	clk: qoriq: Hold reference returned by of_get_parent()
	clk: berlin: Add of_node_put() for of_get_parent()
	clk: sprd: Hold reference returned by of_get_parent()
	clk: tegra: Fix refcount leak in tegra210_clock_init
	clk: tegra: Fix refcount leak in tegra114_clock_init
	clk: tegra20: Fix refcount leak in tegra20_clock_init
	HSI: omap_ssi: Fix refcount leak in ssi_probe
	HSI: omap_ssi_port: Fix dma_map_sg error check
	media: exynos4-is: fimc-is: Add of_node_put() when breaking out of loop
	tty: xilinx_uartps: Fix the ignore_status
	media: meson: vdec: add missing clk_disable_unprepare on error in vdec_hevc_start()
	media: uvcvideo: Fix memory leak in uvc_gpio_parse
	media: uvcvideo: Use entity get_cur in uvc_ctrl_set
	media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init
	RDMA/rxe: Fix "kernel NULL pointer dereference" error
	RDMA/rxe: Fix the error caused by qp->sk
	misc: ocxl: fix possible refcount leak in afu_ioctl()
	fpga: prevent integer overflow in dfl_feature_ioctl_set_irq()
	dmaengine: hisilicon: Disable channels when unregister hisi_dma
	dmaengine: hisilicon: Fix CQ head update
	dmaengine: hisilicon: Add multi-thread support for a DMA channel
	dyndbg: fix static_branch manipulation
	dyndbg: fix module.dyndbg handling
	dyndbg: let query-modname override actual module name
	dyndbg: drop EXPORTed dynamic_debug_exec_queries
	clk: qcom: sm6115: Select QCOM_GDSC
	mtd: devices: docg3: check the return value of devm_ioremap() in the probe
	phy: amlogic: phy-meson-axg-mipi-pcie-analog: Hold reference returned by of_get_parent()
	phy: phy-mtk-tphy: fix the phy type setting issue
	mtd: rawnand: intel: Read the chip-select line from the correct OF node
	mtd: rawnand: intel: Remove undocumented compatible string
	mtd: rawnand: fsl_elbc: Fix none ECC mode
	RDMA/irdma: Align AE id codes to correct flush code and event
	RDMA/srp: Fix srp_abort()
	RDMA/siw: Always consume all skbuf data in sk_data_ready() upcall.
	RDMA/siw: Fix QP destroy to wait for all references dropped.
	ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting()
	ata: fix ata_id_has_devslp()
	ata: fix ata_id_has_ncq_autosense()
	ata: fix ata_id_has_dipm()
	mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct()
	md: Replace snprintf with scnprintf
	md/raid5: Ensure stripe_fill happens on non-read IO with journal
	md/raid5: Remove unnecessary bio_put() in raid5_read_one_chunk()
	RDMA/cm: Use SLID in the work completion as the DLID in responder side
	IB: Set IOVA/LENGTH on IB_MR in core/uverbs layers
	xhci: Don't show warning for reinit on known broken suspend
	usb: gadget: function: fix dangling pnp_string in f_printer.c
	drivers: serial: jsm: fix some leaks in probe
	serial: 8250: Toggle IER bits on only after irq has been set up
	tty: serial: fsl_lpuart: disable dma rx/tx use flags in lpuart_dma_shutdown
	phy: qualcomm: call clk_disable_unprepare in the error handling
	staging: vt6655: fix some erroneous memory clean-up loops
	slimbus: qcom-ngd-ctrl: allow compile testing without QCOM_RPROC_COMMON
	firmware: google: Test spinlock on panic path to avoid lockups
	serial: 8250: Fix restoring termios speed after suspend
	scsi: libsas: Fix use-after-free bug in smp_execute_task_sg()
	scsi: iscsi: Rename iscsi_conn_queue_work()
	scsi: iscsi: Add recv workqueue helpers
	scsi: iscsi: Run recv path from workqueue
	scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()
	clk: qcom: apss-ipq6018: mark apcs_alias0_core_clk as critical
	clk: qcom: gcc-sm6115: Override default Alpha PLL regs
	RDMA/rxe: Fix resize_finish() in rxe_queue.c
	fsi: core: Check error number after calling ida_simple_get
	mfd: intel_soc_pmic: Fix an error handling path in intel_soc_pmic_i2c_probe()
	mfd: fsl-imx25: Fix an error handling path in mx25_tsadc_setup_irq()
	mfd: lp8788: Fix an error handling path in lp8788_probe()
	mfd: lp8788: Fix an error handling path in lp8788_irq_init() and lp8788_irq_init()
	mfd: fsl-imx25: Fix check for platform_get_irq() errors
	mfd: sm501: Add check for platform_driver_register()
	clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
	dmaengine: ioat: stop mod_timer from resurrecting deleted timer in __cleanup()
	usb: mtu3: fix failed runtime suspend in host only mode
	spmi: pmic-arb: correct duplicate APID to PPID mapping logic
	clk: vc5: Fix 5P49V6901 outputs disabling when enabling FOD
	clk: baikal-t1: Fix invalid xGMAC PTP clock divider
	clk: baikal-t1: Add shared xGMAC ref/ptp clocks internal parent
	clk: baikal-t1: Add SATA internal ref clock buffer
	clk: bcm2835: fix bcm2835_clock_rate_from_divisor declaration
	clk: imx: scu: fix memleak on platform_device_add() fails
	clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
	clk: ast2600: BCLK comes from EPLL
	mailbox: mpfs: fix handling of the reg property
	mailbox: mpfs: account for mbox offsets while sending
	mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg
	powerpc/configs: Properly enable PAPR_SCM in pseries_defconfig
	powerpc/math_emu/efp: Include module.h
	powerpc/sysdev/fsl_msi: Add missing of_node_put()
	powerpc/pci_dn: Add missing of_node_put()
	powerpc/powernv: add missing of_node_put() in opal_export_attrs()
	powerpc: Fix fallocate and fadvise64_64 compat parameter combination
	x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
	powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5
	powerpc: Fix SPE Power ISA properties for e500v1 platforms
	powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()
	powerpc/pseries/vas: Pass hw_cpu_id to node associativity HCALL
	crypto: sahara - don't sleep when in softirq
	crypto: hisilicon/zip - fix mismatch in get/set sgl_sge_nr
	hwrng: arm-smccc-trng - fix NO_ENTROPY handling
	cgroup: Honor caller's cgroup NS when resolving path
	hwrng: imx-rngc - Moving IRQ handler registering after imx_rngc_irq_mask_clear()
	crypto: qat - fix default value of WDT timer
	crypto: hisilicon/qm - fix missing put dfx access
	cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
	iommu/omap: Fix buffer overflow in debugfs
	crypto: akcipher - default implementation for setting a private key
	crypto: ccp - Release dma channels before dmaengine unrgister
	crypto: inside-secure - Change swab to swab32
	crypto: qat - fix DMA transfer direction
	cifs: return correct error in ->calc_signature()
	iommu/iova: Fix module config properly
	tracing: kprobe: Fix kprobe event gen test module on exit
	tracing: kprobe: Make gen test module work in arm and riscv
	tracing/osnoise: Fix possible recursive locking in stop_per_cpu_kthreads
	kbuild: remove the target in signal traps when interrupted
	kbuild: rpm-pkg: fix breakage when V=1 is used
	crypto: marvell/octeontx - prevent integer overflows
	crypto: cavium - prevent integer overflow loading firmware
	thermal/drivers/qcom/tsens-v0_1: Fix MSM8939 fourth sensor hw_id
	ACPI: APEI: do not add task_work to kernel thread to avoid memory leak
	f2fs: fix race condition on setting FI_NO_EXTENT flag
	f2fs: fix to account FS_CP_DATA_IO correctly
	selftest: tpm2: Add Client.__del__() to close /dev/tpm* handle
	fs: dlm: fix race in lowcomms
	rcu: Avoid triggering strict-GP irq-work when RCU is idle
	rcu: Back off upon fill_page_cache_func() allocation failure
	rcu-tasks: Convert RCU_LOCKDEP_WARN() to WARN_ONCE()
	ACPI: video: Add Toshiba Satellite/Portege Z830 quirk
	ACPI: tables: FPDT: Don't call acpi_os_map_memory() on invalid phys address
	cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
	MIPS: BCM47XX: Cast memcmp() of function to (void *)
	powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
	thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash
	ARM: decompressor: Include .data.rel.ro.local
	ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable
	x86/entry: Work around Clang __bdos() bug
	NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data
	NFSD: fix use-after-free on source server when doing inter-server copy
	wifi: brcmfmac: fix invalid address access when enabling SCAN log level
	bpftool: Clear errno after libcap's checks
	ice: set tx_tstamps when creating new Tx rings via ethtool
	net: ethernet: ti: davinci_mdio: Add workaround for errata i2329
	openvswitch: Fix double reporting of drops in dropwatch
	openvswitch: Fix overreporting of drops in dropwatch
	tcp: annotate data-race around tcp_md5sig_pool_populated
	x86/mce: Retrieve poison range from hardware
	wifi: ath9k: avoid uninit memory read in ath9k_htc_rx_msg()
	thunderbolt: Add back Intel Falcon Ridge end-to-end flow control workaround
	xfrm: Update ipcomp_scratches with NULL when freed
	iavf: Fix race between iavf_close and iavf_reset_task
	wifi: brcmfmac: fix use-after-free bug in brcmf_netdev_start_xmit()
	Bluetooth: btintel: Mark Intel controller to support LE_STATES quirk
	regulator: core: Prevent integer underflow
	wifi: mt76: mt7921: reset msta->airtime_ac while clearing up hw value
	Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()
	Bluetooth: hci_sysfs: Fix attempting to call device_add multiple times
	can: bcm: check the result of can_send() in bcm_can_tx()
	wifi: rt2x00: don't run Rt5592 IQ calibration on MT7620
	wifi: rt2x00: set correct TX_SW_CFG1 MAC register for MT7620
	wifi: rt2x00: set VGC gain for both chains of MT7620
	wifi: rt2x00: set SoC wmac clock register
	wifi: rt2x00: correctly set BBP register 86 for MT7620
	hwmon: (sht4x) do not overflow clamping operation on 32-bit platforms
	net: If sock is dead don't access sock's sk_wq in sk_stream_wait_memory
	Bluetooth: L2CAP: Fix user-after-free
	r8152: Rate limit overflow messages
	drm/nouveau/nouveau_bo: fix potential memory leak in nouveau_bo_alloc()
	drm: Use size_t type for len variable in drm_copy_field()
	drm: Prevent drm_copy_field() to attempt copying a NULL pointer
	drm/komeda: Fix handling of atomic commits in the atomic_commit_tail hook
	gpu: lontium-lt9611: Fix NULL pointer dereference in lt9611_connector_init()
	drm/amd/display: fix overflow on MIN_I64 definition
	udmabuf: Set ubuf->sg = NULL if the creation of sg table fails
	drm: bridge: dw_hdmi: only trigger hotplug event on link change
	ALSA: usb-audio: Register card at the last interface
	drm/vc4: vec: Fix timings for VEC modes
	drm: panel-orientation-quirks: Add quirk for Anbernic Win600
	platform/chrome: cros_ec: Notify the PM of wake events during resume
	platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading
	ASoC: SOF: pci: Change DMI match info to support all Chrome platforms
	drm/amdgpu: fix initial connector audio value
	drm/meson: reorder driver deinit sequence to fix use-after-free bug
	drm/meson: explicitly remove aggregate driver at module unload time
	mmc: sdhci-msm: add compatible string check for sdm670
	drm/dp: Don't rewrite link config when setting phy test pattern
	drm/amd/display: Remove interface for periodic interrupt 1
	ARM: dts: imx7d-sdb: config the max pressure for tsc2046
	ARM: dts: imx6q: add missing properties for sram
	ARM: dts: imx6dl: add missing properties for sram
	ARM: dts: imx6qp: add missing properties for sram
	ARM: dts: imx6sl: add missing properties for sram
	ARM: dts: imx6sll: add missing properties for sram
	ARM: dts: imx6sx: add missing properties for sram
	kselftest/arm64: Fix validatation termination record after EXTRA_CONTEXT
	arm64: dts: imx8mq-librem5: Add bq25895 as max17055's power supply
	btrfs: dump extra info if one free space cache has more bitmaps than it should
	btrfs: scrub: try to fix super block errors
	btrfs: don't print information about space cache or tree every remount
	ARM: 9242/1: kasan: Only map modules if CONFIG_KASAN_VMALLOC=n
	clk: zynqmp: Fix stack-out-of-bounds in strncpy`
	media: cx88: Fix a null-ptr-deref bug in buffer_prepare()
	media: platform: fix some double free in meson-ge2d and mtk-jpeg and s5p-mfc
	clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
	usb: host: xhci-plat: suspend and resume clocks
	usb: host: xhci-plat: suspend/resume clks for brcm
	dmaengine: ti: k3-udma: Reset UDMA_CHAN_RT byte counters to prevent overflow
	scsi: 3w-9xxx: Avoid disabling device if failing to enable it
	nbd: Fix hung when signal interrupts nbd_start_device_ioctl()
	iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
	power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()
	staging: vt6655: fix potential memory leak
	blk-throttle: prevent overflow while calculating wait time
	ata: libahci_platform: Sanity check the DT child nodes number
	bcache: fix set_at_max_writeback_rate() for multiple attached devices
	soundwire: cadence: Don't overwrite msg->buf during write commands
	soundwire: intel: fix error handling on dai registration issues
	HID: roccat: Fix use-after-free in roccat_read()
	eventfd: guard wake_up in eventfd fs calls as well
	md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d
	usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()
	usb: musb: Fix musb_gadget.c rxstate overflow bug
	arm64: dts: imx8mp: Add snps,gfladj-refclk-lpm-sel quirk to USB nodes
	usb: dwc3: core: Enable GUCTL1 bit 10 for fixing termination error after resume bug
	Revert "usb: storage: Add quirk for Samsung Fit flash"
	staging: rtl8723bs: fix potential memory leak in rtw_init_drv_sw()
	staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
	scsi: tracing: Fix compile error in trace_array calls when TRACING is disabled
	ext2: Use kvmalloc() for group descriptor array
	nvme: copy firmware_rev on each init
	nvmet-tcp: add bounds check on Transfer Tag
	usb: idmouse: fix an uninit-value in idmouse_open
	clk: bcm2835: Make peripheral PLLC critical
	clk: bcm2835: Round UART input clock up
	perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
	io_uring/af_unix: defer registered files gc to io_uring release
	io_uring: correct pinned_vm accounting
	io_uring/rw: fix short rw error handling
	io_uring/rw: fix error'ed retry return values
	io_uring/rw: fix unexpected link breakage
	mm: hugetlb: fix UAF in hugetlb_handle_userfault
	net: ieee802154: return -EINVAL for unknown addr type
	ALSA: usb-audio: Fix last interface check for registration
	blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
	net: ethernet: ti: davinci_mdio: fix build for mdio bitbang uses
	Revert "net/ieee802154: reject zero-sized raw_sendmsg()"
	net/ieee802154: don't warn zero-sized raw_sendmsg()
	drm/amd/display: Fix build breakage with CONFIG_DEBUG_FS=n
	Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
	Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
	lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
	ext4: continue to expand file system when the target size doesn't reach
	thermal: intel_powerclamp: Use first online CPU as control_cpu
	gcov: support GCC 12.1 and newer compilers
	io-wq: Fix memory leak in worker creation
	Linux 5.15.75

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id3dfe8285e830d01df7adb33729e78a75c1cac1a
2022-11-02 08:51:19 +01:00
Michal Hocko
cf38a05eb1 rcu: Back off upon fill_page_cache_func() allocation failure
[ Upstream commit 093590c16b447f53e66771c8579ae66c96f6ef61 ]

The fill_page_cache_func() function allocates couple of pages to store
kvfree_rcu_bulk_data structures. This is a lightweight (GFP_NORETRY)
allocation which can fail under memory pressure. The function will,
however keep retrying even when the previous attempt has failed.

This retrying is in theory correct, but in practice the allocation is
invoked from workqueue context, which means that if the memory reclaim
gets stuck, these retries can hog the worker for quite some time.
Although the workqueues subsystem automatically adjusts concurrency, such
adjustment is not guaranteed to happen until the worker context sleeps.
And the fill_page_cache_func() function's retry loop is not guaranteed
to sleep (see the should_reclaim_retry() function).

And we have seen this function cause workqueue lockups:

kernel: BUG: workqueue lockup - pool cpus=93 node=1 flags=0x1 nice=0 stuck for 32s!
[...]
kernel: pool 74: cpus=37 node=0 flags=0x1 nice=0 hung=32s workers=2 manager: 2146
kernel:   pwq 498: cpus=249 node=1 flags=0x1 nice=0 active=4/256 refcnt=5
kernel:     in-flight: 1917:fill_page_cache_func
kernel:     pending: dbs_work_handler, free_work, kfree_rcu_monitor

Originally, we thought that the root cause of this lockup was several
retries with direct reclaim, but this is not yet confirmed.  Furthermore,
we have seen similar lockups without any heavy memory pressure.  This
suggests that there are other factors contributing to these lockups.
However, it is not really clear that endless retries are desireable.

So let's make the fill_page_cache_func() function back off after
allocation failure.

Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26 12:35:29 +02:00
Greg Kroah-Hartman
df06ef206f Merge 5.15.39 into android14-5.15
Changes in 5.15.39
	MIPS: Fix CP0 counter erratum detection for R4k CPUs
	parisc: Merge model and model name into one line in /proc/cpuinfo
	ALSA: hda/realtek: Add quirk for Yoga Duet 7 13ITL6 speakers
	ALSA: fireworks: fix wrong return count shorter than expected by 4 bytes
	mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC
	mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits
	mmc: core: Set HS clock speed before sending HS CMD13
	gpiolib: of: fix bounds check for 'gpio-reserved-ranges'
	x86/fpu: Prevent FPU state corruption
	KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id
	iommu/vt-d: Calculate mask for non-aligned flushes
	iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
	drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
	drm/amdgpu: do not use passthrough mode in Xen dom0
	RISC-V: relocate DTB if it's outside memory region
	Revert "SUNRPC: attempt AF_LOCAL connect on setup"
	timekeeping: Mark NMI safe time accessors as notrace
	firewire: fix potential uaf in outbound_phy_packet_callback()
	firewire: remove check of list iterator against head past the loop body
	firewire: core: extend card->lock in fw_core_handle_bus_reset
	net: stmmac: disable Split Header (SPH) for Intel platforms
	genirq: Synchronize interrupt thread startup
	ASoC: da7219: Fix change notifications for tone generator frequency
	ASoC: wm8958: Fix change notifications for DSP controls
	ASoC: meson: Fix event generation for AUI ACODEC mux
	ASoC: meson: Fix event generation for G12A tohdmi mux
	ASoC: meson: Fix event generation for AUI CODEC mux
	s390/dasd: fix data corruption for ESE devices
	s390/dasd: prevent double format of tracks for ESE devices
	s390/dasd: Fix read for ESE with blksize < 4k
	s390/dasd: Fix read inconsistency for ESE DASD devices
	can: grcan: grcan_close(): fix deadlock
	can: isotp: remove re-binding of bound socket
	can: grcan: use ofdev->dev when allocating DMA memory
	can: grcan: grcan_probe(): fix broken system id check for errata workaround needs
	can: grcan: only use the NAPI poll budget for RX
	nfc: replace improper check device_is_registered() in netlink related functions
	nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs
	NFC: netlink: fix sleep in atomic bug when firmware download timeout
	gpio: visconti: Fix fwnode of GPIO IRQ
	gpio: pca953x: fix irq_stat not updated when irq is disabled (irq_mask not set)
	hwmon: (adt7470) Fix warning on module removal
	hwmon: (pmbus) disable PEC if not enabled
	ASoC: dmaengine: Restore NULL prepare_slave_config() callback
	ASoC: soc-ops: fix error handling
	iommu/vt-d: Drop stop marker messages
	iommu/dart: check return value after calling platform_get_resource()
	net/mlx5e: Fix trust state reset in reload
	net/mlx5e: Don't match double-vlan packets if cvlan is not set
	net/mlx5e: CT: Fix queued up restore put() executing after relevant ft release
	net/mlx5e: Fix the calling of update_buffer_lossy() API
	net/mlx5: Avoid double clear or set of sync reset requested
	net/mlx5: Fix deadlock in sync reset flow
	selftests/seccomp: Don't call read() on TTY from background pgrp
	SUNRPC release the transport of a relocated task with an assigned transport
	RDMA/siw: Fix a condition race issue in MPA request processing
	RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
	RDMA/irdma: Reduce iWARP QP destroy time
	RDMA/irdma: Fix possible crash due to NULL netdev in notifier
	NFSv4: Don't invalidate inode attributes on delegation return
	net: ethernet: mediatek: add missing of_node_put() in mtk_sgmii_init()
	net: dsa: mt7530: add missing of_node_put() in mt7530_setup()
	net: stmmac: dwmac-sun8i: add missing of_node_put() in sun8i_dwmac_register_mdio_mux()
	net: mdio: Fix ENOMEM return value in BCM6368 mux bus controller
	net: cpsw: add missing of_node_put() in cpsw_probe_dt()
	net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter()
	net: emaclite: Add error handling for of_address_to_resource()
	selftests/net: so_txtime: fix parsing of start time stamp on 32 bit systems
	selftests/net: so_txtime: usage(): fix documentation of default clock
	drm/msm/dp: remove fail safe mode related code
	btrfs: do not BUG_ON() on failure to update inode when setting xattr
	hinic: fix bug of wq out of bound access
	mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter()
	rxrpc: Enable IPv6 checksums on transport socket
	selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operational
	bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flag
	bnxt_en: Fix unnecessary dropping of RX packets
	selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer
	smsc911x: allow using IRQ0
	btrfs: force v2 space cache usage for subpage mount
	btrfs: always log symlinks in full mode
	drm/amdgpu: unify BO evicting method in amdgpu_ttm
	drm/amdgpu: explicitly check for s0ix when evicting resources
	drm/amdgpu: don't set s3 and s0ix at the same time
	drm/amdgpu: Ensure HDA function is suspended before ASIC reset
	gpio: mvebu: drop pwm base assignment
	kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU
	fbdev: Make fb_release() return -ENODEV if fbdev was unregistered
	net/mlx5: Fix slab-out-of-bounds while reading resource dump menu
	net/mlx5e: Lag, Fix use-after-free in fib event handler
	net/mlx5e: Lag, Fix fib_info pointer assignment
	net/mlx5e: Lag, Don't skip fib events on current dst
	iommu/dart: Add missing module owner to ops structure
	kvm: selftests: do not use bitfields larger than 32-bits for PTEs
	KVM: selftests: Silence compiler warning in the kvm_page_table_test
	x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
	KVM: x86: Do not change ICR on write to APIC_SELF_IPI
	KVM: x86/mmu: avoid NULL-pointer dereference on page freeing bugs
	KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertised
	selftest/vm: verify mmap addr in mremap_test
	selftest/vm: verify remap destination address in mremap_test
	mmc: rtsx: add 74 Clocks in power on flow
	Revert "parisc: Mark sched_clock unstable only if clocks are not syncronized"
	rcu: Fix callbacks processing time limit retaining cond_resched()
	rcu: Apply callbacks processing time limit only on softirq
	PCI: pci-bridge-emul: Add description for class_revision field
	PCI: pci-bridge-emul: Add definitions for missing capabilities registers
	PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers on emulated bridge
	PCI: aardvark: Clear all MSIs at setup
	PCI: aardvark: Comment actions in driver remove method
	PCI: aardvark: Disable bus mastering when unbinding driver
	PCI: aardvark: Mask all interrupts when unbinding driver
	PCI: aardvark: Fix memory leak in driver unbind
	PCI: aardvark: Assert PERST# when unbinding driver
	PCI: aardvark: Disable link training when unbinding driver
	PCI: aardvark: Disable common PHY when unbinding driver
	PCI: aardvark: Replace custom PCIE_CORE_INT_* macros with PCI_INTERRUPT_*
	PCI: aardvark: Rewrite IRQ code to chained IRQ handler
	PCI: aardvark: Check return value of generic_handle_domain_irq() when processing INTx IRQ
	PCI: aardvark: Make MSI irq_chip structures static driver structures
	PCI: aardvark: Make msi_domain_info structure a static driver structure
	PCI: aardvark: Use dev_fwnode() instead of of_node_to_fwnode(dev->of_node)
	PCI: aardvark: Refactor unmasking summary MSI interrupt
	PCI: aardvark: Add support for masking MSI interrupts
	PCI: aardvark: Fix setting MSI address
	PCI: aardvark: Enable MSI-X support
	PCI: aardvark: Add support for ERR interrupt on emulated bridge
	PCI: aardvark: Optimize writing PCI_EXP_RTCTL_PMEIE and PCI_EXP_RTSTA_PME on emulated bridge
	PCI: aardvark: Add support for PME interrupts
	PCI: aardvark: Fix support for PME requester on emulated bridge
	PCI: aardvark: Use separate INTA interrupt for emulated root bridge
	PCI: aardvark: Remove irq_mask_ack() callback for INTx interrupts
	PCI: aardvark: Don't mask irq when mapping
	PCI: aardvark: Drop __maybe_unused from advk_pcie_disable_phy()
	PCI: aardvark: Update comment about link going down after link-up
	Linux 5.15.39

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ibd053f0980e7d7eb5659ec6cfa7689c6d9bc416e
2022-06-09 10:28:43 +02:00
Frederic Weisbecker
0060c7bd9e rcu: Apply callbacks processing time limit only on softirq
commit a554ba288845fd3f6f12311fd76a51694233458a upstream.

Time limit only makes sense when callbacks are serviced in softirq mode
because:

_ In case we need to get back to the scheduler,
  cond_resched_tasks_rcu_qs() is called after each callback.

_ In case some other softirq vector needs the CPU, the call to
  local_bh_enable() before cond_resched_tasks_rcu_qs() takes care about
  them via a call to do_softirq().

Therefore, make sure the time limit only applies to softirq mode.

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[UR: backport to 5.15-stable]
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-12 12:30:26 +02:00
Frederic Weisbecker
2c5029d652 rcu: Fix callbacks processing time limit retaining cond_resched()
commit 3e61e95e2d095e308616cba4ffb640f95a480e01 upstream.

The callbacks processing time limit makes sure we are not exceeding a
given amount of time executing the queue.

However its "continue" clause bypasses the cond_resched() call on
rcuc and NOCB kthreads, delaying it until we reach the limit, which can
be very long...

Make sure the scheduler has a higher priority than the time limit.

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[UR: backport to 5.15-stable + commit update]
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-12 12:30:26 +02:00
Kalesh Singh
46a5d62344 FROMGIT: EXP rcu: Move expedited grace period (GP) work to RT kthread_worker
Enabling CONFIG_RCU_BOOST did not reduce RCU expedited grace-period
latency because its workqueues run at SCHED_OTHER, and thus can be
delayed by normal processes.  This commit avoids these delays by moving
the expedited GP work items to a real-time-priority kthread_worker.

This option is controlled by CONFIG_RCU_EXP_KTHREAD and disabled by
default on PREEMPT_RT=y kernels which disable expedited grace periods
after boot by unconditionally setting rcupdate.rcu_normal_after_boot=1.

The results were evaluated on arm64 Android devices (6GB ram) running
5.10 kernel, and capturing trace data in critical user-level code.

The table below shows the resulting order-of-magnitude improvements
in synchronize_rcu_expedited() latency:

------------------------------------------------------------------------
|                          |   workqueues  |  kthread_worker |  Diff   |
------------------------------------------------------------------------
| Count                    |          725  |            688  |         |
------------------------------------------------------------------------
| Min Duration       (ns)  |          326  |            447  |  37.12% |
------------------------------------------------------------------------
| Q1                 (ns)  |       39,428  |         38,971  |  -1.16% |
------------------------------------------------------------------------
| Q2 - Median        (ns)  |       98,225  |         69,743  | -29.00% |
------------------------------------------------------------------------
| Q3                 (ns)  |      342,122  |        126,638  | -62.98% |
------------------------------------------------------------------------
| Max Duration       (ns)  |  372,766,967  |      2,329,671  | -99.38% |
------------------------------------------------------------------------
| Avg Duration       (ns)  |    2,746,353  |        151,242  | -94.49% |
------------------------------------------------------------------------
| Standard Deviation (ns)  |   19,327,765  |        294,408  |         |
------------------------------------------------------------------------

The below table show the range of maximums/minimums for
synchronize_rcu_expedited() latency from all experiments:

------------------------------------------------------------------------
|                          |   workqueues  |  kthread_worker |  Diff   |
------------------------------------------------------------------------
| Total No. of Experiments |           25  |             23  |         |
------------------------------------------------------------------------
| Largest  Maximum   (ns)  |  372,766,967  |      2,329,671  | -99.38% |
------------------------------------------------------------------------
| Smallest Maximum   (ns)  |       38,819  |         86,954  | 124.00% |
------------------------------------------------------------------------
| Range of Maximums  (ns)  |  372,728,148  |      2,242,717  |         |
------------------------------------------------------------------------
| Largest  Minimum   (ns)  |       88,623  |         27,588  | -68.87% |
------------------------------------------------------------------------
| Smallest Minimum   (ns)  |          326  |            447  |  37.12% |
------------------------------------------------------------------------
| Range of Minimums  (ns)  |       88,297  |         27,141  |         |
------------------------------------------------------------------------

Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Reported-by: Tim Murray <timmurray@google.com>
Reported-by: Wei Wang <wvw@google.com>
Tested-by: Kyle Lin <kylelin@google.com>
Tested-by: Chunwei Lu <chunweilu@google.com>
Tested-by: Lulu Wang <luluw@google.com>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20220409003527.1587028-1-kaleshsingh@google.com/
(cherry picked from commit 3902dd17a29bd3ed1ead364a331a1761edb7162b
git: //git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git fastexp.2022.04.11a)
Bug: 224791892
Change-Id: I4cc5d28f9ae99e44e92afc43ff4db4b71c415d6c
2022-04-19 23:20:40 +00:00
Jun Miao
2f8e463885 UPSTREAM: rcu: Avoid alloc_pages() when recording stack
The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
which in turn can then call alloc_pages(GFP_NOWAIT, ...).  In general, however,
it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c).
Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
in case it runs out by using kasan_record_aux_stack_noalloc().

Jianwei Hu reported:
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
INFO: lockdep is turned off.
irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
  softirqs last  enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  CPU: 6 PID: 15319 Comm: python3 Tainted: G        W  O 5.15-rc7-preempt-rt #1
  Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
  Call Trace:
    show_stack+0x52/0x58
    dump_stack+0xa1/0xd6
    ___might_sleep.cold+0x11c/0x12d
    rt_spin_lock+0x3f/0xc0
    rmqueue+0x100/0x1460
    rmqueue+0x100/0x1460
    mark_usage+0x1a0/0x1a0
    ftrace_graph_ret_addr+0x2a/0xb0
    rmqueue_pcplist.constprop.0+0x6a0/0x6a0
     __kasan_check_read+0x11/0x20
     __zone_watermark_ok+0x114/0x270
     get_page_from_freelist+0x148/0x630
     is_module_text_address+0x32/0xa0
     __alloc_pages_nodemask+0x2f6/0x790
     __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
     create_prof_cpu_mask+0x30/0x30
     alloc_pages_current+0xb1/0x150
     stack_depot_save+0x39f/0x490
     kasan_save_stack+0x42/0x50
     kasan_save_stack+0x23/0x50
     kasan_record_aux_stack+0xa9/0xc0
     __call_rcu+0xff/0x9c0
     call_rcu+0xe/0x10
     put_object+0x53/0x70
     __delete_object+0x7b/0x90
     kmemleak_free+0x46/0x70
     slab_free_freelist_hook+0xb4/0x160
     kfree+0xe5/0x420
     kfree_const+0x17/0x30
     kobject_cleanup+0xaa/0x230
     kobject_put+0x76/0x90
     netdev_queue_update_kobjects+0x17d/0x1f0
     ... ...
     ksys_write+0xd9/0x180
     __x64_sys_write+0x42/0x50
     do_syscall_64+0x38/0x50
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

Links: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/kasan.h?id=7cb3007ce2da27ec02a1a3211941e7fe6875b642
Fixes: 84109ab585 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
Fixes: 26e760c9a7 ("rcu: kasan: record and print call_rcu() call stack")
Reported-by: Jianwei Hu <jianwei.hu@windriver.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Jun Miao <jun.miao@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 300c0c5e721834f484b03fa3062602dd8ff48413)
Bug: 217222520
Change-Id: I5b7d4f9dfe5da290627599d8d6de3278debb2a13
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
2022-02-14 15:50:54 +01:00
Paul E. McKenney
c3156dbd50 rcu: Tighten rcu_advance_cbs_nowake() checks
commit 614ddad17f22a22e035e2ea37a04815f50362017 upstream.

Currently, rcu_advance_cbs_nowake() checks that a grace period is in
progress, however, that grace period could end just after the check.
This commit rechecks that a grace period is still in progress while
holding the rcu_node structure's lock.  The grace period cannot end while
the current CPU's rcu_node structure's ->lock is held, thus avoiding
false positives from the WARN_ON_ONCE().

As Daniel Vacek noted, it is not necessary for the rcu_node structure
to have a CPU that has not yet passed through its quiescent state.

Tested-by: Guillaume Morin <guillaume@morinfr.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-29 10:58:25 +01:00
Paul E. McKenney
a96ac0688a rcu: Mark accesses to rcu_state.n_force_qs
commit 2431774f04d1050292054c763070021bade7b151 upstream.

This commit marks accesses to the rcu_state.n_force_qs.  These data
races are hard to make happen, but syzkaller was equal to the task.

Reported-by: syzbot+e08a83a1940ec3846cd5@syzkaller.appspotmail.com
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22 09:32:51 +01:00
Peter Zijlstra
d846b69dc7 rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
[ Upstream commit 74aece72f95f399dd29363669dc32a1344c8fab4 ]

  vmlinux.o: warning: objtool: rcu_nmi_enter()+0x36: call to __kasan_check_read() leaves .noinstr.text section

noinstr cannot have atomic_*() functions in because they're explicitly
annotated, use arch_atomic_*().

Fixes: 2be57f7328 ("rcu: Weaken ->dynticks accesses and updates")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:30 +01:00
Paul E. McKenney
b770efc460 Merge branches 'doc.2021.07.20c', 'fixes.2021.08.06a', 'nocb.2021.07.20c', 'nolibc.2021.07.20c', 'tasks.2021.07.20c', 'torture.2021.07.27a' and 'torturescript.2021.07.27a' into HEAD
doc.2021.07.20c: Documentation updates.
fixes.2021.08.06a: Miscellaneous fixes.
nocb.2021.07.20c: Callback-offloading (NOCB CPU) updates.
nolibc.2021.07.20c: Tiny userspace library updates.
tasks.2021.07.20c: Tasks RCU updates.
torture.2021.07.27a: In-kernel torture-test updates.
torturescript.2021.07.27a: Torture-test scripting updates.
2021-08-10 11:00:53 -07:00
Sebastian Andrzej Siewior
d3dd95a885 rcu: Replace deprecated CPU-hotplug functions
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: rcu@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-10 10:47:32 -07:00
Liu Song
8211e922de rcu: Use per_cpu_ptr to get the pointer of per_cpu variable
There are a few remaining locations in kernel/rcu that still use
"&per_cpu()".  This commit replaces them with "per_cpu_ptr(&)", and does
not introduce any functional change.

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:49 -07:00
Liu Song
eb880949ef rcu: Remove useless "ret" update in rcu_gp_fqs_loop()
Within rcu_gp_fqs_loop(), the "ret" local variable is set to the
return value from swait_event_idle_timeout_exclusive(), but "ret" is
unconditionally overwritten later in the code.  This commit therefore
removes this useless assignment.

Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:48 -07:00
Paul E. McKenney
f74126dcbc rcu: Make rcu_gp_init() and rcu_gp_fqs_loop noinline to conserve stack
The kbuild test project found an oversized stack frame in rcu_gp_kthread()
for some kernel configurations.  This oversizing was due to a very large
amount of inlining, which is unnecessary due to the fact that this code
executes infrequently.  This commit therefore marks rcu_gp_init() and
rcu_gp_fqs_loop noinline_for_stack to conserve stack space.

Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Rong Chen <rong.a.chen@intel.com>
[ paulmck: noinline_for_stack per Nathan Chancellor. ]
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:48 -07:00
Paul E. McKenney
2be57f7328 rcu: Weaken ->dynticks accesses and updates
Accesses to the rcu_data structure's ->dynticks field have always been
fully ordered because it was not possible to prove that weaker ordering
was safe.  However, with the removal of the rcu_eqs_special_set() function
and the advent of the Linux-kernel memory model, it is now easy to show
that two of the four original full memory barriers can be weakened to
acquire and release operations.  The remaining pair must remain full
memory barriers.  This change makes the memory ordering requirements
more evident, and it might well also speed up the to-idle and from-idle
fastpaths on some architectures.

The following litmus test, adapted from one supplied off-list by Frederic
Weisbecker, models the RCU grace-period kthread detecting an idle CPU
that is concurrently transitioning to non-idle:

	C dynticks-from-idle

	{
		DYNTICKS=0; (* Initially idle. *)
	}

	P0(int *X, int *DYNTICKS)
	{
		int dynticks;
		int x;

		// Idle.
		dynticks = READ_ONCE(*DYNTICKS);
		smp_store_release(DYNTICKS, dynticks + 1);
		smp_mb();
		// Now non-idle
		x = READ_ONCE(*X);
	}

	P1(int *X, int *DYNTICKS)
	{
		int dynticks;

		WRITE_ONCE(*X, 1);
		smp_mb();
		dynticks = smp_load_acquire(DYNTICKS);
	}

	exists (1:dynticks=0 /\ 0:x=1)

Running "herd7 -conf linux-kernel.cfg dynticks-from-idle.litmus" verifies
this transition, namely, showing that if the RCU grace-period kthread (P1)
sees another CPU as idle (P0), then any memory access prior to the start
of the grace period (P1's write to X) will be seen by any RCU read-side
critical section following the to-non-idle transition (P0's read from X).
This is a straightforward use of full memory barriers to force ordering
in a store-buffering (SB) litmus test.

The following litmus test, also adapted from the one supplied off-list
by Frederic Weisbecker, models the RCU grace-period kthread detecting
a non-idle CPU that is concurrently transitioning to idle:

	C dynticks-into-idle

	{
		DYNTICKS=1; (* Initially non-idle. *)
	}

	P0(int *X, int *DYNTICKS)
	{
		int dynticks;

		// Non-idle.
		WRITE_ONCE(*X, 1);
		dynticks = READ_ONCE(*DYNTICKS);
		smp_store_release(DYNTICKS, dynticks + 1);
		smp_mb();
		// Now idle.
	}

	P1(int *X, int *DYNTICKS)
	{
		int x;
		int dynticks;

		smp_mb();
		dynticks = smp_load_acquire(DYNTICKS);
		x = READ_ONCE(*X);
	}

	exists (1:dynticks=2 /\ 1:x=0)

Running "herd7 -conf linux-kernel.cfg dynticks-into-idle.litmus" verifies
this transition, namely, showing that if the RCU grace-period kthread
(P1) sees another CPU as newly idle (P0), then any pre-idle memory access
(P0's write to X) will be seen by any code following the grace period
(P1's read from X).  This is a simple release-acquire pair forcing
ordering in a message-passing (MP) litmus test.

Of course, if the grace-period kthread detects the CPU as non-idle,
it will refrain from reporting a quiescent state on behalf of that CPU,
so there are no ordering requirements from the grace-period kthread in
that case.  However, other subsystems call rcu_is_idle_cpu() to check
for CPUs being non-idle from an RCU perspective.  That case is also
verified by the above litmus tests with the proviso that the sense of
the low-order bit of the DYNTICKS counter be inverted.

Unfortunately, on x86 smp_mb() is as expensive as a cache-local atomic
increment.  This commit therefore weakens only the read from ->dynticks.
However, the updates are abstracted into a rcu_dynticks_inc() function
to ease any future changes that might be needed.

[ paulmck: Apply Linus Torvalds feedback. ]

Link: https://lore.kernel.org/lkml/20210721202127.2129660-4-paulmck@kernel.org/
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:48 -07:00
Joel Fernandes (Google)
a86baa69c2 rcu: Remove special bit at the bottom of the ->dynticks counter
Commit b8c17e6664 ("rcu: Maintain special bits at bottom of ->dynticks
counter") reserved a bit at the bottom of the ->dynticks counter to defer
flushing of TLBs, but this facility never has been used.  This commit
therefore removes this capability along with the rcu_eqs_special_set()
function used to trigger it.

Link: https://lore.kernel.org/linux-doc/CALCETrWNPOOdTrFabTDd=H7+wc6xJ9rJceg6OL1S0rTV5pfSsA@mail.gmail.com/
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: "Joel Fernandes (Google)" <joel@joelfernandes.org>
[ paulmck: Forward-port to v5.13-rc1. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:48 -07:00
Frederic Weisbecker
cba712beeb rcu/nocb: Remove NOCB deferred wakeup from rcutree_dead_cpu()
At CPU offline time, we must handle any pending wakeup for the nocb_gp
kthread linked to the outgoing CPU.

Now we are making sure of that twice:

1) From rcu_report_dead() when the outgoing CPU makes the very last
   local cleanups by itself before switching offline.

2) From rcutree_dead_cpu(). Here the offlining CPU has gone and is truly
   now offline. Another CPU takes care of post-portem cleaning up and
   check if the offline CPU had pending wakeup.

Both ways are fine but we have to choose one or the other because we
don't need to repeat that action. Simply benefit from cache locality
and keep only the first solution.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-07-20 13:41:51 -07:00
Frederic Weisbecker
dfcb275402 rcu/nocb: Start moving nocb code to its own plugin file
The kernel/rcu/tree_plugin.h file contains not only the plugins for
preemptible RCU, but also many other features including rcu_nocbs
callback offloading.  This offloading has become large and complex,
so it is time to put it in its own file.

This commit starts that process.

Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
[ paulmck: Rename to tree_nocb.h, add Frederic as author. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-07-20 13:41:51 -07:00
Linus Torvalds
28e92f9903 Merge branch 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:

 - Bitmap parsing support for "all" as an alias for all bits

 - Documentation updates

 - Miscellaneous fixes, including some that overlap into mm and lockdep

 - kvfree_rcu() updates

 - mem_dump_obj() updates, with acks from one of the slab-allocator
   maintainers

 - RCU NOCB CPU updates, including limited deoffloading

 - SRCU updates

 - Tasks-RCU updates

 - Torture-test updates

* 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (78 commits)
  tasks-rcu: Make show_rcu_tasks_gp_kthreads() be static inline
  rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states
  rcu: Add missing __releases() annotation
  rcu: Remove obsolete rcu_read_unlock() deadlock commentary
  rcu: Improve comments describing RCU read-side critical sections
  rcu: Create an unrcu_pointer() to remove __rcu from a pointer
  srcu: Early test SRCU polling start
  rcu: Fix various typos in comments
  rcu/nocb: Unify timers
  rcu/nocb: Prepare for fine-grained deferred wakeup
  rcu/nocb: Only cancel nocb timer if not polling
  rcu/nocb: Delete bypass_timer upon nocb_gp wakeup
  rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup
  rcu/nocb: Allow de-offloading rdp leader
  rcu/nocb: Directly call __wake_nocb_gp() from bypass timer
  rcu: Don't penalize priority boosting when there is nothing to boost
  rcu: Point to documentation of ordering guarantees
  rcu: Make rcu_gp_cleanup() be noinline for tracing
  rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs
  rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP
  ...
2021-07-04 12:58:33 -07:00
Andy Shevchenko
f39650de68 kernel.h: split out panic and oops helpers
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.

There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain

At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.

[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
  Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com

Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:04 -07:00
Paul E. McKenney
641faf1b90 Merge branches 'bitmaprange.2021.05.10c', 'doc.2021.05.10c', 'fixes.2021.05.13a', 'kvfree_rcu.2021.05.10c', 'mmdumpobj.2021.05.10c', 'nocb.2021.05.12a', 'srcu.2021.05.12a', 'tasks.2021.05.18a' and 'torture.2021.05.10c' into HEAD
bitmaprange.2021.05.10c: Allow "all" for bitmap ranges.
doc.2021.05.10c: Documentation updates.
fixes.2021.05.13a: Miscellaneous fixes.
kvfree_rcu.2021.05.10c: kvfree_rcu() updates.
mmdumpobj.2021.05.10c: mem_dump_obj() updates.
nocb.2021.05.12a: RCU NOCB CPU updates, including limited deoffloading.
srcu.2021.05.12a: SRCU updates.
tasks.2021.05.18a: Tasks-RCU updates.
torture.2021.05.10c: Torture-test updates.
2021-05-18 10:56:19 -07:00
Paul E. McKenney
cf868c2af2 rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states
Heavy networking load can cause a CPU to execute continuously and
indefinitely within ksoftirqd, in which case there will be no voluntary
task switches and thus no RCU-tasks quiescent states.  This commit
therefore causes the exiting rcu_softirq_qs() to provide an RCU-tasks
quiescent state.

This of course means that __do_softirq() and its callers cannot be
invoked from within a tracing trampoline.

Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
2021-05-18 10:54:51 -07:00
Paul E. McKenney
1893afd634 rcu: Improve comments describing RCU read-side critical sections
There are a number of places that call out the fact that preempt-disable
regions of code now act as RCU read-side critical sections, where
preempt-disable regions of code include irq-disable regions of code,
bh-disable regions of code, hardirq handlers, and NMI handlers.  However,
someone relying solely on (for example) the call_rcu() header comment
might well have no idea that preempt-disable regions of code have RCU
semantics.

This commit therefore updates the header comments for
call_rcu(), synchronize_rcu(), rcu_dereference_bh_check(), and
rcu_dereference_sched_check() to call out these new(ish) forms of RCU
readers.

Reported-by: Michel Lespinasse <michel@lespinasse.org>
[ paulmck: Apply Matthew Wilcox and Michel Lespinasse feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-13 09:13:23 -07:00
Ingo Molnar
a616aec9aa rcu: Fix various typos in comments
Fix ~12 single-word typos in RCU code comments.

[ paulmck: Apply feedback from Randy Dunlap. ]
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-12 12:11:05 -07:00
Frederic Weisbecker
870905169d rcu/nocb: Prepare for fine-grained deferred wakeup
Tuning the deferred wakeup level must be done from a safe wakeup
point. Currently those sites are:

* ->nocb_timer
* user/idle/guest entry
* CPU down
* softirq/rcuc

All of these sites perform the wake up for both RCU_NOCB_WAKE and
RCU_NOCB_WAKE_FORCE.

In order to merge ->nocb_timer and ->nocb_bypass_timer together, we plan
to add a new RCU_NOCB_WAKE_BYPASS that really should be deferred until
a timer fires so that we don't wake up the NOCB-gp kthread too early.

To prepare for that, this commit specifies the per-callsite wakeup
level/limit.

Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
[ paulmck: Fix non-NOCB rcu_nocb_need_deferred_wakeup() definition. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-12 12:10:23 -07:00
Paul E. McKenney
3d3a0d1b50 rcu: Point to documentation of ordering guarantees
Add comments to synchronize_rcu() and friends that point to
Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:54 -07:00
Paul E. McKenney
2f20de99a6 rcu: Make rcu_gp_cleanup() be noinline for tracing
Although there are trace events for RCU grace periods, these are only
enabled in CONFIG_RCU_TRACE=y kernels.  This commit therefore marks
rcu_gp_cleanup() noinline in order to provide a function that can be
traced that is invoked near the end of each grace period.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:54 -07:00
Paul E. McKenney
3ef5a1c382 rcu: Make RCU priority boosting work on single-CPU rcu_node structures
When any CPU comes online, it checks to see if an RCU-boost kthread has
already been created for that CPU's leaf rcu_node structure, and if
not, it creates one.  Unfortunately, it also verifies that this leaf
rcu_node structure actually has at least one online CPU, and if not,
it declines to create the kthread.  Although this behavior makes sense
during early boot, especially on systems that claim far more CPUs than
they actually have, it makes no sense for the first CPU to come online
for a given rcu_node structure.  There is no point in checking because
we know there is a CPU on its way in.

The problem is that timing differences can cause this incoming CPU to not
yet be reflected in the various bit masks even at rcutree_online_cpu()
time, and there is no chance at rcutree_prepare_cpu() time.  Plus it
would be better to create the RCU-boost kthread at rcutree_prepare_cpu()
to handle the case where the CPU is involved in an RCU priority inversion
very shortly after it comes online.

This commit therefore moves the checking to rcu_prepare_kthreads(), which
is called only at early boot, when the check is appropriate.  In addition,
it makes rcutree_prepare_cpu() invoke rcu_spawn_one_boost_kthread(), which
no longer does any checking for online CPUs.

With this change, RCU priority boosting tests now pass for short rcutorture
runs, even with single-CPU leaf rcu_node structures.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Scott Wood <swood@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:54 -07:00
Paul E. McKenney
8e4b1d2bc1 rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread()
Currently, rcu_spawn_core_kthreads() is invoked via an early_initcall(),
which works, except that rcu_spawn_gp_kthread() is also invoked via an
early_initcall() and rcu_spawn_core_kthreads() relies on adjustments to
kthread_prio that are carried out by rcu_spawn_gp_kthread().  There is
no guaranttee of ordering among early_initcall() handlers, and thus no
guarantee that kthread_prio will be properly checked and range-limited
at the time that rcu_spawn_core_kthreads() needs it.

In most cases, this bug is harmless.  After all, the only reason that
rcu_spawn_gp_kthread() adjusts the value of kthread_prio is if the user
specified a nonsensical value for this boot parameter, which experience
indicates is rare.

Nevertheless, a bug is a bug.  This commit therefore causes the
rcu_spawn_core_kthreads() function to be invoked directly from
rcu_spawn_gp_kthread() after any needed adjustments to kthread_prio have
been carried out.

Fixes: 48d07c04b4 ("rcu: Enable elimination of Tree-RCU softirq processing")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:54 -07:00
Zhouyi Zhou
277ffe1b70 rcu: Improve tree.c comments and add code cleanups
This commit cleans up some comments and code in kernel/rcu/tree.c.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:53 -07:00
Paul E. McKenney
ce7c169dee rcu: Remove the unused rcu_irq_exit_preempt() function
Commit 9ee01e0f69 ("x86/entry: Clean up idtentry_enter/exit()
leftovers") left the rcu_irq_exit_preempt() in place in order to avoid
conflicts with the -rcu tree.  Now that this change has long since hit
mainline, this commit removes the no-longer-used rcu_irq_exit_preempt()
function.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:22:53 -07:00
Frederic Weisbecker
b5befe842e srcu: Fix broken node geometry after early ssp init
An srcu_struct structure that is initialized before rcu_init_geometry()
will have its srcu_node hierarchy based on CONFIG_NR_CPUS.  Once
rcu_init_geometry() is called, this hierarchy is compressed as needed
for the actual maximum number of CPUs for this system.

Later on, that srcu_struct structure is confused, sometimes referring
to its initial CONFIG_NR_CPUS-based hierarchy, and sometimes instead
to the new num_possible_cpus() hierarchy.  For example, each of its
->mynode fields continues to reference the original leaf rcu_node
structures, some of which might no longer exist.  On the other hand,
srcu_for_each_node_breadth_first() traverses to the new node hierarchy.

There are at least two bad possible outcomes to this:

1) a) A callback enqueued early on an srcu_data structure (call it
      *sdp) is recorded pending on sdp->mynode->srcu_data_have_cbs in
      srcu_funnel_gp_start() with sdp->mynode pointing to a deep leaf
      (say 3 levels).

   b) The grace period ends after rcu_init_geometry() shrinks the
      nodes level to a single one.  srcu_gp_end() walks through the new
      srcu_node hierarchy without ever reaching the old leaves so the
      callback is never executed.

   This is easily reproduced on an 8 CPUs machine with CONFIG_NR_CPUS >= 32
   and "rcupdate.rcu_self_test=1". The srcu_barrier() after early tests
   verification never completes and the boot hangs:

	[ 5413.141029] INFO: task swapper/0:1 blocked for more than 4915 seconds.
	[ 5413.147564]       Not tainted 5.12.0-rc4+ #28
	[ 5413.151927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
	[ 5413.159753] task:swapper/0       state:D stack:    0 pid:    1 ppid:     0 flags:0x00004000
	[ 5413.168099] Call Trace:
	[ 5413.170555]  __schedule+0x36c/0x930
	[ 5413.174057]  ? wait_for_completion+0x88/0x110
	[ 5413.178423]  schedule+0x46/0xf0
	[ 5413.181575]  schedule_timeout+0x284/0x380
	[ 5413.185591]  ? wait_for_completion+0x88/0x110
	[ 5413.189957]  ? mark_held_locks+0x61/0x80
	[ 5413.193882]  ? mark_held_locks+0x61/0x80
	[ 5413.197809]  ? _raw_spin_unlock_irq+0x24/0x50
	[ 5413.202173]  ? wait_for_completion+0x88/0x110
	[ 5413.206535]  wait_for_completion+0xb4/0x110
	[ 5413.210724]  ? srcu_torture_stats_print+0x110/0x110
	[ 5413.215610]  srcu_barrier+0x187/0x200
	[ 5413.219277]  ? rcu_tasks_verify_self_tests+0x50/0x50
	[ 5413.224244]  ? rdinit_setup+0x2b/0x2b
	[ 5413.227907]  rcu_verify_early_boot_tests+0x2d/0x40
	[ 5413.232700]  do_one_initcall+0x63/0x310
	[ 5413.236541]  ? rdinit_setup+0x2b/0x2b
	[ 5413.240207]  ? rcu_read_lock_sched_held+0x52/0x80
	[ 5413.244912]  kernel_init_freeable+0x253/0x28f
	[ 5413.249273]  ? rest_init+0x250/0x250
	[ 5413.252846]  kernel_init+0xa/0x110
	[ 5413.256257]  ret_from_fork+0x22/0x30

2) An srcu_struct structure that is initialized before rcu_init_geometry()
   and used afterward will always have stale rdp->mynode references,
   resulting in callbacks to be missed in srcu_gp_end(), just like in
   the previous scenario.

This commit therefore causes init_srcu_struct_nodes to initialize the
geometry, if needed.  This ensures that the srcu_node hierarchy is
properly built and distributed from the get-go.

Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:03:35 -07:00
Frederic Weisbecker
8e9c01c717 srcu: Initialize SRCU after timers
Once srcu_init() is called, the SRCU core will make use of delayed
workqueues, which rely on timers.  However init_timers() is called
several steps after rcu_init().  This means that a call_srcu() after
rcu_init() but before init_timers() would find itself within a dangerously
uninitialized timer core.

This commit therefore creates a separate call to srcu_init() after
init_timer() completes, which ensures that we stay in early SRCU mode
until timers are safe(r).

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:03:35 -07:00
Uladzislau Rezki (Sony)
a78d4a2a10 kvfree_rcu: Refactor kfree_rcu_monitor()
Currently we have three functions which depend on each other.
Two of them are quite tiny and the last one where the most
work is done. All of them are related to queuing RCU batches
to reclaim objects after a GP.

1. kfree_rcu_monitor(). It consist of few lines. It acquires a spin-lock
   and calls kfree_rcu_drain_unlock().

2. kfree_rcu_drain_unlock(). It also consists of few lines of code. It
   calls queue_kfree_rcu_work() to queue the batch.  If this fails,
   it rearms the monitor work to try again later.

3. queue_kfree_rcu_work(). This provides the bulk of the functionality,
   attempting to start a new batch to free objects after a GP.

Since there are no external users of functions [2] and [3], both
can eliminated by moving all logic directly into [1], which both
shrinks and simplifies the code.

Also replace comments which start with "/*" to "//" format to make it
unified across the file.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Uladzislau Rezki (Sony)
d8628f35ba kvfree_rcu: Fix comments according to current code
The kvfree_rcu() function now defers allocations in the common
case due to the fact that there is no lockless access to the
memory-allocator caches/pools.  In addition, in CONFIG_PREEMPT_NONE=y
and in CONFIG_PREEMPT_VOLUNTARY=y kernels, there is no reliable way to
determine if spinlocks are held.  As a result, allocation is deferred in
the common case, and the two-argument form of kvfree_rcu() thus uses the
"channel 3" queue through all the rcu_head structures.  This channel
is called referred to as the emergency case in comments, and these
comments are now obsolete.

This commit therefore updates these comments to reflect the new
common-case nature of such emergencies.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Uladzislau Rezki (Sony)
7fe1da33f6 kvfree_rcu: Use kfree_rcu_monitor() instead of open-coded variant
Replace an open-coded version of the kfree_rcu_monitor() function body
with a call to that function.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Uladzislau Rezki (Sony)
dd28c9f057 kvfree_rcu: Update "monitor_todo" once a batch is started
Before attempting to start a new batch the "monitor_todo" variable is
set to "false" and set back to "true" when a previous RCU batch is still
in progress.  This is at best confusing.

Thus change this variable to "false" only when a new batch has been
successfully queued, otherwise, just leave it be.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Uladzislau Rezki (Sony)
d434c00fa3 kvfree_rcu: Add a bulk-list check when a scheduler is run
The rcu_scheduler_active flag is set to RCU_SCHEDULER_RUNNING once the
scheduler is up and running.  That signal is used in order to check
and queue a "monitor work" to reclaim freed objects (if there are any)
during early boot.  This flag is used by kvfree_rcu() to determine when
work can safely be queued, at which point memory passed to earlier
invocations of kvfree_rcu() can be processed.

However, only "krcp->head" is checked for objects that need to be
released, and there are now two more, namely, "krcp->bkvhead[0]" and
"krcp->bkvhead[1]".  Therefore, check these two additional channels.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Uladzislau Rezki (Sony)
ac7625ebd5 kvfree_rcu: Use [READ/WRITE]_ONCE() macros to access to nr_bkv_objs
nr_bkv_objs is a count of the objects in the kvfree_rcu page cache.
Accessing it requires holding the ->lock.  Switch to READ_ONCE() and
WRITE_ONCE() macros to provide lockless access to this counter.
This lockless access is used for the shrinker.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Zhang Qiang
d0bfa8b3c4 kvfree_rcu: Release a page cache under memory pressure
Add a drain_page_cache() function to drain a per-cpu page cache.
The reason behind of it is a system can run into a low memory
condition, in that case a page shrinker can ask for its users
to free their caches in order to get extra memory available for
other needs in a system.

When a system hits such condition, a page cache is drained for
all CPUs in a system. By default a page cache work is delayed
with 5 seconds interval until a memory pressure disappears, if
needed it can be changed. See a rcu_delay_page_cache_fill_msec
module parameter.

Co-developed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10 16:00:48 -07:00
Paul E. McKenney
ab6ad3dbdd Merge branches 'bitmaprange.2021.03.08a', 'fixes.2021.03.15a', 'kvfree_rcu.2021.03.08a', 'mmdumpobj.2021.03.08a', 'nocb.2021.03.15a', 'poll.2021.03.24a', 'rt.2021.03.08a', 'tasks.2021.03.08a', 'torture.2021.03.08a' and 'torturescript.2021.03.22a' into HEAD
bitmaprange.2021.03.08a:  Allow 3-N for bitmap ranges.
fixes.2021.03.15a:  Miscellaneous fixes.
kvfree_rcu.2021.03.08a:  kvfree_rcu() updates.
mmdumpobj.2021.03.08a:  mem_dump_obj() updates.
nocb.2021.03.15a:  RCU NOCB CPU updates, including limited deoffloading.
poll.2021.03.24a:  Polling grace-period interfaces for RCU.
rt.2021.03.08a:  Realtime-related RCU changes.
tasks.2021.03.08a:  Tasks-RCU updates.
torture.2021.03.08a:  Torture-test updates.
torturescript.2021.03.22a:  Torture-test scripting updates.
2021-03-24 17:20:18 -07:00
Paul E. McKenney
7abb18bd75 rcu: Provide polling interfaces for Tree RCU grace periods
There is a need for a non-blocking polling interface for RCU grace
periods, so this commit supplies start_poll_synchronize_rcu() and
poll_state_synchronize_rcu() for this purpose.  Note that the existing
get_state_synchronize_rcu() may be used if future grace periods are
inevitable (perhaps due to a later call_rcu() invocation).  The new
start_poll_synchronize_rcu() is to be used if future grace periods
might not otherwise happen.  Finally, poll_state_synchronize_rcu()
provides a lockless check for a grace period having elapsed since
the corresponding call to either of the get_state_synchronize_rcu()
or start_poll_synchronize_rcu().

As with get_state_synchronize_rcu(), the return value from either
get_state_synchronize_rcu() or start_poll_synchronize_rcu() is passed in
to a later call to either poll_state_synchronize_rcu() or the existing
(might_sleep) cond_synchronize_rcu().

[ paulmck: Remove redundant smp_mb() per Frederic Weisbecker feedback. ]
[ Update poll_state_synchronize_rcu() docbook per Frederic Weisbecker feedback. ]
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-22 08:23:48 -07:00
Frederic Weisbecker
ec711bc12c rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
At the start of a CPU-hotplug operation, the incoming CPU's callback
list can be in a number of states:

1.	Disabled and empty.  This is the case when the boot CPU has
	not invoked call_rcu(), when a non-boot CPU first comes online,
	and when a non-offloaded CPU comes back online.  In this case,
	it is both necessary and permissible to initialize ->cblist.
	Because either the CPU is currently running with interrupts
	disabled (boot CPU) or is not yet running at all (other CPUs),
	it is not necessary to acquire ->nocb_lock.

	In this case, initialization is required.

2.	Disabled and non-empty.  This cannot occur, because early boot
	call_rcu() invocations enable the callback list before enqueuing
	their callback.

3.	Enabled, whether empty or not.	In this case, the callback
	list has already been initialized.  This case occurs when the
	boot CPU has executed an early boot call_rcu() and also when
	an offloaded CPU comes back online.  In both cases, there is
	no need to initialize the callback list: In the boot-CPU case,
	the CPU has not (yet) gone offline, and in the offloaded case,
	the rcuo kthreads are taking care of business.

	Because it is not necessary to initialize the callback list,
	it is also not necessary to acquire ->nocb_lock.

Therefore, checking if the segcblist is enabled suffices.  This commit
therefore initializes the callback list at rcutree_prepare_cpu() time
only if that list is disabled.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:20:22 -08:00
Frederic Weisbecker
64305db285 rcu/nocb: Forbid NOCB toggling on offline CPUs
It makes no sense to de-offload an offline CPU because that CPU will never
invoke any remaining callbacks.  It also makes little sense to offload an
offline CPU because any pending RCU callbacks were migrated when that CPU
went offline.  Yes, it is in theory possible to use a number of tricks
to permit offloading and deoffloading offline CPUs in certain cases, but
in practice it is far better to have the simple and deterministic rule
"Toggling the offload state of an offline CPU is forbidden".

For but one example, consider that an offloaded offline CPU might have
millions of callbacks queued.  Best to just say "no".

This commit therefore forbids toggling of the offloaded state of
offline CPUs.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:20:21 -08:00
Frederic Weisbecker
3820b513a2 rcu/nocb: Detect unsafe checks for offloaded rdp
Provide CONFIG_PROVE_RCU sanity checks to ensure we are always reading
the offloaded state of an rdp in a safe and stable way and prevent from
its value to be changed under us. We must either hold the barrier mutex,
the cpu-hotplug lock (read or write) or the nocb lock.
Local non-preemptible reads are also safe. NOCB kthreads and timers have
their own means of synchronization against the offloaded state updaters.

Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:20:20 -08:00
Uladzislau Rezki (Sony)
ee6ddf5847 kvfree_rcu: Use same set of GFP flags as does single-argument
Running an rcuscale stress-suite can lead to "Out of memory" of a
system. This can happen under high memory pressure with a small amount
of physical memory.

For example, a KVM test configuration with 64 CPUs and 512 megabytes
can result in OOM when running rcuscale with below parameters:

../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \
  rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make

<snip>
[   12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0
[   12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[   12.056485] Workqueue: events_highpri fill_page_cache_func
[   12.056485] Call Trace:
[   12.056485]  dump_stack+0x57/0x6a
[   12.056485]  dump_header+0x4c/0x30a
[   12.056485]  ? del_timer_sync+0x20/0x30
[   12.056485]  out_of_memory.cold.47+0xa/0x7e
[   12.056485]  __alloc_pages_slowpath.constprop.123+0x82f/0xc00
[   12.056485]  __alloc_pages_nodemask+0x289/0x2c0
[   12.056485]  __get_free_pages+0x8/0x30
[   12.056485]  fill_page_cache_func+0x39/0xb0
[   12.056485]  process_one_work+0x1ed/0x3b0
[   12.056485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  worker_thread+0x28/0x3c0
[   12.060485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  kthread+0x138/0x160
[   12.060485]  ? kthread_park+0x80/0x80
[   12.060485]  ret_from_fork+0x22/0x30
[   12.062156] Mem-Info:
[   12.062350] active_anon:0 inactive_anon:0 isolated_anon:0
[   12.062350]  active_file:0 inactive_file:0 isolated_file:0
[   12.062350]  unevictable:0 dirty:0 writeback:0
[   12.062350]  slab_reclaimable:2797 slab_unreclaimable:80920
[   12.062350]  mapped:1 shmem:2 pagetables:8 bounce:0
[   12.062350]  free:10488 free_pcp:1227 free_cma:0
...
[   12.101610] Out of memory and no killable processes...
[   12.102042] Kernel panic - not syncing: System is deadlocked on memory
[   12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
<snip>

Because kvfree_rcu() has a fallback path, memory allocation failure is
not the end of the world.  Furthermore, the added overhead of aggressive
GFP settings must be balanced against the overhead of the fallback path,
which is a cache miss for double-argument kvfree_rcu() and a call to
synchronize_rcu() for single-argument kvfree_rcu().  The current choice
of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call
to synchronize_rcu(), so less-tenacious GFP flags would be helpful.

Here is the tradeoff that must be balanced:
    a) Minimize use of the fallback path,
    b) Avoid pushing the system into OOM,
    c) Bound allocation latency to that of synchronize_rcu(), and
    d) Leave the emergency reserves to use cases lacking fallbacks.

This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to
GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN.  This combination
leaves the emergency reserves alone and can initiate reclaim, but will
not invoke the OOM killer.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:18:07 -08:00
Uladzislau Rezki (Sony)
3e7ce7a187 kvfree_rcu: Replace __GFP_RETRY_MAYFAIL by __GFP_NORETRY
__GFP_RETRY_MAYFAIL can spend quite a bit of time reclaiming, and this
can be wasted effort given that there is a fallback code path in case
memory allocation fails.

__GFP_NORETRY does perform some light-weight reclaim, but it will fail
under OOM conditions, allowing the fallback to be taken as an alternative
to hard-OOMing the system.

There is a four-way tradeoff that must be balanced:
    1) Minimize use of the fallback path;
    2) Avoid full-up OOM;
    3) Do a light-wait allocation request;
    4) Avoid dipping into the emergency reserves.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:18:07 -08:00