arpi-5.15
17853 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
670c6da635 | Merge branch 'android14-5.15' into arpi-5.15.92 | ||
|
|
f34aed1750 |
ANDROID: kasan: fix slab page check in complete_report_info
The backport commit |
||
|
|
e9e421392e | Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y | ||
|
|
e3d8fe0993 |
Merge 5.15.91 into android14-5.15
Changes in 5.15.91 memory: tegra: Remove clients SID override programming memory: atmel-sdramc: Fix missing clk_disable_unprepare in atmel_ramc_probe() memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe() dmaengine: ti: k3-udma: Do conditional decrement of UDMA_CHAN_RT_PEER_BCNT_REG arm64: dts: imx8mp-phycore-som: Remove invalid PMIC property ARM: dts: imx6ul-pico-dwarf: Use 'clock-frequency' ARM: dts: imx7d-pico: Use 'clock-frequency' ARM: dts: imx6qdl-gw560x: Remove incorrect 'uart-has-rtscts' arm64: dts: imx8mm-beacon: Fix ecspi2 pinmux ARM: imx: add missing of_node_put() HID: intel_ish-hid: Add check for ishtp_dma_tx_map arm64: dts: imx8mm-venice-gw7901: fix USB2 controller OC polarity soc: imx8m: Fix incorrect check for of_clk_get_by_name() reset: uniphier-glue: Use reset_control_bulk API reset: uniphier-glue: Fix possible null-ptr-deref EDAC/highbank: Fix memory leak in highbank_mc_probe() firmware: arm_scmi: Harden shared memory access in fetch_response firmware: arm_scmi: Harden shared memory access in fetch_notification tomoyo: fix broken dependency on *.conf.default RDMA/core: Fix ib block iterator counter overflow IB/hfi1: Reject a zero-length user expected buffer IB/hfi1: Reserve user expected TIDs IB/hfi1: Fix expected receive setup error exit issues IB/hfi1: Immediately remove invalid memory from hardware IB/hfi1: Remove user expected buffer invalidate race affs: initialize fsdata in affs_truncate() PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe() arm64: dts: qcom: msm8992: Don't use sfpb mutex arm64: dts: qcom: msm8992-libra: Add CPU regulators arm64: dts: qcom: msm8992-libra: Fix the memory map phy: ti: fix Kconfig warning and operator precedence NFSD: fix use-after-free in nfsd4_ssc_setup_dul() ARM: dts: at91: sam9x60: fix the ddr clock for sam9x60 amd-xgbe: TX Flow Ctrl Registers are h/w ver dependent amd-xgbe: Delay AN timeout during KR training bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation phy: rockchip-inno-usb2: Fix missing clk_disable_unprepare() in rockchip_usb2phy_power_on() net: nfc: Fix use-after-free in local_cleanup() net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs net: enetc: avoid deadlock in enetc_tx_onestep_tstamp() sch_htb: Avoid grafting on htb_destroy_class_offload when destroying htb gpio: use raw spinlock for gpio chip shadowed data gpio: mxc: Protect GPIO irqchip RMW with bgpio spinlock gpio: mxc: Always set GPIOs used as interrupt source to INPUT mode wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid pinctrl/rockchip: Use temporary variable for struct device pinctrl/rockchip: add error handling for pull/drive register getters pinctrl: rockchip: fix reading pull type on rk3568 net: stmmac: Fix queue statistics reading net/sched: sch_taprio: fix possible use-after-free l2tp: Serialize access to sk_user_data with sk_callback_lock l2tp: Don't sleep and disable BH under writer-side sk_callback_lock l2tp: convert l2tp_tunnel_list to idr l2tp: close all race conditions in l2tp_tunnel_register() octeontx2-pf: Avoid use of GFP_KERNEL in atomic context net: usb: sr9700: Handle negative len net: mdio: validate parameter addr in mdiobus_get_phy() HID: check empty report_list in hid_validate_values() HID: check empty report_list in bigben_probe() net: stmmac: fix invalid call to mdiobus_get_phy() pinctrl: rockchip: fix mux route data for rk3568 HID: revert CHERRY_MOUSE_000C quirk usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait usb: gadget: f_fs: Ensure ep0req is dequeued before free_request Bluetooth: Fix possible deadlock in rfcomm_sk_state_change net: ipa: disable ipa interrupt during suspend net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT net: mlx5: eliminate anonymous module_init & module_exit drm/panfrost: fix GENERIC_ATOMIC64 dependency dmaengine: Fix double increment of client_count in dma_chan_get() net: macb: fix PTP TX timestamp failure due to packet padding virtio-net: correctly enable callback during start_xmit l2tp: prevent lockdep issue in l2tp_tunnel_register() HID: betop: check shape of output reports cifs: fix potential deadlock in cache_refresh_path() dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node() phy: phy-can-transceiver: Skip warning if no "max-bitrate" drm/amd/display: fix issues with driver unload nvme-pci: fix timeout request state check tcp: avoid the lookup process failing to get sk in ehash table octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt ptdma: pt_core_execute_cmd() should use spinlock device property: fix of node refcount leak in fwnode_graph_get_next_endpoint() w1: fix deadloop in __w1_remove_master_device() w1: fix WARNING after calling w1_process() driver core: Fix test_async_probe_init saves device in wrong array selftests/net: toeplitz: fix race on tpacket_v3 block close net: dsa: microchip: ksz9477: port map correction in ALU table entry register thermal/core: Remove duplicate information when an error occurs thermal/core: Rename 'trips' to 'num_trips' thermal: Validate new state in cur_state_store() thermal/core: fix error code in __thermal_cooling_device_register() thermal: core: call put_device() only after device_register() fails net: stmmac: enable all safety features by default tcp: fix rate_app_limited to default to 1 scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist kcsan: test: don't put the expect array on the stack cpufreq: Add SM6375 to cpufreq-dt-platdev blocklist ASoC: fsl_micfil: Correct the number of steps on SX controls net: usb: cdc_ether: add support for Thales Cinterion PLS62-W modem drm: Add orientation quirk for Lenovo ideapad D330-10IGL s390/debug: add _ASM_S390_ prefix to header guard s390: expicitly align _edata and _end symbols on page boundary perf/x86/msr: Add Emerald Rapids perf/x86/intel/uncore: Add Emerald Rapids cpufreq: armada-37xx: stop using 0 as NULL pointer ASoC: fsl_ssi: Rename AC'97 streams to avoid collisions with AC'97 CODEC ASoC: fsl-asoc-card: Fix naming of AC'97 CODEC widgets spi: spidev: remove debug messages that access spidev->spi without locking KVM: s390: interrupt: use READ_ONCE() before cmpxchg() scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id r8152: add vendor/device ID pair for Microsoft Devkit platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK lockref: stop doing cpu_relax in the cmpxchg loop firmware: coreboot: Check size of table entry and use flex-array drm/i915: Allow switching away via vga-switcheroo if uninitialized Revert "selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID" drm/i915: Remove unused variable x86: ACPI: cstate: Optimize C3 entry on AMD CPUs fs: reiserfs: remove useless new_opts in reiserfs_remount sysctl: add a new register_sysctl_init() interface kernel/panic: move panic sysctls to its own file panic: unset panic_on_warn inside panic() ubsan: no need to unset panic_on_warn in ubsan_epilogue() kasan: no need to unset panic_on_warn in end_report() exit: Add and use make_task_dead. objtool: Add a missing comma to avoid string concatenation hexagon: Fix function name in die() h8300: Fix build errors from do_exit() to make_task_dead() transition csky: Fix function name in csky_alignment() and die() ia64: make IA64_MCA_RECOVERY bool instead of tristate panic: Separate sysctl logic from CONFIG_SMP exit: Put an upper limit on how often we can oops exit: Expose "oops_count" to sysfs exit: Allow oops_limit to be disabled panic: Consolidate open-coded panic_on_warn checks panic: Introduce warn_limit panic: Expose "warn_count" to sysfs docs: Fix path paste-o for /sys/kernel/warn_count exit: Use READ_ONCE() for all oops/warn limit reads Bluetooth: hci_sync: cancel cmd_timer if hci_open failed drm/amdgpu: complete gfxoff allow signal during suspend without delay scsi: hpsa: Fix allocation size for scsi_host_alloc() KVM: SVM: fix tsc scaling cache logic module: Don't wait for GOING modules tracing: Make sure trace_printk() can output as soon as it can be used trace_events_hist: add check for return value of 'create_hist_field' ftrace/scripts: Update the instructions for ftrace-bisect.sh cifs: Fix oops due to uncleared server->smbd_conn in reconnect i2c: mv64xxx: Remove shutdown method from driver i2c: mv64xxx: Add atomic_xfer method to driver ksmbd: add smbd max io size parameter ksmbd: add max connections parameter ksmbd: do not sign response to session request for guest login ksmbd: downgrade ndr version error message to debug ksmbd: limit pdu length size according to connection status ovl: fail on invalid uid/gid mapping at copy up KVM: x86/vmx: Do not skip segment attributes if unusable bit is set KVM: arm64: GICv4.1: Fix race with doorbell on VPE activation/deactivation thermal: intel: int340x: Protect trip temperature from concurrent updates ipv6: fix reachability confirmation with proxy_ndp ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment EDAC/device: Respect any driver-supplied workqueue polling value EDAC/qcom: Do not pass llcc_driv_data as edac_device_ctl_info's pvt_info net: mana: Fix IRQ name - add PCI and queue number scsi: ufs: core: Fix devfreq deadlocks i2c: designware: use casting of u64 in clock multiplication to avoid overflow netlink: prevent potential spectre v1 gadgets net: fix UaF in netns ops registration error path drm/i915/selftest: fix intel_selftest_modify_policy argument types netfilter: nft_set_rbtree: Switch to node list walk for overlap detection netfilter: nft_set_rbtree: skip elements in transaction from garbage collection netlink: annotate data races around nlk->portid netlink: annotate data races around dst_portid and dst_group netlink: annotate data races around sk_state ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() ipv4: prevent potential spectre v1 gadget in fib_metrics_match() netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE netrom: Fix use-after-free of a listening socket. net/sched: sch_taprio: do not schedule in taprio_reset() sctp: fail if no bound addresses can be used for a given scope riscv/kprobe: Fix instruction simulation of JALR nvme: fix passthrough csi check gpio: mxc: Unlock on error path in mxc_flip_edge() ravb: Rename "no_ptp_cfg_active" and "ptp_cfg_active" variables net: ravb: Fix lack of register setting after system resumed for Gen3 net: ravb: Fix possible hang if RIS2_QFF1 happen net: mctp: mark socks as dead on unhash, prevent re-add thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type() net/tg3: resolve deadlock in tg3_reset_task() during EEH net: mdio-mux-meson-g12a: force internal PHY off on mux switch treewide: fix up files incorrectly marked executable tools: gpio: fix -c option of gpio-event-mon Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode" cpufreq: Move to_gov_attr_set() to cpufreq.h cpufreq: governor: Use kobject release() method to free dbs_data kbuild: Allow kernel installation packaging to override pkg-config block: fix and cleanup bio_check_ro x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL netfilter: conntrack: unify established states for SCTP paths perf/x86/amd: fix potential integer overflow on shift of a int Linux 5.15.91 Change-Id: I3349d802533097ac86e5c680fbd40c00c9719ec7 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
7b98914a6c |
panic: Consolidate open-coded panic_on_warn checks
commit 79cc1ba7badf9e7a12af99695a557e9ce27ee967 upstream. Several run-time checkers (KASAN, UBSAN, KFENCE, KCSAN, sched) roll their own warnings, and each check "panic_on_warn". Consolidate this into a single function so that future instrumentation can be added in a single location. Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Gow <davidgow@google.com> Cc: tangmeng <tangmeng@uniontech.com> Cc: Jann Horn <jannh@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: kasan-dev@googlegroups.com Cc: linux-mm@kvack.org Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20221117234328.594699-4-keescook@chromium.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
b5c1acaa43 |
kasan: no need to unset panic_on_warn in end_report()
commit e7ce7500375a63348e1d3a703c8d5003cbe3fea6 upstream. panic_on_warn is unset inside panic(), so no need to unset it before calling panic() in end_report(). Link: https://lkml.kernel.org/r/1644324666-15947-6-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Marco Elver <elver@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Xuefeng Li <lixuefeng@loongson.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
bb12d18e60 | Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y | ||
|
|
17126d43d4 |
Merge 5.15.90 into android14-5.15
Changes in 5.15.90
btrfs: fix trace event name typo for FLUSH_DELAYED_REFS
pNFS/filelayout: Fix coalescing test for single DS
selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID
tools/virtio: initialize spinlocks in vring_test.c
virtio_pci: modify ENOENT to EINVAL
vduse: Validate vq_num in vduse_validate_config()
net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats
r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
RDMA/srp: Move large values to a new enum for gcc13
btrfs: always report error in run_one_delayed_ref()
x86/asm: Fix an assembler warning with current binutils
f2fs: let's avoid panic if extent_tree is not created
perf/x86/rapl: Treat Tigerlake like Icelake
fbdev: omapfb: avoid stack overflow warning
Bluetooth: hci_qca: Fix driver shutdown on closed serdev
wifi: brcmfmac: fix regression for Broadcom PCIe wifi devices
wifi: mac80211: sdata can be NULL during AMPDU start
Add exception protection processing for vd in axi_chan_handle_err function
zonefs: Detect append writes at invalid locations
nilfs2: fix general protection fault in nilfs_btree_insert()
efi: fix userspace infinite retry read efivars after EFI runtime services page fault
ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform
drm/amdgpu: disable runtime pm on several sienna cichlid cards(v2)
drm/amd: Delay removal of the firmware framebuffer
hugetlb: unshare some PMDs when splitting VMAs
io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL
eventpoll: add EPOLL_URING_WAKE poll wakeup flag
eventfd: provide a eventfd_signal_mask() helper
io_uring: pass in EPOLL_URING_WAKE for eventfd signaling and wakeups
io_uring: improve send/recv error handling
io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
io_uring: add flag for disabling provided buffer recycling
io_uring: support MSG_WAITALL for IORING_OP_SEND(MSG)
io_uring: allow re-poll if we made progress
io_uring: fix async accept on O_NONBLOCK sockets
io_uring: ensure that cached task references are always put on exit
io_uring: remove duplicated calls to io_kiocb_ppos
io_uring: update kiocb->ki_pos at execution time
io_uring: do not recalculate ppos unnecessarily
io_uring/rw: defer fsnotify calls to task context
xhci-pci: set the dma max_seg_size
usb: xhci: Check endpoint is valid before dereferencing it
xhci: Fix null pointer dereference when host dies
xhci: Add update_hub_device override for PCI xHCI hosts
xhci: Add a flag to disable USB3 lpm on a xhci root port level.
usb: acpi: add helper to check port lpm capability using acpi _DSM
xhci: Detect lpm incapable xHC USB3 roothub ports from ACPI tables
prlimit: do_prlimit needs to have a speculation check
USB: serial: option: add Quectel EM05-G (GR) modem
USB: serial: option: add Quectel EM05-G (CS) modem
USB: serial: option: add Quectel EM05-G (RS) modem
USB: serial: option: add Quectel EC200U modem
USB: serial: option: add Quectel EM05CN (SG) modem
USB: serial: option: add Quectel EM05CN modem
staging: vchiq_arm: fix enum vchiq_status return types
USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100
misc: fastrpc: Don't remove map on creater_process and device_release
misc: fastrpc: Fix use-after-free race condition for maps
usb: core: hub: disable autosuspend for TI TUSB8041
comedi: adv_pci1760: Fix PWM instruction handling
ACPI: PRM: Check whether EFI runtime is available
mmc: sunxi-mmc: Fix clock refcount imbalance during unbind
mmc: sdhci-esdhc-imx: correct the tuning start tap and step setting
btrfs: do not abort transaction on failure to write log tree when syncing log
btrfs: fix race between quota rescan and disable leading to NULL pointer deref
cifs: do not include page data when checking signature
thunderbolt: Use correct function to calculate maximum USB3 link rate
riscv: dts: sifive: fu740: fix size of pcie 32bit memory
bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD
staging: mt7621-dts: change some node hex addresses to lower case
tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO buffer
tty: fix possible null-ptr-defer in spk_ttyio_release
USB: gadgetfs: Fix race between mounting and unmounting
USB: serial: cp210x: add SCALANCE LPE-9000 device id
usb: cdns3: remove fetched trb from cache before dequeuing
usb: host: ehci-fsl: Fix module alias
usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail
usb: typec: altmodes/displayport: Add pin assignment helper
usb: typec: altmodes/displayport: Fix pin assignment calculation
usb: gadget: g_webcam: Send color matching descriptor per frame
usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate()
usb-storage: apply IGNORE_UAS only for HIKSEMI MD202 on RTL9210
dt-bindings: phy: g12a-usb2-phy: fix compatible string documentation
dt-bindings: phy: g12a-usb3-pcie-phy: fix compatible string documentation
serial: pch_uart: Pass correct sg to dma_unmap_sg()
dmaengine: lgm: Move DT parsing after initialization
dmaengine: tegra210-adma: fix global intr clear
dmaengine: idxd: Let probe fail when workqueue cannot be enabled
serial: amba-pl011: fix high priority character transmission in rs486 mode
serial: atmel: fix incorrect baudrate setup
gsmi: fix null-deref in gsmi_get_variable
mei: me: add meteor lake point M DID
drm/i915: re-disable RC6p on Sandy Bridge
drm/i915/display: Check source height is > 0
drm/amd/display: Fix set scaling doesn's work
drm/amd/display: Calculate output_color_space after pixel encoding adjustment
drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrix
drm/amdgpu: drop experimental flag on aldebaran
fs/ntfs3: Fix attr_punch_hole() null pointer derenference
arm64: efi: Execute runtime services from a dedicated stack
efi: rt-wrapper: Add missing include
Revert "drm/amdgpu: make display pinning more flexible (v2)"
x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN
tracing: Use alignof__(struct {type b;}) instead of offsetof()
io_uring: io_kiocb_update_pos() should not touch file for non -1 offset
io_uring/net: fix fast_iov assignment in io_setup_async_msg()
net/ulp: use consistent error code when blocking ULP
net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work()
block: mq-deadline: Rename deadline_is_seq_writes()
Revert "wifi: mac80211: fix memory leak in ieee80211_if_add()"
soc: qcom: apr: Make qcom,protection-domain optional again
mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
io_uring: Clean up a false-positive warning from GCC 9.3.0
io_uring: fix double poll leak on repolling
io_uring/rw: ensure kiocb_end_write() is always called
io_uring/rw: remove leftover debug statement
Linux 5.15.90
Change-Id: I5abc2e695f7183a1d3be9d8f62633bb1df8e8a48
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
6d926a9aae |
FROMLIST: kasan: reset page tags properly with sampling
[The patch is in the mm-unstable tree.]
The implementation of page_alloc poisoning sampling assumed that
tag_clear_highpage resets page tags for __GFP_ZEROTAGS allocations.
However, this is no longer the case since commit 70c248aca9e7
("mm: kasan: Skip unpoisoning of user pages").
This leads to kernel crashes when MTE-enabled userspace mappings are
used with Hardware Tag-Based KASAN enabled.
Reset page tags for __GFP_ZEROTAGS allocations in post_alloc_hook().
Also clarify and fix related comments.
Fixes: 44383cef54c0 ("kasan: allow sampling page_alloc allocations for HW_TAGS")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Peter Collingbourne <pcc@google.com>
Tested-by: Peter Collingbourne <pcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 238286329
Bug: 264310057
Link: https://lore.kernel.org/all/5dbd866714b4839069e2d8469ac45b60953db290.1674592780.git.andreyknvl@google.com/
Change-Id: Iea4234bcf7e35337c8063827b07039583bca9c66
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
|
||
|
|
25e257e5d5 |
FROMGIT: kasan: allow sampling page_alloc allocations for HW_TAGS
[The patch is in mm-stable tree.] As Hardware Tag-Based KASAN is intended to be used in production, its performance impact is crucial. As page_alloc allocations tend to be big, tagging and checking all such allocations can introduce a significant slowdown. Add two new boot parameters that allow to alleviate that slowdown: - kasan.page_alloc.sample, which makes Hardware Tag-Based KASAN tag only every Nth page_alloc allocation with the order configured by the second added parameter (default: tag every such allocation). - kasan.page_alloc.sample.order, which makes sampling enabled by the first parameter only affect page_alloc allocations with the order equal or greater than the specified value (default: 3, see below). The exact performance improvement caused by using the new parameters depends on their values and the applied workload. The chosen default value for kasan.page_alloc.sample.order is 3, which matches both PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. This is done for two reasons: 1. PAGE_ALLOC_COSTLY_ORDER is "the order at which allocations are deemed costly to service", which corresponds to the idea that only large and thus costly allocations are supposed to sampled. 2. One of the workloads targeted by this patch is a benchmark that sends a large amount of data over a local loopback connection. Most multi-page data allocations in the networking subsystem have the order of SKB_FRAG_PAGE_ORDER (or PAGE_ALLOC_COSTLY_ORDER). When running a local loopback test on a testing MTE-enabled device in sync mode, enabling Hardware Tag-Based KASAN introduces a ~50% slowdown. Applying this patch and setting kasan.page_alloc.sampling to a value higher than 1 allows to lower the slowdown. The performance improvement saturates around the sampling interval value of 10 with the default sampling page order of 3. This lowers the slowdown to ~20%. The slowdown in real scenarios involving the network will likely be better. Enabling page_alloc sampling has a downside: KASAN misses bad accesses to a page_alloc allocation that has not been tagged. This lowers the value of KASAN as a security mitigation. However, based on measuring the number of page_alloc allocations of different orders during boot in a test build, sampling with the default kasan.page_alloc.sample.order value affects only ~7% of allocations. The rest ~93% of allocations are still checked deterministically. Link: https://lkml.kernel.org/r/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Jann Horn <jannh@google.com> Cc: Mark Brand <markbrand@google.com> Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 238286329 Bug: 264310057 (cherry picked from commit 44383cef54c0ce1201f884d83cc2b367bc5aa4f7 git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-stable) Change-Id: I85f9eb4e93eeddff8f8d06238f433226affca177 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> |
||
|
|
940e8922c1 |
mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
commit ab0c3f1251b4670978fde0bd54161795a139b060 upstream. uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd, when removing a breakpoint from hugepage text: vma->anon_vma is always set in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able to collapse some page tables in a vma which happens to have anon_vma set from CoWing elsewhere. Is anon_vma lock required? Almost not: if any page other than expected subpage of the non-anon huge page is found in the page table, collapse is aborted without making any change. However, it is possible that an anon page was CoWed from this extent in another mm or vma, in which case a concurrent lookup might look here: so keep it away while clearing pmd (but perhaps we shall go back to using pmd_lock() there in future). Note that collapse_pte_mapped_thp() is exceptional in freeing a page table without having cleared its ptes: I'm uneasy about that, and had thought pte_clear()ing appropriate; but exclusive i_mmap lock does fix the problem, and we would have to move the mmu_notification if clearing those ptes. What this fixes is not a dangerous instability. But I suggest Cc stable because uprobes "healing" has regressed in that way, so this should follow 8d3c106e19e8 into those stable releases where it was backported (and may want adjustment there - I'll supply backports as needed). Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zach O'Keefe <zokeefe@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
bd9a23a4bb |
hugetlb: unshare some PMDs when splitting VMAs
[ Upstream commit b30c14cd61025eeea2f2e8569606cd167ba9ad2d ]
PMD sharing can only be done in PUD_SIZE-aligned pieces of VMAs; however,
it is possible that HugeTLB VMAs are split without unsharing the PMDs
first.
Without this fix, it is possible to hit the uffd-wp-related WARN_ON_ONCE
in hugetlb_change_protection [1]. The key there is that
hugetlb_unshare_all_pmds will not attempt to unshare PMDs in
non-PUD_SIZE-aligned sections of the VMA.
It might seem ideal to unshare in hugetlb_vm_op_open, but we need to
unshare in both the new and old VMAs, so unsharing in hugetlb_vm_op_split
seems natural.
[1]: https://lore.kernel.org/linux-mm/CADrL8HVeOkj0QH5VZZbRzybNE8CG-tEGFshnA+bG9nMgcWtBSg@mail.gmail.com/
Link: https://lkml.kernel.org/r/20230104231910.1464197-1-jthoughton@google.com
Fixes:
|
||
|
|
f95873c0e5 |
Revert "FROMLIST: kasan: allow sampling page_alloc allocations for HW_TAGS"
This reverts commit
|
||
|
|
0fd420005c |
Merge 5.15.89 into android14-5.15
Changes in 5.15.89 netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits ALSA: control-led: use strscpy in set_led_id() ALSA: hda/realtek - Turn on power early ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx KVM: arm64: Fix S1PTW handling on RO memslots KVM: arm64: nvhe: Fix build with profile optimization selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c efi: tpm: Avoid READ_ONCE() for accessing the event log docs: Fix the docs build with Sphinx 6.0 net: stmmac: add aux timestamps fifo clearance wait perf auxtrace: Fix address filter duplicate symbol selection s390/kexec: fix ipl report address for kdump ASoC: qcom: lpass-cpu: Fix fallback SD line index handling s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() drm/virtio: Fix GEM handle creation UAF drm/i915/gt: Reset twice net/mlx5e: Set action fwd flag when parsing tc action goto cifs: Fix uninitialized memory read for smb311 posix symlink create platform/x86: dell-privacy: Only register SW_CAMERA_LENS_COVER if present platform/surface: aggregator: Ignore command messages not intended for us platform/x86: dell-privacy: Fix SW_CAMERA_LENS_COVER reporting dt-bindings: msm: dsi-controller-main: Fix operating-points-v2 constraint drm/msm/adreno: Make adreno quirks not overwrite each other dt-bindings: msm: dsi-controller-main: Fix power-domain constraint dt-bindings: msm: dsi-controller-main: Fix description of core clock dt-bindings: msm: dsi-phy-28nm: Add missing qcom, dsi-phy-regulator-ldo-mode platform/x86: ideapad-laptop: Add Legion 5 15ARH05 DMI id to set_fn_lock_led_list[] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer dt-bindings: msm/dsi: Don't require vdds-supply on 10nm PHY dt-bindings: msm/dsi: Don't require vcca-supply on 14nm PHY platform/x86: sony-laptop: Don't turn off 0x153 keyboard backlight during probe ixgbe: fix pci device refcount leak ipv6: raw: Deduct extension header length in rawv6_push_pending_frames bus: mhi: host: Fix race between channel preparation and M0 event usb: ulpi: defer ulpi_register on ulpi_read_id timeout iommu/iova: Fix alloc iova overflows issue iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() sched/core: Fix use-after-free bug in dup_user_cpus_ptr() netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function. powerpc/imc-pmu: Fix use of mutex in IRQs disabled section x86/boot: Avoid using Intel mnemonics in AT&T syntax asm EDAC/device: Fix period calculation in edac_device_reset_delay_period() x86/resctrl: Fix task CLOSID/RMID update race regulator: da9211: Use irq handler when ready scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile scsi: ufs: Stop using the clock scaling lock in the error handler scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery ASoC: wm8904: fix wrong outputs volume after power reactivation ALSA: usb-audio: Make sure to stop endpoints before closing EPs ALSA: usb-audio: Relax hw constraints for implicit fb sync tipc: fix unexpected link reset due to discovery messages octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable hvc/xen: lock console list traversal nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame() af_unix: selftest: Fix the size of the parameter to connect() tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` tools/nolibc: use pselect6 on RISCV tools/nolibc/std: move the standard type definitions to std.h tools/nolibc/types: split syscall-specific definitions into their own files tools/nolibc/arch: split arch-specific code into individual files tools/nolibc/arch: mark the _start symbol as weak tools/nolibc: Remove .global _start from the entry point code tools/nolibc: restore mips branch ordering in the _start block tools/nolibc: fix the O_* fcntl/open macro definitions for riscv net/sched: act_mpls: Fix warning during failed attribute validation net/mlx5: Fix ptp max frequency adjustment range net/mlx5e: Don't support encap rules with gbp option perf build: Properly guard libbpf includes igc: Fix PPS delta between two synchronized end-points platform/surface: aggregator: Add missing call to ssam_request_sync_free() mm: Always release pages to the buddy allocator in memblock_free_late(). Documentation: KVM: add API issues section KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID io_uring: lock overflowing for IOPOLL arm64: atomics: format whitespace consistently arm64: atomics: remove LL/SC trampolines arm64: cmpxchg_double*: hazard against entire exchange variable efi: fix NULL-deref in init error path scsi: mpt3sas: Remove scsi_dma_map() error messages io_uring/io-wq: free worker if task_work creation is canceled io_uring/io-wq: only free worker if it was allocated for creation block: handle bio_split_to_limits() NULL return Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout" pinctrl: amd: Add dynamic debugging for active GPIOs Linux 5.15.89 Change-Id: I66c4f269aa7751b2e4aac919f822dfdcb844a69d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
efae08b654 |
Merge 5.15.87 into android14-5.15
Changes in 5.15.87
usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init
cifs: fix oops during encryption
Revert "selftests/bpf: Add test for unstable CT lookup API"
nvme-pci: fix doorbell buffer value endianness
nvme-pci: fix mempool alloc size
nvme-pci: fix page size checks
ACPI: resource: Skip IRQ override on Asus Vivobook K3402ZA/K3502ZA
ACPI: resource: do IRQ override on LENOVO IdeaPad
ACPI: resource: do IRQ override on XMG Core 15
ACPI: resource: do IRQ override on Lenovo 14ALC7
block, bfq: fix uaf for bfqq in bfq_exit_icq_bfqq
ata: ahci: Fix PCS quirk application for suspend
nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
nvmet: don't defer passthrough commands with trivial effects to the workqueue
fs/ntfs3: Validate BOOT record_size
fs/ntfs3: Add overflow check for attribute size
fs/ntfs3: Validate data run offset
fs/ntfs3: Add null pointer check to attr_load_runs_vcn
fs/ntfs3: Fix memory leak on ntfs_fill_super() error path
fs/ntfs3: Add null pointer check for inode operations
fs/ntfs3: Validate attribute name offset
fs/ntfs3: Validate buffer length while parsing index
fs/ntfs3: Validate resident attribute name
fs/ntfs3: Fix slab-out-of-bounds read in run_unpack
soundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15
fs/ntfs3: Validate index root when initialize NTFS security
fs/ntfs3: Use __GFP_NOWARN allocation at wnd_init()
fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_fill_super()
fs/ntfs3: Delete duplicate condition in ntfs_read_mft()
fs/ntfs3: Fix slab-out-of-bounds in r_page
objtool: Fix SEGFAULT
powerpc/rtas: avoid device tree lookups in rtas_os_term()
powerpc/rtas: avoid scheduling in rtas_os_term()
HID: multitouch: fix Asus ExpertBook P2 P2451FA trackpoint
HID: plantronics: Additional PIDs for double volume key presses quirk
pstore: Properly assign mem_type property
pstore/zone: Use GFP_ATOMIC to allocate zone buffer
hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
binfmt: Fix error return code in load_elf_fdpic_binary()
ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
ALSA: line6: correct midi status byte when receiving data from podxt
ALSA: line6: fix stack overflow in line6_midi_transmit
pnode: terminate at peers of source
mfd: mt6360: Add bounds checking in Regmap read/write call-backs
md: fix a crash in mempool_free
mm, compaction: fix fast_isolate_around() to stay within boundaries
f2fs: should put a page when checking the summary info
f2fs: allow to read node block after shutdown
mmc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING
tpm: acpi: Call acpi_put_table() to fix memory leak
tpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak
tpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak
SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
kcsan: Instrument memcpy/memset/memmove with newer Clang
ASoC: Intel/SOF: use set_stream() instead of set_tdm_slots() for HDAudio
ASoC/SoundWire: dai: expand 'stream' concept beyond SoundWire
rcu-tasks: Simplify trc_read_check_handler() atomic operations
net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO
net/af_packet: make sure to pull mac header
media: stv0288: use explicitly signed char
soc: qcom: Select REMAP_MMIO for LLCC driver
kest.pl: Fix grub2 menu handling for rebooting
ktest.pl minconfig: Unset configs instead of just removing them
jbd2: use the correct print format
perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D
perf/x86/intel/uncore: Clear attr_update properly
arm64: dts: qcom: sdm845-db845c: correct SPI2 pins drive strength
mmc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K
btrfs: fix resolving backrefs for inline extent followed by prealloc
ARM: ux500: do not directly dereference __iomem
arm64: dts: qcom: sdm850-lenovo-yoga-c630: correct I2C12 pins drive strength
selftests: Use optional USERCFLAGS and USERLDFLAGS
PM/devfreq: governor: Add a private governor_data for governor
cpufreq: Init completion before kobject_init_and_add()
ALSA: patch_realtek: Fix Dell Inspiron Plus 16
ALSA: hda/realtek: Apply dual codec fixup for Dell Latitude laptops
fs: dlm: fix sock release if listen fails
fs: dlm: retry accept() until -EAGAIN or error returns
mptcp: mark ops structures as ro_after_init
mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort
dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata
dm thin: Use last transaction's pmd->root when commit failed
dm thin: resume even if in FAIL mode
dm thin: Fix UAF in run_timer_softirq()
dm integrity: Fix UAF in dm_integrity_dtr()
dm clone: Fix UAF in clone_dtr()
dm cache: Fix UAF in destroy()
dm cache: set needs_check flag after aborting metadata
tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx'
perf/core: Call LSM hook after copying perf_event_attr
of/kexec: Fix reading 32-bit "linux,initrd-{start,end}" values
KVM: VMX: Resume guest immediately when injecting #GP on ECREATE
KVM: nVMX: Inject #GP, not #UD, if "generic" VMXON CR0/CR4 check fails
KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
x86/microcode/intel: Do not retry microcode reloading on the APs
ftrace/x86: Add back ftrace_expected for ftrace bug reports
x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK
x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK
tracing: Fix race where eprobes can be called before the event
tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE
tracing/hist: Fix wrong return value in parse_action_params()
tracing/probes: Handle system names with hyphens
tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line
staging: media: tegra-video: fix chan->mipi value on error
staging: media: tegra-video: fix device_node use after free
ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod
media: dvb-core: Fix double free in dvb_register_device()
media: dvb-core: Fix UAF due to refcount races at releasing
cifs: fix confusing debug message
cifs: fix missing display of three mount options
rtc: ds1347: fix value written to century register
block: mq-deadline: Do not break sequential write streams to zoned HDDs
md/bitmap: Fix bitmap chunk size overflow issues
efi: Add iMac Pro 2017 to uefi skip cert quirk
wifi: wilc1000: sdio: fix module autoloading
ASoC: jz4740-i2s: Handle independent FIFO flush bits
ipu3-imgu: Fix NULL pointer dereference in imgu_subdev_set_selection()
ipmi: fix long wait in unload when IPMI disconnect
mtd: spi-nor: Check for zero erase size in spi_nor_find_best_erase_type()
ima: Fix a potential NULL pointer access in ima_restore_measurement_list
ipmi: fix use after free in _ipmi_destroy_user()
PCI: Fix pci_device_is_present() for VFs by checking PF
PCI/sysfs: Fix double free in error path
riscv: stacktrace: Fixup ftrace_graph_ret_addr retp argument
riscv: mm: notify remote harts about mmu cache updates
crypto: n2 - add missing hash statesize
crypto: ccp - Add support for TEE for PCI ID 0x14CA
driver core: Fix bus_type.match() error handling in __driver_attach()
phy: qcom-qmp-combo: fix sc8180x reset
iommu/amd: Fix ivrs_acpihid cmdline parsing code
remoteproc: core: Do pm_relax when in RPROC_OFFLINE state
parisc: led: Fix potential null-ptr-deref in start_task()
device_cgroup: Roll back to original exceptions after copy failure
drm/connector: send hotplug uevent on connector cleanup
drm/vmwgfx: Validate the box size for the snooped cursor
drm/i915/dsi: fix VBT send packet port selection for dual link DSI
drm/ingenic: Fix missing platform_driver_unregister() call in ingenic_drm_init()
ext4: silence the warning when evicting inode with dioread_nolock
ext4: add inode table check in __ext4_get_inode_loc to aovid possible infinite loop
ext4: remove trailing newline from ext4_msg() message
fs: ext4: initialize fsdata in pagecache_write()
ext4: fix use-after-free in ext4_orphan_cleanup
ext4: fix undefined behavior in bit shift for ext4_check_flag_values
ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode
ext4: add helper to check quota inums
ext4: fix bug_on in __es_tree_search caused by bad quota inode
ext4: fix reserved cluster accounting in __es_remove_extent()
ext4: check and assert if marking an no_delete evicting inode dirty
ext4: fix bug_on in __es_tree_search caused by bad boot loader inode
ext4: fix leaking uninitialized memory in fast-commit journal
ext4: fix uninititialized value in 'ext4_evict_inode'
ext4: init quota for 'old.inode' in 'ext4_rename'
ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline
ext4: fix corruption when online resizing a 1K bigalloc fs
ext4: fix error code return to user-space in ext4_get_branch()
ext4: avoid BUG_ON when creating xattrs
ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
ext4: fix inode leak in ext4_xattr_inode_create() on an error path
ext4: initialize quota before expanding inode in setproject ioctl
ext4: avoid unaccounted block allocation when expanding inode
ext4: allocate extended attribute value in vmalloc area
drm/amdgpu: handle polaris10/11 overlap asics (v2)
drm/amdgpu: make display pinning more flexible (v2)
block: mq-deadline: Fix dd_finish_request() for zoned devices
tracing: Fix issue of missing one synthetic field
ext4: remove unused enum EXT4_FC_COMMIT_FAILED
ext4: use ext4_debug() instead of jbd_debug()
ext4: introduce EXT4_FC_TAG_BASE_LEN helper
ext4: factor out ext4_fc_get_tl()
ext4: fix potential out of bound read in ext4_fc_replay_scan()
ext4: disable fast-commit of encrypted dir operations
ext4: don't set up encryption key during jbd2 transaction
ext4: add missing validation of fast-commit record lengths
ext4: fix unaligned memory access in ext4_fc_reserve_space()
ext4: fix off-by-one errors in fast-commit block filling
ARM: renumber bits related to _TIF_WORK_MASK
phy: qcom-qmp-combo: fix out-of-bounds clock access
btrfs: replace strncpy() with strscpy()
btrfs: move missing device handling in a dedicate function
btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
x86/mce: Get rid of msr_ops
x86/MCE/AMD: Clear DFR errors found in THR handler
media: s5p-mfc: Fix to handle reference queue during finishing
media: s5p-mfc: Clear workbit to handle error condition
media: s5p-mfc: Fix in register read and write for H264
perf probe: Use dwarf_attr_integrate as generic DWARF attr accessor
perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data
ravb: Fix "failed to switch device to config mode" message during unbind
ext4: goto right label 'failed_mount3a'
ext4: correct inconsistent error msg in nojournal mode
mbcache: automatically delete entries from cache on freeing
ext4: fix deadlock due to mbcache entry corruption
drm/i915/migrate: don't check the scratch page
drm/i915/migrate: fix offset calculation
drm/i915/migrate: fix length calculation
SUNRPC: ensure the matching upcall is in-flight upon downcall
btrfs: fix an error handling path in btrfs_defrag_leaves()
bpf: pull before calling skb_postpull_rcsum()
drm/panfrost: Fix GEM handle creation ref-counting
netfilter: nf_tables: consolidate set description
netfilter: nf_tables: add function to create set stateful expressions
netfilter: nf_tables: perform type checking for existing sets
vmxnet3: correctly report csum_level for encapsulated packet
netfilter: nf_tables: honor set timeout and garbage collection updates
veth: Fix race with AF_XDP exposing old or uninitialized descriptors
nfsd: shut down the NFSv4 state objects before the filecache
net: hns3: add interrupts re-initialization while doing VF FLR
net: hns3: refactor hns3_nic_reuse_page()
net: hns3: extract macro to simplify ring stats update code
net: hns3: fix miss L3E checking for rx packet
net: hns3: fix VF promisc mode not update when mac table full
net: sched: fix memory leak in tcindex_set_parms
qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure
net: dsa: mv88e6xxx: depend on PTP conditionally
nfc: Fix potential resource leaks
vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init()
vhost/vsock: Fix error handling in vhost_vsock_init()
vringh: fix range used in iotlb_translate()
vhost: fix range used in translate_desc()
vdpa_sim: fix vringh initialization in vdpasim_queue_ready()
net/mlx5: E-Switch, properly handle ingress tagged packets on VST
net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path
net/mlx5: Avoid recovery in probe flows
net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default
net/mlx5e: TC, Refactor mlx5e_tc_add_flow_mod_hdr() to get flow attr
net/mlx5e: Always clear dest encap in neigh-update-del
net/mlx5e: Fix hw mtu initializing at XDP SQ allocation
net: amd-xgbe: add missed tasklet_kill
net: ena: Fix toeplitz initial hash value
net: ena: Don't register memory info on XDP exchange
net: ena: Account for the number of processed bytes in XDP
net: ena: Use bitmask to indicate packet redirection
net: ena: Fix rx_copybreak value update
net: ena: Set default value for RX interrupt moderation
net: ena: Update NUMA TPH hint register upon NUMA node update
net: phy: xgmiitorgmii: Fix refcount leak in xgmiitorgmii_probe
RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device
RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
drm/meson: Reduce the FIFO lines held when AFBC is not used
filelock: new helper: vfs_inode_has_locks
ceph: switch to vfs_inode_has_locks() to fix file lock bug
gpio: sifive: Fix refcount leak in sifive_gpio_probe
net: sched: atm: dont intepret cls results when asked to drop
net: sched: cbq: dont intepret cls results when asked to drop
net: sparx5: Fix reading of the MAC address
netfilter: ipset: fix hash:net,port,net hang with /0 subnet
netfilter: ipset: Rework long task execution when adding/deleting entries
perf tools: Fix resources leak in perf_data__open_dir()
drm/imx: ipuv3-plane: Fix overlay plane width
fs/ntfs3: don't hold ni_lock when calling truncate_setsize()
drivers/net/bonding/bond_3ad: return when there's no aggregator
octeontx2-pf: Fix lmtst ID used in aura free
usb: rndis_host: Secure rndis_query check against int overflow
perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match non BPF mode
drm/i915: unpin on error in intel_vgpu_shadow_mm_pin()
caif: fix memory leak in cfctrl_linkup_request()
udf: Fix extension of the last extent in the file
ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet
nvme: fix multipath crash caused by flush request when blktrace is enabled
io_uring: check for valid register opcode earlier
nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it
nvme: also return I/O command effects from nvme_command_effects
btrfs: check superblock to ensure the fs was not modified at thaw time
x86/kexec: Fix double-free of elf header buffer
x86/bugs: Flush IBP in ib_prctl_set()
nfsd: fix handling of readdir in v4root vs. mount upcall timeout
fbdev: matroxfb: G200eW: Increase max memory from 1 MB to 16 MB
block: don't allow splitting of a REQ_NOWAIT bio
io_uring: fix CQ waiting timeout handling
thermal: int340x: Add missing attribute for data rate base
riscv: uaccess: fix type of 0 variable on error in get_user()
riscv, kprobes: Stricter c.jr/c.jalr decoding
drm/i915/gvt: fix gvt debugfs destroy
drm/i915/gvt: fix vgpu debugfs clean in remove
hfs/hfsplus: use WARN_ON for sanity check
hfs/hfsplus: avoid WARN_ON() for sanity check, use proper error handling
ksmbd: fix infinite loop in ksmbd_conn_handler_loop()
ksmbd: check nt_len to be at least CIFS_ENCPWD_SIZE in ksmbd_decode_ntlmssp_auth_blob
Revert "ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007"
mptcp: dedicated request sock for subflow in v6
mptcp: use proper req destructor for IPv6
ext4: don't allow journal inode to have encrypt flag
selftests: set the BUILD variable to absolute path
btrfs: make thaw time super block check to also verify checksum
net: hns3: fix return value check bug of rx copybreak
mbcache: Avoid nesting of cache->c_list_lock under bit locks
efi: random: combine bootloader provided RNG seed with RNG protocol output
io_uring: Fix unsigned 'res' comparison with zero in io_fixup_rw_res()
drm/mgag200: Fix PLL setup for G200_SE_A rev >=4
Linux 5.15.87
Change-Id: I06fb376627506652ed60c04d56074956e6e075a0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
8ad43539c6 | Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y | ||
|
|
b8f3b3cffb |
mm: Always release pages to the buddy allocator in memblock_free_late().
[ Upstream commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 ]
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the deferred
init process. memblock_free_pages() is called by memblock_free_late(),
which is used to free reserved ranges after memblock_free_all() has
run. All pages in reserved ranges have been initialized at that point,
and accordingly, those pages are not touched by the deferred init
process. This means that currently, if the pages that
memblock_free_late() intends to release are in the deferred range, they
will never be released to the buddy allocator. They will forever be
reserved.
In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().
For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call __free_pages_core()
directly instead.
One case where this issue can occur in the wild is EFI boot on
x86_64. The x86 EFI code reserves all EFI boot services memory ranges
via memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively). If any of those ranges happens to fall within the
deferred init range, the pages will not be released and that memory will
be unavailable.
For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
v6.2-rc2:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 178867
v6.2-rc2 + patch:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 222816 # +43,949 pages
Fixes:
|
||
|
|
5cecdaebbf |
FROMLIST: BACKPORT: mm: fix is_pinnable_page against on cma page
Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA
so current is_pinnable_page could miss CMA pages which has MIGRATE_
ISOLATE. It ends up pinning CMA pages as longterm at pin_user_pages
APIs so CMA allocation keep failed until the pin is released.
CPU 0 CPU 1 - Task B
cma_alloc
alloc_contig_range
pin_user_pages_fast(FOLL_LONGTERM)
change pageblock as MIGRATE_ISOLATE
internal_get_user_pages_fast
lockless_pages_from_mm
gup_pte_range
try_grab_folio
is_pinnable_page
return true;
So, pinned the page successfully.
page migration failure with pinned page
..
.. After 30 sec
unpin_user_page(page)
CMA allocation succeeded after 30 sec.
The CMA allocation path protects the migration type change race
using zone->lock but what GUP path need to know is just whether the
page is on CMA area or not rather than exact migration type.
Thus, we don't need zone->lock but just checks migration type in
either of (MIGRATE_ISOLATE and MIGRATE_CMA).
Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause
rejecting of pinning pages on MIGRATE_ISOLATE pageblocks even
though it's neither CMA nor movable zone if the page is temporarily
unmovable. However, such a migration failure by unexpected temporal
refcount holding is general issue, not only come from MIGRATE_ISOLATE
and the MIGRATE_ISOLATE is also transient state like other temporal
elevated refcount problem.
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Conflicts:
include/linux/mm.h
1. There is no is_pinnable_page in 5.10
Link: https://lore.kernel.org/all/20220524171525.976723-1-minchan@kernel.org/
Bug: 231227007
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I5cdd2b8eefdd7e89658abd21c32aa84876ad7782
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit e9dd78ebe1c8e9fcc4067e0795326495a16a9c9b)
|
||
|
|
fe19dff7e6 |
FROMLIST: kasan: allow sampling page_alloc allocations for HW_TAGS
[The patch is in mm-unstable tree.] As Hardware Tag-Based KASAN is intended to be used in production, its performance impact is crucial. As page_alloc allocations tend to be big, tagging and checking all such allocations can introduce a significant slowdown. Add two new boot parameters that allow to alleviate that slowdown: - kasan.page_alloc.sample, which makes Hardware Tag-Based KASAN tag only every Nth page_alloc allocation with the order configured by the second added parameter (default: tag every such allocation). - kasan.page_alloc.sample.order, which makes sampling enabled by the first parameter only affect page_alloc allocations with the order equal or greater than the specified value (default: 3, see below). The exact performance improvement caused by using the new parameters depends on their values and the applied workload. The chosen default value for kasan.page_alloc.sample.order is 3, which matches both PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. This is done for two reasons: 1. PAGE_ALLOC_COSTLY_ORDER is "the order at which allocations are deemed costly to service", which corresponds to the idea that only large and thus costly allocations are supposed to sampled. 2. One of the workloads targeted by this patch is a benchmark that sends a large amount of data over a local loopback connection. Most multi-page data allocations in the networking subsystem have the order of SKB_FRAG_PAGE_ORDER (or PAGE_ALLOC_COSTLY_ORDER). When running a local loopback test on a testing MTE-enabled device in sync mode, enabling Hardware Tag-Based KASAN introduces a ~50% slowdown. Applying this patch and setting kasan.page_alloc.sampling to a value higher than 1 allows to lower the slowdown. The performance improvement saturates around the sampling interval value of 10 with the default sampling page order of 3. This lowers the slowdown to ~20%. The slowdown in real scenarios involving the network will likely be better. Enabling page_alloc sampling has a downside: KASAN misses bad accesses to a page_alloc allocation that has not been tagged. This lowers the value of KASAN as a security mitigation. However, based on measuring the number of page_alloc allocations of different orders during boot in a test build, sampling with the default kasan.page_alloc.sample.order value affects only ~7% of allocations. The rest ~93% of allocations are still checked deterministically. Link: https://lkml.kernel.org/r/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Jann Horn <jannh@google.com> Cc: Mark Brand <markbrand@google.com> Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 238286329 Bug: 264310057 Link: https://lore.kernel.org/all/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Change-Id: Icc7befe61848021c68a12034f426f1c300181ad6 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> |
||
|
|
a8962f626f |
ANDROID: vendor hook to control blk_plug for shrink_lruvec
Add vendor hook to contorl blk plugging for shrink_lruvec. Merged CL bcf1e503f5ed774dc28126a0f1a8c839717eafac: ANDROID: adjust vendor hook to control blk_plug Bug: 255471591 Bug: 238728493 Change-Id: Iba2603ff2e1b62cf2ee8fd6969d8ccd71416a288 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 89fed37332fd48e0cd13b256cd85d6929d5da319) |
||
|
|
7df45e50a5 |
ANDROID: vendor hook to control blk_plug for memory reclaim
Add vendor hook to contorl blk plugging. Bug: 255471591 Bug: 238728493 Change-Id: I96b73cec14f0d2fea46a4828526e6ae5aa5c71b7 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit a17e132ec4f290621666311e73f43202706d2743) |
||
|
|
ba005d6032 |
ANDROID: vendor hook to control bh_lru and lru_cache_disable
Add vendor hook for bh_lru and lru_cache_disable Bug: 238728493 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I81bfad317cf6e8633186ebb3238644306d7a102d Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 74e2ea264cd1895c493b9008b62bfea98dacf3f6) |
||
|
|
243f54dd3a |
ANDROID: vendor hook for TLB batching control
Add vendor hook for flushing TLB batching in zap_pte_range. Merged CL 232bdcbd660b1129b7d8d0de25a563b476eeb522: ANDROID: pass argument in zap_pte_range vendor hooks Bug: 238728493 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: If2de5f070dd7b76624961f5a91440bf69a99ca2d Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit d257ef6764f228145d0fca24998162809bb5b9f7) |
||
|
|
bb6ab2be93 |
ANDROID: vendor hook to control pagevec flush
The pagevec batching causes lru_add_drain_all which is too expensive sometimes. This patch adds a new vendor hook to drain the pagevec immediately depending on the page's type. Bug: 251881967 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Id17e14e69197993ddad511a40c96e51674c02834 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 2f8253b7e6e563cc19cffa120c72f6f528664103) |
||
|
|
1eeadb47e0 |
ANDROID: mm: vh for compaction begin/end
Add vendor hook for compaction begin/end. The first use would be to measure compaction durations. Bug: 229927848 Test: local kernel build test Signed-off-by: Robin Hsu <robinhsu@google.com> Change-Id: I3d95434bf49b37199056dc9ddfc36a59a7de17b7 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 13b6bd38bb1f43bfffdb08c8f3a4a20d36ccd670) |
||
|
|
e3f396c2d4 |
ANDROID: add vendor_hook to control CMA allocation ratio
CMA first allocation policy for movable makes CMA(Upstream doesn't) area always full. It's good for memory efficiency since it could use up CMA available memory most of time. However, it could cause cma_alloc slow since it causes a lot page migration all the time. Let's add vendor hook for someone who want to restore CMA allocation policy to upstream so they will see less page migration in cma_alloc. If the vendor_hook returns false, the rmqueue_bulk return 0 without filling pcp->lists so get_populated_pcp_list will return NULL. Once get_populated_pcp_list returns NULL, __rmqueue_pcplist will retry the page allocation with original migratetype(currently, original migratetype couldn't be MIGRATE_CMA) so the retrial will find available pages from !MIGRATE_CMA free list. Bug: 231978523 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ia031d9bc6f34085b892a8d9923bf5b9b1794f94a Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 0ca85e35bf5b4a7ff08f00a060c83e4a82380b64) |
||
|
|
4f7af52a4b | Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y | ||
|
|
35d8a89862 |
mm, compaction: fix fast_isolate_around() to stay within boundaries
commit be21b32afe470c5ae98e27e49201158a47032942 upstream. Depending on the memory configuration, isolate_freepages_block() may scan pages out of the target range and causes panic. Panic can occur on systems with multiple zones in a single pageblock. The reason it is rare is that it only happens in special configurations. Depending on how many similar systems there are, it may be a good idea to fix this problem for older kernels as well. The problem is that pfn as argument of fast_isolate_around() could be out of the target range. Therefore we should consider the case where pfn < start_pfn, and also the case where end_pfn < pfn. This problem should have been addressd by the commit |
||
|
|
6986b4c889 |
ANDROID: mm: Export find_vm_area
Export find_vm_area for obtaining pages of vmalloc'ed memory, which is required for both GXP and TPU modules. Bug: 263839332 Change-Id: I1d6c37a5abb6012c3ff295120dd2d3cb2871c820 Signed-off-by: davidchiang <davidchiang@google.com> |
||
|
|
c41503e313 |
ANDROID: page_pinner: prevent pp_buffer uninitialized access
There is a race window between page_pinner_inited set and the pp_buffer initialization which cause accessing the pp_buffer->lock. Avoid this by moving the pp_buffer initialization to page_ext_ops->init() which sets the page_pinner_inited only after the pp_buffer is initialized. Race scenario: 1) init_page_pinner is called --> page_pinner_inited is set. 2) __alloc_contig_migrate_range --> __page_pinner_failure_detect() accesses the pp_buffer->lock(yet to be initialized). 3) Then the pp_buffer is allocated and initialized. Below is the issue call stack: spin_bug+0x0 _raw_spin_lock_irqsave+0x3c __page_pinner_failure_detect+0x110 __alloc_contig_migrate_range+0x1c4 alloc_contig_range+0x130 cma_alloc+0x170 dma_alloc_contiguous+0xa0 __dma_direct_alloc_pages+0x16c dma_direct_alloc+0x88 Bug: 259024332 Change-Id: I6849ac4d944498b9a431b47cad7adc7903c9bbaa Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> |
||
|
|
d68a75b1dc | Merge "Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.15.y' into android14-5.15" into android14-5.15 | ||
|
|
d705ab99ab |
ANDROID: vendor_hooks: Export direct reclaim trace points
Get direct reclaim info.
Bug: 190795589
Signed-off-by: Martin Liu <liumartin@google.com>
Change-Id: Ie66a3c87484a364a918c19b8e044c82f1afd6749
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
c20204c67d |
ANDROID: cma: allow to use CMA in swap-in path
Now, we allow to use CMA pages for certain user space allocations. One of them is anonymous page fault case. To align the use case, we should also allow to use CMA pages in swap-in cases. This could help mitigate OOM on swap-in cases showing plenty of free CMA left. logd.klogd invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 CPU: 0 PID: 433 Comm: logd.klogd Tainted: G W OE 5.10.100-android13-0 #1 Call trace: dump_backtrace.cfi_jt+0x0/0x8 show_stack+0x1c/0x2c dump_stack_lvl+0xc0/0x13c dump_header+0x54/0x238 oom_kill_process+0xb0/0x158 out_of_memory+0x17c/0x328 __alloc_pages_slowpath+0x5c4/0x8d0 __alloc_pages_nodemask+0x1bc/0x2e0 __read_swap_cache_async+0xdc/0x370 swap_vma_readahead+0x3b4/0x488 swapin_readahead+0x3c/0x54 do_swap_page+0x1e0/0xaa0 handle_pte_fault+0x128/0x1e0 handle_mm_fault+0x308/0x590 do_page_fault+0x33c/0x478 do_translation_fault+0x58/0x11c do_mem_abort+0x68/0x144 el0_da+0x24/0x34 el0_sync_handler+0xc4/0xec el0_sync+0x1c0/0x200 Mem-Info: active_anon:0 inactive_anon:3222 isolated_anon:62 active_file:232 inactive_file:428 isolated_file:0 unevictable:37232 dirty:3 writeback:40 slab_reclaimable:19943 slab_unreclaimable:281193 mapped:37126 shmem:2815 pagetables:8981 bounce:0 free:126007 free_pcp:223 free_cma:123062 Node 0 active_anon:16kB inactive_anon:13160kB active_file:292kB inactive_file:2000kB unevictable:148928kB isolated(anon):0kB isolated(file):0kB mapped:148308kB dirty:12kB writeback:164kB shmem:11260kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:20528kB shadow_call_stack:5200kB all_unreclaimable? no DMA32 free:14128kB min:7572kB low:22636kB high:37700kB reserved_highatomic:4096KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1913856kB managed:1553276kB mlocked:0kB pagetables:1292kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:2520kB lowmem_reserve[]: 0 0 0 Normal free:489888kB min:19808kB low:59220kB high:98632kB reserved_highatomic:36864KB active_anon:20kB inactive_anon:12168kB active_file:0kB inactive_file:1640kB unevictable:148928kB writepending:180kB present:4194304kB managed:4063392kB mlocked:148928kB pagetables:34632kB bounce:0kB free_pcp:1928kB local_pcp:0kB free_cma:489752kB lowmem_reserve[]: 0 0 0 DMA32: 166*4kB (UME) 163*8kB (UMECH) 592*16kB (UMCH) 5*32kB (UC) 2*64kB (C) 2*128kB (C) 2*256kB (C) 1*512kB (C) 1*1024kB (C) 0*2048kB 0*4096kB = 14032kB Normal: 969*4kB (C) 77*8kB (C) 40*16kB (C) 17*32kB (C) 5*64kB (C) 1*128kB (C) 2*256kB (C) 1*512kB (C) 0*1024kB 0*2048kB 118*4096kB (C) = 490476kB 40220 total pagecache pages 30 pages in swap cache Swap cache stats: add 2634625, delete 2635304, find 160621/2963954 Free swap = 1473788kB Total swap = 2097148kB Bug: 229822798 Signed-off-by: Martin Liu <liumartin@google.com> Change-Id: Ia0bb6f72e52f77f26062e1769bfd92e831f07cab Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 3e591c63b13772a1de0ffed995b482166d27ed71) |
||
|
|
dd887dbfaa |
ANDROID: mm: do not count cma_alloc_fail on __GFP_NORETRY
Do not account __GFP_NORETRY allocation failure as cma_alloc_fail since it's not critical failure(i.e., the caller with __GFP_NORETRY should always carry on the fallback plan). It's also good for compatibility POV with upstream since upstream cma_alloc_fail only counts cma_alloc_fail with !__GFP_NORETRY since upstream doesn't support __GFP_NORTRY yet. Bug: 220669548 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I377e6b033c3786e10b6b1c814037a4fc40e20a73 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 8ffc7ff817fe552592daa2b0de1760e3539663f3) |
||
|
|
bb7b81497d |
ANDROID: GKI: export cma_get_size
Export cma_get_size to tell cma instance's size, which is needed to allocate entire pages of the cma. Bug: 218731671 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ifb2769f60250ce605236342b950907218e1c28a5 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 7a44906686048bdcecb7dfa4fac02c4ad7f6cd06) |
||
|
|
1b38e981db |
ANDROID: mm: cma do not sleep for __GFP_NORETRY
Do not sleep for retrying for __GFP_NORERY since it's failfast
mode approach. User could retry the allocation without the flag
by themselves if they see the failure.
Bug: 192475091
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ic6a857978fda8e353b9ed770d1e0ba1808fd201e
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
60d2dad38e |
ANDROID: mm: cma: skip problematic pageblock
alloc_contig_range is supposed to work on max(MAX_ORDER_NR_PAGES,
or pageblock_nr_pages) granularity aligned range. If it fails
at a page and return error to user, user doesn't know what page
makes the allocation failure and keep retrying another allocation
with new range including the failed page and encountered error
again and again until it could escape the out of the granularity
block. Instead, let's make CMA aware of what pfn was troubled in
previous trial and then continue to work new pageblock out of the
failed page so it doesn't see the repeated error repeatedly.
Currently, this option works for only __GFP_NORETRY case for
safe for existing CMA users.
Bug: 192475091
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I0959c9df3d4b36408a68920abbb4d52d31026079
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
c63e78a29a |
ANDROID: mm: lru_cache_disable skips lru cache drainnig
lru_cache_disable is not trivial cost since it should run work
from every cores in the system. Thus, repeated call of the
function whenever alloc_contig_range in the cma's allocation loop
is called is expensive.
This patch makes the lru_cache_disable smarter in that it will
not run __lru_add_drain_all since it knows the cache was already
disabled by someone else.
With that, user of alloc_contig_range can disable the lru cache
in advance in their context so that subsequent alloc_contig_range
for user's operation will avoid the costly function call.
This patch moves lru_cache APIs from swap.h to swap.c and export
it for vendor users.
Bug: 192475091
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I23da8599c55db49dc80226285972e4cd80dedcff
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
b74eeef409 |
ANDROID: mm: do not try test_page_isoalte if migration fails
Currently, alloc_contig_range expects that even though a page fails
with -EBUSY from __alloc_contig_migrate_range, it want to check
those failed pages in test_pages_isolated again with hope that
those page would be freed soon so cma allocatoin would be succeeded.
However, it depends on the luck and I found sometimes test_page_isolated
constantly fails at the page repeatedly whenever cma_alloc retried.
Rather than burning out CPU to check the page's status in
test_pages_isolated for GFP_NORETRY allocation, just bail out and
relies on the user what they want to do.
Currently, this option works for only __GFP_NORETRY case for safe
of existing other users.
Bug: 192475091
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I9211452be06960dc7d8f854537e53b3fc5262c8e
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
a7f55c5c73 |
ANDROID: mm: add cma allocation statistics
alloc_contig_range is the core worker function for CMA allocation
so it has every information to be able to understand allocation
latency. For example, how many pages are migrated, how many time
unmap was needed to migrate pages, how many times it encountered
errors by some reasons.
This patch adds such statistics in the alloc_contig_range and
return it to user so user can use those information to analyize
latency. The cma_alloc is first user for the statistics, which
export the statistics as new trace event(i.e., cma_alloc_info).
It was really usefuli to optimize cma allocation work.
Bug: 192475091
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I7be43cc89d11078e2a324d2d06aada6d8e9e1cc9
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
1b2de5aa2d |
ANDROID: mm: cma: add vendor hoook in cma_alloc()
Add vendor hook for cma_alloc latency measuring.
Bug: 177231781
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ia2dbb26454bd8f03489389b29b9a6c939d3c2bbb
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
1b6d55eb48 |
ANDROID: mm: build alloc_contig_dump_pages in page_alloc.o
GKI has CONFIG_DYNAMIC_DEBUG_CORE. Thus, to enable only the
specific alloc_contig_dump_pages without needing all pr_debug
in every source files is using -DCONFIG_DYNAMIC_MODULE
when the page_alloc.o compiled.
Bug: 182195592
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I93266eb4161b3653389c737d4c767fd5d1083339
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
9cb760bc6a |
FROMLIST: mm: failfast mode with __GFP_NORETRY in alloc_contig_range
Contiguous memory allocation can be stalled due to waiting
on page writeback and/or page lock which causes unpredictable
delay. It's a unavoidable cost for the requestor to get *big*
contiguous memory but it's expensive for *small* contiguous
memory(e.g., order-4) because caller could retry the request
in different range where would have easy migratable pages
without stalling.
This patch introduce __GFP_NORETRY as compaction gfp_mask in
alloc_contig_range so it will fail fast without blocking
when it encounters pages needed waiting.
Bug: 170340257
Bug: 120293424
Link: https://lore.kernel.org/linux-mm/YAnM5PbNJZlk%2F%2FiX@google.com/T/#m1362218ebb69e6e10c20d9361008b079745c4e6f
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I42ba8dd5aeb065d936978ab205e4baf84bf9a321
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
eebff8eab2 |
FROMLIST: mm: cma: introduce gfp flag in cma_alloc instead of no_warn
The upcoming patch will introduce __GFP_NORETRY semantic
in alloc_contig_range which is a failfast mode of the API.
Instead of adding a additional parameter for gfp, replace
no_warn with gfp flag.
To keep old behaviors, it follows the rule below.
no_warn gfp_flags
false GFP_KERNEL
true GFP_KERNEL|__GFP_NOWARN
gfp & __GFP_NOWARN GFP_KERNEL | (gfp & __GFP_NOWARN)
Bug: 170340257
Bug: 120293424
Link: https://lore.kernel.org/linux-mm/YAnM5PbNJZlk%2F%2FiX@google.com/T/#m36b144ff81fe0a8f0ecaf6813de4819ecc41f8fe
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I1ce020ab5d5fff34eb6464be4632ddef72fb43eb
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit
|
||
|
|
b205c093f2 |
Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.15.y' into android14-5.15
* aosp/upstream-f2fs-stable-linux-5.15.y:
f2fs: let's avoid panic if extent_tree is not created
f2fs: should use a temp extent_info for lookup
f2fs: don't mix to use union values in extent_info
f2fs: initialize extent_cache parameter
f2fs: fix to avoid NULL pointer dereference in f2fs_issue_flush()
fs: account for group membership
fs: fix acl translation
fs: support mapped mounts of mapped filesystems
fs: add i_user_ns() helper
fs: port higher-level mapping helpers
fs: remove unused low-level mapping helpers
fs: use low-level mapping helpers
docs: update mapping documentation
fs: account for filesystem mappings
fs: tweak fsuidgid_has_mapping()
fs: move mapping helpers
fs: add is_idmapped_mnt() helper
Revert "fs: add is_idmapped_mnt() helper"
Revert "fs: move mapping helpers"
Revert "fs: tweak fsuidgid_has_mapping()"
Revert "fs: account for filesystem mappings"
Revert "docs: update mapping documentation"
Revert "fs: use low-level mapping helpers"
Revert "fs: remove unused low-level mapping helpers"
Revert "fs: add i_user_ns() helper"
Revert "fs: account for group membership"
fsverity: simplify fsverity_get_digest()
fsverity: stop using PG_error to track error status
fs-verity: use kmap_local_page() instead of kmap()
highmem: Make __kunmap_{local,atomic}() take const void pointer
fs-verity: use memcpy_from_page()
fs-verity: Use struct_size() helper in enable_verity()
fs-verity: remove unused parameter desc_size in fsverity_create_info()
fs-verity: define a function to return the integrity protected file digest
fscrypt: add additional documentation for SM4 support
fscrypt: remove unused Speck definitions
fscrypt: Add SM4 XTS/CTS symmetric algorithm support
blk-crypto: Add support for SM4-XTS blk crypto mode
fscrypt: add comment for fscrypt_valid_enc_modes_v1()
blk-crypto: Add a missing include directive
blk-crypto: move internal only declarations to blk-crypto-internal.h
blk-crypto: add a blk_crypto_config_supported_natively helper
blk-crypto: don't use struct request_queue for public interfaces
fscrypt: pass super_block to fscrypt_put_master_key_activeref()
fscrypt: add fscrypt_context_for_new_inode
fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
fscrypt: Add HCTR2 support for filename encryption
fs: account for group membership
fs: add i_user_ns() helper
fs: remove unused low-level mapping helpers
fs: use low-level mapping helpers
docs: update mapping documentation
fs: account for filesystem mappings
fs: tweak fsuidgid_has_mapping()
fs: move mapping helpers
fs: add is_idmapped_mnt() helper
Bug: 256243893
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: Ia7b419ef516dee40be32968cd7c60ce0adabca99
|
||
|
|
f74aca771c |
BACKPORT: mm: don't be stuck to rmap lock on reclaim path
The rmap locks(i_mmap_rwsem and anon_vma->root->rwsem) could be contended
under memory pressure if processes keep working on their vmas(e.g., fork,
mmap, munmap). It makes reclaim path stuck. In our real workload traces,
we see kswapd is waiting the lock for 300ms+(worst case, a sec) and it
makes other processes entering direct reclaim, which were also stuck on
the lock.
This patch makes lru aging path try_lock mode like shink_page_list so the
reclaim context will keep working with next lru pages without being stuck.
if it found the rmap lock contended, it rotates the page back to head of
lru in both active/inactive lrus to make them consistent behavior, which
is basic starting point rather than adding more heristic.
Since this patch introduces a new "contended" field as out-param along
with try_lock in-param in rmap_walk_control, it's not immutable any longer
if the try_lock is set so remove const keywords on rmap related functions.
Since rmap walking is already expensive operation, I doubt the const
would help sizable benefit( And we didn't have it until 5.17).
In a heavy app workload in Android, trace shows following statistics. It
almost removes rmap lock contention from reclaim path.
Martin Liu reported:
Before:
max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function
1632 0 1631 151.542173 31672 209 page_lock_anon_vma_read
601 0 601 145.544681 28817 198 rmap_walk_file
After:
max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function
NaN NaN NaN NaN NaN 0.0 NaN
0 0 0 0.127645 1 12 rmap_walk_file
[minchan@kernel.org: add comment, per Matthew]
Link: https://lkml.kernel.org/r/YnNqeB5tUf6LZ57b@google.com
Link: https://lkml.kernel.org/r/20220510215423.164547-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: John Dias <joaodias@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Martin Liu <liumartin@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Conflicts:
folio->page
Conflicts:
mm/huge_memory.c was refactored by
commit
|
||
|
|
8ca606e98b |
FROMLIST: mm: fix use-after free of page_ext after race with memory-offline
The below is one path where race between page_ext and offline of the respective memory blocks will cause use-after-free on the access of page_ext structure. process1 process2 --------- --------- a)doing /proc/page_owner doing memory offline through offline_pages. b)PageBuddy check is failed thus proceed to get the page_owner information through page_ext access. page_ext = lookup_page_ext(page); migrate_pages(); ................. Since all pages are successfully migrated as part of the offline operation,send MEM_OFFLINE notification where for page_ext it calls: offline_page_ext()--> __free_page_ext()--> free_page_ext()--> vfree(ms->page_ext) mem_section->page_ext = NULL c) Check for the PAGE_EXT flags in the page_ext->flags access results into the use-after-free(leading to the translation faults). As mentioned above, there is really no synchronization between page_ext access and its freeing in the memory_offline. The memory offline steps(roughly) on a memory block is as below: 1) Isolate all the pages 2) while(1) try free the pages to buddy.(->free_list[MIGRATE_ISOLATE]) 3) delete the pages from this buddy list. 4) Then free page_ext.(Note: The struct page is still alive as it is freed only during hot remove of the memory which frees the memmap, which steps the user might not perform). This design leads to the state where struct page is alive but the struct page_ext is freed, where the later is ideally part of the former which just representing the page_flags (check [3] for why this design is chosen). The above mentioned race is just one example __but the problem persists in the other paths too involving page_ext->flags access(eg: page_is_idle())__. Fix all the paths where offline races with page_ext access by maintaining synchronization with rcu lock and is achieved in 3 steps: 1) Invalidate all the page_ext's of the sections of a memory block by storing a flag in the LSB of mem_section->page_ext. 2) Wait till all the existing readers to finish working with the ->page_ext's with synchronize_rcu(). Any parallel process that starts after this call will not get page_ext, through lookup_page_ext(), for the block parallel offline operation is being performed. 3) Now safely free all sections ->page_ext's of the block on which offline operation is being performed. Note: If synchronize_rcu() takes time then optimizations can be done in this path through call_rcu()[2]. Thanks to David Hildenbrand for his views/suggestions on the initial discussion[1] and Pavan kondeti for various inputs on this patch. [1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@quicinc.com/ [2] https://lore.kernel.org/all/a26ce299-aed1-b8ad-711e-a49e82bdd180@quicinc.com/T/#u [3] https://lore.kernel.org/all/6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com/ Bug: 236222283 Bug: 240196534 Link: https://lore.kernel.org/all/1661496993-11473-1-git-send-email-quic_charante@quicinc.com/ Change-Id: Ib439ae19c61a557a5c70ea90e3c4b35a5583ba0d Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> Signed-off-by: Minchan Kim <minchan@google.com> (fixed merge conflicts and still exported lookup_page_ext) (minchan: fixed page_pinner with new page_ext scheme) |
||
|
|
e12acd3eef |
ANDROID: mm: introduce page_pinner
For CMA allocation, it's really critical to migrate a page but sometimes it fails. One of the reasons is some driver holds a page refcount for a long time so VM couldn't migrate the page at that time. The concern here is there is no way to find the who hold the refcount of the page effectively. This patch introduces feature to keep tracking page's pinner. All get_page sites are vulnerable to pin a page for a long time but the cost to keep track it would be significat since get_page is the most frequent kernel operation. Furthermore, the page could be not user page but kernel page which is not related to the page migration failure. Thus, this patch keeps tracks of only migration failed pages to reduce runtime cost. Once page migration fails in CMA allocation path, those pages are marked as "migration failure" and every put_page operation against those pages, callstack of the put are recorded into page_pinner buffer. Later, admin can see what pages were failed and who released the refcount since the failure. It really helps effectively to find out longtime refcount holder to prevent the page migration. note: page_pinner doesn't guarantee attributing/unattributing are atomic if they happen at the same time. It's just best effort so false-positive could happen. Bug: 183414571 BUg: 240196534 Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I603d0c0122734c377db6b1eb95848a6f734173a0 (cherry picked from commit 898cfbf094a2fc13c67fab5b5d3c916f0139833a) |
||
|
|
a21b3ffd99 |
ANDROID: remove unnecessary SPECULATIVE_PAGE_FAULT config dependency
After recent fixes [1], speculative page fault walks are performed with
disabled interrupts, therefore do not depend on ALLOC_SPLIT_PTLOCKS
which would affect them if performed under RCU protection. Remove
unnecessary config dependency.
[1] 5fcb50b0559a ("ANDROID: mm: fix speculative walk which is unsafe under RCU")
Bug: 253557903
Change-Id: Ia1c835c7b08419f8fce61fa4f7e6842fbf786229
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
|
||
|
|
98f3cc7ecd |
ANDROID: mm: freeing MIGRATE_ISOLATE page instantly
Since Android has pcp list for MIGRATE_CMA[1], it could cause
CMA allocation latency due to not freeing the MIGRATE_ISOLATE
page immediately.
Originally, MIGRATE_ISOLATED page is supposed to go buddy list
with skipping pcp list. Otherwise, the page could be reallocated
from pcp list or staying on the pcp list until the pcp is drained
so that CMA keeps retrying since it couldn't find the freed page
from buddy list. That worked before since the CMA pfnblocks changed
only from MIGRATE_CMA to MIGRATE_ISOLATE and free function logic
in page allocator has checked MIGRATE_ISOLATEness on every CMA
pages using below.
free_unref_page_commit
if (migratetype >= MIGRATE_PCPTYPES)
if(is_migrate_isolate(migratetype))
free_one_page(page);
It worked since enum MIGRATE_CMA was bigger than enum
MIGRATE_PCPTYPES but since [1], the enum MIGRATE_CMA is less than
MIGRATE_PCPTYPES so the logic above doesn't work any more.
It could cause following race
CPU 0 CPU 1
free_unref_page
migratetype = get_pfnblock_migratetype()
set_pcppage_migratetype(MIGRATE_CMA)
cma_alloc
alloc_contig_range
set_migrate_isolate(MIGRATE_ISOLATE)
add the page into pcp list
the page could be reallocated
This patch couldn't fix the race completely due to missing zone->lock
in order-0 page free(for performance reason). However, it's not a new
problem so we need to deal with the issue separately.
[1] ANDROID: mm: add cma pcp list
Bug: 218731671
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: Ibea20085ce5bfb4b74b83b041f9bda9a380120f9
Signed-off-by: Richard Chang <richardycc@google.com>
(cherry picked from commit d9e4b67784866047e8cfb5598cdf1ebc0c71f3d9)
|