kernel_arpi

Author	SHA1	Message	Date
Peter Yoon	670c6da635	Merge branch 'android14-5.15' into arpi-5.15.92	2023-02-28 17:56:07 +09:00
Andrey Konovalov	f34aed1750	ANDROID: kasan: fix slab page check in complete_report_info The backport commit `77a0deb5d5` ("BACKPORT: kasan: fill in cache and object in complete_report_info") did not resolve the conflict due to the folio patchset missing in 5.15 correctly: complete_report_info needs to check PageSlab to make sure that the page is a slab page. Add a PageSlab check to complete_report_info. Bug: 254721825 Reported-by: Peter Collingbourne <pcc@google.com> Fixes: `77a0deb5d5` ("BACKPORT: kasan: fill in cache and object in complete_report_info") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Change-Id: I307ddbd9315134f825b37a0c7254a033453a46ef	2023-02-10 00:01:43 +01:00
Dom Cobley	e9e421392e	Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y	2023-02-01 17:30:22 +00:00
Greg Kroah-Hartman	e3d8fe0993	Merge 5.15.91 into android14-5.15 Changes in 5.15.91 memory: tegra: Remove clients SID override programming memory: atmel-sdramc: Fix missing clk_disable_unprepare in atmel_ramc_probe() memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe() dmaengine: ti: k3-udma: Do conditional decrement of UDMA_CHAN_RT_PEER_BCNT_REG arm64: dts: imx8mp-phycore-som: Remove invalid PMIC property ARM: dts: imx6ul-pico-dwarf: Use 'clock-frequency' ARM: dts: imx7d-pico: Use 'clock-frequency' ARM: dts: imx6qdl-gw560x: Remove incorrect 'uart-has-rtscts' arm64: dts: imx8mm-beacon: Fix ecspi2 pinmux ARM: imx: add missing of_node_put() HID: intel_ish-hid: Add check for ishtp_dma_tx_map arm64: dts: imx8mm-venice-gw7901: fix USB2 controller OC polarity soc: imx8m: Fix incorrect check for of_clk_get_by_name() reset: uniphier-glue: Use reset_control_bulk API reset: uniphier-glue: Fix possible null-ptr-deref EDAC/highbank: Fix memory leak in highbank_mc_probe() firmware: arm_scmi: Harden shared memory access in fetch_response firmware: arm_scmi: Harden shared memory access in fetch_notification tomoyo: fix broken dependency on *.conf.default RDMA/core: Fix ib block iterator counter overflow IB/hfi1: Reject a zero-length user expected buffer IB/hfi1: Reserve user expected TIDs IB/hfi1: Fix expected receive setup error exit issues IB/hfi1: Immediately remove invalid memory from hardware IB/hfi1: Remove user expected buffer invalidate race affs: initialize fsdata in affs_truncate() PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe() arm64: dts: qcom: msm8992: Don't use sfpb mutex arm64: dts: qcom: msm8992-libra: Add CPU regulators arm64: dts: qcom: msm8992-libra: Fix the memory map phy: ti: fix Kconfig warning and operator precedence NFSD: fix use-after-free in nfsd4_ssc_setup_dul() ARM: dts: at91: sam9x60: fix the ddr clock for sam9x60 amd-xgbe: TX Flow Ctrl Registers are h/w ver dependent amd-xgbe: Delay AN timeout during KR training bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation phy: rockchip-inno-usb2: Fix missing clk_disable_unprepare() in rockchip_usb2phy_power_on() net: nfc: Fix use-after-free in local_cleanup() net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs net: enetc: avoid deadlock in enetc_tx_onestep_tstamp() sch_htb: Avoid grafting on htb_destroy_class_offload when destroying htb gpio: use raw spinlock for gpio chip shadowed data gpio: mxc: Protect GPIO irqchip RMW with bgpio spinlock gpio: mxc: Always set GPIOs used as interrupt source to INPUT mode wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid pinctrl/rockchip: Use temporary variable for struct device pinctrl/rockchip: add error handling for pull/drive register getters pinctrl: rockchip: fix reading pull type on rk3568 net: stmmac: Fix queue statistics reading net/sched: sch_taprio: fix possible use-after-free l2tp: Serialize access to sk_user_data with sk_callback_lock l2tp: Don't sleep and disable BH under writer-side sk_callback_lock l2tp: convert l2tp_tunnel_list to idr l2tp: close all race conditions in l2tp_tunnel_register() octeontx2-pf: Avoid use of GFP_KERNEL in atomic context net: usb: sr9700: Handle negative len net: mdio: validate parameter addr in mdiobus_get_phy() HID: check empty report_list in hid_validate_values() HID: check empty report_list in bigben_probe() net: stmmac: fix invalid call to mdiobus_get_phy() pinctrl: rockchip: fix mux route data for rk3568 HID: revert CHERRY_MOUSE_000C quirk usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait usb: gadget: f_fs: Ensure ep0req is dequeued before free_request Bluetooth: Fix possible deadlock in rfcomm_sk_state_change net: ipa: disable ipa interrupt during suspend net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT net: mlx5: eliminate anonymous module_init & module_exit drm/panfrost: fix GENERIC_ATOMIC64 dependency dmaengine: Fix double increment of client_count in dma_chan_get() net: macb: fix PTP TX timestamp failure due to packet padding virtio-net: correctly enable callback during start_xmit l2tp: prevent lockdep issue in l2tp_tunnel_register() HID: betop: check shape of output reports cifs: fix potential deadlock in cache_refresh_path() dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node() phy: phy-can-transceiver: Skip warning if no "max-bitrate" drm/amd/display: fix issues with driver unload nvme-pci: fix timeout request state check tcp: avoid the lookup process failing to get sk in ehash table octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt ptdma: pt_core_execute_cmd() should use spinlock device property: fix of node refcount leak in fwnode_graph_get_next_endpoint() w1: fix deadloop in __w1_remove_master_device() w1: fix WARNING after calling w1_process() driver core: Fix test_async_probe_init saves device in wrong array selftests/net: toeplitz: fix race on tpacket_v3 block close net: dsa: microchip: ksz9477: port map correction in ALU table entry register thermal/core: Remove duplicate information when an error occurs thermal/core: Rename 'trips' to 'num_trips' thermal: Validate new state in cur_state_store() thermal/core: fix error code in __thermal_cooling_device_register() thermal: core: call put_device() only after device_register() fails net: stmmac: enable all safety features by default tcp: fix rate_app_limited to default to 1 scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist kcsan: test: don't put the expect array on the stack cpufreq: Add SM6375 to cpufreq-dt-platdev blocklist ASoC: fsl_micfil: Correct the number of steps on SX controls net: usb: cdc_ether: add support for Thales Cinterion PLS62-W modem drm: Add orientation quirk for Lenovo ideapad D330-10IGL s390/debug: add _ASM_S390_ prefix to header guard s390: expicitly align _edata and _end symbols on page boundary perf/x86/msr: Add Emerald Rapids perf/x86/intel/uncore: Add Emerald Rapids cpufreq: armada-37xx: stop using 0 as NULL pointer ASoC: fsl_ssi: Rename AC'97 streams to avoid collisions with AC'97 CODEC ASoC: fsl-asoc-card: Fix naming of AC'97 CODEC widgets spi: spidev: remove debug messages that access spidev->spi without locking KVM: s390: interrupt: use READ_ONCE() before cmpxchg() scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id r8152: add vendor/device ID pair for Microsoft Devkit platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK lockref: stop doing cpu_relax in the cmpxchg loop firmware: coreboot: Check size of table entry and use flex-array drm/i915: Allow switching away via vga-switcheroo if uninitialized Revert "selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID" drm/i915: Remove unused variable x86: ACPI: cstate: Optimize C3 entry on AMD CPUs fs: reiserfs: remove useless new_opts in reiserfs_remount sysctl: add a new register_sysctl_init() interface kernel/panic: move panic sysctls to its own file panic: unset panic_on_warn inside panic() ubsan: no need to unset panic_on_warn in ubsan_epilogue() kasan: no need to unset panic_on_warn in end_report() exit: Add and use make_task_dead. objtool: Add a missing comma to avoid string concatenation hexagon: Fix function name in die() h8300: Fix build errors from do_exit() to make_task_dead() transition csky: Fix function name in csky_alignment() and die() ia64: make IA64_MCA_RECOVERY bool instead of tristate panic: Separate sysctl logic from CONFIG_SMP exit: Put an upper limit on how often we can oops exit: Expose "oops_count" to sysfs exit: Allow oops_limit to be disabled panic: Consolidate open-coded panic_on_warn checks panic: Introduce warn_limit panic: Expose "warn_count" to sysfs docs: Fix path paste-o for /sys/kernel/warn_count exit: Use READ_ONCE() for all oops/warn limit reads Bluetooth: hci_sync: cancel cmd_timer if hci_open failed drm/amdgpu: complete gfxoff allow signal during suspend without delay scsi: hpsa: Fix allocation size for scsi_host_alloc() KVM: SVM: fix tsc scaling cache logic module: Don't wait for GOING modules tracing: Make sure trace_printk() can output as soon as it can be used trace_events_hist: add check for return value of 'create_hist_field' ftrace/scripts: Update the instructions for ftrace-bisect.sh cifs: Fix oops due to uncleared server->smbd_conn in reconnect i2c: mv64xxx: Remove shutdown method from driver i2c: mv64xxx: Add atomic_xfer method to driver ksmbd: add smbd max io size parameter ksmbd: add max connections parameter ksmbd: do not sign response to session request for guest login ksmbd: downgrade ndr version error message to debug ksmbd: limit pdu length size according to connection status ovl: fail on invalid uid/gid mapping at copy up KVM: x86/vmx: Do not skip segment attributes if unusable bit is set KVM: arm64: GICv4.1: Fix race with doorbell on VPE activation/deactivation thermal: intel: int340x: Protect trip temperature from concurrent updates ipv6: fix reachability confirmation with proxy_ndp ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment EDAC/device: Respect any driver-supplied workqueue polling value EDAC/qcom: Do not pass llcc_driv_data as edac_device_ctl_info's pvt_info net: mana: Fix IRQ name - add PCI and queue number scsi: ufs: core: Fix devfreq deadlocks i2c: designware: use casting of u64 in clock multiplication to avoid overflow netlink: prevent potential spectre v1 gadgets net: fix UaF in netns ops registration error path drm/i915/selftest: fix intel_selftest_modify_policy argument types netfilter: nft_set_rbtree: Switch to node list walk for overlap detection netfilter: nft_set_rbtree: skip elements in transaction from garbage collection netlink: annotate data races around nlk->portid netlink: annotate data races around dst_portid and dst_group netlink: annotate data races around sk_state ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() ipv4: prevent potential spectre v1 gadget in fib_metrics_match() netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE netrom: Fix use-after-free of a listening socket. net/sched: sch_taprio: do not schedule in taprio_reset() sctp: fail if no bound addresses can be used for a given scope riscv/kprobe: Fix instruction simulation of JALR nvme: fix passthrough csi check gpio: mxc: Unlock on error path in mxc_flip_edge() ravb: Rename "no_ptp_cfg_active" and "ptp_cfg_active" variables net: ravb: Fix lack of register setting after system resumed for Gen3 net: ravb: Fix possible hang if RIS2_QFF1 happen net: mctp: mark socks as dead on unhash, prevent re-add thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type() net/tg3: resolve deadlock in tg3_reset_task() during EEH net: mdio-mux-meson-g12a: force internal PHY off on mux switch treewide: fix up files incorrectly marked executable tools: gpio: fix -c option of gpio-event-mon Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode" cpufreq: Move to_gov_attr_set() to cpufreq.h cpufreq: governor: Use kobject release() method to free dbs_data kbuild: Allow kernel installation packaging to override pkg-config block: fix and cleanup bio_check_ro x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL netfilter: conntrack: unify established states for SCTP paths perf/x86/amd: fix potential integer overflow on shift of a int Linux 5.15.91 Change-Id: I3349d802533097ac86e5c680fbd40c00c9719ec7 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-02-01 09:38:19 +00:00
Kees Cook	7b98914a6c	panic: Consolidate open-coded panic_on_warn checks commit 79cc1ba7badf9e7a12af99695a557e9ce27ee967 upstream. Several run-time checkers (KASAN, UBSAN, KFENCE, KCSAN, sched) roll their own warnings, and each check "panic_on_warn". Consolidate this into a single function so that future instrumentation can be added in a single location. Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Gow <davidgow@google.com> Cc: tangmeng <tangmeng@uniontech.com> Cc: Jann Horn <jannh@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: kasan-dev@googlegroups.com Cc: linux-mm@kvack.org Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20221117234328.594699-4-keescook@chromium.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:22 +01:00
Tiezhu Yang	b5c1acaa43	kasan: no need to unset panic_on_warn in end_report() commit e7ce7500375a63348e1d3a703c8d5003cbe3fea6 upstream. panic_on_warn is unset inside panic(), so no need to unset it before calling panic() in end_report(). Link: https://lkml.kernel.org/r/1644324666-15947-6-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Marco Elver <elver@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Xuefeng Li <lixuefeng@loongson.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:20 +01:00
Dom Cobley	bb12d18e60	Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y	2023-01-30 14:49:00 +00:00
Greg Kroah-Hartman	17126d43d4	Merge 5.15.90 into android14-5.15 Changes in 5.15.90 btrfs: fix trace event name typo for FLUSH_DELAYED_REFS pNFS/filelayout: Fix coalescing test for single DS selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID tools/virtio: initialize spinlocks in vring_test.c virtio_pci: modify ENOENT to EINVAL vduse: Validate vq_num in vduse_validate_config() net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down() RDMA/srp: Move large values to a new enum for gcc13 btrfs: always report error in run_one_delayed_ref() x86/asm: Fix an assembler warning with current binutils f2fs: let's avoid panic if extent_tree is not created perf/x86/rapl: Treat Tigerlake like Icelake fbdev: omapfb: avoid stack overflow warning Bluetooth: hci_qca: Fix driver shutdown on closed serdev wifi: brcmfmac: fix regression for Broadcom PCIe wifi devices wifi: mac80211: sdata can be NULL during AMPDU start Add exception protection processing for vd in axi_chan_handle_err function zonefs: Detect append writes at invalid locations nilfs2: fix general protection fault in nilfs_btree_insert() efi: fix userspace infinite retry read efivars after EFI runtime services page fault ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform drm/amdgpu: disable runtime pm on several sienna cichlid cards(v2) drm/amd: Delay removal of the firmware framebuffer hugetlb: unshare some PMDs when splitting VMAs io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL eventpoll: add EPOLL_URING_WAKE poll wakeup flag eventfd: provide a eventfd_signal_mask() helper io_uring: pass in EPOLL_URING_WAKE for eventfd signaling and wakeups io_uring: improve send/recv error handling io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly io_uring: add flag for disabling provided buffer recycling io_uring: support MSG_WAITALL for IORING_OP_SEND(MSG) io_uring: allow re-poll if we made progress io_uring: fix async accept on O_NONBLOCK sockets io_uring: ensure that cached task references are always put on exit io_uring: remove duplicated calls to io_kiocb_ppos io_uring: update kiocb->ki_pos at execution time io_uring: do not recalculate ppos unnecessarily io_uring/rw: defer fsnotify calls to task context xhci-pci: set the dma max_seg_size usb: xhci: Check endpoint is valid before dereferencing it xhci: Fix null pointer dereference when host dies xhci: Add update_hub_device override for PCI xHCI hosts xhci: Add a flag to disable USB3 lpm on a xhci root port level. usb: acpi: add helper to check port lpm capability using acpi _DSM xhci: Detect lpm incapable xHC USB3 roothub ports from ACPI tables prlimit: do_prlimit needs to have a speculation check USB: serial: option: add Quectel EM05-G (GR) modem USB: serial: option: add Quectel EM05-G (CS) modem USB: serial: option: add Quectel EM05-G (RS) modem USB: serial: option: add Quectel EC200U modem USB: serial: option: add Quectel EM05CN (SG) modem USB: serial: option: add Quectel EM05CN modem staging: vchiq_arm: fix enum vchiq_status return types USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100 misc: fastrpc: Don't remove map on creater_process and device_release misc: fastrpc: Fix use-after-free race condition for maps usb: core: hub: disable autosuspend for TI TUSB8041 comedi: adv_pci1760: Fix PWM instruction handling ACPI: PRM: Check whether EFI runtime is available mmc: sunxi-mmc: Fix clock refcount imbalance during unbind mmc: sdhci-esdhc-imx: correct the tuning start tap and step setting btrfs: do not abort transaction on failure to write log tree when syncing log btrfs: fix race between quota rescan and disable leading to NULL pointer deref cifs: do not include page data when checking signature thunderbolt: Use correct function to calculate maximum USB3 link rate riscv: dts: sifive: fu740: fix size of pcie 32bit memory bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD staging: mt7621-dts: change some node hex addresses to lower case tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO buffer tty: fix possible null-ptr-defer in spk_ttyio_release USB: gadgetfs: Fix race between mounting and unmounting USB: serial: cp210x: add SCALANCE LPE-9000 device id usb: cdns3: remove fetched trb from cache before dequeuing usb: host: ehci-fsl: Fix module alias usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail usb: typec: altmodes/displayport: Add pin assignment helper usb: typec: altmodes/displayport: Fix pin assignment calculation usb: gadget: g_webcam: Send color matching descriptor per frame usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate() usb-storage: apply IGNORE_UAS only for HIKSEMI MD202 on RTL9210 dt-bindings: phy: g12a-usb2-phy: fix compatible string documentation dt-bindings: phy: g12a-usb3-pcie-phy: fix compatible string documentation serial: pch_uart: Pass correct sg to dma_unmap_sg() dmaengine: lgm: Move DT parsing after initialization dmaengine: tegra210-adma: fix global intr clear dmaengine: idxd: Let probe fail when workqueue cannot be enabled serial: amba-pl011: fix high priority character transmission in rs486 mode serial: atmel: fix incorrect baudrate setup gsmi: fix null-deref in gsmi_get_variable mei: me: add meteor lake point M DID drm/i915: re-disable RC6p on Sandy Bridge drm/i915/display: Check source height is > 0 drm/amd/display: Fix set scaling doesn's work drm/amd/display: Calculate output_color_space after pixel encoding adjustment drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrix drm/amdgpu: drop experimental flag on aldebaran fs/ntfs3: Fix attr_punch_hole() null pointer derenference arm64: efi: Execute runtime services from a dedicated stack efi: rt-wrapper: Add missing include Revert "drm/amdgpu: make display pinning more flexible (v2)" x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN tracing: Use alignof__(struct {type b;}) instead of offsetof() io_uring: io_kiocb_update_pos() should not touch file for non -1 offset io_uring/net: fix fast_iov assignment in io_setup_async_msg() net/ulp: use consistent error code when blocking ULP net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work() block: mq-deadline: Rename deadline_is_seq_writes() Revert "wifi: mac80211: fix memory leak in ieee80211_if_add()" soc: qcom: apr: Make qcom,protection-domain optional again mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma io_uring: Clean up a false-positive warning from GCC 9.3.0 io_uring: fix double poll leak on repolling io_uring/rw: ensure kiocb_end_write() is always called io_uring/rw: remove leftover debug statement Linux 5.15.90 Change-Id: I5abc2e695f7183a1d3be9d8f62633bb1df8e8a48 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-01-25 16:13:42 +00:00
Andrey Konovalov	6d926a9aae	FROMLIST: kasan: reset page tags properly with sampling [The patch is in the mm-unstable tree.] The implementation of page_alloc poisoning sampling assumed that tag_clear_highpage resets page tags for __GFP_ZEROTAGS allocations. However, this is no longer the case since commit 70c248aca9e7 ("mm: kasan: Skip unpoisoning of user pages"). This leads to kernel crashes when MTE-enabled userspace mappings are used with Hardware Tag-Based KASAN enabled. Reset page tags for __GFP_ZEROTAGS allocations in post_alloc_hook(). Also clarify and fix related comments. Fixes: 44383cef54c0 ("kasan: allow sampling page_alloc allocations for HW_TAGS") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Peter Collingbourne <pcc@google.com> Tested-by: Peter Collingbourne <pcc@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 238286329 Bug: 264310057 Link: https://lore.kernel.org/all/5dbd866714b4839069e2d8469ac45b60953db290.1674592780.git.andreyknvl@google.com/ Change-Id: Iea4234bcf7e35337c8063827b07039583bca9c66 Signed-off-by: Andrey Konovalov <andreyknvl@google.com>	2023-01-25 00:02:18 +01:00
Andrey Konovalov	25e257e5d5	FROMGIT: kasan: allow sampling page_alloc allocations for HW_TAGS [The patch is in mm-stable tree.] As Hardware Tag-Based KASAN is intended to be used in production, its performance impact is crucial. As page_alloc allocations tend to be big, tagging and checking all such allocations can introduce a significant slowdown. Add two new boot parameters that allow to alleviate that slowdown: - kasan.page_alloc.sample, which makes Hardware Tag-Based KASAN tag only every Nth page_alloc allocation with the order configured by the second added parameter (default: tag every such allocation). - kasan.page_alloc.sample.order, which makes sampling enabled by the first parameter only affect page_alloc allocations with the order equal or greater than the specified value (default: 3, see below). The exact performance improvement caused by using the new parameters depends on their values and the applied workload. The chosen default value for kasan.page_alloc.sample.order is 3, which matches both PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. This is done for two reasons: 1. PAGE_ALLOC_COSTLY_ORDER is "the order at which allocations are deemed costly to service", which corresponds to the idea that only large and thus costly allocations are supposed to sampled. 2. One of the workloads targeted by this patch is a benchmark that sends a large amount of data over a local loopback connection. Most multi-page data allocations in the networking subsystem have the order of SKB_FRAG_PAGE_ORDER (or PAGE_ALLOC_COSTLY_ORDER). When running a local loopback test on a testing MTE-enabled device in sync mode, enabling Hardware Tag-Based KASAN introduces a ~50% slowdown. Applying this patch and setting kasan.page_alloc.sampling to a value higher than 1 allows to lower the slowdown. The performance improvement saturates around the sampling interval value of 10 with the default sampling page order of 3. This lowers the slowdown to ~20%. The slowdown in real scenarios involving the network will likely be better. Enabling page_alloc sampling has a downside: KASAN misses bad accesses to a page_alloc allocation that has not been tagged. This lowers the value of KASAN as a security mitigation. However, based on measuring the number of page_alloc allocations of different orders during boot in a test build, sampling with the default kasan.page_alloc.sample.order value affects only ~7% of allocations. The rest ~93% of allocations are still checked deterministically. Link: https://lkml.kernel.org/r/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Jann Horn <jannh@google.com> Cc: Mark Brand <markbrand@google.com> Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 238286329 Bug: 264310057 (cherry picked from commit 44383cef54c0ce1201f884d83cc2b367bc5aa4f7 git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-stable) Change-Id: I85f9eb4e93eeddff8f8d06238f433226affca177 Signed-off-by: Andrey Konovalov <andreyknvl@google.com>	2023-01-25 00:02:08 +01:00
Hugh Dickins	940e8922c1	mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma commit ab0c3f1251b4670978fde0bd54161795a139b060 upstream. uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd, when removing a breakpoint from hugepage text: vma->anon_vma is always set in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able to collapse some page tables in a vma which happens to have anon_vma set from CoWing elsewhere. Is anon_vma lock required? Almost not: if any page other than expected subpage of the non-anon huge page is found in the page table, collapse is aborted without making any change. However, it is possible that an anon page was CoWed from this extent in another mm or vma, in which case a concurrent lookup might look here: so keep it away while clearing pmd (but perhaps we shall go back to using pmd_lock() there in future). Note that collapse_pte_mapped_thp() is exceptional in freeing a page table without having cleared its ptes: I'm uneasy about that, and had thought pte_clear()ing appropriate; but exclusive i_mmap lock does fix the problem, and we would have to move the mmu_notification if clearing those ptes. What this fixes is not a dangerous instability. But I suggest Cc stable because uprobes "healing" has regressed in that way, so this should follow 8d3c106e19e8 into those stable releases where it was backported (and may want adjustment there - I'll supply backports as needed). Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zach O'Keefe <zokeefe@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-24 07:22:49 +01:00
James Houghton	bd9a23a4bb	hugetlb: unshare some PMDs when splitting VMAs [ Upstream commit b30c14cd61025eeea2f2e8569606cd167ba9ad2d ] PMD sharing can only be done in PUD_SIZE-aligned pieces of VMAs; however, it is possible that HugeTLB VMAs are split without unsharing the PMDs first. Without this fix, it is possible to hit the uffd-wp-related WARN_ON_ONCE in hugetlb_change_protection [1]. The key there is that hugetlb_unshare_all_pmds will not attempt to unshare PMDs in non-PUD_SIZE-aligned sections of the VMA. It might seem ideal to unshare in hugetlb_vm_op_open, but we need to unshare in both the new and old VMAs, so unsharing in hugetlb_vm_op_split seems natural. [1]: https://lore.kernel.org/linux-mm/CADrL8HVeOkj0QH5VZZbRzybNE8CG-tEGFshnA+bG9nMgcWtBSg@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230104231910.1464197-1-jthoughton@google.com Fixes: `6dfeaff93b` ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when register wp") Signed-off-by: James Houghton <jthoughton@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-24 07:22:43 +01:00
Peter Collingbourne	f95873c0e5	Revert "FROMLIST: kasan: allow sampling page_alloc allocations for HW_TAGS" This reverts commit `fe19dff7e6`. Reason for revert: Observed frequent boot crashes on a device with sampling KASAN enabled. Bug: 265863271 Change-Id: Ib7860295065ed7aaa36d9e47d8aaa97918c7bc57 Signed-off-by: Peter Collingbourne <pcc@google.com>	2023-01-20 15:11:44 -08:00
Greg Kroah-Hartman	0fd420005c	Merge 5.15.89 into android14-5.15 Changes in 5.15.89 netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits ALSA: control-led: use strscpy in set_led_id() ALSA: hda/realtek - Turn on power early ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx KVM: arm64: Fix S1PTW handling on RO memslots KVM: arm64: nvhe: Fix build with profile optimization selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c efi: tpm: Avoid READ_ONCE() for accessing the event log docs: Fix the docs build with Sphinx 6.0 net: stmmac: add aux timestamps fifo clearance wait perf auxtrace: Fix address filter duplicate symbol selection s390/kexec: fix ipl report address for kdump ASoC: qcom: lpass-cpu: Fix fallback SD line index handling s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() drm/virtio: Fix GEM handle creation UAF drm/i915/gt: Reset twice net/mlx5e: Set action fwd flag when parsing tc action goto cifs: Fix uninitialized memory read for smb311 posix symlink create platform/x86: dell-privacy: Only register SW_CAMERA_LENS_COVER if present platform/surface: aggregator: Ignore command messages not intended for us platform/x86: dell-privacy: Fix SW_CAMERA_LENS_COVER reporting dt-bindings: msm: dsi-controller-main: Fix operating-points-v2 constraint drm/msm/adreno: Make adreno quirks not overwrite each other dt-bindings: msm: dsi-controller-main: Fix power-domain constraint dt-bindings: msm: dsi-controller-main: Fix description of core clock dt-bindings: msm: dsi-phy-28nm: Add missing qcom, dsi-phy-regulator-ldo-mode platform/x86: ideapad-laptop: Add Legion 5 15ARH05 DMI id to set_fn_lock_led_list[] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer dt-bindings: msm/dsi: Don't require vdds-supply on 10nm PHY dt-bindings: msm/dsi: Don't require vcca-supply on 14nm PHY platform/x86: sony-laptop: Don't turn off 0x153 keyboard backlight during probe ixgbe: fix pci device refcount leak ipv6: raw: Deduct extension header length in rawv6_push_pending_frames bus: mhi: host: Fix race between channel preparation and M0 event usb: ulpi: defer ulpi_register on ulpi_read_id timeout iommu/iova: Fix alloc iova overflows issue iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() sched/core: Fix use-after-free bug in dup_user_cpus_ptr() netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function. powerpc/imc-pmu: Fix use of mutex in IRQs disabled section x86/boot: Avoid using Intel mnemonics in AT&T syntax asm EDAC/device: Fix period calculation in edac_device_reset_delay_period() x86/resctrl: Fix task CLOSID/RMID update race regulator: da9211: Use irq handler when ready scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile scsi: ufs: Stop using the clock scaling lock in the error handler scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery ASoC: wm8904: fix wrong outputs volume after power reactivation ALSA: usb-audio: Make sure to stop endpoints before closing EPs ALSA: usb-audio: Relax hw constraints for implicit fb sync tipc: fix unexpected link reset due to discovery messages octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable hvc/xen: lock console list traversal nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame() af_unix: selftest: Fix the size of the parameter to connect() tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` tools/nolibc: use pselect6 on RISCV tools/nolibc/std: move the standard type definitions to std.h tools/nolibc/types: split syscall-specific definitions into their own files tools/nolibc/arch: split arch-specific code into individual files tools/nolibc/arch: mark the _start symbol as weak tools/nolibc: Remove .global _start from the entry point code tools/nolibc: restore mips branch ordering in the _start block tools/nolibc: fix the O_* fcntl/open macro definitions for riscv net/sched: act_mpls: Fix warning during failed attribute validation net/mlx5: Fix ptp max frequency adjustment range net/mlx5e: Don't support encap rules with gbp option perf build: Properly guard libbpf includes igc: Fix PPS delta between two synchronized end-points platform/surface: aggregator: Add missing call to ssam_request_sync_free() mm: Always release pages to the buddy allocator in memblock_free_late(). Documentation: KVM: add API issues section KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID io_uring: lock overflowing for IOPOLL arm64: atomics: format whitespace consistently arm64: atomics: remove LL/SC trampolines arm64: cmpxchg_double*: hazard against entire exchange variable efi: fix NULL-deref in init error path scsi: mpt3sas: Remove scsi_dma_map() error messages io_uring/io-wq: free worker if task_work creation is canceled io_uring/io-wq: only free worker if it was allocated for creation block: handle bio_split_to_limits() NULL return Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout" pinctrl: amd: Add dynamic debugging for active GPIOs Linux 5.15.89 Change-Id: I66c4f269aa7751b2e4aac919f822dfdcb844a69d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-01-18 13:45:44 +00:00
Greg Kroah-Hartman	efae08b654	Merge 5.15.87 into android14-5.15 Changes in 5.15.87 usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init cifs: fix oops during encryption Revert "selftests/bpf: Add test for unstable CT lookup API" nvme-pci: fix doorbell buffer value endianness nvme-pci: fix mempool alloc size nvme-pci: fix page size checks ACPI: resource: Skip IRQ override on Asus Vivobook K3402ZA/K3502ZA ACPI: resource: do IRQ override on LENOVO IdeaPad ACPI: resource: do IRQ override on XMG Core 15 ACPI: resource: do IRQ override on Lenovo 14ALC7 block, bfq: fix uaf for bfqq in bfq_exit_icq_bfqq ata: ahci: Fix PCS quirk application for suspend nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition nvmet: don't defer passthrough commands with trivial effects to the workqueue fs/ntfs3: Validate BOOT record_size fs/ntfs3: Add overflow check for attribute size fs/ntfs3: Validate data run offset fs/ntfs3: Add null pointer check to attr_load_runs_vcn fs/ntfs3: Fix memory leak on ntfs_fill_super() error path fs/ntfs3: Add null pointer check for inode operations fs/ntfs3: Validate attribute name offset fs/ntfs3: Validate buffer length while parsing index fs/ntfs3: Validate resident attribute name fs/ntfs3: Fix slab-out-of-bounds read in run_unpack soundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15 fs/ntfs3: Validate index root when initialize NTFS security fs/ntfs3: Use __GFP_NOWARN allocation at wnd_init() fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_fill_super() fs/ntfs3: Delete duplicate condition in ntfs_read_mft() fs/ntfs3: Fix slab-out-of-bounds in r_page objtool: Fix SEGFAULT powerpc/rtas: avoid device tree lookups in rtas_os_term() powerpc/rtas: avoid scheduling in rtas_os_term() HID: multitouch: fix Asus ExpertBook P2 P2451FA trackpoint HID: plantronics: Additional PIDs for double volume key presses quirk pstore: Properly assign mem_type property pstore/zone: Use GFP_ATOMIC to allocate zone buffer hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount binfmt: Fix error return code in load_elf_fdpic_binary() ovl: Use ovl mounter's fsuid and fsgid in ovl_link() ALSA: line6: correct midi status byte when receiving data from podxt ALSA: line6: fix stack overflow in line6_midi_transmit pnode: terminate at peers of source mfd: mt6360: Add bounds checking in Regmap read/write call-backs md: fix a crash in mempool_free mm, compaction: fix fast_isolate_around() to stay within boundaries f2fs: should put a page when checking the summary info f2fs: allow to read node block after shutdown mmc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING tpm: acpi: Call acpi_put_table() to fix memory leak tpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak tpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails kcsan: Instrument memcpy/memset/memmove with newer Clang ASoC: Intel/SOF: use set_stream() instead of set_tdm_slots() for HDAudio ASoC/SoundWire: dai: expand 'stream' concept beyond SoundWire rcu-tasks: Simplify trc_read_check_handler() atomic operations net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO net/af_packet: make sure to pull mac header media: stv0288: use explicitly signed char soc: qcom: Select REMAP_MMIO for LLCC driver kest.pl: Fix grub2 menu handling for rebooting ktest.pl minconfig: Unset configs instead of just removing them jbd2: use the correct print format perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D perf/x86/intel/uncore: Clear attr_update properly arm64: dts: qcom: sdm845-db845c: correct SPI2 pins drive strength mmc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K btrfs: fix resolving backrefs for inline extent followed by prealloc ARM: ux500: do not directly dereference __iomem arm64: dts: qcom: sdm850-lenovo-yoga-c630: correct I2C12 pins drive strength selftests: Use optional USERCFLAGS and USERLDFLAGS PM/devfreq: governor: Add a private governor_data for governor cpufreq: Init completion before kobject_init_and_add() ALSA: patch_realtek: Fix Dell Inspiron Plus 16 ALSA: hda/realtek: Apply dual codec fixup for Dell Latitude laptops fs: dlm: fix sock release if listen fails fs: dlm: retry accept() until -EAGAIN or error returns mptcp: mark ops structures as ro_after_init mptcp: remove MPTCP 'ifdef' in TCP SYN cookies dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata dm thin: Use last transaction's pmd->root when commit failed dm thin: resume even if in FAIL mode dm thin: Fix UAF in run_timer_softirq() dm integrity: Fix UAF in dm_integrity_dtr() dm clone: Fix UAF in clone_dtr() dm cache: Fix UAF in destroy() dm cache: set needs_check flag after aborting metadata tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx' perf/core: Call LSM hook after copying perf_event_attr of/kexec: Fix reading 32-bit "linux,initrd-{start,end}" values KVM: VMX: Resume guest immediately when injecting #GP on ECREATE KVM: nVMX: Inject #GP, not #UD, if "generic" VMXON CR0/CR4 check fails KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 x86/microcode/intel: Do not retry microcode reloading on the APs ftrace/x86: Add back ftrace_expected for ftrace bug reports x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK tracing: Fix race where eprobes can be called before the event tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE tracing/hist: Fix wrong return value in parse_action_params() tracing/probes: Handle system names with hyphens tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line staging: media: tegra-video: fix chan->mipi value on error staging: media: tegra-video: fix device_node use after free ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod media: dvb-core: Fix double free in dvb_register_device() media: dvb-core: Fix UAF due to refcount races at releasing cifs: fix confusing debug message cifs: fix missing display of three mount options rtc: ds1347: fix value written to century register block: mq-deadline: Do not break sequential write streams to zoned HDDs md/bitmap: Fix bitmap chunk size overflow issues efi: Add iMac Pro 2017 to uefi skip cert quirk wifi: wilc1000: sdio: fix module autoloading ASoC: jz4740-i2s: Handle independent FIFO flush bits ipu3-imgu: Fix NULL pointer dereference in imgu_subdev_set_selection() ipmi: fix long wait in unload when IPMI disconnect mtd: spi-nor: Check for zero erase size in spi_nor_find_best_erase_type() ima: Fix a potential NULL pointer access in ima_restore_measurement_list ipmi: fix use after free in _ipmi_destroy_user() PCI: Fix pci_device_is_present() for VFs by checking PF PCI/sysfs: Fix double free in error path riscv: stacktrace: Fixup ftrace_graph_ret_addr retp argument riscv: mm: notify remote harts about mmu cache updates crypto: n2 - add missing hash statesize crypto: ccp - Add support for TEE for PCI ID 0x14CA driver core: Fix bus_type.match() error handling in __driver_attach() phy: qcom-qmp-combo: fix sc8180x reset iommu/amd: Fix ivrs_acpihid cmdline parsing code remoteproc: core: Do pm_relax when in RPROC_OFFLINE state parisc: led: Fix potential null-ptr-deref in start_task() device_cgroup: Roll back to original exceptions after copy failure drm/connector: send hotplug uevent on connector cleanup drm/vmwgfx: Validate the box size for the snooped cursor drm/i915/dsi: fix VBT send packet port selection for dual link DSI drm/ingenic: Fix missing platform_driver_unregister() call in ingenic_drm_init() ext4: silence the warning when evicting inode with dioread_nolock ext4: add inode table check in __ext4_get_inode_loc to aovid possible infinite loop ext4: remove trailing newline from ext4_msg() message fs: ext4: initialize fsdata in pagecache_write() ext4: fix use-after-free in ext4_orphan_cleanup ext4: fix undefined behavior in bit shift for ext4_check_flag_values ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode ext4: add helper to check quota inums ext4: fix bug_on in __es_tree_search caused by bad quota inode ext4: fix reserved cluster accounting in __es_remove_extent() ext4: check and assert if marking an no_delete evicting inode dirty ext4: fix bug_on in __es_tree_search caused by bad boot loader inode ext4: fix leaking uninitialized memory in fast-commit journal ext4: fix uninititialized value in 'ext4_evict_inode' ext4: init quota for 'old.inode' in 'ext4_rename' ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline ext4: fix corruption when online resizing a 1K bigalloc fs ext4: fix error code return to user-space in ext4_get_branch() ext4: avoid BUG_ON when creating xattrs ext4: fix kernel BUG in 'ext4_write_inline_data_end()' ext4: fix inode leak in ext4_xattr_inode_create() on an error path ext4: initialize quota before expanding inode in setproject ioctl ext4: avoid unaccounted block allocation when expanding inode ext4: allocate extended attribute value in vmalloc area drm/amdgpu: handle polaris10/11 overlap asics (v2) drm/amdgpu: make display pinning more flexible (v2) block: mq-deadline: Fix dd_finish_request() for zoned devices tracing: Fix issue of missing one synthetic field ext4: remove unused enum EXT4_FC_COMMIT_FAILED ext4: use ext4_debug() instead of jbd_debug() ext4: introduce EXT4_FC_TAG_BASE_LEN helper ext4: factor out ext4_fc_get_tl() ext4: fix potential out of bound read in ext4_fc_replay_scan() ext4: disable fast-commit of encrypted dir operations ext4: don't set up encryption key during jbd2 transaction ext4: add missing validation of fast-commit record lengths ext4: fix unaligned memory access in ext4_fc_reserve_space() ext4: fix off-by-one errors in fast-commit block filling ARM: renumber bits related to _TIF_WORK_MASK phy: qcom-qmp-combo: fix out-of-bounds clock access btrfs: replace strncpy() with strscpy() btrfs: move missing device handling in a dedicate function btrfs: fix extent map use-after-free when handling missing device in read_one_chunk x86/mce: Get rid of msr_ops x86/MCE/AMD: Clear DFR errors found in THR handler media: s5p-mfc: Fix to handle reference queue during finishing media: s5p-mfc: Clear workbit to handle error condition media: s5p-mfc: Fix in register read and write for H264 perf probe: Use dwarf_attr_integrate as generic DWARF attr accessor perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data ravb: Fix "failed to switch device to config mode" message during unbind ext4: goto right label 'failed_mount3a' ext4: correct inconsistent error msg in nojournal mode mbcache: automatically delete entries from cache on freeing ext4: fix deadlock due to mbcache entry corruption drm/i915/migrate: don't check the scratch page drm/i915/migrate: fix offset calculation drm/i915/migrate: fix length calculation SUNRPC: ensure the matching upcall is in-flight upon downcall btrfs: fix an error handling path in btrfs_defrag_leaves() bpf: pull before calling skb_postpull_rcsum() drm/panfrost: Fix GEM handle creation ref-counting netfilter: nf_tables: consolidate set description netfilter: nf_tables: add function to create set stateful expressions netfilter: nf_tables: perform type checking for existing sets vmxnet3: correctly report csum_level for encapsulated packet netfilter: nf_tables: honor set timeout and garbage collection updates veth: Fix race with AF_XDP exposing old or uninitialized descriptors nfsd: shut down the NFSv4 state objects before the filecache net: hns3: add interrupts re-initialization while doing VF FLR net: hns3: refactor hns3_nic_reuse_page() net: hns3: extract macro to simplify ring stats update code net: hns3: fix miss L3E checking for rx packet net: hns3: fix VF promisc mode not update when mac table full net: sched: fix memory leak in tcindex_set_parms qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure net: dsa: mv88e6xxx: depend on PTP conditionally nfc: Fix potential resource leaks vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init() vhost/vsock: Fix error handling in vhost_vsock_init() vringh: fix range used in iotlb_translate() vhost: fix range used in translate_desc() vdpa_sim: fix vringh initialization in vdpasim_queue_ready() net/mlx5: E-Switch, properly handle ingress tagged packets on VST net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path net/mlx5: Avoid recovery in probe flows net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default net/mlx5e: TC, Refactor mlx5e_tc_add_flow_mod_hdr() to get flow attr net/mlx5e: Always clear dest encap in neigh-update-del net/mlx5e: Fix hw mtu initializing at XDP SQ allocation net: amd-xgbe: add missed tasklet_kill net: ena: Fix toeplitz initial hash value net: ena: Don't register memory info on XDP exchange net: ena: Account for the number of processed bytes in XDP net: ena: Use bitmask to indicate packet redirection net: ena: Fix rx_copybreak value update net: ena: Set default value for RX interrupt moderation net: ena: Update NUMA TPH hint register upon NUMA node update net: phy: xgmiitorgmii: Fix refcount leak in xgmiitorgmii_probe RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device RDMA/mlx5: Fix validation of max_rd_atomic caps for DC drm/meson: Reduce the FIFO lines held when AFBC is not used filelock: new helper: vfs_inode_has_locks ceph: switch to vfs_inode_has_locks() to fix file lock bug gpio: sifive: Fix refcount leak in sifive_gpio_probe net: sched: atm: dont intepret cls results when asked to drop net: sched: cbq: dont intepret cls results when asked to drop net: sparx5: Fix reading of the MAC address netfilter: ipset: fix hash:net,port,net hang with /0 subnet netfilter: ipset: Rework long task execution when adding/deleting entries perf tools: Fix resources leak in perf_data__open_dir() drm/imx: ipuv3-plane: Fix overlay plane width fs/ntfs3: don't hold ni_lock when calling truncate_setsize() drivers/net/bonding/bond_3ad: return when there's no aggregator octeontx2-pf: Fix lmtst ID used in aura free usb: rndis_host: Secure rndis_query check against int overflow perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match non BPF mode drm/i915: unpin on error in intel_vgpu_shadow_mm_pin() caif: fix memory leak in cfctrl_linkup_request() udf: Fix extension of the last extent in the file ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet nvme: fix multipath crash caused by flush request when blktrace is enabled io_uring: check for valid register opcode earlier nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it nvme: also return I/O command effects from nvme_command_effects btrfs: check superblock to ensure the fs was not modified at thaw time x86/kexec: Fix double-free of elf header buffer x86/bugs: Flush IBP in ib_prctl_set() nfsd: fix handling of readdir in v4root vs. mount upcall timeout fbdev: matroxfb: G200eW: Increase max memory from 1 MB to 16 MB block: don't allow splitting of a REQ_NOWAIT bio io_uring: fix CQ waiting timeout handling thermal: int340x: Add missing attribute for data rate base riscv: uaccess: fix type of 0 variable on error in get_user() riscv, kprobes: Stricter c.jr/c.jalr decoding drm/i915/gvt: fix gvt debugfs destroy drm/i915/gvt: fix vgpu debugfs clean in remove hfs/hfsplus: use WARN_ON for sanity check hfs/hfsplus: avoid WARN_ON() for sanity check, use proper error handling ksmbd: fix infinite loop in ksmbd_conn_handler_loop() ksmbd: check nt_len to be at least CIFS_ENCPWD_SIZE in ksmbd_decode_ntlmssp_auth_blob Revert "ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007" mptcp: dedicated request sock for subflow in v6 mptcp: use proper req destructor for IPv6 ext4: don't allow journal inode to have encrypt flag selftests: set the BUILD variable to absolute path btrfs: make thaw time super block check to also verify checksum net: hns3: fix return value check bug of rx copybreak mbcache: Avoid nesting of cache->c_list_lock under bit locks efi: random: combine bootloader provided RNG seed with RNG protocol output io_uring: Fix unsigned 'res' comparison with zero in io_fixup_rw_res() drm/mgag200: Fix PLL setup for G200_SE_A rev >=4 Linux 5.15.87 Change-Id: I06fb376627506652ed60c04d56074956e6e075a0 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2023-01-18 13:13:15 +00:00
Dom Cobley	8ad43539c6	Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y	2023-01-18 11:53:49 +00:00
Aaron Thompson	b8f3b3cffb	mm: Always release pages to the buddy allocator in memblock_free_late(). [ Upstream commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 ] If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. All pages in reserved ranges have been initialized at that point, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. One case where this issue can occur in the wild is EFI boot on x86_64. The x86 EFI code reserves all EFI boot services memory ranges via memblock_reserve() and frees them later via memblock_free_late() (efi_reserve_boot_services() and efi_free_boot_services(), respectively). If any of those ranges happens to fall within the deferred init range, the pages will not be released and that memory will be unavailable. For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI: v6.2-rc2: # grep -E 'Node\|spanned\|present\|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 178867 v6.2-rc2 + patch: # grep -E 'Node\|spanned\|present\|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 222816 # +43,949 pages Fixes: `3a80a7fa79` ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Aaron Thompson <dev@aaront.org> Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-18 11:48:57 +01:00
Minchan Kim	5cecdaebbf	FROMLIST: BACKPORT: mm: fix is_pinnable_page against on cma page Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA so current is_pinnable_page could miss CMA pages which has MIGRATE_ ISOLATE. It ends up pinning CMA pages as longterm at pin_user_pages APIs so CMA allocation keep failed until the pin is released. CPU 0 CPU 1 - Task B cma_alloc alloc_contig_range pin_user_pages_fast(FOLL_LONGTERM) change pageblock as MIGRATE_ISOLATE internal_get_user_pages_fast lockless_pages_from_mm gup_pte_range try_grab_folio is_pinnable_page return true; So, pinned the page successfully. page migration failure with pinned page .. .. After 30 sec unpin_user_page(page) CMA allocation succeeded after 30 sec. The CMA allocation path protects the migration type change race using zone->lock but what GUP path need to know is just whether the page is on CMA area or not rather than exact migration type. Thus, we don't need zone->lock but just checks migration type in either of (MIGRATE_ISOLATE and MIGRATE_CMA). Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause rejecting of pinning pages on MIGRATE_ISOLATE pageblocks even though it's neither CMA nor movable zone if the page is temporarily unmovable. However, such a migration failure by unexpected temporal refcount holding is general issue, not only come from MIGRATE_ISOLATE and the MIGRATE_ISOLATE is also transient state like other temporal elevated refcount problem. Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Conflicts: include/linux/mm.h 1. There is no is_pinnable_page in 5.10 Link: https://lore.kernel.org/all/20220524171525.976723-1-minchan@kernel.org/ Bug: 231227007 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I5cdd2b8eefdd7e89658abd21c32aa84876ad7782 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit e9dd78ebe1c8e9fcc4067e0795326495a16a9c9b)	2023-01-16 01:52:43 +00:00
Andrey Konovalov	fe19dff7e6	FROMLIST: kasan: allow sampling page_alloc allocations for HW_TAGS [The patch is in mm-unstable tree.] As Hardware Tag-Based KASAN is intended to be used in production, its performance impact is crucial. As page_alloc allocations tend to be big, tagging and checking all such allocations can introduce a significant slowdown. Add two new boot parameters that allow to alleviate that slowdown: - kasan.page_alloc.sample, which makes Hardware Tag-Based KASAN tag only every Nth page_alloc allocation with the order configured by the second added parameter (default: tag every such allocation). - kasan.page_alloc.sample.order, which makes sampling enabled by the first parameter only affect page_alloc allocations with the order equal or greater than the specified value (default: 3, see below). The exact performance improvement caused by using the new parameters depends on their values and the applied workload. The chosen default value for kasan.page_alloc.sample.order is 3, which matches both PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. This is done for two reasons: 1. PAGE_ALLOC_COSTLY_ORDER is "the order at which allocations are deemed costly to service", which corresponds to the idea that only large and thus costly allocations are supposed to sampled. 2. One of the workloads targeted by this patch is a benchmark that sends a large amount of data over a local loopback connection. Most multi-page data allocations in the networking subsystem have the order of SKB_FRAG_PAGE_ORDER (or PAGE_ALLOC_COSTLY_ORDER). When running a local loopback test on a testing MTE-enabled device in sync mode, enabling Hardware Tag-Based KASAN introduces a ~50% slowdown. Applying this patch and setting kasan.page_alloc.sampling to a value higher than 1 allows to lower the slowdown. The performance improvement saturates around the sampling interval value of 10 with the default sampling page order of 3. This lowers the slowdown to ~20%. The slowdown in real scenarios involving the network will likely be better. Enabling page_alloc sampling has a downside: KASAN misses bad accesses to a page_alloc allocation that has not been tagged. This lowers the value of KASAN as a security mitigation. However, based on measuring the number of page_alloc allocations of different orders during boot in a test build, sampling with the default kasan.page_alloc.sample.order value affects only ~7% of allocations. The rest ~93% of allocations are still checked deterministically. Link: https://lkml.kernel.org/r/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Jann Horn <jannh@google.com> Cc: Mark Brand <markbrand@google.com> Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 238286329 Bug: 264310057 Link: https://lore.kernel.org/all/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com Change-Id: Icc7befe61848021c68a12034f426f1c300181ad6 Signed-off-by: Andrey Konovalov <andreyknvl@google.com>	2023-01-13 20:27:50 +00:00
Minchan Kim	a8962f626f	ANDROID: vendor hook to control blk_plug for shrink_lruvec Add vendor hook to contorl blk plugging for shrink_lruvec. Merged CL bcf1e503f5ed774dc28126a0f1a8c839717eafac: ANDROID: adjust vendor hook to control blk_plug Bug: 255471591 Bug: 238728493 Change-Id: Iba2603ff2e1b62cf2ee8fd6969d8ccd71416a288 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 89fed37332fd48e0cd13b256cd85d6929d5da319)	2023-01-13 18:43:29 +00:00
Minchan Kim	7df45e50a5	ANDROID: vendor hook to control blk_plug for memory reclaim Add vendor hook to contorl blk plugging. Bug: 255471591 Bug: 238728493 Change-Id: I96b73cec14f0d2fea46a4828526e6ae5aa5c71b7 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit a17e132ec4f290621666311e73f43202706d2743)	2023-01-13 18:43:29 +00:00
Minchan Kim	ba005d6032	ANDROID: vendor hook to control bh_lru and lru_cache_disable Add vendor hook for bh_lru and lru_cache_disable Bug: 238728493 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I81bfad317cf6e8633186ebb3238644306d7a102d Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 74e2ea264cd1895c493b9008b62bfea98dacf3f6)	2023-01-13 18:43:29 +00:00
Minchan Kim	243f54dd3a	ANDROID: vendor hook for TLB batching control Add vendor hook for flushing TLB batching in zap_pte_range. Merged CL 232bdcbd660b1129b7d8d0de25a563b476eeb522: ANDROID: pass argument in zap_pte_range vendor hooks Bug: 238728493 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: If2de5f070dd7b76624961f5a91440bf69a99ca2d Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit d257ef6764f228145d0fca24998162809bb5b9f7)	2023-01-13 18:43:29 +00:00
Minchan Kim	bb6ab2be93	ANDROID: vendor hook to control pagevec flush The pagevec batching causes lru_add_drain_all which is too expensive sometimes. This patch adds a new vendor hook to drain the pagevec immediately depending on the page's type. Bug: 251881967 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Id17e14e69197993ddad511a40c96e51674c02834 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 2f8253b7e6e563cc19cffa120c72f6f528664103)	2023-01-13 18:43:29 +00:00
Robin Hsu	1eeadb47e0	ANDROID: mm: vh for compaction begin/end Add vendor hook for compaction begin/end. The first use would be to measure compaction durations. Bug: 229927848 Test: local kernel build test Signed-off-by: Robin Hsu <robinhsu@google.com> Change-Id: I3d95434bf49b37199056dc9ddfc36a59a7de17b7 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 13b6bd38bb1f43bfffdb08c8f3a4a20d36ccd670)	2023-01-13 18:43:29 +00:00
Minchan Kim	e3f396c2d4	ANDROID: add vendor_hook to control CMA allocation ratio CMA first allocation policy for movable makes CMA(Upstream doesn't) area always full. It's good for memory efficiency since it could use up CMA available memory most of time. However, it could cause cma_alloc slow since it causes a lot page migration all the time. Let's add vendor hook for someone who want to restore CMA allocation policy to upstream so they will see less page migration in cma_alloc. If the vendor_hook returns false, the rmqueue_bulk return 0 without filling pcp->lists so get_populated_pcp_list will return NULL. Once get_populated_pcp_list returns NULL, __rmqueue_pcplist will retry the page allocation with original migratetype(currently, original migratetype couldn't be MIGRATE_CMA) so the retrial will find available pages from !MIGRATE_CMA free list. Bug: 231978523 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ia031d9bc6f34085b892a8d9923bf5b9b1794f94a Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 0ca85e35bf5b4a7ff08f00a060c83e4a82380b64)	2023-01-13 18:43:29 +00:00
Dom Cobley	4f7af52a4b	Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y	2023-01-13 13:14:02 +00:00
NARIBAYASHI Akira	35d8a89862	mm, compaction: fix fast_isolate_around() to stay within boundaries commit be21b32afe470c5ae98e27e49201158a47032942 upstream. Depending on the memory configuration, isolate_freepages_block() may scan pages out of the target range and causes panic. Panic can occur on systems with multiple zones in a single pageblock. The reason it is rare is that it only happens in special configurations. Depending on how many similar systems there are, it may be a good idea to fix this problem for older kernels as well. The problem is that pfn as argument of fast_isolate_around() could be out of the target range. Therefore we should consider the case where pfn < start_pfn, and also the case where end_pfn < pfn. This problem should have been addressd by the commit `6e2b7044c1` ("mm, compaction: make fast_isolate_freepages() stay within zone") but there was an oversight. Case1: pfn < start_pfn <at memory compaction for node Y> \| node X's zone \| node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ end_pfn ^ start_pfn = cc->zone->zone_start_pfn pfn <---------> scanned range by "Scan After" Case2: end_pfn < pfn <at memory compaction for node X> \| node X's zone \| node Y's zone +-----------------+------------------------------... pageblock ^ ^ ^ +-----------+-----------+-----------+-----------+... ^ ^ ^ ^ ^ pfn ^ end_pfn start_pfn <---------> scanned range by "Scan Before" It seems that there is no good reason to skip nr_isolated pages just after given pfn. So let perform simple scan from start to end instead of dividing the scan into "Before" and "After". Link: https://lkml.kernel.org/r/20221026112438.236336-1-a.naribayashi@fujitsu.com Fixes: `6e2b7044c1` ("mm, compaction: make fast_isolate_freepages() stay within zone"). Signed-off-by: NARIBAYASHI Akira <a.naribayashi@fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-12 11:58:47 +01:00
davidchiang	6986b4c889	ANDROID: mm: Export find_vm_area Export find_vm_area for obtaining pages of vmalloc'ed memory, which is required for both GXP and TPU modules. Bug: 263839332 Change-Id: I1d6c37a5abb6012c3ff295120dd2d3cb2871c820 Signed-off-by: davidchiang <davidchiang@google.com>	2023-01-11 04:31:52 +00:00
Charan Teja Kalla	c41503e313	ANDROID: page_pinner: prevent pp_buffer uninitialized access There is a race window between page_pinner_inited set and the pp_buffer initialization which cause accessing the pp_buffer->lock. Avoid this by moving the pp_buffer initialization to page_ext_ops->init() which sets the page_pinner_inited only after the pp_buffer is initialized. Race scenario: 1) init_page_pinner is called --> page_pinner_inited is set. 2) __alloc_contig_migrate_range --> __page_pinner_failure_detect() accesses the pp_buffer->lock(yet to be initialized). 3) Then the pp_buffer is allocated and initialized. Below is the issue call stack: spin_bug+0x0 _raw_spin_lock_irqsave+0x3c __page_pinner_failure_detect+0x110 __alloc_contig_migrate_range+0x1c4 alloc_contig_range+0x130 cma_alloc+0x170 dma_alloc_contiguous+0xa0 __dma_direct_alloc_pages+0x16c dma_direct_alloc+0x88 Bug: 259024332 Change-Id: I6849ac4d944498b9a431b47cad7adc7903c9bbaa Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>	2023-01-05 19:14:46 +00:00
Jaegeuk Kim	d68a75b1dc	Merge "Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.15.y' into android14-5.15" into android14-5.15	2023-01-05 17:17:32 +00:00
Martin Liu	d705ab99ab	ANDROID: vendor_hooks: Export direct reclaim trace points Get direct reclaim info. Bug: 190795589 Signed-off-by: Martin Liu <liumartin@google.com> Change-Id: Ie66a3c87484a364a918c19b8e044c82f1afd6749 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `12902c9996`)	2023-01-05 04:05:51 +00:00
Martin Liu	c20204c67d	ANDROID: cma: allow to use CMA in swap-in path Now, we allow to use CMA pages for certain user space allocations. One of them is anonymous page fault case. To align the use case, we should also allow to use CMA pages in swap-in cases. This could help mitigate OOM on swap-in cases showing plenty of free CMA left. logd.klogd invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 CPU: 0 PID: 433 Comm: logd.klogd Tainted: G W OE 5.10.100-android13-0 #1 Call trace: dump_backtrace.cfi_jt+0x0/0x8 show_stack+0x1c/0x2c dump_stack_lvl+0xc0/0x13c dump_header+0x54/0x238 oom_kill_process+0xb0/0x158 out_of_memory+0x17c/0x328 __alloc_pages_slowpath+0x5c4/0x8d0 __alloc_pages_nodemask+0x1bc/0x2e0 __read_swap_cache_async+0xdc/0x370 swap_vma_readahead+0x3b4/0x488 swapin_readahead+0x3c/0x54 do_swap_page+0x1e0/0xaa0 handle_pte_fault+0x128/0x1e0 handle_mm_fault+0x308/0x590 do_page_fault+0x33c/0x478 do_translation_fault+0x58/0x11c do_mem_abort+0x68/0x144 el0_da+0x24/0x34 el0_sync_handler+0xc4/0xec el0_sync+0x1c0/0x200 Mem-Info: active_anon:0 inactive_anon:3222 isolated_anon:62 active_file:232 inactive_file:428 isolated_file:0 unevictable:37232 dirty:3 writeback:40 slab_reclaimable:19943 slab_unreclaimable:281193 mapped:37126 shmem:2815 pagetables:8981 bounce:0 free:126007 free_pcp:223 free_cma:123062 Node 0 active_anon:16kB inactive_anon:13160kB active_file:292kB inactive_file:2000kB unevictable:148928kB isolated(anon):0kB isolated(file):0kB mapped:148308kB dirty:12kB writeback:164kB shmem:11260kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:20528kB shadow_call_stack:5200kB all_unreclaimable? no DMA32 free:14128kB min:7572kB low:22636kB high:37700kB reserved_highatomic:4096KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1913856kB managed:1553276kB mlocked:0kB pagetables:1292kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:2520kB lowmem_reserve[]: 0 0 0 Normal free:489888kB min:19808kB low:59220kB high:98632kB reserved_highatomic:36864KB active_anon:20kB inactive_anon:12168kB active_file:0kB inactive_file:1640kB unevictable:148928kB writepending:180kB present:4194304kB managed:4063392kB mlocked:148928kB pagetables:34632kB bounce:0kB free_pcp:1928kB local_pcp:0kB free_cma:489752kB lowmem_reserve[]: 0 0 0 DMA32: 1664kB (UME) 1638kB (UMECH) 59216kB (UMCH) 532kB (UC) 264kB (C) 2128kB (C) 2256kB (C) 1512kB (C) 11024kB (C) 02048kB 04096kB = 14032kB Normal: 9694kB (C) 778kB (C) 4016kB (C) 1732kB (C) 564kB (C) 1128kB (C) 2256kB (C) 1512kB (C) 01024kB 02048kB 1184096kB (C) = 490476kB 40220 total pagecache pages 30 pages in swap cache Swap cache stats: add 2634625, delete 2635304, find 160621/2963954 Free swap = 1473788kB Total swap = 2097148kB Bug: 229822798 Signed-off-by: Martin Liu <liumartin@google.com> Change-Id: Ia0bb6f72e52f77f26062e1769bfd92e831f07cab Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 3e591c63b13772a1de0ffed995b482166d27ed71)	2023-01-05 02:16:50 +00:00
Minchan Kim	dd887dbfaa	ANDROID: mm: do not count cma_alloc_fail on __GFP_NORETRY Do not account __GFP_NORETRY allocation failure as cma_alloc_fail since it's not critical failure(i.e., the caller with __GFP_NORETRY should always carry on the fallback plan). It's also good for compatibility POV with upstream since upstream cma_alloc_fail only counts cma_alloc_fail with !__GFP_NORETRY since upstream doesn't support __GFP_NORTRY yet. Bug: 220669548 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I377e6b033c3786e10b6b1c814037a4fc40e20a73 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 8ffc7ff817fe552592daa2b0de1760e3539663f3)	2023-01-05 02:16:50 +00:00
Minchan Kim	bb7b81497d	ANDROID: GKI: export cma_get_size Export cma_get_size to tell cma instance's size, which is needed to allocate entire pages of the cma. Bug: 218731671 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ifb2769f60250ce605236342b950907218e1c28a5 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit 7a44906686048bdcecb7dfa4fac02c4ad7f6cd06)	2023-01-05 02:16:50 +00:00
Minchan Kim	1b38e981db	ANDROID: mm: cma do not sleep for __GFP_NORETRY Do not sleep for retrying for __GFP_NORERY since it's failfast mode approach. User could retry the allocation without the flag by themselves if they see the failure. Bug: 192475091 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ic6a857978fda8e353b9ed770d1e0ba1808fd201e Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `12f48605e8`)	2023-01-05 02:16:50 +00:00
Minchan Kim	60d2dad38e	ANDROID: mm: cma: skip problematic pageblock alloc_contig_range is supposed to work on max(MAX_ORDER_NR_PAGES, or pageblock_nr_pages) granularity aligned range. If it fails at a page and return error to user, user doesn't know what page makes the allocation failure and keep retrying another allocation with new range including the failed page and encountered error again and again until it could escape the out of the granularity block. Instead, let's make CMA aware of what pfn was troubled in previous trial and then continue to work new pageblock out of the failed page so it doesn't see the repeated error repeatedly. Currently, this option works for only __GFP_NORETRY case for safe for existing CMA users. Bug: 192475091 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I0959c9df3d4b36408a68920abbb4d52d31026079 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `0e688e972d`)	2023-01-05 02:16:50 +00:00
Minchan Kim	c63e78a29a	ANDROID: mm: lru_cache_disable skips lru cache drainnig lru_cache_disable is not trivial cost since it should run work from every cores in the system. Thus, repeated call of the function whenever alloc_contig_range in the cma's allocation loop is called is expensive. This patch makes the lru_cache_disable smarter in that it will not run __lru_add_drain_all since it knows the cache was already disabled by someone else. With that, user of alloc_contig_range can disable the lru cache in advance in their context so that subsequent alloc_contig_range for user's operation will avoid the costly function call. This patch moves lru_cache APIs from swap.h to swap.c and export it for vendor users. Bug: 192475091 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I23da8599c55db49dc80226285972e4cd80dedcff Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `c8578a3e90`)	2023-01-05 02:16:50 +00:00
Minchan Kim	b74eeef409	ANDROID: mm: do not try test_page_isoalte if migration fails Currently, alloc_contig_range expects that even though a page fails with -EBUSY from __alloc_contig_migrate_range, it want to check those failed pages in test_pages_isolated again with hope that those page would be freed soon so cma allocatoin would be succeeded. However, it depends on the luck and I found sometimes test_page_isolated constantly fails at the page repeatedly whenever cma_alloc retried. Rather than burning out CPU to check the page's status in test_pages_isolated for GFP_NORETRY allocation, just bail out and relies on the user what they want to do. Currently, this option works for only __GFP_NORETRY case for safe of existing other users. Bug: 192475091 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I9211452be06960dc7d8f854537e53b3fc5262c8e Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `c01ce3b5ef`)	2023-01-05 02:16:50 +00:00
Minchan Kim	a7f55c5c73	ANDROID: mm: add cma allocation statistics alloc_contig_range is the core worker function for CMA allocation so it has every information to be able to understand allocation latency. For example, how many pages are migrated, how many time unmap was needed to migrate pages, how many times it encountered errors by some reasons. This patch adds such statistics in the alloc_contig_range and return it to user so user can use those information to analyize latency. The cma_alloc is first user for the statistics, which export the statistics as new trace event(i.e., cma_alloc_info). It was really usefuli to optimize cma allocation work. Bug: 192475091 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I7be43cc89d11078e2a324d2d06aada6d8e9e1cc9 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `675e504598`)	2023-01-05 02:16:50 +00:00
Minchan Kim	1b2de5aa2d	ANDROID: mm: cma: add vendor hoook in cma_alloc() Add vendor hook for cma_alloc latency measuring. Bug: 177231781 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ia2dbb26454bd8f03489389b29b9a6c939d3c2bbb Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `c6e85ea56b`)	2023-01-05 02:16:50 +00:00
Minchan Kim	1b6d55eb48	ANDROID: mm: build alloc_contig_dump_pages in page_alloc.o GKI has CONFIG_DYNAMIC_DEBUG_CORE. Thus, to enable only the specific alloc_contig_dump_pages without needing all pr_debug in every source files is using -DCONFIG_DYNAMIC_MODULE when the page_alloc.o compiled. Bug: 182195592 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I93266eb4161b3653389c737d4c767fd5d1083339 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `8d03e49505`)	2023-01-05 02:16:50 +00:00
Minchan Kim	9cb760bc6a	FROMLIST: mm: failfast mode with __GFP_NORETRY in alloc_contig_range Contiguous memory allocation can be stalled due to waiting on page writeback and/or page lock which causes unpredictable delay. It's a unavoidable cost for the requestor to get big contiguous memory but it's expensive for small contiguous memory(e.g., order-4) because caller could retry the request in different range where would have easy migratable pages without stalling. This patch introduce __GFP_NORETRY as compaction gfp_mask in alloc_contig_range so it will fail fast without blocking when it encounters pages needed waiting. Bug: 170340257 Bug: 120293424 Link: https://lore.kernel.org/linux-mm/YAnM5PbNJZlk%2F%2FiX@google.com/T/#m1362218ebb69e6e10c20d9361008b079745c4e6f Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I42ba8dd5aeb065d936978ab205e4baf84bf9a321 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `20512940b8`)	2023-01-05 02:16:50 +00:00
Minchan Kim	eebff8eab2	FROMLIST: mm: cma: introduce gfp flag in cma_alloc instead of no_warn The upcoming patch will introduce __GFP_NORETRY semantic in alloc_contig_range which is a failfast mode of the API. Instead of adding a additional parameter for gfp, replace no_warn with gfp flag. To keep old behaviors, it follows the rule below. no_warn gfp_flags false GFP_KERNEL true GFP_KERNEL\|__GFP_NOWARN gfp & __GFP_NOWARN GFP_KERNEL \| (gfp & __GFP_NOWARN) Bug: 170340257 Bug: 120293424 Link: https://lore.kernel.org/linux-mm/YAnM5PbNJZlk%2F%2FiX@google.com/T/#m36b144ff81fe0a8f0ecaf6813de4819ecc41f8fe Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I1ce020ab5d5fff34eb6464be4632ddef72fb43eb Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit `23ba990a3e`)	2023-01-05 02:16:50 +00:00
Jaegeuk Kim	b205c093f2	Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.15.y' into android14-5.15 * aosp/upstream-f2fs-stable-linux-5.15.y: f2fs: let's avoid panic if extent_tree is not created f2fs: should use a temp extent_info for lookup f2fs: don't mix to use union values in extent_info f2fs: initialize extent_cache parameter f2fs: fix to avoid NULL pointer dereference in f2fs_issue_flush() fs: account for group membership fs: fix acl translation fs: support mapped mounts of mapped filesystems fs: add i_user_ns() helper fs: port higher-level mapping helpers fs: remove unused low-level mapping helpers fs: use low-level mapping helpers docs: update mapping documentation fs: account for filesystem mappings fs: tweak fsuidgid_has_mapping() fs: move mapping helpers fs: add is_idmapped_mnt() helper Revert "fs: add is_idmapped_mnt() helper" Revert "fs: move mapping helpers" Revert "fs: tweak fsuidgid_has_mapping()" Revert "fs: account for filesystem mappings" Revert "docs: update mapping documentation" Revert "fs: use low-level mapping helpers" Revert "fs: remove unused low-level mapping helpers" Revert "fs: add i_user_ns() helper" Revert "fs: account for group membership" fsverity: simplify fsverity_get_digest() fsverity: stop using PG_error to track error status fs-verity: use kmap_local_page() instead of kmap() highmem: Make __kunmap_{local,atomic}() take const void pointer fs-verity: use memcpy_from_page() fs-verity: Use struct_size() helper in enable_verity() fs-verity: remove unused parameter desc_size in fsverity_create_info() fs-verity: define a function to return the integrity protected file digest fscrypt: add additional documentation for SM4 support fscrypt: remove unused Speck definitions fscrypt: Add SM4 XTS/CTS symmetric algorithm support blk-crypto: Add support for SM4-XTS blk crypto mode fscrypt: add comment for fscrypt_valid_enc_modes_v1() blk-crypto: Add a missing include directive blk-crypto: move internal only declarations to blk-crypto-internal.h blk-crypto: add a blk_crypto_config_supported_natively helper blk-crypto: don't use struct request_queue for public interfaces fscrypt: pass super_block to fscrypt_put_master_key_activeref() fscrypt: add fscrypt_context_for_new_inode fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size fscrypt: Add HCTR2 support for filename encryption fs: account for group membership fs: add i_user_ns() helper fs: remove unused low-level mapping helpers fs: use low-level mapping helpers docs: update mapping documentation fs: account for filesystem mappings fs: tweak fsuidgid_has_mapping() fs: move mapping helpers fs: add is_idmapped_mnt() helper Bug: 256243893 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: Ia7b419ef516dee40be32968cd7c60ce0adabca99	2023-01-04 17:13:29 -08:00
Minchan Kim	f74aca771c	BACKPORT: mm: don't be stuck to rmap lock on reclaim path The rmap locks(i_mmap_rwsem and anon_vma->root->rwsem) could be contended under memory pressure if processes keep working on their vmas(e.g., fork, mmap, munmap). It makes reclaim path stuck. In our real workload traces, we see kswapd is waiting the lock for 300ms+(worst case, a sec) and it makes other processes entering direct reclaim, which were also stuck on the lock. This patch makes lru aging path try_lock mode like shink_page_list so the reclaim context will keep working with next lru pages without being stuck. if it found the rmap lock contended, it rotates the page back to head of lru in both active/inactive lrus to make them consistent behavior, which is basic starting point rather than adding more heristic. Since this patch introduces a new "contended" field as out-param along with try_lock in-param in rmap_walk_control, it's not immutable any longer if the try_lock is set so remove const keywords on rmap related functions. Since rmap walking is already expensive operation, I doubt the const would help sizable benefit( And we didn't have it until 5.17). In a heavy app workload in Android, trace shows following statistics. It almost removes rmap lock contention from reclaim path. Martin Liu reported: Before: max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function 1632 0 1631 151.542173 31672 209 page_lock_anon_vma_read 601 0 601 145.544681 28817 198 rmap_walk_file After: max_dur(ms) min_dur(ms) max-min(dur)ms avg_dur(ms) sum_dur(ms) count blocked_function NaN NaN NaN NaN NaN 0.0 NaN 0 0 0 0.127645 1 12 rmap_walk_file [minchan@kernel.org: add comment, per Matthew] Link: https://lkml.kernel.org/r/YnNqeB5tUf6LZ57b@google.com Link: https://lkml.kernel.org/r/20220510215423.164547-1-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: John Dias <joaodias@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Martin Liu <liumartin@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: folio->page Conflicts: mm/huge_memory.c was refactored by commit `c5b5a3dd2c` mm: thp: refactor NUMA fault handling (cherry picked from commit 6d4675e601357834dadd2ba1d803f6484596015c) Bug: 239681156 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I0c63e0291120c8a1b5f2d83b8a7b210cb56c27a2 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit b8762fa2655469fb3f2c40a38f67e6f05afe4136)	2023-01-04 02:28:34 +00:00
Charan Teja Kalla	8ca606e98b	FROMLIST: mm: fix use-after free of page_ext after race with memory-offline The below is one path where race between page_ext and offline of the respective memory blocks will cause use-after-free on the access of page_ext structure. process1 process2 --------- --------- a)doing /proc/page_owner doing memory offline through offline_pages. b)PageBuddy check is failed thus proceed to get the page_owner information through page_ext access. page_ext = lookup_page_ext(page); migrate_pages(); ................. Since all pages are successfully migrated as part of the offline operation,send MEM_OFFLINE notification where for page_ext it calls: offline_page_ext()--> __free_page_ext()--> free_page_ext()--> vfree(ms->page_ext) mem_section->page_ext = NULL c) Check for the PAGE_EXT flags in the page_ext->flags access results into the use-after-free(leading to the translation faults). As mentioned above, there is really no synchronization between page_ext access and its freeing in the memory_offline. The memory offline steps(roughly) on a memory block is as below: 1) Isolate all the pages 2) while(1) try free the pages to buddy.(->free_list[MIGRATE_ISOLATE]) 3) delete the pages from this buddy list. 4) Then free page_ext.(Note: The struct page is still alive as it is freed only during hot remove of the memory which frees the memmap, which steps the user might not perform). This design leads to the state where struct page is alive but the struct page_ext is freed, where the later is ideally part of the former which just representing the page_flags (check [3] for why this design is chosen). The above mentioned race is just one example __but the problem persists in the other paths too involving page_ext->flags access(eg: page_is_idle())__. Fix all the paths where offline races with page_ext access by maintaining synchronization with rcu lock and is achieved in 3 steps: 1) Invalidate all the page_ext's of the sections of a memory block by storing a flag in the LSB of mem_section->page_ext. 2) Wait till all the existing readers to finish working with the ->page_ext's with synchronize_rcu(). Any parallel process that starts after this call will not get page_ext, through lookup_page_ext(), for the block parallel offline operation is being performed. 3) Now safely free all sections ->page_ext's of the block on which offline operation is being performed. Note: If synchronize_rcu() takes time then optimizations can be done in this path through call_rcu()[2]. Thanks to David Hildenbrand for his views/suggestions on the initial discussion[1] and Pavan kondeti for various inputs on this patch. [1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@quicinc.com/ [2] https://lore.kernel.org/all/a26ce299-aed1-b8ad-711e-a49e82bdd180@quicinc.com/T/#u [3] https://lore.kernel.org/all/6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com/ Bug: 236222283 Bug: 240196534 Link: https://lore.kernel.org/all/1661496993-11473-1-git-send-email-quic_charante@quicinc.com/ Change-Id: Ib439ae19c61a557a5c70ea90e3c4b35a5583ba0d Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> Signed-off-by: Minchan Kim <minchan@google.com> (fixed merge conflicts and still exported lookup_page_ext) (minchan: fixed page_pinner with new page_ext scheme)	2023-01-04 02:18:52 +00:00
Minchan Kim	e12acd3eef	ANDROID: mm: introduce page_pinner For CMA allocation, it's really critical to migrate a page but sometimes it fails. One of the reasons is some driver holds a page refcount for a long time so VM couldn't migrate the page at that time. The concern here is there is no way to find the who hold the refcount of the page effectively. This patch introduces feature to keep tracking page's pinner. All get_page sites are vulnerable to pin a page for a long time but the cost to keep track it would be significat since get_page is the most frequent kernel operation. Furthermore, the page could be not user page but kernel page which is not related to the page migration failure. Thus, this patch keeps tracks of only migration failed pages to reduce runtime cost. Once page migration fails in CMA allocation path, those pages are marked as "migration failure" and every put_page operation against those pages, callstack of the put are recorded into page_pinner buffer. Later, admin can see what pages were failed and who released the refcount since the failure. It really helps effectively to find out longtime refcount holder to prevent the page migration. note: page_pinner doesn't guarantee attributing/unattributing are atomic if they happen at the same time. It's just best effort so false-positive could happen. Bug: 183414571 BUg: 240196534 Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I603d0c0122734c377db6b1eb95848a6f734173a0 (cherry picked from commit 898cfbf094a2fc13c67fab5b5d3c916f0139833a)	2023-01-04 02:18:52 +00:00
Suren Baghdasaryan	a21b3ffd99	ANDROID: remove unnecessary SPECULATIVE_PAGE_FAULT config dependency After recent fixes [1], speculative page fault walks are performed with disabled interrupts, therefore do not depend on ALLOC_SPLIT_PTLOCKS which would affect them if performed under RCU protection. Remove unnecessary config dependency. [1] 5fcb50b0559a ("ANDROID: mm: fix speculative walk which is unsafe under RCU") Bug: 253557903 Change-Id: Ia1c835c7b08419f8fce61fa4f7e6842fbf786229 Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2022-12-29 20:48:14 +00:00
Minchan Kim	98f3cc7ecd	ANDROID: mm: freeing MIGRATE_ISOLATE page instantly Since Android has pcp list for MIGRATE_CMA[1], it could cause CMA allocation latency due to not freeing the MIGRATE_ISOLATE page immediately. Originally, MIGRATE_ISOLATED page is supposed to go buddy list with skipping pcp list. Otherwise, the page could be reallocated from pcp list or staying on the pcp list until the pcp is drained so that CMA keeps retrying since it couldn't find the freed page from buddy list. That worked before since the CMA pfnblocks changed only from MIGRATE_CMA to MIGRATE_ISOLATE and free function logic in page allocator has checked MIGRATE_ISOLATEness on every CMA pages using below. free_unref_page_commit if (migratetype >= MIGRATE_PCPTYPES) if(is_migrate_isolate(migratetype)) free_one_page(page); It worked since enum MIGRATE_CMA was bigger than enum MIGRATE_PCPTYPES but since [1], the enum MIGRATE_CMA is less than MIGRATE_PCPTYPES so the logic above doesn't work any more. It could cause following race CPU 0 CPU 1 free_unref_page migratetype = get_pfnblock_migratetype() set_pcppage_migratetype(MIGRATE_CMA) cma_alloc alloc_contig_range set_migrate_isolate(MIGRATE_ISOLATE) add the page into pcp list the page could be reallocated This patch couldn't fix the race completely due to missing zone->lock in order-0 page free(for performance reason). However, it's not a new problem so we need to deal with the issue separately. [1] ANDROID: mm: add cma pcp list Bug: 218731671 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ibea20085ce5bfb4b74b83b041f9bda9a380120f9 Signed-off-by: Richard Chang <richardycc@google.com> (cherry picked from commit d9e4b67784866047e8cfb5598cdf1ebc0c71f3d9)	2022-12-29 09:09:24 +00:00

1 2 3 4 5 ...

17853 Commits