75a03dbfc836e86f6e56caf1b8e36a2735c5562f
355 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0fd37220d8 |
UPSTREAM: mm: refactor vm_area_struct::anon_vma_name usage code
Avoid mixing strings and their anon_vma_name referenced pointers by using struct anon_vma_name whenever possible. This simplifies the code and allows easier sharing of anon_vma_name structures when they represent the same name. [surenb@google.com: fix comment] Link: https://lkml.kernel.org/r/20220223153613.835563-1-surenb@google.com Link: https://lkml.kernel.org/r/20220224231834.1481408-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Colin Cross <ccross@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Alexey Gladkov <legion@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Collingbourne <pcc@google.com> Cc: Xiaofeng Cao <caoxiaofeng@yulong.com> Cc: David Hildenbrand <david@redhat.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 5c26f6ac9416b63d093e29c30e79b3297e425472) Bug: 218352794 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I4a6b5602ce7151d1a4b88fac489f86d68089bd4d |
||
|
|
16f06ae351 |
Merge 5.15.27 into android-5.15
Changes in 5.15.27
mac80211_hwsim: report NOACK frames in tx_status
mac80211_hwsim: initialize ieee80211_tx_info at hw_scan_work
i2c: bcm2835: Avoid clock stretching timeouts
ASoC: rt5668: do not block workqueue if card is unbound
ASoC: rt5682: do not block workqueue if card is unbound
regulator: core: fix false positive in regulator_late_cleanup()
Input: clear BTN_RIGHT/MIDDLE on buttonpads
btrfs: get rid of warning on transaction commit when using flushoncommit
KVM: arm64: vgic: Read HW interrupt pending state from the HW
block: loop:use kstatfs.f_bsize of backing file to set discard granularity
tipc: fix a bit overflow in tipc_crypto_key_rcv()
cifs: do not use uninitialized data in the owner/group sid
cifs: fix double free race when mount fails in cifs_get_root()
HID: amd_sfh: Handle amd_sfh work buffer in PM ops
HID: amd_sfh: Add functionality to clear interrupts
HID: amd_sfh: Add interrupt handler to process interrupts
cifs: modefromsids must add an ACE for authenticated users
selftests/seccomp: Fix seccomp failure by adding missing headers
drm/amd/pm: correct UMD pstate clocks for Dimgrey Cavefish and Beige Goby
selftests/ftrace: Do not trace do_softirq because of PREEMPT_RT
dmaengine: shdma: Fix runtime PM imbalance on error
i2c: cadence: allow COMPILE_TEST
i2c: imx: allow COMPILE_TEST
i2c: qup: allow COMPILE_TEST
net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990
block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern
usb: gadget: don't release an existing dev->buf
usb: gadget: clear related members when goto fail
exfat: reuse exfat_inode_info variable instead of calling EXFAT_I()
exfat: fix i_blocks for files truncated over 4 GiB
tracing: Add test for user space strings when filtering on string pointers
arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL
serial: stm32: prevent TDR register overwrite when sending x_char
ext4: drop ineligible txn start stop APIs
ext4: simplify updating of fast commit stats
ext4: fast commit may not fallback for ineligible commit
ext4: fast commit may miss file actions
sched/fair: Fix fault in reweight_entity
ata: pata_hpt37x: fix PCI clock detection
drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
tracing: Add ustring operation to filtering string pointers
ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()
NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()
NFSD: Fix zero-length NFSv3 WRITEs
io_uring: fix no lock protection for ctx->cq_extra
tools/resolve_btf_ids: Close ELF file on error
mtd: spi-nor: Fix mtd size for s3an flashes
MIPS: fix local_{add,sub}_return on MIPS64
signal: In get_signal test for signal_group_exit every time through the loop
PCI: mediatek-gen3: Disable DVFSRC voltage request
PCI: rcar: Check if device is runtime suspended instead of __clk_is_enabled()
PCI: dwc: Do not remap invalid res
PCI: aardvark: Fix checking for MEM resource type
KVM: VMX: Don't unblock vCPU w/ Posted IRQ if IRQs are disabled in guest
KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU
KVM: VMX: Read Posted Interrupt "control" exactly once per loop iteration
KVM: X86: Ensure that dirty PDPTRs are loaded
KVM: x86: Handle 32-bit wrap of EIP for EMULTYPE_SKIP with flat code seg
KVM: x86: Exit to userspace if emulation prepared a completion callback
i3c: fix incorrect address slot lookup on 64-bit
i3c/master/mipi-i3c-hci: Fix a potentially infinite loop in 'hci_dat_v1_get_index()'
tracing: Do not let synth_events block other dyn_event systems during create
Input: ti_am335x_tsc - set ADCREFM for X configuration
Input: ti_am335x_tsc - fix STEPCONFIG setup for Z2
PCI: mvebu: Check for errors from pci_bridge_emul_init() call
PCI: mvebu: Do not modify PCI IO type bits in conf_write
PCI: mvebu: Fix support for bus mastering and PCI_COMMAND on emulated bridge
PCI: mvebu: Fix configuring secondary bus of PCIe Root Port via emulated bridge
PCI: mvebu: Setup PCIe controller to Root Complex mode
PCI: mvebu: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
PCI: mvebu: Fix support for PCI_EXP_DEVCTL on emulated bridge
PCI: mvebu: Fix support for PCI_EXP_RTSTA on emulated bridge
PCI: mvebu: Fix support for DEVCAP2, DEVCTL2 and LNKCTL2 registers on emulated bridge
NFSD: Fix verifier returned in stable WRITEs
Revert "nfsd: skip some unnecessary stats in the v4 case"
nfsd: fix crash on COPY_NOTIFY with special stateid
x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi()
drm/i915: don't call free_mmap_offset when purging
SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace point
SUNRPC: Fix sockaddr handling in svcsock_accept_class trace points
drm/sun4i: dw-hdmi: Fix missing put_device() call in sun8i_hdmi_phy_get
drm/atomic: Check new_crtc_state->active to determine if CRTC needs disable in self refresh mode
ntb_hw_switchtec: Fix pff ioread to read into mmio_part_cfg_all
ntb_hw_switchtec: Fix bug with more than 32 partitions
drm/amdkfd: Check for null pointer after calling kmemdup
drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt
i3c: master: dw: check return of dw_i3c_master_get_free_pos()
dma-buf: cma_heap: Fix mutex locking section
tracing/uprobes: Check the return value of kstrdup() for tu->filename
tracing/probes: check the return value of kstrndup() for pbuf
mm: defer kmemleak object creation of module_alloc()
kasan: fix quarantine conflicting with init_on_free
selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting
hugetlbfs: fix off-by-one error in hugetlb_vmdelete_list()
drm/amdgpu/display: Only set vblank_disable_immediate when PSR is not enabled
drm/amdgpu: filter out radeon PCI device IDs
drm/amdgpu: filter out radeon secondary ids as well
drm/amd/display: Use adjusted DCN301 watermarks
drm/amd/display: move FPU associated DSC code to DML folder
ethtool: Fix link extended state for big endian
octeontx2-af: Optimize KPU1 processing for variable-length headers
octeontx2-af: Reset PTP config in FLR handler
octeontx2-af: cn10k: RPM hardware timestamp configuration
octeontx2-af: cn10k: Use appropriate register for LMAC enable
octeontx2-af: Adjust LA pointer for cpt parse header
octeontx2-af: Add KPU changes to parse NGIO as separate layer
net/mlx5e: IPsec: Refactor checksum code in tx data path
net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
bpf: Use u64_stats_t in struct bpf_prog_stats
bpf: Fix possible race in inc_misses_counter
drm/amd/display: Update watermark values for DCN301
drm: mxsfb: Set fallback bus format when the bridge doesn't provide one
drm: mxsfb: Fix NULL pointer dereference
riscv/mm: Add XIP_FIXUP for phys_ram_base
drm/i915/display: split out dpt out of intel_display.c
drm/i915/display: Move DRRS code its own file
drm/i915: Disable DRRS on IVB/HSW port != A
gve: Recording rx queue before sending to napi
net: dsa: ocelot: seville: utilize of_mdiobus_register
net: dsa: seville: register the mdiobus under devres
ibmvnic: don't release napi in __ibmvnic_open()
of: net: move of_net under net/
net: ethernet: litex: Add the dependency on HAS_IOMEM
drm/mediatek: mtk_dsi: Reset the dsi0 hardware
cifs: protect session channel fields with chan_lock
cifs: fix confusing unneeded warning message on smb2.1 and earlier
drm/amd/display: Fix stream->link_enc unassigned during stream removal
bnxt_en: Fix occasional ethtool -t loopback test failures
drm/amd/display: For vblank_disable_immediate, check PSR is really used
PCI: mvebu: Fix device enumeration regression
net: of: fix stub of_net helpers for CONFIG_NET=n
ALSA: intel_hdmi: Fix reference to PCM buffer address
ucounts: Fix systemd LimitNPROC with private users regression
riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value
riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
riscv: Fix config KASAN && DEBUG_VIRTUAL
iwlwifi: mvm: check debugfs_dir ptr before use
ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min
iommu/vt-d: Fix double list_add when enabling VMD in scalable mode
iommu/amd: Recover from event log overflow
drm/i915: s/JSP2/ICP2/ PCH
drm/amd/display: Reduce dmesg error to a debug print
xen/netfront: destroy queues before real_num_tx_queues is zeroed
thermal: core: Fix TZ_GET_TRIP NULL pointer dereference
mac80211: fix EAPoL rekey fail in 802.3 rx path
blktrace: fix use after free for struct blk_trace
ntb: intel: fix port config status offset for SPR
mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls
xfrm: fix MTU regression
netfilter: fix use-after-free in __nf_register_net_hook()
bpf, sockmap: Do not ignore orig_len parameter
xfrm: fix the if_id check in changelink
xfrm: enforce validity of offload input flags
e1000e: Correct NVM checksum verification flow
net: fix up skbs delta_truesize in UDP GRO frag_list
netfilter: nf_queue: don't assume sk is full socket
netfilter: nf_queue: fix possible use-after-free
netfilter: nf_queue: handle socket prefetch
batman-adv: Request iflink once in batadv-on-batadv check
batman-adv: Request iflink once in batadv_get_real_netdevice
batman-adv: Don't expect inter-netns unique iflink indices
net: ipv6: ensure we call ipv6_mc_down() at most once
net: dcb: flush lingering app table entries for unregistered devices
net: ipa: add an interconnect dependency
net/smc: fix connection leak
net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range
mac80211: fix forwarded mesh frames AC & queue selection
net: stmmac: fix return value of __setup handler
mac80211: treat some SAE auth steps as final
iavf: Fix missing check for running netdev
net: sxgbe: fix return value of __setup handler
ibmvnic: register netdev after init of adapter
net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
iavf: Fix deadlock in iavf_reset_task
efivars: Respect "block" flag in efivar_entry_set_safe()
auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature
firmware: arm_scmi: Remove space in MODULE_ALIAS name
ASoC: cs4265: Fix the duplicated control name
auxdisplay: lcd2s: Fix memory leak in ->remove()
auxdisplay: lcd2s: Use proper API to free the instance of charlcd object
can: gs_usb: change active_channels's type from atomic_t to u8
iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find
arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
igc: igc_read_phy_reg_gpy: drop premature return
ARM: Fix kgdb breakpoint for Thumb2
mips: setup: fix setnocoherentio() boolean setting
ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
mptcp: Correctly set DATA_FIN timeout when number of retransmits is large
selftests: mlxsw: tc_police_scale: Make test more robust
pinctrl: sunxi: Use unique lockdep classes for IRQs
igc: igc_write_phy_reg_gpy: drop premature return
ibmvnic: free reset-work-item when flushing
memfd: fix F_SEAL_WRITE after shmem huge page allocated
s390/extable: fix exception table sorting
sched: Fix yet more sched_fork() races
arm64: dts: juno: Remove GICv2m dma-range
iommu/amd: Fix I/O page table memory leak
MIPS: ralink: mt7621: do memory detection on KSEG1
ARM: dts: switch timer config to common devkit8000 devicetree
ARM: dts: Use 32KiHz oscillator on devkit8000
soc: fsl: guts: Revert commit
|
||
|
|
416e3a0e42 |
proc: fix documentation and description of pagemap
commit dd21bfa425c098b95ca86845f8e7d1ec1ddf6e4a upstream. Since bit 57 was exported for uffd-wp write-protected (commit |
||
|
|
2ded03fd7c |
Merge 5.15.25 into android13-5.15
Changes in 5.15.25 drm/nouveau/pmu/gm200-: use alternate falcon reset sequence fs/proc: task_mmu.c: don't read mapcount for migration entry btrfs: zoned: cache reported zone during mount scsi: lpfc: Fix mailbox command failure during driver initialization HID:Add support for UGTABLET WP5540 Revert "svm: Add warning message for AVIC IPI invalid target" parisc: Show error if wrong 32/64-bit compiler is being used serial: parisc: GSC: fix build when IOSAPIC is not set parisc: Drop __init from map_pages declaration parisc: Fix data TLB miss in sba_unmap_sg parisc: Fix sglist access in ccio-dma.c mmc: block: fix read single on recovery logic mm: don't try to NUMA-migrate COW pages that have other uses HID: amd_sfh: Add illuminance mask to limit ALS max value HID: i2c-hid: goodix: Fix a lockdep splat HID: amd_sfh: Increase sensor command timeout HID: amd_sfh: Correct the structure field name PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology parisc: Add ioread64_lo_hi() and iowrite64_lo_hi() btrfs: send: in case of IO error log it platform/x86: touchscreen_dmi: Add info for the RWC NANOTE P8 AY07J 2-in-1 platform/x86: ISST: Fix possible circular locking dependency detected kunit: tool: Import missing importlib.abc selftests: rtc: Increase test timeout so that all tests run kselftest: signal all child processes net: ieee802154: at86rf230: Stop leaking skb's selftests/zram: Skip max_comp_streams interface on newer kernel selftests/zram01.sh: Fix compression ratio calculation selftests/zram: Adapt the situation that /dev/zram0 is being used selftests: openat2: Print also errno in failure messages selftests: openat2: Add missing dependency in Makefile selftests: openat2: Skip testcases that fail with EOPNOTSUPP selftests: skip mincore.check_file_mmap when fs lacks needed support ax25: improve the incomplete fix to avoid UAF and NPD bugs pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP vfs: make freeze_super abort when sync_filesystem returns error quota: make dquot_quota_sync return errors from ->sync_fs scsi: pm80xx: Fix double completion for SATA devices kselftest: Fix vdso_test_abi return status scsi: core: Reallocate device's budget map on queue depth change scsi: pm8001: Fix use-after-free for aborted TMF sas_task scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task drm/amd: Warn users about potential s0ix problems nvme: fix a possible use-after-free in controller reset during load nvme-tcp: fix possible use-after-free in transport error_recovery work nvme-rdma: fix possible use-after-free in transport error_recovery work net: sparx5: do not refer to skb after passing it on drm/amd: add support to check whether the system is set to s3 drm/amd: Only run s3 or s0ix if system is configured properly drm/amdgpu: fix logic inversion in check x86/Xen: streamline (and fix) PV CPU enumeration Revert "module, async: async_synchronize_full() on module init iff async is used" gcc-plugins/stackleak: Use noinstr in favor of notrace random: wake up /dev/random writers after zap KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case KVM: x86: nSVM: fix potential NULL derefernce on nested migration KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state iwlwifi: fix use-after-free drm/radeon: Fix backlight control on iMac 12,1 drm/atomic: Don't pollute crtc_state->mode_blob with error pointers drm/amd/pm: correct the sequence of sending gpu reset msg drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. drm/i915/opregion: check port number bounds for SWSCI display power state drm/i915: Fix dbuf slice config lookup drm/i915: Fix mbus join config lookup vsock: remove vsock from connected table when connect is interrupted by a signal drm/cma-helper: Set VM_DONTEXPAND for mmap drm/i915/gvt: Make DRM_I915_GVT depend on X86 drm/i915/ttm: tweak priority hint selection iwlwifi: pcie: fix locking when "HW not ready" iwlwifi: pcie: gen2: fix locking when "HW not ready" iwlwifi: mvm: don't send SAR GEO command for 3160 devices selftests: netfilter: fix exit value for nft_concat_range netfilter: nft_synproxy: unregister hooks on init error path selftests: netfilter: disable rp_filter on router ipv4: fix data races in fib_alias_hw_flags_set ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() ipv6: per-netns exclusive flowlabel checks Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" mac80211: mlme: check for null after calling kmemdup brcmfmac: firmware: Fix crash in brcm_alt_fw_path cfg80211: fix race in netlink owner interface destruction net: dsa: lan9303: fix reset on probe net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN net: dsa: lantiq_gswip: fix use after free in gswip_remove() net: dsa: lan9303: handle hwaccel VLAN tags net: dsa: lan9303: add VLAN IDs to master device net: ieee802154: ca8210: Fix lifs/sifs periods ping: fix the dif and sdif check in ping_lookup bonding: force carrier update when releasing slave drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit net_sched: add __rcu annotation to netdev->qdisc bonding: fix data-races around agg_select_timer libsubcmd: Fix use-after-free for realloc(..., 0) net/smc: Avoid overwriting the copies of clcsock callback functions net: phy: mediatek: remove PHY mode check on MT7531 atl1c: fix tx timeout after link flap on Mikrotik 10/25G NIC tipc: fix wrong publisher node address in link publications dpaa2-switch: fix default return of dpaa2_switch_flower_parse_mirror_key dpaa2-eth: Initialize mutex used in one step timestamping path net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled perf bpf: Defer freeing string after possible strlen() on it selftests/exec: Add non-regular to TEST_GEN_PROGS arm64: Correct wrong label in macro __init_el2_gicv3 ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra ALSA: hda/realtek: Add quirk for Legion Y9000X 2019 ALSA: hda/realtek: Fix deadlock by COEF mutex ALSA: hda: Fix regression on forced probe mask option ALSA: hda: Fix missing codec probe on Shenker Dock 15 ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_range() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_sx() ASoC: ops: Fix stereo change notifications in snd_soc_put_xr_sx() cifs: fix set of group SID via NTSD xattrs powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE powerpc/lib/sstep: fix 'ptesync' build error mtd: rawnand: gpmi: don't leak PM reference in error path smb3: fix snapshot mount option tipc: fix wrong notification node addresses scsi: ufs: Remove dead code scsi: ufs: Fix a deadlock in the error handler ASoC: tas2770: Insert post reset delay ASoC: qcom: Actually clear DMA interrupt register for HDMI block/wbt: fix negative inflight counter when remove scsi device NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked() NFS: LOOKUP_DIRECTORY is also ok with symlinks NFS: Do not report writeback errors in nfs_getattr() tty: n_tty: do not look ahead for EOL character past the end of the buffer block: fix surprise removal for drivers calling blk_set_queue_dying mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe() mtd: parsers: qcom: Fix kernel panic on skipped partition mtd: parsers: qcom: Fix missing free for pparts in cleanup mtd: phram: Prevent divide by zero bug in phram_setup() mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status HID: elo: fix memory leak in elo_probe mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get Drivers: hv: vmbus: Fix memory leak in vmbus_add_channel_kobj KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id() KVM: x86/pmu: Don't truncate the PerfEvtSeln MSR when creating a perf event KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW ARM: OMAP2+: hwmod: Add of_node_put() before break ARM: OMAP2+: adjust the location of put_device() call in omapdss_init_of phy: usb: Leave some clocks running during suspend staging: vc04_services: Fix RCU dereference check phy: phy-mtk-tphy: Fix duplicated argument in phy-mtk-tphy irqchip/sifive-plic: Add missing thead,c900-plic match string x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm netfilter: conntrack: don't refresh sctp entries in closed state ksmbd: fix same UniqueId for dot and dotdot entries ksmbd: don't align last entry offset in smb2 query directory arm64: dts: meson-gx: add ATF BL32 reserved-memory region arm64: dts: meson-g12: add ATF BL32 reserved-memory region arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610 pidfd: fix test failure due to stack overflow on some arches selftests: fixup build warnings in pidfd / clone3 tests mm: io_uring: allow oom-killer from io_uring_setup ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems" kconfig: let 'shell' return enough output for deep path names ata: libata-core: Disable TRIM on M88V29 soc: aspeed: lpc-ctrl: Block error printing on probe defer cases xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create drm/rockchip: dw_hdmi: Do not leave clock enabled in error case tracing: Fix tp_printk option related with tp_printk_stop_on_boot display/amd: decrease message verbosity about watermarks table failure drm/amd/display: Cap pflip irqs per max otg number drm/amd/display: fix yellow carp wm clamping net: usb: qmi_wwan: Add support for Dell DW5829e net: macb: Align the dma and coherent dma masks kconfig: fix failing to generate auto.conf scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop EDAC: Fix calculation of returned address and next offset in edac_align_ptr() ucounts: Handle wrapping in is_ucounts_overlimit ucounts: In set_cred_ucounts assume new->ucounts is non-NULL ucounts: Base set_cred_ucounts changes on the real user ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 lib/iov_iter: initialize "flags" in new pipe_buffer rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user ucounts: Move RLIMIT_NPROC handling after set_user net: sched: limit TC_ACT_REPEAT loops dmaengine: sh: rcar-dmac: Check for error num after setting mask dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size tests: fix idmapped mount_setattr test i2c: qcom-cci: don't delete an unregistered adapter i2c: qcom-cci: don't put a device tree node before i2c_add_adapter() dmaengine: ptdma: Fix the error handling path in pt_core_init() copy_process(): Move fd_install() out of sighand->siglock critical section scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() ice: enable parsing IPSEC SPI headers for RSS i2c: brcmstb: fix support for DSL and CM variants lockdep: Correct lock_classes index mapping Linux 5.15.25 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib129a0e11f5e82d67563329a5de1b0aef1d87928 |
||
|
|
a8dd0cfa37 |
fs/proc: task_mmu.c: don't read mapcount for migration entry
commit 24d7275ce2791829953ed4e72f68277ceb2571c6 upstream.
The syzbot reported the below BUG:
kernel BUG at include/linux/page-flags.h:785!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
Call Trace:
page_mapcount include/linux/mm.h:837 [inline]
smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
walk_pmd_range mm/pagewalk.c:128 [inline]
walk_pud_range mm/pagewalk.c:205 [inline]
walk_p4d_range mm/pagewalk.c:240 [inline]
walk_pgd_range mm/pagewalk.c:277 [inline]
__walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
walk_page_vma+0x277/0x350 mm/pagewalk.c:530
smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
smap_gather_stats fs/proc/task_mmu.c:741 [inline]
show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
seq_read+0x3e0/0x5b0 fs/seq_file.c:162
vfs_read+0x1b5/0x600 fs/read_write.c:479
ksys_read+0x12d/0x250 fs/read_write.c:619
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The reproducer was trying to read /proc/$PID/smaps when calling
MADV_FREE at the mean time. MADV_FREE may split THPs if it is called
for partial THP. It may trigger the below race:
CPU A CPU B
----- -----
smaps walk: MADV_FREE:
page_mapcount()
PageCompound()
split_huge_page()
page = compound_head(page)
PageDoubleMap(page)
When calling PageDoubleMap() this page is not a tail page of THP anymore
so the BUG is triggered.
This could be fixed by elevated refcount of the page before calling
mapcount, but that would prevent it from counting migration entries, and
it seems overkilling because the race just could happen when PMD is
split so all PTE entries of tail pages are actually migration entries,
and smaps_account() does treat migration entries as mapcount == 1 as
Kirill pointed out.
Add a new parameter for smaps_account() to tell this entry is migration
entry then skip calling page_mapcount(). Don't skip getting mapcount
for device private entries since they do track references with mapcount.
Pagemap also has the similar issue although it was not reported. Fixed
it as well.
[shy828301@gmail.com: v4]
Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
[nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()]
Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org
Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com
Fixes:
|
||
|
|
049413278d |
UPSTREAM: mm: move anon_vma declarations to linux/mm_inline.h
The patch to add anonymous vma names causes a build failure in some
configurations:
include/linux/mm_types.h: In function 'is_same_vma_anon_name':
include/linux/mm_types.h:924:37: error: implicit declaration of function 'strcmp' [-Werror=implicit-function-declaration]
924 | return name && vma_name && !strcmp(name, vma_name);
| ^~~~~~
include/linux/mm_types.h:22:1: note: 'strcmp' is defined in header '<string.h>'; did you forget to '#include <string.h>'?
This should not really be part of linux/mm_types.h in the first place,
as that header is meant to only contain structure defintions and need a
minimum set of indirect includes itself.
While the header clearly includes more than it should at this point,
let's not make it worse by including string.h as well, which would pull
in the expensive (compile-speed wise) fortify-string logic.
Move the new functions into a separate header that only needs to be
included in a couple of locations.
Link: https://lkml.kernel.org/r/20211207125710.2503446-1-arnd@kernel.org
Fixes: "mm: add a field to store names for private anonymous memory"
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 17fca131cee21724ee953a17c185c14e9533af5b)
Bug: 120441514
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I54719d7ea27d3cf53ef7245b2af88d2a2bc9bafe
|
||
|
|
301c56064d |
UPSTREAM: mm: add a field to store names for private anonymous memory
In many userspace applications, and especially in VM based applications like Android uses heavily, there are multiple different allocators in use. At a minimum there is libc malloc and the stack, and in many cases there are libc malloc, the stack, direct syscalls to mmap anonymous memory, and multiple VM heaps (one for small objects, one for big objects, etc.). Each of these layers usually has its own tools to inspect its usage; malloc by compiling a debug version, the VM through heap inspection tools, and for direct syscalls there is usually no way to track them. On Android we heavily use a set of tools that use an extended version of the logic covered in Documentation/vm/pagemap.txt to walk all pages mapped in userspace and slice their usage by process, shared (COW) vs. unique mappings, backing, etc. This can account for real physical memory usage even in cases like fork without exec (which Android uses heavily to share as many private COW pages as possible between processes), Kernel SamePage Merging, and clean zero pages. It produces a measurement of the pages that only exist in that process (USS, for unique), and a measurement of the physical memory usage of that process with the cost of shared pages being evenly split between processes that share them (PSS). If all anonymous memory is indistinguishable then figuring out the real physical memory usage (PSS) of each heap requires either a pagemap walking tool that can understand the heap debugging of every layer, or for every layer's heap debugging tools to implement the pagemap walking logic, in which case it is hard to get a consistent view of memory across the whole system. Tracking the information in userspace leads to all sorts of problems. It either needs to be stored inside the process, which means every process has to have an API to export its current heap information upon request, or it has to be stored externally in a filesystem that somebody needs to clean up on crashes. It needs to be readable while the process is still running, so it has to have some sort of synchronization with every layer of userspace. Efficiently tracking the ranges requires reimplementing something like the kernel vma trees, and linking to it from every layer of userspace. It requires more memory, more syscalls, more runtime cost, and more complexity to separately track regions that the kernel is already tracking. This patch adds a field to /proc/pid/maps and /proc/pid/smaps to show a userspace-provided name for anonymous vmas. The names of named anonymous vmas are shown in /proc/pid/maps and /proc/pid/smaps as [anon:<name>]. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name) Setting the name to NULL clears it. The name length limit is 80 bytes including NUL-terminator and is checked to contain only printable ascii characters (including space), except '[',']','\','$' and '`'. Ascii strings are being used to have a descriptive identifiers for vmas, which can be understood by the users reading /proc/pid/maps or /proc/pid/smaps. Names can be standardized for a given system and they can include some variable parts such as the name of the allocator or a library, tid of the thread using it, etc. The name is stored in a pointer in the shared union in vm_area_struct that points to a null terminated string. Anonymous vmas with the same name (equivalent strings) and are otherwise mergeable will be merged. The name pointers are not shared between vmas even if they contain the same name. The name pointer is stored in a union with fields that are only used on file-backed mappings, so it does not increase memory usage. CONFIG_ANON_VMA_NAME kernel configuration is introduced to enable this feature. It keeps the feature disabled by default to prevent any additional memory overhead and to avoid confusing procfs parsers on systems which are not ready to support named anonymous vmas. The patch is based on the original patch developed by Colin Cross, more specifically on its latest version [1] posted upstream by Sumit Semwal. It used a userspace pointer to store vma names. In that design, name pointers could be shared between vmas. However during the last upstreaming attempt, Kees Cook raised concerns [2] about this approach and suggested to copy the name into kernel memory space, perform validity checks [3] and store as a string referenced from vm_area_struct. One big concern is about fork() performance which would need to strdup anonymous vma names. Dave Hansen suggested experimenting with worst-case scenario of forking a process with 64k vmas having longest possible names [4]. I ran this experiment on an ARM64 Android device and recorded a worst-case regression of almost 40% when forking such a process. This regression is addressed in the followup patch which replaces the pointer to a name with a refcounted structure that allows sharing the name pointer between vmas of the same name. Instead of duplicating the string during fork() or when splitting a vma it increments the refcount. [1] https://lore.kernel.org/linux-mm/20200901161459.11772-4-sumit.semwal@linaro.org/ [2] https://lore.kernel.org/linux-mm/202009031031.D32EF57ED@keescook/ [3] https://lore.kernel.org/linux-mm/202009031022.3834F692@keescook/ [4] https://lore.kernel.org/linux-mm/5d0358ab-8c47-2f5f-8e43-23b89d6a8e95@intel.com/ Changes for prctl(2) manual page (in the options section): PR_SET_VMA Sets an attribute specified in arg2 for virtual memory areas starting from the address specified in arg3 and spanning the size specified in arg4. arg5 specifies the value of the attribute to be set. Note that assigning an attribute to a virtual memory area might prevent it from being merged with adjacent virtual memory areas due to the difference in that attribute's value. Currently, arg2 must be one of: PR_SET_VMA_ANON_NAME Set a name for anonymous virtual memory areas. arg5 should be a pointer to a null-terminated string containing the name. The name length including null byte cannot exceed 80 bytes. If arg5 is NULL, the name of the appropriate anonymous virtual memory areas will be reset. The name can contain only printable ascii characters (including space), except '[',']','\','$' and '`'. This feature is available only if the kernel is built with the CONFIG_ANON_VMA_NAME option enabled. [surenb@google.com: docs: proc.rst: /proc/PID/maps: fix malformed table] Link: https://lkml.kernel.org/r/20211123185928.2513763-1-surenb@google.com [surenb: rebased over v5.15-rc6, replaced userpointer with a kernel copy, added input sanitization and CONFIG_ANON_VMA_NAME config. The bulk of the work here was done by Colin Cross, therefore, with his permission, keeping him as the author] Link: https://lkml.kernel.org/r/20211019215511.3771969-2-surenb@google.com Signed-off-by: Colin Cross <ccross@google.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Glauber <jan.glauber@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Landley <rob@landley.net> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Shaohua Li <shli@fusionio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 9a10064f5625d5572c3626c1516e0bebc6c9fe9b) Bug: 120441514 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I53d56d551a7d62f75341304751814294b447c04e |
||
|
|
f355f9635d |
Revert "ANDROID: mm: add a field to store names for private anonymous memory"
This reverts commit
|
||
|
|
65d0fb3715 |
Revert "ANDROID: fix up 60500a4228 ("ANDROID: mm: add a field to store names for private anonymous memory")"
This reverts commit
|
||
|
|
c5cd945b24 |
Merge fd47ff55c9 ("Merge tag 'usb-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb") into android-mainline
Steps on the way to 5.15-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I42ffa8818bbb2072f043923553c4d8f91d9647a5 |
||
|
|
8d0920bde5 |
mm: remove VM_DENYWRITE
All in-tree users of MAP_DENYWRITE are gone. MAP_DENYWRITE cannot be set from user space, so all users are gone; let's remove it. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: David Hildenbrand <david@redhat.com> |
||
|
|
293f275f4d |
Merge commit df8ba5f160 ("Merge tag 'kgdb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux") into android-mainline
A large step en route to v5.14-rc1 Change-Id: I52bb71dc737044a593d1a9dfd7fe02b31e273ff9 Signed-off-by: Lee Jones <lee.jones@linaro.org> |
||
|
|
8e658623d4 |
Merge commit c288d9cd71 ("Merge tag 'for-5.14/io_uring-2021-06-30' of git://git.kernel.dk/linux-block") into android-mainline
Another small step en route to v5.14-rc1 Change-Id: I24899ab78da7d367574ed69ceaa82ab0837d9556 Signed-off-by: Lee Jones <lee.jones@linaro.org> |
||
|
|
af5cdaf822 |
mm: remove special swap entry functions
Patch series "Add support for SVM atomics in Nouveau", v11. Introduction ============ Some devices have features such as atomic PTE bits that can be used to implement atomic access to system memory. To support atomic operations to a shared virtual memory page such a device needs access to that page which is exclusive of the CPU. This series introduces a mechanism to temporarily unmap pages granting exclusive access to a device. These changes are required to support OpenCL atomic operations in Nouveau to shared virtual memory (SVM) regions allocated with the CL_MEM_SVM_ATOMICS clSVMAlloc flag. A more complete description of the OpenCL SVM feature is available at https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/ OpenCL_API.html#_shared_virtual_memory . Implementation ============== Exclusive device access is implemented by adding a new swap entry type (SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main difference is that on fault the original entry is immediately restored by the fault handler instead of waiting. Restoring the entry triggers calls to MMU notifers which allows a device driver to revoke the atomic access permission from the GPU prior to the CPU finalising the entry. Patches ======= Patches 1 & 2 refactor existing migration and device private entry functions. Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated functionality into separate functions - try_to_migrate_one() and try_to_munlock_one(). Patch 5 renames some existing code but does not introduce functionality. Patch 6 is a small clean-up to swap entry handling in copy_pte_range(). Patch 7 contains the bulk of the implementation for device exclusive memory. Patch 8 contains some additions to the HMM selftests to ensure everything works as expected. Patch 9 is a cleanup for the Nouveau SVM implementation. Patch 10 contains the implementation of atomic access for the Nouveau driver. Testing ======= This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program which checks that GPU atomic accesses to system memory are atomic. Without this series the test fails as there is no way of write-protecting the page mapping which results in the device clobbering CPU writes. For reference the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/ Further testing has been performed by adding support for testing exclusive access to the hmm-tests kselftests. This patch (of 10): Remove multiple similar inline functions for dealing with different types of special swap entries. Both migration and device private swap entries use the swap offset to store a pfn. Instead of multiple inline functions to obtain a struct page for each swap entry type use a common function pfn_swap_entry_to_page(). Also open-code the various entry_to_pfn() functions as this results is shorter code that is easier to understand. Link: https://lkml.kernel.org/r/20210616105937.23201-1-apopple@nvidia.com Link: https://lkml.kernel.org/r/20210616105937.23201-2-apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
fb8e37f35a |
mm/pagemap: export uffd-wp protection information
Export the PTE/PMD status of uffd-wp to pagemap too. Link: https://lkml.kernel.org/r/20210428225030.9708-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Shaohua Li <shli@fb.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Wang Qing <wangqing@vivo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
e6be37b2e7 |
mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()
Since commit |
||
|
|
a458b76a41 |
mm: gup: pack has_pinned in MMF_HAS_PINNED
has_pinned 32bit can be packed in the MMF_HAS_PINNED bit as a noop cleanup. Any atomic_inc/dec to the mm cacheline shared by all threads in pin-fast would reintroduce a loss of SMP scalability to pin-fast, so there's no future potential usefulness to keep an atomic in the mm for this. set_bit(MMF_HAS_PINNED) will be theoretically a bit slower than WRITE_ONCE (atomic_set is equivalent to WRITE_ONCE), but the set_bit (just like atomic_set after this commit) has to be still issued only once per "mm", so the difference between the two will be lost in the noise. will-it-scale "mmap2" shows no change in performance with enterprise config as expected. will-it-scale "pin_fast" retains the > 4000% SMP scalability performance improvement against upstream as expected. This is a noop as far as overall performance and SMP scalability are concerned. [peterx@redhat.com: pack has_pinned in MMF_HAS_PINNED] Link: https://lkml.kernel.org/r/YJqWESqyxa8OZA+2@t490s [akpm@linux-foundation.org: coding style fixes] [peterx@redhat.com: fix build for task_mmu.c, introduce mm_set_has_pinned_flag, fix comments] Link: https://lkml.kernel.org/r/20210507150553.208763-4-peterx@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
85adc860fd |
Merge 6efb943b86 Linux 5.13-rc1 into android-mainline
One giant leap, all the way up to 5.13-rc1 Also take the opportunity to re-align (a.k.a. fix a couple of previous merge conflict fix-up issues) which occurred during this merge-window. Fixes: |
||
|
|
7677f7fd8b |
userfaultfd: add minor fault registration mode
Patch series "userfaultfd: add minor fault handling", v9. Overview ======== This series adds a new userfaultfd feature, UFFD_FEATURE_MINOR_HUGETLBFS. When enabled (via the UFFDIO_API ioctl), this feature means that any hugetlbfs VMAs registered with UFFDIO_REGISTER_MODE_MISSING will *also* get events for "minor" faults. By "minor" fault, I mean the following situation: Let there exist two mappings (i.e., VMAs) to the same page(s) (shared memory). One of the mappings is registered with userfaultfd (in minor mode), and the other is not. Via the non-UFFD mapping, the underlying pages have already been allocated & filled with some contents. The UFFD mapping has not yet been faulted in; when it is touched for the first time, this results in what I'm calling a "minor" fault. As a concrete example, when working with hugetlbfs, we have huge_pte_none(), but find_lock_page() finds an existing page. We also add a new ioctl to resolve such faults: UFFDIO_CONTINUE. The idea is, userspace resolves the fault by either a) doing nothing if the contents are already correct, or b) updating the underlying contents using the second, non-UFFD mapping (via memcpy/memset or similar, or something fancier like RDMA, or etc...). In either case, userspace issues UFFDIO_CONTINUE to tell the kernel "I have ensured the page contents are correct, carry on setting up the mapping". Use Case ======== Consider the use case of VM live migration (e.g. under QEMU/KVM): 1. While a VM is still running, we copy the contents of its memory to a target machine. The pages are populated on the target by writing to the non-UFFD mapping, using the setup described above. The VM is still running (and therefore its memory is likely changing), so this may be repeated several times, until we decide the target is "up to date enough". 2. We pause the VM on the source, and start executing on the target machine. During this gap, the VM's user(s) will *see* a pause, so it is desirable to minimize this window. 3. Between the last time any page was copied from the source to the target, and when the VM was paused, the contents of that page may have changed - and therefore the copy we have on the target machine is out of date. Although we can keep track of which pages are out of date, for VMs with large amounts of memory, it is "slow" to transfer this information to the target machine. We want to resume execution before such a transfer would complete. 4. So, the guest begins executing on the target machine. The first time it touches its memory (via the UFFD-registered mapping), userspace wants to intercept this fault. Userspace checks whether or not the page is up to date, and if not, copies the updated page from the source machine, via the non-UFFD mapping. Finally, whether a copy was performed or not, userspace issues a UFFDIO_CONTINUE ioctl to tell the kernel "I have ensured the page contents are correct, carry on setting up the mapping". We don't have to do all of the final updates on-demand. The userfaultfd manager can, in the background, also copy over updated pages once it receives the map of which pages are up-to-date or not. Interaction with Existing APIs ============================== Because this is a feature, a registered VMA could potentially receive both missing and minor faults. I spent some time thinking through how the existing API interacts with the new feature: UFFDIO_CONTINUE cannot be used to resolve non-minor faults, as it does not allocate a new page. If UFFDIO_CONTINUE is used on a non-minor fault: - For non-shared memory or shmem, -EINVAL is returned. - For hugetlb, -EFAULT is returned. UFFDIO_COPY and UFFDIO_ZEROPAGE cannot be used to resolve minor faults. Without modifications, the existing codepath assumes a new page needs to be allocated. This is okay, since userspace must have a second non-UFFD-registered mapping anyway, thus there isn't much reason to want to use these in any case (just memcpy or memset or similar). - If UFFDIO_COPY is used on a minor fault, -EEXIST is returned. - If UFFDIO_ZEROPAGE is used on a minor fault, -EEXIST is returned (or -EINVAL in the case of hugetlb, as UFFDIO_ZEROPAGE is unsupported in any case). - UFFDIO_WRITEPROTECT simply doesn't work with shared memory, and returns -ENOENT in that case (regardless of the kind of fault). Future Work =========== This series only supports hugetlbfs. I have a second series in flight to support shmem as well, extending the functionality. This series is more mature than the shmem support at this point, and the functionality works fully on hugetlbfs, so this series can be merged first and then shmem support will follow. This patch (of 6): This feature allows userspace to intercept "minor" faults. By "minor" faults, I mean the following situation: Let there exist two mappings (i.e., VMAs) to the same page(s). One of the mappings is registered with userfaultfd (in minor mode), and the other is not. Via the non-UFFD mapping, the underlying pages have already been allocated & filled with some contents. The UFFD mapping has not yet been faulted in; when it is touched for the first time, this results in what I'm calling a "minor" fault. As a concrete example, when working with hugetlbfs, we have huge_pte_none(), but find_lock_page() finds an existing page. This commit adds the new registration mode, and sets the relevant flag on the VMAs being registered. In the hugetlb fault path, if we find that we have huge_pte_none(), but find_lock_page() does indeed find an existing page, then we have a "minor" fault, and if the VMA has the userfaultfd registration flag, we call into userfaultfd to handle it. This is implemented as a new registration mode, instead of an API feature. This is because the alternative implementation has significant drawbacks [1]. However, doing it this was requires we allocate a VM_* flag for the new registration mode. On 32-bit systems, there are no unused bits, so this feature is only supported on architectures with CONFIG_ARCH_USES_HIGH_VMA_FLAGS. When attempting to register a VMA in MINOR mode on 32-bit architectures, we return -EINVAL. [1] https://lore.kernel.org/patchwork/patch/1380226/ [peterx@redhat.com: fix minor fault page leak] Link: https://lkml.kernel.org/r/20210322175132.36659-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210301222728.176417-1-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20210301222728.176417-2-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michal Koutn" <mkoutny@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shaohua Li <shli@fb.com> Cc: Shawn Anastasio <shawn@anastas.io> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Price <steven.price@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Adam Ruprecht <ruprecht@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Cannon Matthews <cannonmatthews@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
6bb5c8b522 |
Merge 5.12-rc3 into android-mainline
Linux 5.12-rc3 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ibc683be6a28885f39d477e3f86ec2ecd1a9e7e21 |
||
|
|
ca6eb14d64 |
mm: use is_cow_mapping() across tree where proper
After is_cow_mapping() is exported in mm.h, replace some manual checks elsewhere throughout the tree but start to use the new helper. Link: https://lkml.kernel.org/r/20210217233547.93892-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@ziepe.ca> Cc: VMware Graphics <linux-graphics-maintainer@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Gal Pressman <galpress@amazon.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Wei Zhang <wzam@amazon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
1e5781383b |
Merge 657bd90c93 ("Merge tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") into android-mainline
Steps on the way to 5.12-rc1
Resolves conflicts in:
kernel/sched/cpufreq_schedutil.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I06d90f919467f3e7e8970aaedbb872a10eb699ff
|
||
|
|
912efa17e5 |
mm: proc: Invalidate TLB after clearing soft-dirty page state
Since commit |
||
|
|
902943e032 |
Merge v5.11-rc4 into android-mainline
Linux 5.11-rc4 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id18fa2761612efa7e1e8d03f4225bfac40064004 |
||
|
|
9348b73c2e |
mm: don't play games with pinned pages in clear_page_refs
Turning a pinned page read-only breaks the pinning after COW. Don't do it. The whole "track page soft dirty" state doesn't work with pinned pages anyway, since the page might be dirtied by the pinning entity without ever being noticed in the page tables. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
29a951dfb3 |
mm: fix clear_refs_write locking
Turning page table entries read-only requires the mmap_sem held for writing. So stop doing the odd games with turning things from read locks to write locks and back. Just get the write lock. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
35dcb7ff81 |
Merge 7f376f1917 ("Merge tag 'mtd/fixes-for-5.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux") into android-mainline
Steps on the way to 5.10-final Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iaf246e292babee342e89c090dad69e12e2cb4d75 |
||
|
|
40d6366e9d |
proc: use untagged_addr() for pagemap_read addresses
When we try to visit the pagemap of a tagged userspace pointer, we find
that the start_vaddr is not correct because of the tag.
To fix it, we should untag the userspace pointers in pagemap_read().
I tested with 5.10-rc4 and the issue remains.
Explanation from Catalin in [1]:
"Arguably, that's a user-space bug since tagged file offsets were never
supported. In this case it's not even a tag at bit 56 as per the arm64
tagged address ABI but rather down to bit 47. You could say that the
problem is caused by the C library (malloc()) or whoever created the
tagged vaddr and passed it to this function. It's not a kernel
regression as we've never supported it.
Now, pagemap is a special case where the offset is usually not
generated as a classic file offset but rather derived by shifting a
user virtual address. I guess we can make a concession for pagemap
(only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)"
My test code is based on [2]:
A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8
userspace program:
uint64 OsLayer::VirtualToPhysical(void *vaddr) {
uint64 frame, paddr, pfnmask, pagemask;
int pagesize = sysconf(_SC_PAGESIZE);
off64_t off = ((uintptr_t)vaddr) / pagesize * 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0
int fd = open(kPagemapPath, O_RDONLY);
...
if (lseek64(fd, off, SEEK_SET) != off || read(fd, &frame, 8) != 8) {
int err = errno;
string errtxt = ErrorString(err);
if (fd >= 0)
close(fd);
return 0;
}
...
}
kernel fs/proc/task_mmu.c:
static ssize_t pagemap_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
...
src = *ppos;
svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54
start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000
end_vaddr = mm->task_size;
/* watch out for wraparound */
// svpfn == 0xb400007662f54
// (mm->task_size >> PAGE) == 0x8000000
if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4
start_vaddr = end_vaddr;
ret = 0;
while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr
int len;
unsigned long end;
...
}
...
}
[1] https://lore.kernel.org/patchwork/patch/1343258/
[2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158
Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
Cc: <stable@vger.kernel.org> [5.4-]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
05d2a661fd |
Merge 54a4c789ca ("Merge tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media") into android-mainline
Steps on the way to 5.10-rc1 Resolves conflicts in: fs/userfaultfd.c Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ie3fe3c818f1f6565cfd4fa551de72d2b72ef60af |
||
|
|
75c90a8c3a |
Merge d5660df4a5 ("Merge branch 'akpm' (patches from Andrew)") into android-mainline
steps on the way to 5.10-rc1 Change-Id: Iddc84c25b6a9d71fa8542b927d6f69c364131c3d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
1c84293163 |
Merge 6734e20e39 ("Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux") into android-mainline
Tiny steps on the way to 5.10-rc1. Change-Id: I8ff6cb398ac1c0623bf2cefd29860616d05be107 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
4d45e75a99 |
mm: remove the now-unnecessary mmget_still_valid() hack
The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
ff9f47f6f0 |
mm: proc: smaps_rollup: do not stall write attempts on mmap_lock
smaps_rollup will try to grab mmap_lock and go through the whole vma list until it finishes the iterating. When encountering large processes, the mmap_lock will be held for a longer time, which may block other write requests like mmap and munmap from progressing smoothly. There are upcoming mmap_lock optimizations like range-based locks, but the lock applied to smaps_rollup would be the coarse type, which doesn't avoid the occurrence of unpleasant contention. To solve aforementioned issue, we add a check which detects whether anyone wants to grab mmap_lock for write attempts. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Price <steven.price@arm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Song Liu <songliubraving@fb.com> Cc: Jimmy Assarsson <jimmyassarsson@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Link: http://lkml.kernel.org/r/1597715898-3854-4-git-send-email-chinwen.chang@mediatek.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
03b4b11493 |
mm: smaps*: extend smap_gather_stats to support specified beginning
Extend smap_gather_stats to support indicated beginning address at which it should start gathering. To achieve the goal, we add a new parameter @start assigned by the caller and try to refactor it for simplicity. If @start is 0, it will use the range of @vma for gathering. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Huang Ying <ying.huang@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jimmy Assarsson <jimmyassarsson@gmail.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/1597715898-3854-3-git-send-email-chinwen.chang@mediatek.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
8cf886463e |
proc: optimise smaps for shmem entries
Avoid bumping the refcount on pages when we're only interested in the swap entries. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: William Kucharski <william.kucharski@oracle.com> Link: https://lkml.kernel.org/r/20200910183318.20139-5-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
9f3419315f |
arm64: mte: Add PROT_MTE support to mmap() and mprotect()
To enable tagging on a memory range, the user must explicitly opt in via a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new memory type in the AttrIndx field of a pte, simplify the or'ing of these bits over the protection_map[] attributes by making MT_NORMAL index 0. There are two conditions for arch_vm_get_page_prot() to return the MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE, registered as VM_MTE in the vm_flags, and (2) the vma supports MTE, decided during the mmap() call (only) and registered as VM_MTE_ALLOWED. arch_calc_vm_prot_bits() is responsible for registering the user request as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its mmap() file ops call. In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on stack or brk area. The Linux mmap() syscall currently ignores unknown PROT_* flags. In the presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE will not report an error and the memory will not be mapped as Normal Tagged. For consistency, mprotect(PROT_MTE) will not report an error either if the memory range does not support MTE. Two subsequent patches in the series will propose tightening of this behaviour. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> |
||
|
|
b5c8a97d50 |
ANDROID: fix up 60500a4228 ("ANDROID: mm: add a field to store names for private anonymous memory")
In the mainline commit
|
||
|
|
418b4bd4a0 |
Merge dc06fe51d2 ("Merge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux") into android-mainline
Steps on the way to 5.9-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iceded779988ff472863b7e1c54e22a9fa6383a30 |
||
|
|
471e78cc76 |
/proc/PID/smaps: consistent whitespace output format
The keys in smaps output are padded to fixed width with spaces. All
except for THPeligible that uses tabs (only since commit
|
||
|
|
a253db8915 |
Merge ad57a1022f ("Merge tag 'exfat-for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat") into android-mainline
Steps on the way to 5.8-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4bc42f572167ea2f815688b4d1eb6124b6d260d4 |
||
|
|
a64320f5b8 |
Merge 94709049fb ("Merge branch 'akpm' (patches from Andrew)") into android-mainline
Tiny merge resolutions along the way to 5.8-rc1. Change-Id: I24b3cca28ed36f32c92b6374dae5d7f006d3bced Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
c1e8d7c6a7 |
mmap locking API: convert mmap_sem comments
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
d8ed45c5dc |
mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
48c3dc5979 |
Merge commit '533b220f7be4' ("Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux") into android-mainline
arm64 baby steps along the way to 5.8-rc1. Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: I245a58d34012df3a7c5702bc05f6f4803e53856e |
||
|
|
6103983f46 |
Merge 3ee3723b40 ("Merge tag 'm68k-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k") into android-mainline
Steps along the way to 5.8-rc1 Change-Id: I9b3945d9f149835b7db64d8eba015d8de4160013 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
94709049fb |
Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ... |
||
|
|
c94b6923fa |
/proc/PID/smaps: Add PMD migration entry parsing
Now, when reading /proc/PID/smaps, the PMD migration entry in page table is simply ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing is added. To test the patch, we run pmbench to eat 400 MB memory in background, then run /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as follows can be reproduced within 60 seconds. Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages under migration may be lost as below. 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 Size: 409600 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 407552 kB Pss: 407552 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 407552 kB Referenced: 301056 kB Anonymous: 407552 kB LazyFree: 0 kB AnonHugePages: 405504 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 1 VmFlags: rd wr mr mw me ac After the patch, it will be always, 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 Size: 409600 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 409600 kB Pss: 409600 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 409600 kB Referenced: 294912 kB Anonymous: 409600 kB LazyFree: 0 kB AnonHugePages: 407552 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 1 VmFlags: rd wr mr mw me ac Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Link: http://lkml.kernel.org/r/20200403123059.1846960-1-ying.huang@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
533b220f7b |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"A sizeable pile of arm64 updates for 5.8.
Summary below, but the big two features are support for Branch Target
Identification and Clang's Shadow Call stack. The latter is currently
arm64-only, but the high-level parts are all in core code so it could
easily be adopted by other architectures pending toolchain support
Branch Target Identification (BTI):
- Support for ARMv8.5-BTI in both user- and kernel-space. This allows
branch targets to limit the types of branch from which they can be
called and additionally prevents branching to arbitrary code,
although kernel support requires a very recent toolchain.
- Function annotation via SYM_FUNC_START() so that assembly functions
are wrapped with the relevant "landing pad" instructions.
- BPF and vDSO updates to use the new instructions.
- Addition of a new HWCAP and exposure of BTI capability to userspace
via ID register emulation, along with ELF loader support for the
BTI feature in .note.gnu.property.
- Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
Shadow Call Stack (SCS):
- Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each task
that holds only return addresses. This protects function return
control flow from buffer overruns on the main stack.
- Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
- Core support for SCS, should other architectures want to use it
too.
- SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
CPU feature detection:
- Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a concern
for KVM, which disabled support for 32-bit guests on such a system.
- Addition of new ID registers and fields as the architecture has
been extended.
Perf and PMU drivers:
- Minor fixes and cleanups to system PMU drivers.
Hardware errata:
- Unify KVM workarounds for VHE and nVHE configurations.
- Sort vendor errata entries in Kconfig.
Secure Monitor Call Calling Convention (SMCCC):
- Update to the latest specification from Arm (v1.2).
- Allow PSCI code to query the SMCCC version.
Software Delegated Exception Interface (SDEI):
- Unexport a bunch of unused symbols.
- Minor fixes to handling of firmware data.
Pointer authentication:
- Add support for dumping the kernel PAC mask in vmcoreinfo so that
the stack can be unwound by tools such as kdump.
- Simplification of key initialisation during CPU bringup.
BPF backend:
- Improve immediate generation for logical and add/sub instructions.
vDSO:
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
ACPI:
- Work around for an ambiguity in the IORT specification relating to
the "num_ids" field.
- Support _DMA method for all named components rather than only PCIe
root complexes.
- Minor other IORT-related fixes.
Miscellaneous:
- Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
- Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
- Refactoring and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
KVM: arm64: Check advertised Stage-2 page size capability
arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
ACPI/IORT: Remove the unused __get_pci_rid()
arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
arm64/cpufeature: Introduce ID_MMFR5 CPU register
arm64/cpufeature: Introduce ID_DFR1 CPU register
arm64/cpufeature: Introduce ID_PFR2 CPU register
arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
arm64: mm: Add asid_gen_match() helper
firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
arm64: vdso: Fix CFI directives in sigreturn trampoline
arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
...
|
||
|
|
80e4e56132 |
Merge branch 'for-next/bti-user' into for-next/bti
Merge in user support for Branch Target Identification, which narrowly missed the cut for 5.7 after a late ABI concern. * for-next/bti-user: arm64: bti: Document behaviour for dynamically linked binaries arm64: elf: Fix allnoconfig kernel build with !ARCH_USE_GNU_PROPERTY arm64: BTI: Add Kconfig entry for userspace BTI mm: smaps: Report arm64 guarded pages in smaps arm64: mm: Display guarded pages in ptdump KVM: arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: traps: Shuffle code to eliminate forward declarations arm64: unify native/compat instruction skipping arm64: BTI: Decode BYTPE bits when printing PSTATE arm64: elf: Enable BTI at exec based on ELF program properties elf: Allow arch to tweak initial mmap prot flags arm64: Basic Branch Target Identification support ELF: Add ELF program property parsing support ELF: UAPI and Kconfig additions for ELF program properties |
||
|
|
66648766ef |
mm: Remove MPX leftovers
Remove MPX leftovers in generic code.
Fixes:
|