dd887dbfaaf816cdecdd28ff8578e8c290848b69
1193 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7f53b0e704 |
BACKPORT: mm: multi-gen LRU: support page table walks
To further exploit spatial locality, the aging prefers to walk page tables
to search for young PTEs and promote hot pages. A kill switch will be
added in the next patch to disable this behavior. When disabled, the
aging relies on the rmap only.
NB: this behavior has nothing similar with the page table scanning in the
2.4 kernel [1], which searches page tables for old PTEs, adds cold pages
to swapcache and unmaps them.
To avoid confusion, the term "iteration" specifically means the traversal
of an entire mm_struct list; the term "walk" will be applied to page
tables and the rmap, as usual.
An mm_struct list is maintained for each memcg, and an mm_struct follows
its owner task to the new memcg when this task is migrated. Given an
lruvec, the aging iterates lruvec_memcg()->mm_list and calls
walk_page_range() with each mm_struct on this list to promote hot pages
before it increments max_seq.
When multiple page table walkers iterate the same list, each of them gets
a unique mm_struct; therefore they can run concurrently. Page table
walkers ignore any misplaced pages, e.g., if an mm_struct was migrated,
pages it left in the previous memcg will not be promoted when its current
memcg is under reclaim. Similarly, page table walkers will not promote
pages from nodes other than the one under reclaim.
This patch uses the following optimizations when walking page tables:
1. It tracks the usage of mm_struct's between context switches so that
page table walkers can skip processes that have been sleeping since
the last iteration.
2. It uses generational Bloom filters to record populated branches so
that page table walkers can reduce their search space based on the
query results, e.g., to skip page tables containing mostly holes or
misplaced pages.
3. It takes advantage of the accessed bit in non-leaf PMD entries when
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y.
4. It does not zigzag between a PGD table and the same PMD table
spanning multiple VMAs. IOW, it finishes all the VMAs within the
range of the same PMD table before it returns to a PGD table. This
improves the cache performance for workloads that have large
numbers of tiny VMAs [2], especially when CONFIG_PGTABLE_LEVELS=5.
Server benchmark results:
Single workload:
fio (buffered I/O): no change
Single workload:
memcached (anon): +[8, 10]%
Ops/sec KB/sec
patch1-7: 1147696.57 44640.29
patch1-8: 1245274.91 48435.66
Configurations:
no change
Client benchmark results:
kswapd profiles:
patch1-7
48.16% lzo1x_1_do_compress (real work)
8.20% page_vma_mapped_walk (overhead)
7.06% _raw_spin_unlock_irq
2.92% ptep_clear_flush
2.53% __zram_bvec_write
2.11% do_raw_spin_lock
2.02% memmove
1.93% lru_gen_look_around
1.56% free_unref_page_list
1.40% memset
patch1-8
49.44% lzo1x_1_do_compress (real work)
6.19% page_vma_mapped_walk (overhead)
5.97% _raw_spin_unlock_irq
3.13% get_pfn_page
2.85% ptep_clear_flush
2.42% __zram_bvec_write
2.08% do_raw_spin_lock
1.92% memmove
1.44% alloc_zspage
1.36% memset
Configurations:
no change
Thanks to the following developers for their efforts [3].
kernel test robot <lkp@intel.com>
[1] https://lwn.net/Articles/23732/
[2] https://llvm.org/docs/ScudoHardenedAllocator.html
[3] https://lore.kernel.org/r/202204160827.ekEARWQo-lkp@intel.com/
Link: https://lkml.kernel.org/r/20220918080010.2920238-9-yuzhao@google.com
Change-Id: I7ed3daf288e664e15bfd34991a77467a19a4e39a
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Larabel <Michael@MichaelLarabel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit bd74fdaea146029e4fa12c6de89adbe0779348a9)
[ Resolve conflicts in include/linux/memcontrol.h,
include/linux/mm_types.h ]
Bug: 249601646
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
|
||
|
|
02dc0d1dda |
Revert "FROMLIST: mm: multi-gen LRU: support page table walks"
This reverts commit
|
||
|
|
3f311327f9 |
ANDROID: mm: introduce vma refcounting to protect vma during SPF
Current mechanism to stabilize a vma during speculative page fault handling makes a copy of the faulting vma under RCU protection. This makes it hard to protect elements which do not belong to the vma but are used by the page fault handler like vma->vm_file. The problems is that a copy of the vma can't be used to safely protect the file attached to the original vma unless the file is also released after RCU grace period (which is how SPF was designed originally but that caused performance regression and had to be changed). To avoid these complications, introduce vma refcounting to stabilize and operate on the original vma during page fault handling. Page fault handler finds the vma and increases its refcount under RCU protection, vma is freed after RCU grace period, vma->vm_file is released only after refcount indicates no users. This mechanism guarantees that once get_vma returns a vma, both the vma itself and vma->vm_file are stable. Additional benefits of this patch are: we don't need to copy the vma and no additional logic is needed to stabilize vma->vm_file. Bug: 257443051 Change-Id: I59d373926d687fcbd56847a8c3500c43bf1844c8 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
|
|
50567620db |
ANDROID: reimplement vm_file protection during speculative page fault
Use vma->vm_file refcounting to protect the file during speculative page fault handling. Bug: 258731892 Change-Id: I222c23785391bea7d95c4506d70d6f68029ec45f Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
|
|
c11ef6356b |
Revert "ANDROID: add vma->file_ref_count to synchronize vma->vm_file destruction"
This reverts commit
|
||
|
|
5a1075de9c |
Merge 5.15.68 into android14-5.15
Changes in 5.15.68 net: wwan: iosm: remove pointless null check efi: libstub: Disable struct randomization efi: capsule-loader: Fix use-after-free in efi_capsule_write wifi: iwlegacy: 4965: corrected fix for potential off-by-one overflow in il4965_rs_fill_link_cmd() fs: only do a memory barrier for the first set_buffer_uptodate() Revert "mm: kmemleak: take a full lowmem check in kmemleak_*_phys()" scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX scsi: megaraid_sas: Fix double kfree() drm/gem: Fix GEM handle release errors drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup. drm/radeon: add a force flush to delay work when radeon scsi: ufs: core: Reduce the power mode change timeout Revert "parisc: Show error if wrong 32/64-bit compiler is being used" parisc: ccio-dma: Handle kmalloc failure in ccio_init_resources() parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level netfilter: conntrack: work around exceeded receive window cpufreq: check only freq_table in __resolve_freq() net/core/skbuff: Check the return value of skb_copy_bits() md: Flush workqueue md_rdev_misc_wq in md_alloc() fbdev: fbcon: Destroy mutex on freeing struct fb_info fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init() drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly ALSA: pcm: oss: Fix race at SNDCTL_DSP_SYNC ALSA: emu10k1: Fix out of bounds access in snd_emu10k1_pcm_channel_alloc() ALSA: aloop: Fix random zeros in capture data when using jiffies timer ALSA: usb-audio: Split endpoint setups for hw_params and prepare ALSA: usb-audio: Fix an out-of-bounds bug in __snd_usb_parse_audio_interface() tracing: Fix to check event_mutex is held while accessing trigger list btrfs: zoned: set pseudo max append zone limit in zone emulation mode vfio/type1: Unpin zero pages kprobes: Prohibit probes in gate area debugfs: add debugfs_lookup_and_remove() sched/debug: fix dentry leak in update_sched_domain_debugfs drm/amd/display: fix memory leak when using debugfs_lookup() nvmet: fix a use-after-free drm/i915: Implement WaEdpLinkRateDataReload scsi: mpt3sas: Fix use-after-free warning scsi: lpfc: Add missing destroy_workqueue() in error path NFS: Further optimisations for 'ls -l' NFS: Save some space in the inode NFS: Fix another fsync() issue after a server reboot cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock ASoC: qcom: sm8250: add missing module owner RDMA/rtrs-clt: Use the right sg_cnt after ib_dma_map_sg RDMA/rtrs-srv: Pass the correct number of entries for dma mapped SGL ARM: dts: imx6qdl-kontron-samx6i: remove duplicated node soc: imx: gpcv2: Assert reset before ungating clock regulator: core: Clean up on enable failure tee: fix compiler warning in tee_shm_register() RDMA/cma: Fix arguments order in net device validation soc: brcmstb: pm-arm: Fix refcount leak and __iomem leak bugs RDMA/hns: Fix supported page size RDMA/hns: Fix wrong fixed value of qp->rq.wqe_shift wifi: wilc1000: fix DMA on stack objects ARM: at91: pm: fix self-refresh for sama7g5 ARM: at91: pm: fix DDR recalibration when resuming from backup and self-refresh ARM: dts: at91: sama5d27_wlsom1: specify proper regulator output ranges ARM: dts: at91: sama5d2_icp: specify proper regulator output ranges ARM: dts: at91: sama5d27_wlsom1: don't keep ldo2 enabled all the time ARM: dts: at91: sama5d2_icp: don't keep vdd_other enabled all the time netfilter: br_netfilter: Drop dst references before setting. netfilter: nf_tables: clean up hook list when offload flags check fails netfilter: nf_conntrack_irc: Fix forged IP logic RDMA/srp: Set scmnd->result only when scmnd is not NULL ALSA: usb-audio: Inform the delayed registration more properly ALSA: usb-audio: Register card again for iface over delayed_register option rxrpc: Fix ICMP/ICMP6 error handling rxrpc: Fix an insufficiently large sglist in rxkad_verify_packet_2() afs: Use the operation issue time instead of the reply time for callbacks Revert "net: phy: meson-gxl: improve link-up behavior" sch_sfb: Don't assume the skb is still around after enqueueing to child tipc: fix shift wrapping bug in map_get() net: introduce __skb_fill_page_desc_noacc tcp: TX zerocopy should not sense pfmemalloc status ice: use bitmap_free instead of devm_kfree i40e: Fix kernel crash during module removal iavf: Detach device during reset task xen-netback: only remove 'hotplug-status' when the vif is actually destroyed RDMA/siw: Pass a pointer to virt_to_page() ipv6: sr: fix out-of-bounds read when setting HMAC data. IB/core: Fix a nested dead lock as part of ODP flow RDMA/mlx5: Set local port to one when accessing counters erofs: fix pcluster use-after-free on UP platforms nvme-tcp: fix UAF when detecting digest errors nvme-tcp: fix regression that causes sporadic requests to time out tcp: fix early ETIMEDOUT after spurious non-SACK RTO nvmet: fix mar and mor off-by-one errors RDMA/irdma: Report the correct max cqes from query device RDMA/irdma: Return correct WC error for bind operation failure RDMA/irdma: Report RNR NAK generation in device caps sch_sfb: Also store skb len before calling child enqueue perf script: Fix Cannot print 'iregs' field for hybrid systems hwmon: (tps23861) fix byte order in resistance register ASoC: mchp-spdiftx: remove references to mchp_i2s_caps ASoC: mchp-spdiftx: Fix clang -Wbitfield-constant-conversion MIPS: loongson32: ls1c: Fix hang during startup kbuild: disable header exports for UML in a straightforward way i40e: Refactor tc mqprio checks i40e: Fix ADQ rate limiting for PF swiotlb: avoid potential left shift overflow iommu/amd: use full 64-bit value in build_completion_wait() s390/boot: fix absolute zero lowcore corruption on boot hwmon: (mr75203) fix VM sensor allocation when "intel,vm-map" not defined hwmon: (mr75203) update pvt->v_num and vm_num to the actual number of used sensors hwmon: (mr75203) fix voltage equation for negative source input hwmon: (mr75203) fix multi-channel voltage reading hwmon: (mr75203) enable polling for all VM channels Revert "arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"" arm64/bti: Disable in kernel BTI when cross section thunks are broken iommu/vt-d: Correctly calculate sagaw value of IOMMU arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly drm/bridge: display-connector: implement bus fmts callbacks perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename) ARM: at91: ddr: remove CONFIG_SOC_SAMA7 dependency Linux 5.15.68 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I3e23c18230fda5af55fc5b73db9ac288835c8c23 |
||
|
|
819110054b |
IB/core: Fix a nested dead lock as part of ODP flow
[ Upstream commit 85eaeb5058f0f04dffb124c97c86b4f18db0b833 ]
Fix a nested dead lock as part of ODP flow by using mmput_async().
From the below call trace [1] can see that calling mmput() once we have
the umem_odp->umem_mutex locked as required by
ib_umem_odp_map_dma_and_lock() might trigger in the same task the
exit_mmap()->__mmu_notifier_release()->mlx5_ib_invalidate_range() which
may dead lock when trying to lock the same mutex.
Moving to use mmput_async() will solve the problem as the above
exit_mmap() flow will be called in other task and will be executed once
the lock will be available.
[1]
[64843.077665] task:kworker/u133:2 state:D stack: 0 pid:80906 ppid:
2 flags:0x00004000
[64843.077672] Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib]
[64843.077719] Call Trace:
[64843.077722] <TASK>
[64843.077724] __schedule+0x23d/0x590
[64843.077729] schedule+0x4e/0xb0
[64843.077735] schedule_preempt_disabled+0xe/0x10
[64843.077740] __mutex_lock.constprop.0+0x263/0x490
[64843.077747] __mutex_lock_slowpath+0x13/0x20
[64843.077752] mutex_lock+0x34/0x40
[64843.077758] mlx5_ib_invalidate_range+0x48/0x270 [mlx5_ib]
[64843.077808] __mmu_notifier_release+0x1a4/0x200
[64843.077816] exit_mmap+0x1bc/0x200
[64843.077822] ? walk_page_range+0x9c/0x120
[64843.077828] ? __cond_resched+0x1a/0x50
[64843.077833] ? mutex_lock+0x13/0x40
[64843.077839] ? uprobe_clear_state+0xac/0x120
[64843.077860] mmput+0x5f/0x140
[64843.077867] ib_umem_odp_map_dma_and_lock+0x21b/0x580 [ib_core]
[64843.077931] pagefault_real_mr+0x9a/0x140 [mlx5_ib]
[64843.077962] pagefault_mr+0xb4/0x550 [mlx5_ib]
[64843.077992] pagefault_single_data_segment.constprop.0+0x2ac/0x560
[mlx5_ib]
[64843.078022] mlx5_ib_eqe_pf_action+0x528/0x780 [mlx5_ib]
[64843.078051] process_one_work+0x22b/0x3d0
[64843.078059] worker_thread+0x53/0x410
[64843.078065] ? process_one_work+0x3d0/0x3d0
[64843.078073] kthread+0x12a/0x150
[64843.078079] ? set_kthread_struct+0x50/0x50
[64843.078085] ret_from_fork+0x22/0x30
[64843.078093] </TASK>
Fixes:
|
||
|
|
4daa3c254e |
ANDROID: add vma->file_ref_count to synchronize vma->vm_file destruction
In order to prevent destruction of vma->vm_file while it's being used during speculative page fault handling, introduce an atomic refcounter. Bug: 234527424 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I0e971156f3e76feb45136bac1582a7eaab8c75df |
||
|
|
0864756fb0 |
Revert "ANDROID: Use the notifier lock to perform file-backed vma teardown"
This reverts commit
|
||
|
|
bafafe0ec4 |
ANDROID: vendor_hooks: Add hooks to dup_task_struct
Add hook to dup_task_struct for vendor data fields initialisation.
Bug: 188004638
Change-Id: I4b58604ee822fb8d1e0cc37bec72e820e7318427
Signed-off-by: Liangliang Li <liliangliang@vivo.com>
(cherry picked from commit
|
||
|
|
38c4790b88 |
ANDROID: Fix the CONFIG_ANDROID_VENDOR_OEM_DATA=n build
Scripts like
https://github.com/bvanassche/build-scsi-drivers/blob/main/build-scsi-drivers
do not set CONFIG_ANDROID_VENDOR_OEM_DATA. Hence this patch that
unbreaks the CONFIG_ANDROID_VENDOR_OEM_DATA=n build.
Fixes:
|
||
|
|
34c53e1824 |
ANDROID: init_task: Init android vendor and oem data
Without initialization, it will be random data and hard for
vendor hook to decide.
Bug: 207739506
Change-Id: I278772d87eea38c03a40d4f0bef20ac8644e2ecd
Signed-off-by: Maria Yu <quic_aiquny@quicinc.com>
(cherry picked from commit
|
||
|
|
6072c99f21 |
ANDROID: Use the notifier lock to perform file-backed vma teardown
When a file-backed vma is being released, the userspace can have an
expectation that the vma and the file it's pinning will be released
synchronously. This does not happen when SPF is enabled because vma
and associated file are released asynchronously after RCU grace
period. This is done to prevent pagefault handler from stepping on
a deleted object. Fix this issue by synchronizing the file-backed
pagefault handler with the vma tear-down using notifier lock.
Fixes:
|
||
|
|
23ef56f65c |
Revert "ANDROID: Make file-backed vma teardown synchronous"
This reverts commit
|
||
|
|
fe25fc5375 |
ANDROID: Make file-backed vma teardown synchronous
When a file-backed vma is being released, the userspace can have an
expectation that the vma and the file it's pinning will be released
synchronously. This does not happen when SPF is enabled because vma
and associated file are released asynchronously after RCU grace
period. This is done to prevent pagefault handler from stepping on
a deleted object. Fix this issue by synchronously waiting for RCU
grace period during file-backed vma tear-down.
Fixes:
|
||
|
|
5280d76d38 |
FROMLIST: mm: multi-gen LRU: support page table walks
To further exploit spatial locality, the aging prefers to walk page
tables to search for young PTEs and promote hot pages. A kill switch
will be added in the next patch to disable this behavior. When
disabled, the aging relies on the rmap only.
NB: this behavior has nothing similar with the page table scanning in
the 2.4 kernel [1], which searches page tables for old PTEs, adds cold
pages to swapcache and unmaps them.
To avoid confusion, the term "iteration" specifically means the
traversal of an entire mm_struct list; the term "walk" will be applied
to page tables and the rmap, as usual.
An mm_struct list is maintained for each memcg, and an mm_struct
follows its owner task to the new memcg when this task is migrated.
Given an lruvec, the aging iterates lruvec_memcg()->mm_list and calls
walk_page_range() with each mm_struct on this list to promote hot
pages before it increments max_seq.
When multiple page table walkers iterate the same list, each of them
gets a unique mm_struct; therefore they can run concurrently. Page
table walkers ignore any misplaced pages, e.g., if an mm_struct was
migrated, pages it left in the previous memcg will not be promoted
when its current memcg is under reclaim. Similarly, page table walkers
will not promote pages from nodes other than the one under reclaim.
This patch uses the following optimizations when walking page tables:
1. It tracks the usage of mm_struct's between context switches so that
page table walkers can skip processes that have been sleeping since
the last iteration.
2. It uses generational Bloom filters to record populated branches so
that page table walkers can reduce their search space based on the
query results, e.g., to skip page tables containing mostly holes or
misplaced pages.
3. It takes advantage of the accessed bit in non-leaf PMD entries when
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y.
4. It does not zigzag between a PGD table and the same PMD table
spanning multiple VMAs. IOW, it finishes all the VMAs within the
range of the same PMD table before it returns to a PGD table. This
improves the cache performance for workloads that have large
numbers of tiny VMAs [2], especially when CONFIG_PGTABLE_LEVELS=5.
Server benchmark results:
Single workload:
fio (buffered I/O): no change
Single workload:
memcached (anon): +[5.5, 7.5]%
Ops/sec KB/sec
patch1-7: 1014393.57 39455.42
patch1-8: 1078507.59 41949.15
Configurations:
no change
Client benchmark results:
kswapd profiles:
patch1-7
45.54% lzo1x_1_do_compress (real work)
9.56% page_vma_mapped_walk
6.70% _raw_spin_unlock_irq
2.78% ptep_clear_flush
2.47% do_raw_spin_lock
2.22% __zram_bvec_write
1.87% lru_gen_look_around
1.78% memmove
1.77% obj_malloc
1.44% free_unref_page_list
patch1-8
47.02% lzo1x_1_do_compress (real work)
6.73% page_vma_mapped_walk
6.14% _raw_spin_unlock_irq
3.39% walk_pte_range
2.63% ptep_clear_flush
2.29% __zram_bvec_write
2.10% do_raw_spin_lock
1.81% memmove
1.73% obj_malloc
1.53% free_unref_page_list
Configurations:
no change
[1] https://lwn.net/Articles/23732/
[2] https://source.android.com/devices/tech/debug/scudo
Link: https://lore.kernel.org/lkml/20220309021230.721028-9-yuzhao@google.com/
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Bug: 227651406
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: I5a3c97cf8ebf8d65d5f9528cd979a637c190053e
|
||
|
|
0fd37220d8 |
UPSTREAM: mm: refactor vm_area_struct::anon_vma_name usage code
Avoid mixing strings and their anon_vma_name referenced pointers by using struct anon_vma_name whenever possible. This simplifies the code and allows easier sharing of anon_vma_name structures when they represent the same name. [surenb@google.com: fix comment] Link: https://lkml.kernel.org/r/20220223153613.835563-1-surenb@google.com Link: https://lkml.kernel.org/r/20220224231834.1481408-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Colin Cross <ccross@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Alexey Gladkov <legion@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Collingbourne <pcc@google.com> Cc: Xiaofeng Cao <caoxiaofeng@yulong.com> Cc: David Hildenbrand <david@redhat.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 5c26f6ac9416b63d093e29c30e79b3297e425472) Bug: 218352794 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I4a6b5602ce7151d1a4b88fac489f86d68089bd4d |
||
|
|
48e35d053f |
FROMLIST: mm: rcu safe vma->vm_file freeing
Defer freeing of vma->vm_file when freeing vmas. This is to allow speculative page faults in the mapped file case. Signed-off-by: Michel Lespinasse <michel@lespinasse.org> Link: https://lore.kernel.org/all/20210407014502.24091-24-michel@lespinasse.org/ Bug: 161210518 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ic766bc2086db82eae9f3aadf0f23dd743be1c464 |
||
|
|
12230588f3 |
FROMLIST: mm: disable rcu safe vma freeing for single threaded user space
Performance tuning: as single threaded userspace does not use speculative page faults, it does not require rcu safe vma freeing. Turn this off to avoid the related (small) extra overheads. For multi threaded userspace, we often see a performance benefit from the rcu safe vma freeing - even in tests that do not have any frequent concurrent page faults ! This is because rcu safe vma freeing prevents recently released vmas from being immediately reused in a new thread. Signed-off-by: Michel Lespinasse <michel@lespinasse.org> Link: https://lore.kernel.org/all/20220128131006.67712-30-michel@lespinasse.org/ Bug: 161210518 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I81ef7ab43e2757f268c567d5bfe6ab02f1e43a1c |
||
|
|
1ae855f191 |
FROMLIST: mm: add mmu_notifier_lock
Introduce mmu_notifier_lock as a per-mm percpu_rw_semaphore, as well as the code to initialize and destroy it together with the mm. This lock will be used to prevent races between mmu_notifier_register() and speculative fault handlers that need to fire MMU notifications without holding any of the mmap or rmap locks. Signed-off-by: Michel Lespinasse <michel@lespinasse.org> Link: https://lore.kernel.org/all/20220128131006.67712-24-michel@lespinasse.org/ Bug: 161210518 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I453ebe979c8b9dcc6159b41c5ec7a1ea17d85ee2 |
||
|
|
67cc8ce9a6 |
FROMLIST: mm: rcu safe vma freeing
This prepares for speculative page faults looking up and copying vmas under protection of an rcu read lock, instead of the usual mmap read lock. Note - it might also be feasible to just use SLAB_TYPESAFE_BY_RCU when creating the vm_area_cachep, but that's probably too subtle to consider here. Signed-off-by: Michel Lespinasse <michel@lespinasse.org> Link: https://lore.kernel.org/all/20220128131006.67712-12-michel@lespinasse.org/ Bug: 161210518 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I992fddb7c32c61bb4ab10b387f91c4e54c2250ef |
||
|
|
3411613611 |
sched: Fix yet more sched_fork() races
commit b1e8206582f9d680cff7d04828708c8b6ab32957 upstream.
Where commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an
invalid sched_task_group") fixed a fork race vs cgroup, it opened up a
race vs syscalls by not placing the task on the runqueue before it
gets exposed through the pidhash.
Commit 13765de8148f ("sched/fair: Fix fault in reweight_entity") is
trying to fix a single instance of this, instead fix the whole class
of issues, effectively reverting this commit.
Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
42da9cb956 |
UPSTREAM: sched: Fix yet more sched_fork() races
Where commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an
invalid sched_task_group") fixed a fork race vs cgroup, it opened up a
race vs syscalls by not placing the task on the runqueue before it
gets exposed through the pidhash.
Commit 13765de8148f ("sched/fair: Fix fault in reweight_entity") is
trying to fix a single instance of this, instead fix the whole class
of issues, effectively reverting this commit.
Change-Id: I4d34311eac28b23ee32e9308a21c66afe8fa8a3b
Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net
BUG: 221850698
(cherry picked from commit b1e8206582f9d680cff7d04828708c8b6ab32957)
Signed-off-by: Ashay Jaiswal <quic_ashayj@quicinc.com>
|
||
|
|
2ded03fd7c |
Merge 5.15.25 into android13-5.15
Changes in 5.15.25 drm/nouveau/pmu/gm200-: use alternate falcon reset sequence fs/proc: task_mmu.c: don't read mapcount for migration entry btrfs: zoned: cache reported zone during mount scsi: lpfc: Fix mailbox command failure during driver initialization HID:Add support for UGTABLET WP5540 Revert "svm: Add warning message for AVIC IPI invalid target" parisc: Show error if wrong 32/64-bit compiler is being used serial: parisc: GSC: fix build when IOSAPIC is not set parisc: Drop __init from map_pages declaration parisc: Fix data TLB miss in sba_unmap_sg parisc: Fix sglist access in ccio-dma.c mmc: block: fix read single on recovery logic mm: don't try to NUMA-migrate COW pages that have other uses HID: amd_sfh: Add illuminance mask to limit ALS max value HID: i2c-hid: goodix: Fix a lockdep splat HID: amd_sfh: Increase sensor command timeout HID: amd_sfh: Correct the structure field name PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology parisc: Add ioread64_lo_hi() and iowrite64_lo_hi() btrfs: send: in case of IO error log it platform/x86: touchscreen_dmi: Add info for the RWC NANOTE P8 AY07J 2-in-1 platform/x86: ISST: Fix possible circular locking dependency detected kunit: tool: Import missing importlib.abc selftests: rtc: Increase test timeout so that all tests run kselftest: signal all child processes net: ieee802154: at86rf230: Stop leaking skb's selftests/zram: Skip max_comp_streams interface on newer kernel selftests/zram01.sh: Fix compression ratio calculation selftests/zram: Adapt the situation that /dev/zram0 is being used selftests: openat2: Print also errno in failure messages selftests: openat2: Add missing dependency in Makefile selftests: openat2: Skip testcases that fail with EOPNOTSUPP selftests: skip mincore.check_file_mmap when fs lacks needed support ax25: improve the incomplete fix to avoid UAF and NPD bugs pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP vfs: make freeze_super abort when sync_filesystem returns error quota: make dquot_quota_sync return errors from ->sync_fs scsi: pm80xx: Fix double completion for SATA devices kselftest: Fix vdso_test_abi return status scsi: core: Reallocate device's budget map on queue depth change scsi: pm8001: Fix use-after-free for aborted TMF sas_task scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task drm/amd: Warn users about potential s0ix problems nvme: fix a possible use-after-free in controller reset during load nvme-tcp: fix possible use-after-free in transport error_recovery work nvme-rdma: fix possible use-after-free in transport error_recovery work net: sparx5: do not refer to skb after passing it on drm/amd: add support to check whether the system is set to s3 drm/amd: Only run s3 or s0ix if system is configured properly drm/amdgpu: fix logic inversion in check x86/Xen: streamline (and fix) PV CPU enumeration Revert "module, async: async_synchronize_full() on module init iff async is used" gcc-plugins/stackleak: Use noinstr in favor of notrace random: wake up /dev/random writers after zap KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case KVM: x86: nSVM: fix potential NULL derefernce on nested migration KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state iwlwifi: fix use-after-free drm/radeon: Fix backlight control on iMac 12,1 drm/atomic: Don't pollute crtc_state->mode_blob with error pointers drm/amd/pm: correct the sequence of sending gpu reset msg drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. drm/i915/opregion: check port number bounds for SWSCI display power state drm/i915: Fix dbuf slice config lookup drm/i915: Fix mbus join config lookup vsock: remove vsock from connected table when connect is interrupted by a signal drm/cma-helper: Set VM_DONTEXPAND for mmap drm/i915/gvt: Make DRM_I915_GVT depend on X86 drm/i915/ttm: tweak priority hint selection iwlwifi: pcie: fix locking when "HW not ready" iwlwifi: pcie: gen2: fix locking when "HW not ready" iwlwifi: mvm: don't send SAR GEO command for 3160 devices selftests: netfilter: fix exit value for nft_concat_range netfilter: nft_synproxy: unregister hooks on init error path selftests: netfilter: disable rp_filter on router ipv4: fix data races in fib_alias_hw_flags_set ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() ipv6: per-netns exclusive flowlabel checks Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" mac80211: mlme: check for null after calling kmemdup brcmfmac: firmware: Fix crash in brcm_alt_fw_path cfg80211: fix race in netlink owner interface destruction net: dsa: lan9303: fix reset on probe net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN net: dsa: lantiq_gswip: fix use after free in gswip_remove() net: dsa: lan9303: handle hwaccel VLAN tags net: dsa: lan9303: add VLAN IDs to master device net: ieee802154: ca8210: Fix lifs/sifs periods ping: fix the dif and sdif check in ping_lookup bonding: force carrier update when releasing slave drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit net_sched: add __rcu annotation to netdev->qdisc bonding: fix data-races around agg_select_timer libsubcmd: Fix use-after-free for realloc(..., 0) net/smc: Avoid overwriting the copies of clcsock callback functions net: phy: mediatek: remove PHY mode check on MT7531 atl1c: fix tx timeout after link flap on Mikrotik 10/25G NIC tipc: fix wrong publisher node address in link publications dpaa2-switch: fix default return of dpaa2_switch_flower_parse_mirror_key dpaa2-eth: Initialize mutex used in one step timestamping path net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled perf bpf: Defer freeing string after possible strlen() on it selftests/exec: Add non-regular to TEST_GEN_PROGS arm64: Correct wrong label in macro __init_el2_gicv3 ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra ALSA: hda/realtek: Add quirk for Legion Y9000X 2019 ALSA: hda/realtek: Fix deadlock by COEF mutex ALSA: hda: Fix regression on forced probe mask option ALSA: hda: Fix missing codec probe on Shenker Dock 15 ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_range() ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_sx() ASoC: ops: Fix stereo change notifications in snd_soc_put_xr_sx() cifs: fix set of group SID via NTSD xattrs powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE powerpc/lib/sstep: fix 'ptesync' build error mtd: rawnand: gpmi: don't leak PM reference in error path smb3: fix snapshot mount option tipc: fix wrong notification node addresses scsi: ufs: Remove dead code scsi: ufs: Fix a deadlock in the error handler ASoC: tas2770: Insert post reset delay ASoC: qcom: Actually clear DMA interrupt register for HDMI block/wbt: fix negative inflight counter when remove scsi device NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked() NFS: LOOKUP_DIRECTORY is also ok with symlinks NFS: Do not report writeback errors in nfs_getattr() tty: n_tty: do not look ahead for EOL character past the end of the buffer block: fix surprise removal for drivers calling blk_set_queue_dying mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe() mtd: parsers: qcom: Fix kernel panic on skipped partition mtd: parsers: qcom: Fix missing free for pparts in cleanup mtd: phram: Prevent divide by zero bug in phram_setup() mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status HID: elo: fix memory leak in elo_probe mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get Drivers: hv: vmbus: Fix memory leak in vmbus_add_channel_kobj KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id() KVM: x86/pmu: Don't truncate the PerfEvtSeln MSR when creating a perf event KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW ARM: OMAP2+: hwmod: Add of_node_put() before break ARM: OMAP2+: adjust the location of put_device() call in omapdss_init_of phy: usb: Leave some clocks running during suspend staging: vc04_services: Fix RCU dereference check phy: phy-mtk-tphy: Fix duplicated argument in phy-mtk-tphy irqchip/sifive-plic: Add missing thead,c900-plic match string x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm netfilter: conntrack: don't refresh sctp entries in closed state ksmbd: fix same UniqueId for dot and dotdot entries ksmbd: don't align last entry offset in smb2 query directory arm64: dts: meson-gx: add ATF BL32 reserved-memory region arm64: dts: meson-g12: add ATF BL32 reserved-memory region arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610 pidfd: fix test failure due to stack overflow on some arches selftests: fixup build warnings in pidfd / clone3 tests mm: io_uring: allow oom-killer from io_uring_setup ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems" kconfig: let 'shell' return enough output for deep path names ata: libata-core: Disable TRIM on M88V29 soc: aspeed: lpc-ctrl: Block error printing on probe defer cases xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create drm/rockchip: dw_hdmi: Do not leave clock enabled in error case tracing: Fix tp_printk option related with tp_printk_stop_on_boot display/amd: decrease message verbosity about watermarks table failure drm/amd/display: Cap pflip irqs per max otg number drm/amd/display: fix yellow carp wm clamping net: usb: qmi_wwan: Add support for Dell DW5829e net: macb: Align the dma and coherent dma masks kconfig: fix failing to generate auto.conf scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop EDAC: Fix calculation of returned address and next offset in edac_align_ptr() ucounts: Handle wrapping in is_ucounts_overlimit ucounts: In set_cred_ucounts assume new->ucounts is non-NULL ucounts: Base set_cred_ucounts changes on the real user ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 lib/iov_iter: initialize "flags" in new pipe_buffer rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user ucounts: Move RLIMIT_NPROC handling after set_user net: sched: limit TC_ACT_REPEAT loops dmaengine: sh: rcar-dmac: Check for error num after setting mask dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size tests: fix idmapped mount_setattr test i2c: qcom-cci: don't delete an unregistered adapter i2c: qcom-cci: don't put a device tree node before i2c_add_adapter() dmaengine: ptdma: Fix the error handling path in pt_core_init() copy_process(): Move fd_install() out of sighand->siglock critical section scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() ice: enable parsing IPSEC SPI headers for RSS i2c: brcmstb: fix support for DSL and CM variants lockdep: Correct lock_classes index mapping Linux 5.15.25 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib129a0e11f5e82d67563329a5de1b0aef1d87928 |
||
|
|
795feafb72 |
copy_process(): Move fd_install() out of sighand->siglock critical section
commit ddc204b517e60ae64db34f9832dc41dafa77c751 upstream. I was made aware of the following lockdep splat: [ 2516.308763] ===================================================== [ 2516.309085] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 2516.309433] 5.14.0-51.el9.aarch64+debug #1 Not tainted [ 2516.309703] ----------------------------------------------------- [ 2516.310149] stress-ng/153663 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 2516.310512] ffff0000e422b198 (&newf->file_lock){+.+.}-{2:2}, at: fd_install+0x368/0x4f0 [ 2516.310944] and this task is already holding: [ 2516.311248] ffff0000c08140d8 (&sighand->siglock){-.-.}-{2:2}, at: copy_process+0x1e2c/0x3e80 [ 2516.311804] which would create a new lock dependency: [ 2516.312066] (&sighand->siglock){-.-.}-{2:2} -> (&newf->file_lock){+.+.}-{2:2} [ 2516.312446] but this new dependency connects a HARDIRQ-irq-safe lock: [ 2516.312983] (&sighand->siglock){-.-.}-{2:2} : [ 2516.330700] Possible interrupt unsafe locking scenario: [ 2516.331075] CPU0 CPU1 [ 2516.331328] ---- ---- [ 2516.331580] lock(&newf->file_lock); [ 2516.331790] local_irq_disable(); [ 2516.332231] lock(&sighand->siglock); [ 2516.332579] lock(&newf->file_lock); [ 2516.332922] <Interrupt> [ 2516.333069] lock(&sighand->siglock); [ 2516.333291] *** DEADLOCK *** [ 2516.389845] stack backtrace: [ 2516.390101] CPU: 3 PID: 153663 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-51.el9.aarch64+debug #1 [ 2516.390756] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2516.391155] Call trace: [ 2516.391302] dump_backtrace+0x0/0x3e0 [ 2516.391518] show_stack+0x24/0x30 [ 2516.391717] dump_stack_lvl+0x9c/0xd8 [ 2516.391938] dump_stack+0x1c/0x38 [ 2516.392247] print_bad_irq_dependency+0x620/0x710 [ 2516.392525] check_irq_usage+0x4fc/0x86c [ 2516.392756] check_prev_add+0x180/0x1d90 [ 2516.392988] validate_chain+0x8e0/0xee0 [ 2516.393215] __lock_acquire+0x97c/0x1e40 [ 2516.393449] lock_acquire.part.0+0x240/0x570 [ 2516.393814] lock_acquire+0x90/0xb4 [ 2516.394021] _raw_spin_lock+0xe8/0x154 [ 2516.394244] fd_install+0x368/0x4f0 [ 2516.394451] copy_process+0x1f5c/0x3e80 [ 2516.394678] kernel_clone+0x134/0x660 [ 2516.394895] __do_sys_clone3+0x130/0x1f4 [ 2516.395128] __arm64_sys_clone3+0x5c/0x7c [ 2516.395478] invoke_syscall.constprop.0+0x78/0x1f0 [ 2516.395762] el0_svc_common.constprop.0+0x22c/0x2c4 [ 2516.396050] do_el0_svc+0xb0/0x10c [ 2516.396252] el0_svc+0x24/0x34 [ 2516.396436] el0t_64_sync_handler+0xa4/0x12c [ 2516.396688] el0t_64_sync+0x198/0x19c [ 2517.491197] NET: Registered PF_ATMPVC protocol family [ 2517.491524] NET: Registered PF_ATMSVC protocol family [ 2591.991877] sched: RT throttling activated One way to solve this problem is to move the fd_install() call out of the sighand->siglock critical section. Before commit |
||
|
|
2d2d92cfcd |
ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
commit 8f2f9c4d82f24f172ae439e5035fc1e0e4c229dd upstream. Michal Koutný <mkoutny@suse.com> wrote: > It was reported that v5.14 behaves differently when enforcing > RLIMIT_NPROC limit, namely, it allows one more task than previously. > This is consequence of the commit |
||
|
|
cdb4c18935 |
FROMLIST: kasan, fork: reset pointer tags of vmapped stacks
[Combines a FROMGIT patch and a FROMLIST fix for it.] Once tag-based KASAN modes start tagging vmalloc() allocations, kernel stacks start getting tagged if CONFIG_VMAP_STACK is enabled. Reset the tag of kernel stack pointers after allocation in alloc_thread_stack_node(). For SW_TAGS KASAN, when CONFIG_KASAN_STACK is enabled, the instrumentation can't handle the SP register being tagged. For HW_TAGS KASAN, there's no instrumentation-related issues. However, the impact of having a tagged SP register needs to be properly evaluated, so keep it non-tagged for now. Note, that the memory for the stack allocation still gets tagged to catch vmalloc-into-stack out-of-bounds accesses. Link: https://lkml.kernel.org/r/c6c96f012371ecd80e1936509ebcd3b07a5956f7.1643047180.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Marco Elver <elver@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit 9d2dae85d689202c56068ce62e20821ad91c3606 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Link: https://lore.kernel.org/linux-mm/f50c5f96ef896d7936192c888b0c0a7674e33184.1644943792.git.andreyknvl@google.com/ Bug: 217222520 Change-Id: Ie723b03f1b857bc841cffc9a424b2791c97044a6 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> |
||
|
|
049413278d |
UPSTREAM: mm: move anon_vma declarations to linux/mm_inline.h
The patch to add anonymous vma names causes a build failure in some
configurations:
include/linux/mm_types.h: In function 'is_same_vma_anon_name':
include/linux/mm_types.h:924:37: error: implicit declaration of function 'strcmp' [-Werror=implicit-function-declaration]
924 | return name && vma_name && !strcmp(name, vma_name);
| ^~~~~~
include/linux/mm_types.h:22:1: note: 'strcmp' is defined in header '<string.h>'; did you forget to '#include <string.h>'?
This should not really be part of linux/mm_types.h in the first place,
as that header is meant to only contain structure defintions and need a
minimum set of indirect includes itself.
While the header clearly includes more than it should at this point,
let's not make it worse by including string.h as well, which would pull
in the expensive (compile-speed wise) fortify-string logic.
Move the new functions into a separate header that only needs to be
included in a couple of locations.
Link: https://lkml.kernel.org/r/20211207125710.2503446-1-arnd@kernel.org
Fixes: "mm: add a field to store names for private anonymous memory"
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 17fca131cee21724ee953a17c185c14e9533af5b)
Bug: 120441514
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I54719d7ea27d3cf53ef7245b2af88d2a2bc9bafe
|
||
|
|
301c56064d |
UPSTREAM: mm: add a field to store names for private anonymous memory
In many userspace applications, and especially in VM based applications like Android uses heavily, there are multiple different allocators in use. At a minimum there is libc malloc and the stack, and in many cases there are libc malloc, the stack, direct syscalls to mmap anonymous memory, and multiple VM heaps (one for small objects, one for big objects, etc.). Each of these layers usually has its own tools to inspect its usage; malloc by compiling a debug version, the VM through heap inspection tools, and for direct syscalls there is usually no way to track them. On Android we heavily use a set of tools that use an extended version of the logic covered in Documentation/vm/pagemap.txt to walk all pages mapped in userspace and slice their usage by process, shared (COW) vs. unique mappings, backing, etc. This can account for real physical memory usage even in cases like fork without exec (which Android uses heavily to share as many private COW pages as possible between processes), Kernel SamePage Merging, and clean zero pages. It produces a measurement of the pages that only exist in that process (USS, for unique), and a measurement of the physical memory usage of that process with the cost of shared pages being evenly split between processes that share them (PSS). If all anonymous memory is indistinguishable then figuring out the real physical memory usage (PSS) of each heap requires either a pagemap walking tool that can understand the heap debugging of every layer, or for every layer's heap debugging tools to implement the pagemap walking logic, in which case it is hard to get a consistent view of memory across the whole system. Tracking the information in userspace leads to all sorts of problems. It either needs to be stored inside the process, which means every process has to have an API to export its current heap information upon request, or it has to be stored externally in a filesystem that somebody needs to clean up on crashes. It needs to be readable while the process is still running, so it has to have some sort of synchronization with every layer of userspace. Efficiently tracking the ranges requires reimplementing something like the kernel vma trees, and linking to it from every layer of userspace. It requires more memory, more syscalls, more runtime cost, and more complexity to separately track regions that the kernel is already tracking. This patch adds a field to /proc/pid/maps and /proc/pid/smaps to show a userspace-provided name for anonymous vmas. The names of named anonymous vmas are shown in /proc/pid/maps and /proc/pid/smaps as [anon:<name>]. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name) Setting the name to NULL clears it. The name length limit is 80 bytes including NUL-terminator and is checked to contain only printable ascii characters (including space), except '[',']','\','$' and '`'. Ascii strings are being used to have a descriptive identifiers for vmas, which can be understood by the users reading /proc/pid/maps or /proc/pid/smaps. Names can be standardized for a given system and they can include some variable parts such as the name of the allocator or a library, tid of the thread using it, etc. The name is stored in a pointer in the shared union in vm_area_struct that points to a null terminated string. Anonymous vmas with the same name (equivalent strings) and are otherwise mergeable will be merged. The name pointers are not shared between vmas even if they contain the same name. The name pointer is stored in a union with fields that are only used on file-backed mappings, so it does not increase memory usage. CONFIG_ANON_VMA_NAME kernel configuration is introduced to enable this feature. It keeps the feature disabled by default to prevent any additional memory overhead and to avoid confusing procfs parsers on systems which are not ready to support named anonymous vmas. The patch is based on the original patch developed by Colin Cross, more specifically on its latest version [1] posted upstream by Sumit Semwal. It used a userspace pointer to store vma names. In that design, name pointers could be shared between vmas. However during the last upstreaming attempt, Kees Cook raised concerns [2] about this approach and suggested to copy the name into kernel memory space, perform validity checks [3] and store as a string referenced from vm_area_struct. One big concern is about fork() performance which would need to strdup anonymous vma names. Dave Hansen suggested experimenting with worst-case scenario of forking a process with 64k vmas having longest possible names [4]. I ran this experiment on an ARM64 Android device and recorded a worst-case regression of almost 40% when forking such a process. This regression is addressed in the followup patch which replaces the pointer to a name with a refcounted structure that allows sharing the name pointer between vmas of the same name. Instead of duplicating the string during fork() or when splitting a vma it increments the refcount. [1] https://lore.kernel.org/linux-mm/20200901161459.11772-4-sumit.semwal@linaro.org/ [2] https://lore.kernel.org/linux-mm/202009031031.D32EF57ED@keescook/ [3] https://lore.kernel.org/linux-mm/202009031022.3834F692@keescook/ [4] https://lore.kernel.org/linux-mm/5d0358ab-8c47-2f5f-8e43-23b89d6a8e95@intel.com/ Changes for prctl(2) manual page (in the options section): PR_SET_VMA Sets an attribute specified in arg2 for virtual memory areas starting from the address specified in arg3 and spanning the size specified in arg4. arg5 specifies the value of the attribute to be set. Note that assigning an attribute to a virtual memory area might prevent it from being merged with adjacent virtual memory areas due to the difference in that attribute's value. Currently, arg2 must be one of: PR_SET_VMA_ANON_NAME Set a name for anonymous virtual memory areas. arg5 should be a pointer to a null-terminated string containing the name. The name length including null byte cannot exceed 80 bytes. If arg5 is NULL, the name of the appropriate anonymous virtual memory areas will be reset. The name can contain only printable ascii characters (including space), except '[',']','\','$' and '`'. This feature is available only if the kernel is built with the CONFIG_ANON_VMA_NAME option enabled. [surenb@google.com: docs: proc.rst: /proc/PID/maps: fix malformed table] Link: https://lkml.kernel.org/r/20211123185928.2513763-1-surenb@google.com [surenb: rebased over v5.15-rc6, replaced userpointer with a kernel copy, added input sanitization and CONFIG_ANON_VMA_NAME config. The bulk of the work here was done by Colin Cross, therefore, with his permission, keeping him as the author] Link: https://lkml.kernel.org/r/20211019215511.3771969-2-surenb@google.com Signed-off-by: Colin Cross <ccross@google.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Glauber <jan.glauber@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Landley <rob@landley.net> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Shaohua Li <shli@fusionio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 9a10064f5625d5572c3626c1516e0bebc6c9fe9b) Bug: 120441514 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I53d56d551a7d62f75341304751814294b447c04e |
||
|
|
36de88a855 |
Merge 5.15.3 into android13-5.15
Changes in 5.15.3
xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay
usb: xhci: Enable runtime-pm by default on AMD Yellow Carp platform
Input: iforce - fix control-message timeout
Input: elantench - fix misreporting trackpoint coordinates
Input: i8042 - Add quirk for Fujitsu Lifebook T725
libata: fix read log timeout value
ocfs2: fix data corruption on truncate
scsi: scsi_ioctl: Validate command size
scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run
scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()
scsi: lpfc: Don't release final kref on Fport node while ABTS outstanding
scsi: lpfc: Fix FCP I/O flush functionality for TMF routines
scsi: qla2xxx: Fix crash in NVMe abort path
scsi: qla2xxx: Fix kernel crash when accessing port_speed sysfs file
scsi: qla2xxx: Fix use after free in eh_abort path
ce/gf100: fix incorrect CE0 address calculation on some GPUs
char: xillybus: fix msg_ep UAF in xillyusb_probe()
mmc: mtk-sd: Add wait dma stop done flow
mmc: dw_mmc: Dont wait for DRTO on Write RSP error
exfat: fix incorrect loading of i_blocks for large files
io-wq: remove worker to owner tw dependency
parisc: Fix set_fixmap() on PA1.x CPUs
parisc: Fix ptrace check on syscall return
tpm: Check for integer overflow in tpm2_map_response_body()
firmware/psci: fix application of sizeof to pointer
crypto: s5p-sss - Add error handling in s5p_aes_probe()
media: rkvdec: Do not override sizeimage for output format
media: ite-cir: IR receiver stop working after receive overflow
media: rkvdec: Support dynamic resolution changes
media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers
media: v4l2-ioctl: Fix check_ext_ctrls
ALSA: hda/realtek: Fix mic mute LED for the HP Spectre x360 14
ALSA: hda/realtek: Add a quirk for HP OMEN 15 mute LED
ALSA: hda/realtek: Add quirk for Clevo PC70HS
ALSA: hda/realtek: Headset fixup for Clevo NH77HJQ
ALSA: hda/realtek: Add a quirk for Acer Spin SP513-54N
ALSA: hda/realtek: Add quirk for ASUS UX550VE
ALSA: hda/realtek: Add quirk for HP EliteBook 840 G7 mute LED
ALSA: ua101: fix division by zero at probe
ALSA: 6fire: fix control and bulk message timeouts
ALSA: line6: fix control and interrupt message timeouts
ALSA: mixer: oss: Fix racy access to slots
ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume
ALSA: usb-audio: Line6 HX-Stomp XL USB_ID for 48k-fixed quirk
ALSA: usb-audio: Add registration quirk for JBL Quantum 400
ALSA: hda: Free card instance properly at probe errors
ALSA: synth: missing check for possible NULL after the call to kstrdup
ALSA: pci: rme: Fix unaligned buffer addresses
ALSA: PCM: Fix NULL dereference at mmap checks
ALSA: timer: Fix use-after-free problem
ALSA: timer: Unconditionally unlink slave instances, too
Revert "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
ext4: fix lazy initialization next schedule time computation in more granular unit
ext4: ensure enough credits in ext4_ext_shift_path_extents
ext4: refresh the ext4_ext_path struct after dropping i_data_sem.
fuse: fix page stealing
x86/sme: Use #define USE_EARLY_PGTABLE_L5 in mem_encrypt_identity.c
x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
x86/irq: Ensure PI wakeup handler is unregistered before module unload
x86/iopl: Fake iopl(3) CLI/STI usage
btrfs: clear MISSING device status bit in btrfs_close_one_device
btrfs: fix lost error handling when replaying directory deletes
btrfs: call btrfs_check_rw_degradable only if there is a missing device
KVM: x86/mmu: Drop a redundant, broken remote TLB flush
KVM: VMX: Unregister posted interrupt wakeup handler on hardware unsetup
KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
ia64: kprobes: Fix to pass correct trampoline address to the handler
selinux: fix race condition when computing ocontext SIDs
ipmi:watchdog: Set panic count to proper value on a panic
md/raid1: only allocate write behind bio for WriteMostly device
hwmon: (pmbus/lm25066) Add offset coefficients
regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled
regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property
EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
mwifiex: fix division by zero in fw download path
ath6kl: fix division by zero in send path
ath6kl: fix control-message timeout
ath10k: fix control-message timeout
ath10k: fix division by zero in send path
PCI: Mark Atheros QCA6174 to avoid bus reset
rtl8187: fix control-message timeouts
evm: mark evm_fixmode as __ro_after_init
ifb: Depend on netfilter alternatively to tc
platform/surface: aggregator_registry: Add support for Surface Laptop Studio
mt76: mt7615: fix skb use-after-free on mac reset
HID: surface-hid: Use correct event registry for managing HID events
HID: surface-hid: Allow driver matching for target ID 1 devices
wcn36xx: Fix HT40 capability for 2Ghz band
wcn36xx: Fix tx_status mechanism
wcn36xx: Fix (QoS) null data frame bitrate/modulation
PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions
mwifiex: Read a PCI register after writing the TX ring write pointer
mwifiex: Try waking the firmware until we get an interrupt
libata: fix checking of DMA state
dma-buf: fix and rework dma_buf_poll v7
wcn36xx: handle connection loss indication
rsi: fix occasional initialisation failure with BT coex
rsi: fix key enabled check causing unwanted encryption for vap_id > 0
rsi: fix rate mask set leading to P2P failure
rsi: Fix module dev_oper_mode parameter description
perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
perf/x86/intel/uncore: Fix invalid unit check
perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
ASoC: tegra: Set default card name for Trimslice
ASoC: tegra: Restore AC97 support
signal: Remove the bogus sigkill_pending in ptrace_stop
memory: renesas-rpc-if: Correct QSPI data transfer in Manual mode
signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed
soc: samsung: exynos-pmu: Fix compilation when nothing selects CONFIG_MFD_CORE
soc: fsl: dpio: replace smp_processor_id with raw_smp_processor_id
soc: fsl: dpio: use the combined functions to protect critical zone
mtd: rawnand: socrates: Keep the driver compatible with on-die ECC engines
mctp: handle the struct sockaddr_mctp padding fields
power: supply: max17042_battery: Prevent int underflow in set_soc_threshold
power: supply: max17042_battery: use VFSOC for capacity when no rsns
iio: core: fix double free in iio_device_unregister_sysfs()
iio: core: check return value when calling dev_set_name()
KVM: arm64: Extract ESR_ELx.EC only
KVM: x86: Fix recording of guest steal time / preempted status
KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows
KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
KVM: nVMX: Handle dynamic MSR intercept toggling
can: peak_usb: always ask for BERR reporting for PCAN-USB devices
can: mcp251xfd: mcp251xfd_irq(): add missing can_rx_offload_threaded_irq_finish() in case of bus off
can: j1939: j1939_tp_cmd_recv(): ignore abort message in the BAM transport
can: j1939: j1939_can_recv(): ignore messages with invalid source address
can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM
iio: adc: tsc2046: fix scan interval warning
powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found
io_uring: honour zeroes as io-wq worker limits
ring-buffer: Protect ring_buffer_reset() from reentrancy
serial: core: Fix initializing and restoring termios speed
ifb: fix building without CONFIG_NET_CLS_ACT
xen/balloon: add late_initcall_sync() for initial ballooning done
ovl: fix use after free in struct ovl_aio_req
ovl: fix filattr copy-up failure
PCI: pci-bridge-emul: Fix emulation of W1C bits
PCI: cadence: Add cdns_plat_pcie_probe() missing return
cxl/pci: Fix NULL vs ERR_PTR confusion
PCI: aardvark: Do not clear status bits of masked interrupts
PCI: aardvark: Fix checking for link up via LTSSM state
PCI: aardvark: Do not unmask unused interrupts
PCI: aardvark: Fix reporting Data Link Layer Link Active
PCI: aardvark: Fix configuring Reference clock
PCI: aardvark: Fix return value of MSI domain .alloc() method
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge
PCI: aardvark: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge
PCI: aardvark: Set PCI Bridge Class Code to PCI Bridge
PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge
quota: check block number when reading the block in quota file
quota: correct error number in free_dqentry()
cifs: To match file servers, make sure the server hostname matches
cifs: set a minimum of 120s for next dns resolution
mfd: simple-mfd-i2c: Select MFD_CORE to fix build error
pinctrl: core: fix possible memory leak in pinctrl_enable()
coresight: cti: Correct the parameter for pm_runtime_put
coresight: trbe: Fix incorrect access of the sink specific data
coresight: trbe: Defer the probe on offline CPUs
iio: buffer: check return value of kstrdup_const()
iio: buffer: Fix memory leak in iio_buffers_alloc_sysfs_and_mask()
iio: buffer: Fix memory leak in __iio_buffer_alloc_sysfs_and_mask()
iio: buffer: Fix memory leak in iio_buffer_register_legacy_sysfs_groups()
drivers: iio: dac: ad5766: Fix dt property name
iio: dac: ad5446: Fix ad5622_write() return value
iio: ad5770r: make devicetree property reading consistent
Documentation:devicetree:bindings:iio:dac: Fix val
USB: serial: keyspan: fix memleak on probe errors
serial: 8250: fix racy uartclk update
ksmbd: set unique value to volume serial field in FS_VOLUME_INFORMATION
io-wq: serialize hash clear with wakeup
serial: 8250: Fix reporting real baudrate value in c_ospeed field
Revert "serial: 8250: Fix reporting real baudrate value in c_ospeed field"
most: fix control-message timeouts
USB: iowarrior: fix control-message timeouts
USB: chipidea: fix interrupt deadlock
power: supply: max17042_battery: Clear status bits in interrupt handler
component: do not leave master devres group open after bind
dma-buf: WARN on dmabuf release with pending attachments
drm: panel-orientation-quirks: Update the Lenovo Ideapad D330 quirk (v2)
drm: panel-orientation-quirks: Add quirk for KD Kurio Smart C15200 2-in-1
drm: panel-orientation-quirks: Add quirk for the Samsung Galaxy Book 10.6
Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()
Bluetooth: fix use-after-free error in lock_sock_nested()
Bluetooth: call sock_hold earlier in sco_conn_del
drm/panel-orientation-quirks: add Valve Steam Deck
rcutorture: Avoid problematic critical section nesting on PREEMPT_RT
platform/x86: wmi: do not fail if disabling fails
drm/amdgpu: move iommu_resume before ip init/resume
MIPS: lantiq: dma: add small delay after reset
MIPS: lantiq: dma: reset correct number of channel
locking/lockdep: Avoid RCU-induced noinstr fail
net: sched: update default qdisc visibility after Tx queue cnt changes
ACPI: resources: Add DMI-based legacy IRQ override quirk
rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
smackfs: Fix use-after-free in netlbl_catmap_walk()
ath11k: Align bss_chan_info structure with firmware
crypto: aesni - check walk.nbytes instead of err
x86/mm/64: Improve stack overflow warnings
x86: Increase exception stack sizes
mwifiex: Run SET_BSS_MODE when changing from P2P to STATION vif-type
mwifiex: Properly initialize private structure on interface type changes
spi: Check we have a spi_device_id for each DT compatible
fscrypt: allow 256-bit master keys with AES-256-XTS
drm/amdgpu: Fix MMIO access page fault
drm/amd/display: Fix null pointer dereference for encoders
selftests: net: fib_nexthops: Wait before checking reported idle time
ath11k: Avoid reg rules update during firmware recovery
ath11k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
ath11k: Change DMA_FROM_DEVICE to DMA_TO_DEVICE when map reinjected packets
ath10k: high latency fixes for beacon buffer
octeontx2-pf: Enable promisc/allmulti match MCAM entries.
media: mt9p031: Fix corrupted frame after restarting stream
media: netup_unidvb: handle interrupt properly according to the firmware
media: atomisp: Fix error handling in probe
media: stm32: Potential NULL pointer dereference in dcmi_irq_thread()
media: uvcvideo: Set capability in s_param
media: uvcvideo: Return -EIO for control errors
media: uvcvideo: Set unique vdev name based in type
media: vidtv: Fix memory leak in remove
media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe()
media: s5p-mfc: Add checking to s5p_mfc_probe().
media: videobuf2: rework vb2_mem_ops API
media: imx: set a media_device bus_info string
media: rcar-vin: Use user provided buffers when starting
media: mceusb: return without resubmitting URB in case of -EPROTO error.
ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
rtw88: fix RX clock gate setting while fifo dump
brcmfmac: Add DMI nvram filename quirk for Cyberbook T116 tablet
media: rcar-csi2: Add checking to rcsi2_start_receiver()
ipmi: Disable some operations during a panic
fs/proc/uptime.c: Fix idle time reporting in /proc/uptime
kselftests/sched: cleanup the child processes
ACPICA: Avoid evaluating methods too early during system resume
cpufreq: Make policy min/max hard requirements
ice: Move devlink port to PF/VF struct
media: imx-jpeg: Fix possible null pointer dereference
media: ipu3-imgu: imgu_fmt: Handle properly try
media: ipu3-imgu: VIDIOC_QUERYCAP: Fix bus_info
media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte()
net-sysfs: try not to restart the syscall if it will fail eventually
drm/amdkfd: rm BO resv on validation to avoid deadlock
tracefs: Have tracefs directories not set OTH permission bits by default
tracing: Disable "other" permission bits in the tracefs files
ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create()
KVM: arm64: Propagate errors from __pkvm_prot_finalize hypercall
mmc: moxart: Fix reference count leaks in moxart_probe
iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
ACPI: battery: Accept charges over the design capacity as full
ACPI: scan: Release PM resources blocked by unused objects
drm/amd/display: fix null pointer deref when plugging in display
drm/amdkfd: fix resume error when iommu disabled in Picasso
net: phy: micrel: make *-skew-ps check more lenient
leaking_addresses: Always print a trailing newline
thermal/core: Fix null pointer dereference in thermal_release()
drm/msm: prevent NULL dereference in msm_gpu_crashstate_capture()
thermal/drivers/tsens: Add timeout to get_temp_tsens_valid
block: bump max plugged deferred size from 16 to 32
floppy: fix calling platform_device_unregister() on invalid drives
md: update superblock after changing rdev flags in state_store
memstick: r592: Fix a UAF bug when removing the driver
locking/rwsem: Disable preemption for spinning region
lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
lib/xz: Validate the value before assigning it to an enum variable
workqueue: make sysfs of unbound kworker cpumask more clever
tracing/cfi: Fix cmp_entries_* functions signature mismatch
mt76: mt7915: fix an off-by-one bound check
mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
iwlwifi: change all JnP to NO-160 configuration
block: remove inaccurate requeue check
media: allegro: ignore interrupt if mailbox is not initialized
drm/amdgpu/pm: properly handle sclk for profiling modes on vangogh
nvmet: fix use-after-free when a port is removed
nvmet-rdma: fix use-after-free when a port is removed
nvmet-tcp: fix use-after-free when a port is removed
nvme: drop scan_lock and always kick requeue list when removing namespaces
samples/bpf: Fix application of sizeof to pointer
arm64: vdso32: suppress error message for 'make mrproper'
PM: hibernate: Get block device exclusively in swsusp_check()
selftests: kvm: fix mismatched fclose() after popen()
selftests/bpf: Fix perf_buffer test on system with offline cpus
iwlwifi: mvm: disable RX-diversity in powersave
smackfs: use __GFP_NOFAIL for smk_cipso_doi()
ARM: clang: Do not rely on lr register for stacktrace
gre/sit: Don't generate link-local addr if addr_gen_mode is IN6_ADDR_GEN_MODE_NONE
can: bittiming: can_fixup_bittiming(): change type of tseg1 and alltseg to unsigned int
gfs2: Cancel remote delete work asynchronously
gfs2: Fix glock_hash_walk bugs
ARM: 9136/1: ARMv7-M uses BE-8, not BE-32
tools/latency-collector: Use correct size when writing queue_full_warning
vrf: run conntrack only in context of lower/physdev for locally generated packets
net: annotate data-race in neigh_output()
ACPI: AC: Quirk GK45 to skip reading _PSR
ACPI: resources: Add one more Medion model in IRQ override quirk
btrfs: reflink: initialize return value to 0 in btrfs_extent_same()
btrfs: do not take the uuid_mutex in btrfs_rm_device
spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()
wcn36xx: Correct band/freq reporting on RX
wcn36xx: Fix packet drop on resume
Revert "wcn36xx: Enable firmware link monitoring"
ftrace: do CPU checking after preemption disabled
inet: remove races in inet{6}_getname()
x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
drm/amd/display: dcn20_resource_construct reduce scope of FPU enabled
selftests/core: fix conflicting types compile error for close_range()
perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
parisc: fix warning in flush_tlb_all
task_stack: Fix end_of_stack() for architectures with upwards-growing stack
erofs: don't trigger WARN() when decompression fails
parisc/unwind: fix unwinder when CONFIG_64BIT is enabled
parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
netfilter: conntrack: set on IPS_ASSURED if flows enters internal stream state
selftests/bpf: Fix strobemeta selftest regression
fbdev/efifb: Release PCI device's runtime PM ref during FB destroy
drm/bridge: anx7625: Propagate errors from sp_tx_rst_aux()
perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
drm/bridge: it66121: Initialize {device,vendor}_ids
drm/bridge: it66121: Wait for next bridge to be probed
Bluetooth: fix init and cleanup of sco_conn.timeout_work
libbpf: Don't crash on object files with no symbol tables
Bluetooth: hci_uart: fix GPF in h5_recv
rcu: Fix existing exp request check in sync_sched_exp_online_cleanup()
MIPS: lantiq: dma: fix burst length for DEU
x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
objtool: Handle __sanitize_cov*() tail calls
net/mlx5: Publish and unpublish all devlink parameters at once
drm/v3d: fix wait for TMU write combiner flush
crypto: sm4 - Do not change section of ck and sbox
virtio-gpu: fix possible memory allocation failure
lockdep: Let lock_is_held_type() detect recursive read as read
net: net_namespace: Fix undefined member in key_remove_domain()
net: phylink: don't call netif_carrier_off() with NULL netdev
drm: bridge: it66121: Fix return value it66121_probe
spi: Fixed division by zero warning
cgroup: Make rebind_subsystems() disable v2 controllers all at once
wcn36xx: Fix Antenna Diversity Switching
wilc1000: fix possible memory leak in cfg_scan_result()
Bluetooth: btmtkuart: fix a memleak in mtk_hci_wmt_sync
drm/amdgpu: Fix crash on device remove/driver unload
drm/amd/display: Pass display_pipe_params_st as const in DML
drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage
crypto: caam - disable pkc for non-E SoCs
crypto: qat - power up 4xxx device
Bluetooth: hci_h5: Fix (runtime)suspend issues on RTL8723BS HCIs
bnxt_en: Check devlink allocation and registration status
qed: Don't ignore devlink allocation failures
rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()
mptcp: do not shrink snd_nxt when recovering
fortify: Fix dropped strcpy() compile-time write overflow check
mac80211: twt: don't use potentially unaligned pointer
cfg80211: always free wiphy specific regdomain
net/mlx5: Accept devlink user input after driver initialization complete
net: dsa: rtl8366rb: Fix off-by-one bug
net: dsa: rtl8366: Fix a bug in deleting VLANs
bpf/tests: Fix error in tail call limit tests
ath11k: fix some sleeping in atomic bugs
ath11k: Avoid race during regd updates
ath11k: fix packet drops due to incorrect 6 GHz freq value in rx status
ath11k: Fix memory leak in ath11k_qmi_driver_event_work
gve: DQO: avoid unused variable warnings
ath10k: Fix missing frame timestamp for beacon/probe-resp
ath10k: sdio: Add missing BH locking around napi_schdule()
drm/ttm: stop calling tt_swapin in vm_access
arm64: mm: update max_pfn after memory hotplug
drm/amdgpu: fix warning for overflow check
libbpf: Fix skel_internal.h to set errno on loader retval < 0
media: em28xx: add missing em28xx_close_extension
media: meson-ge2d: Fix rotation parameter changes detection in 'ge2d_s_ctrl()'
media: cxd2880-spi: Fix a null pointer dereference on error handling path
media: ttusb-dec: avoid release of non-acquired mutex
media: dvb-usb: fix ununit-value in az6027_rc_query
media: imx258: Fix getting clock frequency
media: v4l2-ioctl: S_CTRL output the right value
media: mtk-vcodec: venc: fix return value when start_streaming fails
media: TDA1997x: handle short reads of hdmi info frame.
media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'
media: imx-jpeg: Fix the error handling path of 'mxc_jpeg_probe()'
media: i2c: ths8200 needs V4L2_ASYNC
media: sun6i-csi: Allow the video device to be open multiple times
media: radio-wl1273: Avoid card name truncation
media: si470x: Avoid card name truncation
media: tm6000: Avoid card name truncation
media: cx23885: Fix snd_card_free call on null card pointer
media: atmel: fix the ispck initialization
scs: Release kasan vmalloc poison in scs_free process
kprobes: Do not use local variable when creating debugfs file
crypto: ecc - fix CRYPTO_DEFAULT_RNG dependency
drm: fb_helper: fix CONFIG_FB dependency
cpuidle: Fix kobject memory leaks in error paths
media: em28xx: Don't use ops->suspend if it is NULL
ath10k: Don't always treat modem stop events as crashes
ath9k: Fix potential interrupt storm on queue reset
PM: EM: Fix inefficient states detection
x86/insn: Use get_unaligned() instead of memcpy()
EDAC/amd64: Handle three rank interleaving mode
rcu: Always inline rcu_dynticks_task*_{enter,exit}()
rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
netfilter: nft_dynset: relax superfluous check on set updates
media: venus: fix vpp frequency calculation for decoder
media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()
crypto: ccree - avoid out-of-range warnings from clang
crypto: qat - detect PFVF collision after ACK
crypto: qat - disregard spurious PFVF interrupts
hwrng: mtk - Force runtime pm ops for sleep ops
ima: fix deadlock when traversing "ima_default_rules".
b43legacy: fix a lower bounds test
b43: fix a lower bounds test
gve: Recover from queue stall due to missed IRQ
gve: Track RX buffer allocation failures
mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured
mmc: sdhci-omap: Fix context restore
memstick: avoid out-of-range warning
memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()
net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE
hwmon: Fix possible memleak in __hwmon_device_register()
hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff
ath10k: fix max antenna gain unit
kernel/sched: Fix sched_fork() access an invalid sched_task_group
net: fealnx: fix build for UML
net: intel: igc_ptp: fix build for UML
net: tulip: winbond-840: fix build for UML
tcp: switch orphan_count to bare per-cpu counters
crypto: octeontx2 - set assoclen in aead_do_fallback()
thermal/core: fix a UAF bug in __thermal_cooling_device_register()
drm/msm/dsi: do not enable irq handler before powering up the host
drm/msm: Fix potential Oops in a6xx_gmu_rpmh_init()
drm/msm: potential error pointer dereference in init()
drm/msm: unlock on error in get_sched_entity()
drm/msm: fix potential NULL dereference in cleanup
drm/msm: uninitialized variable in msm_gem_import()
net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
thermal/drivers/qcom/lmh: make QCOM_LMH depends on QCOM_SCM
mailbox: Remove WARN_ON for async_cb.cb in cmdq_exec_done
media: ivtv: fix build for UML
media: ir_toy: assignment to be16 should be of correct type
mmc: mxs-mmc: disable regulator on error and in the remove function
io-wq: Remove duplicate code in io_workqueue_create()
block: ataflop: fix breakage introduced at blk-mq refactoring
blk-wbt: prevent NULL pointer dereference in wb_timer_fn
platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning
mailbox: mtk-cmdq: Validate alias_id on probe
mailbox: mtk-cmdq: Fix local clock ID usage
ACPI: PM: Turn off unused wakeup power resources
ACPI: PM: Fix sharing of wakeup power resources
drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu
mt76: mt7921: fix endianness in mt7921_mcu_tx_done_event
mt76: mt7915: fix endianness warning in mt7915_mac_add_txs_skb
mt76: mt7921: fix endianness warning in mt7921_update_txs
mt76: mt7615: fix endianness warning in mt7615_mac_write_txwi
mt76: mt7915: fix info leak in mt7915_mcu_set_pre_cal()
mt76: connac: fix mt76_connac_gtk_rekey_tlv usage
mt76: fix build error implicit enumeration conversion
mt76: mt7921: fix survey-dump reporting
mt76: mt76x02: fix endianness warnings in mt76x02_mac.c
mt76: mt7921: Fix out of order process by invalid event pkt
mt76: mt7915: fix potential overflow of eeprom page index
mt76: mt7915: fix bit fields for HT rate idx
mt76: mt7921: fix dma hang in rmmod
mt76: connac: fix GTK rekey offload failure on WPA mixed mode
mt76: overwrite default reg_ops if necessary
mt76: mt7921: report HE MU radiotap
mt76: mt7921: fix firmware usage of RA info using legacy rates
mt76: mt7921: fix kernel warning from cfg80211_calculate_bitrate
mt76: mt7921: always wake device if necessary in debugfs
mt76: mt7915: fix hwmon temp sensor mem use-after-free
mt76: mt7615: fix hwmon temp sensor mem use-after-free
mt76: mt7915: fix possible infinite loop release semaphore
mt76: mt7921: fix retrying release semaphore without end
mt76: mt7615: fix monitor mode tear down crash
mt76: connac: fix possible NULL pointer dereference in mt76_connac_get_phy_mode_v2
mt76: mt7915: fix sta_rec_wtbl tag len
mt76: mt7915: fix muar_idx in mt7915_mcu_alloc_sta_req()
rsi: stop thread firstly in rsi_91x_init() error handling
mwifiex: Send DELBA requests according to spec
iwlwifi: mvm: reset PM state on unsuccessful resume
iwlwifi: pnvm: don't kmemdup() more than we have
iwlwifi: pnvm: read EFI data only if long enough
net: enetc: unmap DMA in enetc_send_cmd()
phy: micrel: ksz8041nl: do not use power down mode
nbd: Fix use-after-free in pid_show
nvme-rdma: fix error code in nvme_rdma_setup_ctrl
PM: hibernate: fix sparse warnings
clocksource/drivers/timer-ti-dm: Select TIMER_OF
x86/sev: Fix stack type check in vc_switch_off_ist()
drm/msm: Fix potential NULL dereference in DPU SSPP
drm/msm/dsi: fix wrong type in msm_dsi_host
crypto: tcrypt - fix skcipher multi-buffer tests for 1420B blocks
smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
KVM: selftests: Fix nested SVM tests when built with clang
libbpf: Fix memory leak in btf__dedup()
bpftool: Avoid leaking the JSON writer prepared for program metadata
libbpf: Fix overflow in BTF sanity checks
libbpf: Fix BTF header parsing checks
mt76: mt7615: mt7622: fix ibss and meshpoint
s390/gmap: validate VMA in __gmap_zap()
s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap()
s390/mm: validate VMA in PGSTE manipulation functions
s390/mm: fix VMA and page table handling code in storage key handling functions
s390/uv: fully validate the VMA before calling follow_page()
KVM: s390: pv: avoid double free of sida page
KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
irq: mips: avoid nested irq_enter()
net: dsa: avoid refcount warnings when ->port_{fdb,mdb}_del returns error
ARM: 9142/1: kasan: work around LPAE build warning
ath10k: fix module load regression with iram-recovery feature
block: ataflop: more blk-mq refactoring fixes
blk-cgroup: synchronize blkg creation against policy deactivation
libbpf: Fix off-by-one bug in bpf_core_apply_relo()
tpm: fix Atmel TPM crash caused by too frequent queries
tpm_tis_spi: Add missing SPI ID
libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
tracing: Fix missing trace_boot_init_histograms kstrdup NULL checks
cpufreq: intel_pstate: Fix cpu->pstate.turbo_freq initialization
spi: spi-rpc-if: Check return value of rpcif_sw_init()
samples/kretprobes: Fix return value if register_kretprobe() failed
KVM: s390: Fix handle_sske page fault handling
libertas_tf: Fix possible memory leak in probe and disconnect
libertas: Fix possible memory leak in probe and disconnect
wcn36xx: add proper DMA memory barriers in rx path
wcn36xx: Fix discarded frames due to wrong sequence number
bpf: Avoid races in __bpf_prog_run() for 32bit arches
bpf: Fixes possible race in update_prog_stats() for 32bit arches
wcn36xx: Channel list update before hardware scan
drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()
drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
selftests/bpf: Fix fd cleanup in sk_lookup test
selftests/bpf: Fix memory leak in test_ima
sctp: allow IP fragmentation when PLPMTUD enters Error state
sctp: reset probe_timer in sctp_transport_pl_update
sctp: subtract sctphdr len in sctp_transport_pl_hlen
sctp: return true only for pathmtu update in sctp_transport_pl_toobig
net: amd-xgbe: Toggle PLL settings during rate change
ipmi: kcs_bmc: Fix a memory leak in the error handling path of 'kcs_bmc_serio_add_device()'
nfp: fix NULL pointer access when scheduling dim work
nfp: fix potential deadlock when canceling dim work
net: phylink: avoid mvneta warning when setting pause parameters
net: bridge: fix uninitialized variables when BRIDGE_CFM is disabled
selftests: net: bridge: update IGMP/MLD membership interval value
crypto: pcrypt - Delay write to padata->info
selftests/bpf: Fix fclose/pclose mismatch in test_progs
udp6: allow SO_MARK ctrl msg to affect routing
ibmvnic: don't stop queue in xmit
ibmvnic: Process crqs after enabling interrupts
ibmvnic: delay complete()
selftests: mptcp: fix proto type in link_failure tests
skmsg: Lose offset info in sk_psock_skb_ingress
cgroup: Fix rootcg cpu.stat guest double counting
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
of: unittest: fix EXPECT text for gpio hog errors
cpufreq: Fix parameter in parse_perf_domain()
staging: r8188eu: fix memory leak in rtw_set_key
arm64: dts: meson: sm1: add Ethernet PHY reset line for ODROID-C4/HC4
iio: st_sensors: disable regulators after device unregistration
RDMA/rxe: Fix wrong port_cap_flags
ARM: dts: BCM5301X: Fix memory nodes names
arm64: dts: broadcom: bcm4908: Fix UART clock name
clk: mvebu: ap-cpu-clk: Fix a memory leak in error handling paths
scsi: pm80xx: Fix lockup in outbound queue management
scsi: qla2xxx: edif: Use link event to wake up app
scsi: lpfc: Fix NVMe I/O failover to non-optimized path
ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc()
arm64: dts: rockchip: Fix GPU register width for RK3328
ARM: dts: qcom: msm8974: Add xo_board reference clock to DSI0 PHY
RDMA/bnxt_re: Fix query SRQ failure
arm64: dts: ti: k3-j721e-main: Fix "max-virtual-functions" in PCIe EP nodes
arm64: dts: ti: k3-j721e-main: Fix "bus-range" upto 256 bus number for PCIe
arm64: dts: ti: j7200-main: Fix "vendor-id"/"device-id" properties of pcie node
arm64: dts: ti: j7200-main: Fix "bus-range" upto 256 bus number for PCIe
arm64: dts: meson-g12a: Fix the pwm regulator supply properties
arm64: dts: meson-g12b: Fix the pwm regulator supply properties
arm64: dts: meson-sm1: Fix the pwm regulator supply properties
bus: ti-sysc: Fix timekeeping_suspended warning on resume
ARM: dts: at91: tse850: the emac<->phy interface is rmii
arm64: dts: qcom: sc7180: Base dynamic CPU power coefficients in reality
soc: qcom: llcc: Disable MMUHWT retention
arm64: dts: qcom: sc7280: fix display port phy reg property
scsi: dc395: Fix error case unwinding
MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
JFS: fix memleak in jfs_mount
pinctrl: renesas: rzg2l: Fix missing port register 21h
ASoC: wcd9335: Use correct version to initialize Class H
arm64: dts: qcom: msm8916: Fix Secondary MI2S bit clock
arm64: dts: renesas: beacon: Fix Ethernet PHY mode
iommu/mediatek: Fix out-of-range warning with clang
arm64: dts: qcom: pm8916: Remove wrong reg-names for rtc@6000
iommu/dma: Fix sync_sg with swiotlb
iommu/dma: Fix arch_sync_dma for map
ALSA: hda: Reduce udelay() at SKL+ position reporting
ALSA: hda: Use position buffer for SKL+ again
ALSA: usb-audio: Fix possible race at sync of urb completions
soundwire: debugfs: use controller id and link_id for debugfs
power: reset: at91-reset: check properly the return value of devm_of_iomap
scsi: ufs: core: Fix ufshcd_probe_hba() prototype to match the definition
scsi: ufs: core: Stop clearing UNIT ATTENTIONS
scsi: megaraid_sas: Fix concurrent access to ISR between IRQ polling and real interrupt
scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
driver core: Fix possible memory leak in device_link_add()
arm: dts: omap3-gta04a4: accelerometer irq fix
ASoC: SOF: topology: do not power down primary core during topology removal
iio: st_pressure_spi: Add missing entries SPI to device ID table
soc/tegra: Fix an error handling path in tegra_powergate_power_up()
memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe
clk: at91: check pmc node status before registering syscore ops
powerpc/mem: Fix arch/powerpc/mm/mem.c:53:12: error: no previous prototype for 'create_section_mapping'
video: fbdev: chipsfb: use memset_io() instead of memset()
powerpc: fix unbalanced node refcount in check_kvm_guest()
powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
serial: 8250_dw: Drop wrong use of ACPI_PTR()
usb: gadget: hid: fix error code in do_config()
power: supply: rt5033_battery: Change voltage values to µV
power: supply: max17040: fix null-ptr-deref in max17040_probe()
scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn()
RDMA/mlx4: Return missed an error if device doesn't support steering
usb: musb: select GENERIC_PHY instead of depending on it
staging: most: dim2: do not double-register the same device
staging: ks7010: select CRYPTO_HASH/CRYPTO_MICHAEL_MIC
RDMA/core: Set sgtable nents when using ib_dma_virt_map_sg()
dyndbg: make dyndbg a known cli param
powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10
pinctrl: renesas: checker: Fix off-by-one bug in drive register check
ARM: dts: stm32: Reduce DHCOR SPI NOR frequency to 50 MHz
ARM: dts: stm32: fix STUSB1600 Type-C irq level on stm32mp15xx-dkx
ARM: dts: stm32: fix SAI sub nodes register range
ARM: dts: stm32: fix AV96 board SAI2 pin muxing on stm32mp15
ASoC: cs42l42: Always configure both ASP TX channels
ASoC: cs42l42: Correct some register default values
ASoC: cs42l42: Defer probe if request_threaded_irq() returns EPROBE_DEFER
soc: qcom: rpmhpd: Make power_on actually enable the domain
soc: qcom: socinfo: add two missing PMIC IDs
iio: buffer: Fix double-free in iio_buffers_alloc_sysfs_and_mask()
usb: typec: STUSB160X should select REGMAP_I2C
iio: adis: do not disabe IRQs in 'adis_init()'
soundwire: bus: stop dereferencing invalid slave pointer
scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
serial: imx: fix detach/attach of serial console
usb: dwc2: drd: fix dwc2_force_mode call in dwc2_ovr_init
usb: dwc2: drd: fix dwc2_drd_role_sw_set when clock could be disabled
usb: dwc2: drd: reset current session before setting the new one
powerpc/booke: Disable STRICT_KERNEL_RWX, DEBUG_PAGEALLOC and KFENCE
usb: dwc3: gadget: Skip resizing EP's TX FIFO if already resized
firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available()
soc: qcom: rpmhpd: fix sm8350_mxc's peer domain
soc: qcom: apr: Add of_node_put() before return
arm64: dts: qcom: pmi8994: Fix "eternal"->"external" typo in WLED node
arm64: dts: qcom: sdm845: Use RPMH_CE_CLK macro directly
arm64: dts: qcom: sdm845: Fix Qualcomm crypto engine bus clock
pinctrl: equilibrium: Fix function addition in multiple groups
ASoC: topology: Fix stub for snd_soc_tplg_component_remove()
phy: qcom-qusb2: Fix a memory leak on probe
phy: ti: gmii-sel: check of_get_address() for failure
phy: qcom-qmp: another fix for the sc8180x PCIe definition
phy: qcom-snps: Correct the FSEL_MASK
phy: Sparx5 Eth SerDes: Fix return value check in sparx5_serdes_probe()
serial: xilinx_uartps: Fix race condition causing stuck TX
clk: at91: sam9x60-pll: use DIV_ROUND_CLOSEST_ULL
clk: at91: clk-master: check if div or pres is zero
clk: at91: clk-master: fix prescaler logic
HID: u2fzero: clarify error check and length calculations
HID: u2fzero: properly handle timeouts in usb_submit_urb
powerpc/nohash: Fix __ptep_set_access_flags() and ptep_set_wrprotect()
powerpc/book3e: Fix set_memory_x() and set_memory_nx()
powerpc/44x/fsp2: add missing of_node_put
powerpc/xmon: fix task state output
ALSA: oxfw: fix functional regression for Mackie Onyx 1640i in v5.14 or later
iommu/dma: Fix incorrect error return on iommu deferred attach
powerpc: Don't provide __kernel_map_pages() without ARCH_SUPPORTS_DEBUG_PAGEALLOC
ASoC: cs42l42: Correct configuring of switch inversion from ts-inv
RDMA/hns: Fix initial arm_st of CQ
RDMA/hns: Modify the value of MAX_LP_MSG_LEN to meet hardware compatibility
ASoC: rsnd: Fix an error handling path in 'rsnd_node_count()'
serial: cpm_uart: Protect udbg definitions by CONFIG_SERIAL_CPM_CONSOLE
virtio_ring: check desc == NULL when using indirect with packed
vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
mips: cm: Convert to bitfield API to fix out-of-bounds access
power: supply: bq27xxx: Fix kernel crash on IRQ handler register error
RDMA/core: Require the driver to set the IOVA correctly during rereg_mr
apparmor: fix error check
rpmsg: Fix rpmsg_create_ept return when RPMSG config is not defined
mtd: rawnand: intel: Fix potential buffer overflow in probe
nfsd: don't alloc under spinlock in rpc_parse_scope_id
rtc: ds1302: Add SPI ID table
rtc: ds1390: Add SPI ID table
rtc: pcf2123: Add SPI ID table
remoteproc: imx_rproc: Fix TCM io memory type
i2c: i801: Use PCI bus rescan mutex to protect P2SB access
dmaengine: idxd: move out percpu_ref_exit() to ensure it's outside submission
rtc: mcp795: Add SPI ID table
Input: ariel-pwrbutton - add SPI device ID table
i2c: mediatek: fixing the incorrect register offset
NFS: Default change_attr_type to NFS4_CHANGE_TYPE_IS_UNDEFINED
NFS: Don't set NFS_INO_DATA_INVAL_DEFER and NFS_INO_INVALID_DATA
NFS: Ignore the directory size when marking for revalidation
NFS: Fix dentry verifier races
pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds
drm/bridge/lontium-lt9611uxc: fix provided connector suport
drm/plane-helper: fix uninitialized variable reference
PCI: aardvark: Don't spam about PIO Response Status
PCI: aardvark: Fix preserving PCI_EXP_RTCTL_CRSSVE flag on emulated bridge
opp: Fix return in _opp_add_static_v2()
NFS: Fix deadlocks in nfs_scan_commit_list()
sparc: Add missing "FORCE" target when using if_changed
fs: orangefs: fix error return code of orangefs_revalidate_lookup()
Input: st1232 - increase "wait ready" timeout
drm/bridge: nwl-dsi: Add atomic_get_input_bus_fmts
mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare()
PCI: uniphier: Serialize INTx masking/unmasking and fix the bit operation
mtd: rawnand: arasan: Prevent an unsupported configuration
mtd: core: don't remove debugfs directory if device is in use
remoteproc: Fix a memory leak in an error handling path in 'rproc_handle_vdev()'
rtc: rv3032: fix error handling in rv3032_clkout_set_rate()
dmaengine: at_xdmac: call at_xdmac_axi_config() on resume path
dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro
dmaengine: stm32-dma: fix stm32_dma_get_max_width
NFS: Fix up commit deadlocks
NFS: Fix an Oops in pnfs_mark_request_commit()
Fix user namespace leak
auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string
auxdisplay: ht16k33: Connect backlight to fbdev
auxdisplay: ht16k33: Fix frame buffer device blanking
soc: fsl: dpaa2-console: free buffer before returning from dpaa2_console_read
netfilter: nfnetlink_queue: fix OOB when mac header was cleared
dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result`
dmaengine: tegra210-adma: fix pm runtime unbalance
dmanegine: idxd: fix resource free ordering on driver removal
dmaengine: idxd: reconfig device after device reset command
signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL)
m68k: set a default value for MEMORY_RESERVE
watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT
ar7: fix kernel builds for compiler test
scsi: target: core: Remove from tmr_list during LUN unlink
scsi: qla2xxx: Relogin during fabric disturbance
scsi: qla2xxx: Fix gnl list corruption
scsi: qla2xxx: Turn off target reset during issue_lip
scsi: qla2xxx: edif: Fix app start fail
scsi: qla2xxx: edif: Fix app start delay
scsi: qla2xxx: edif: Flush stale events and msgs on session down
scsi: qla2xxx: edif: Increase ELS payload
scsi: qla2xxx: edif: Fix EDIF bsg
NFSv4: Fix a regression in nfs_set_open_stateid_locked()
dmaengine: idxd: fix resource leak on dmaengine driver disable
i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
gpio: realtek-otto: fix GPIO line IRQ offset
xen-pciback: Fix return in pm_ctrl_init()
nbd: fix max value for 'first_minor'
nbd: fix possible overflow for 'first_minor' in nbd_dev_add()
io-wq: fix max-workers not correctly set on multi-node system
net: davinci_emac: Fix interrupt pacing disable
kselftests/net: add missed icmp.sh test to Makefile
kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile
kselftests/net: add missed SRv6 tests
kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile
kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile
ethtool: fix ethtool msg len calculation for pause stats
openrisc: fix SMP tlb flush NULL pointer dereference
net: vlan: fix a UAF in vlan_dev_real_dev()
net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
ice: Fix replacing VF hardware MAC to existing MAC filter
ice: Fix not stopping Tx queues for VFs
kdb: Adopt scheduler's task classification
ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
PCI: j721e: Fix j721e_pcie_probe() error path
nvdimm/btt: do not call del_gendisk() if not needed
scsi: bsg: Fix errno when scsi_bsg_register_queue() fails
scsi: ufs: ufshpb: Use proper power management API
scsi: ufs: core: Fix NULL pointer dereference
scsi: ufs: ufshpb: Properly handle max-single-cmd
selftests: net: properly support IPv6 in GSO GRE test
drm/nouveau/svm: Fix refcount leak bug and missing check against null bug
nvdimm/pmem: cleanup the disk if pmem_release_disk() is yet assigned
block/ataflop: use the blk_cleanup_disk() helper
block/ataflop: add registration bool before calling del_gendisk()
block/ataflop: provide a helper for cleanup up an atari disk
ataflop: remove ataflop_probe_lock mutex
PCI: Do not enable AtomicOps on VFs
cpufreq: intel_pstate: Clear HWP desired on suspend/shutdown and offline
net: phy: fix duplex out of sync problem while changing settings
block: fix device_add_disk() kobject_create_and_add() error handling
drm/ttm: remove ttm_bo_vm_insert_huge()
bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
octeontx2-pf: select CONFIG_NET_DEVLINK
ALSA: memalloc: Catch call with NULL snd_dma_buffer pointer
mfd: core: Add missing of_node_put for loop iteration
mfd: cpcap: Add SPI device ID table
mfd: sprd: Add SPI device ID table
mfd: altera-sysmgr: Fix a mistake caused by resource_size conversion
ACPI: PM: Fix device wakeup power reference counting error
libbpf: Fix lookup_and_delete_elem_flags error reporting
selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
selftests/bpf/xdp_redirect_multi: Limit the tests in netns
drm: fb_helper: improve CONFIG_FB dependency
Revert "drm/imx: Annotate dma-fence critical section in commit path"
drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
can: etas_es58x: es58x_rx_err_msg(): fix memory leak in error path
can: mcp251xfd: mcp251xfd_chip_start(): fix error handling for mcp251xfd_chip_rx_int_enable()
mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
zram: off by one in read_block_state()
perf bpf: Add missing free to bpf_event__print_bpf_prog_info()
llc: fix out-of-bound array index in llc_sk_dev_hash()
nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
litex_liteeth: Fix a double free in the remove function
arm64: arm64_ftr_reg->name may not be a human-readable string
arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
bpf, sockmap: Remove unhash handler for BPF sockmap usage
bpf, sockmap: Fix race in ingress receive verdict with redirect to self
bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
dmaengine: stm32-dma: fix burst in case of unaligned memory address
dmaengine: stm32-dma: avoid 64-bit division in stm32_dma_get_max_width
gve: Fix off by one in gve_tx_timeout()
drm/i915/fb: Fix rounding error in subsampled plane size calculation
init: make unknown command line param message clearer
seq_file: fix passing wrong private data
drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
net: dsa: mv88e6xxx: Don't support >1G speeds on 6191X on ports other than 10
net/sched: sch_taprio: fix undefined behavior in ktime_mono_to_any
net: hns3: fix ROCE base interrupt vector initialization bug
net: hns3: fix pfc packet number incorrect after querying pfc parameters
net: hns3: fix kernel crash when unload VF while it is being reset
net: hns3: allow configure ETS bandwidth of all TCs
net: stmmac: allow a tc-taprio base-time of zero
net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
vsock: prevent unnecessary refcnt inc for nonblocking connect
net/smc: fix sk_refcnt underflow on linkdown and fallback
cxgb4: fix eeprom len when diagnostics not implemented
selftests/net: udpgso_bench_rx: fix port argument
thermal: int340x: fix build on 32-bit targets
smb3: do not error on fsync when readonly
ARM: 9155/1: fix early early_iounmap()
ARM: 9156/1: drop cc-option fallbacks for architecture selection
parisc: Fix backtrace to always include init funtion names
parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
MIPS: fix duplicated slashes for Platform file path
MIPS: fix *-pkg builds for loongson2ef platform
MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL
x86/mce: Add errata workaround for Skylake SKX37
PCI/MSI: Move non-mask check back into low level accessors
PCI/MSI: Destroy sysfs before freeing entries
KVM: x86: move guest_pv_has out of user_access section
posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
irqchip/sifive-plic: Fixup EOI failed when masked
f2fs: should use GFP_NOFS for directory inodes
f2fs: include non-compressed blocks in compr_written_block
f2fs: fix UAF in f2fs_available_free_memory
ceph: fix mdsmap decode when there are MDS's beyond max_mds
erofs: fix unsafe pagevec reuse of hooked pclusters
drm/i915/guc: Fix blocked context accounting
block: Hold invalidate_lock in BLKDISCARD ioctl
block: Hold invalidate_lock in BLKZEROOUT ioctl
block: Hold invalidate_lock in BLKRESETZONE ioctl
ksmbd: Fix buffer length check in fsctl_validate_negotiate_info()
ksmbd: don't need 8byte alignment for request length in ksmbd_check_message
dmaengine: ti: k3-udma: Set bchan to NULL if a channel request fail
dmaengine: ti: k3-udma: Set r/tchan or rflow to NULL if request fail
dmaengine: bestcomm: fix system boot lockups
net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE
9p/net: fix missing error check in p9_check_errors
mm/filemap.c: remove bogus VM_BUG_ON
memcg: prohibit unconditional exceeding the limit of dying tasks
mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
mm, oom: do not trigger out_of_memory from the #PF
mm, thp: lock filemap when truncating page cache
mm, thp: fix incorrect unmap behavior for private pages
mfd: dln2: Add cell for initializing DLN2 ADC
video: backlight: Drop maximum brightness override for brightness zero
bcache: fix use-after-free problem in bcache_device_free()
bcache: Revert "bcache: use bvec_virt"
PM: sleep: Avoid calling put_device() under dpm_list_mtx
s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove
s390/cio: check the subchannel validity for dev_busid
s390/tape: fix timer initialization in tape_std_assign()
s390/ap: Fix hanging ioctl caused by orphaned replies
s390/cio: make ccw_device_dma_* more robust
remoteproc: elf_loader: Fix loading segment when is_iomem true
remoteproc: Fix the wrong default value of is_iomem
remoteproc: imx_rproc: Fix ignoring mapping vdev regions
remoteproc: imx_rproc: Fix rsc-table name
mtd: rawnand: fsmc: Fix use of SM ORDER
mtd: rawnand: ams-delta: Keep the driver compatible with on-die ECC engines
mtd: rawnand: xway: Keep the driver compatible with on-die ECC engines
mtd: rawnand: mpc5121: Keep the driver compatible with on-die ECC engines
mtd: rawnand: gpio: Keep the driver compatible with on-die ECC engines
mtd: rawnand: pasemi: Keep the driver compatible with on-die ECC engines
mtd: rawnand: orion: Keep the driver compatible with on-die ECC engines
mtd: rawnand: plat_nand: Keep the driver compatible with on-die ECC engines
mtd: rawnand: au1550nd: Keep the driver compatible with on-die ECC engines
powerpc/vas: Fix potential NULL pointer dereference
powerpc/bpf: Fix write protecting JIT code
powerpc/32e: Ignore ESR in instruction storage interrupt handler
powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
powerpc/security: Use a mutex for interrupt exit code patching
powerpc/64s/interrupt: Fix check_return_regs_valid() false positive
powerpc/pseries/mobility: ignore ibm, platform-facilities updates
powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
drm/sun4i: Fix macros in sun8i_csc.h
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
PCI: aardvark: Fix PCIe Max Payload Size setting
SUNRPC: Partial revert of commit
|
||
|
|
d5d21724af |
posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
commit ca7752caeaa70bd31d1714af566c9809688544af upstream.
copy_process currently copies task_struct.posix_cputimers_work as-is. If a
timer interrupt arrives while handling clone and before dup_task_struct
completes then the child task will have:
1. posix_cputimers_work.scheduled = true
2. posix_cputimers_work.work queued.
copy_process clears task_struct.task_works, so (2) will have no effect and
posix_cpu_timers_work will never run (not to mention it doesn't make sense
for two tasks to share a common linked list).
Since posix_cpu_timers_work never runs, posix_cputimers_work.scheduled is
never cleared. Since scheduled is set, future timer interrupts will skip
scheduling work, with the ultimate result that the task will never receive
timer expirations.
Together, the complete flow is:
1. Task 1 calls clone(), enters kernel.
2. Timer interrupt fires, schedules task work on Task 1.
2a. task_struct.posix_cputimers_work.scheduled = true
2b. task_struct.posix_cputimers_work.work added to
task_struct.task_works.
3. dup_task_struct() copies Task 1 to Task 2.
4. copy_process() clears task_struct.task_works for Task 2.
5. Future timer interrupts on Task 2 see
task_struct.posix_cputimers_work.scheduled = true and skip scheduling
work.
Fix this by explicitly clearing contents of task_struct.posix_cputimers_work
in copy_process(). This was never meant to be shared or inherited across
tasks in the first place.
Fixes:
|
||
|
|
3869eecf05 |
kernel/sched: Fix sched_fork() access an invalid sched_task_group
[ Upstream commit 4ef0c5c6b5ba1f38f0ea1cedad0cad722f00c14a ]
There is a small race between copy_process() and sched_fork()
where child->sched_task_group point to an already freed pointer.
parent doing fork() | someone moving the parent
| to another cgroup
-------------------------------+-------------------------------
copy_process()
+ dup_task_struct()<1>
parent move to another cgroup,
and free the old cgroup. <2>
+ sched_fork()
+ __set_task_cpu()<3>
+ task_fork_fair()
+ sched_slice()<4>
In the worst case, this bug can lead to "use-after-free" and
cause panic as shown above:
(1) parent copy its sched_task_group to child at <1>;
(2) someone move the parent to another cgroup and free the old
cgroup at <2>;
(3) the sched_task_group and cfs_rq that belong to the old cgroup
will be accessed at <3> and <4>, which cause a panic:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0
[] Oops: 0000 [#1] SMP PTI
[] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G OE --------- - - 4.18.0.x86_64+ #1
[] RIP: 0010:sched_slice+0x84/0xc0
[] Call Trace:
[] task_fork_fair+0x81/0x120
[] sched_fork+0x132/0x240
[] copy_process.part.5+0x675/0x20e0
[] ? __handle_mm_fault+0x63f/0x690
[] _do_fork+0xcd/0x3b0
[] do_syscall_64+0x5d/0x1d0
[] entry_SYSCALL_64_after_hwframe+0x65/0xca
[] RIP: 0033:0x7f04418cd7e1
Between cgroup_can_fork() and cgroup_post_fork(), the cgroup
membership and thus sched_task_group can't change. So update child's
sched_task_group at sched_post_fork() and move task_fork() and
__set_task_cpu() (where accees the sched_task_group) from sched_fork()
to sched_post_fork().
Fixes:
|
||
|
|
bc70904edc |
ANDROID: sched: Add vendor hooks for sched.
Add vendor hooks in scheduler to support OEM's value adds. Bug: 183674818 Signed-off-by: lijianzhong <lijianzhong@xiaomi.com> Change-Id: I8415958749948b3702e411f835c227ad4f8d8e92 Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org> |
||
|
|
e74ef7cf8f |
Merge tag 'v5.15-rc1' into android-mainline
Linux 5.15-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib4933c598d27b18268860e7549966ef7724652fc |
||
|
|
c0d1ebaba1 |
Merge 2d338201d5 ("Merge branch 'akpm' (patches from Andrew)") into android-mainline
Steps on the way to 5.15-rc1 Resolves merge conflict in: fs/proc/base.c Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic554ca8447961e52fbc6f27d91470a816b59a771 |
||
|
|
c5cd945b24 |
Merge fd47ff55c9 ("Merge tag 'usb-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb") into android-mainline
Steps on the way to 5.15-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I42ffa8818bbb2072f043923553c4d8f91d9647a5 |
||
|
|
13db8c5047 |
mm/hugetlb: initialize hugetlb_usage in mm_init
After fork, the child process will get incorrect (2x) hugetlb_usage. If
a process uses 5 2MB hugetlb pages in an anonymous mapping,
HugetlbPages: 10240 kB
and then forks, the child will show,
HugetlbPages: 20480 kB
The reason for double the amount is because hugetlb_usage will be copied
from the parent and then increased when we copy page tables from parent
to child. Child will have 2x actual usage.
Fix this by adding hugetlb_count_init in mm_init.
Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
Fixes:
|
||
|
|
2d338201d5 |
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
"147 patches, based on
|
||
|
|
05da8113c9 |
kernel/fork.c: unexport get_{mm,task}_exe_file
Only used by core code and the tomoyo which can't be a module either. Link: https://lkml.kernel.org/r/20210820095430.445242-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
bc2f6edebd |
Merge 9e9fb7655e ("Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next") into android-mainline
Steps on the way to 5.15-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I49577d606b2710975407eae3fee60bc331397810 |
||
|
|
49624efa65 |
Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux
Pull MAP_DENYWRITE removal from David Hildenbrand:
"Remove all in-tree usage of MAP_DENYWRITE from the kernel and remove
VM_DENYWRITE.
There are some (minor) user-visible changes:
- We no longer deny write access to shared libaries loaded via legacy
uselib(); this behavior matches modern user space e.g. dlopen().
- We no longer deny write access to the elf interpreter after exec
completed, treating it just like shared libraries (which it often
is).
- We always deny write access to the file linked via /proc/pid/exe:
sys_prctl(PR_SET_MM_MAP/EXE_FILE) will fail if write access to the
file cannot be denied, and write access to the file will remain
denied until the link is effectivel gone (exec, termination,
sys_prctl(PR_SET_MM_MAP/EXE_FILE)) -- just as if exec'ing the file.
Cross-compiled for a bunch of architectures (alpha, microblaze, i386,
s390x, ...) and verified via ltp that especially the relevant tests
(i.e., creat07 and execve04) continue working as expected"
* tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux:
fs: update documentation of get_write_access() and friends
mm: ignore MAP_DENYWRITE in ksys_mmap_pgoff()
mm: remove VM_DENYWRITE
binfmt: remove in-tree usage of MAP_DENYWRITE
kernel/fork: always deny write access to current MM exe_file
kernel/fork: factor out replacing the current MM exe_file
binfmt: don't use MAP_DENYWRITE when loading shared libraries via uselib()
|
||
|
|
8d0920bde5 |
mm: remove VM_DENYWRITE
All in-tree users of MAP_DENYWRITE are gone. MAP_DENYWRITE cannot be set from user space, so all users are gone; let's remove it. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: David Hildenbrand <david@redhat.com> |
||
|
|
fe69d560b5 |
kernel/fork: always deny write access to current MM exe_file
We want to remove VM_DENYWRITE only currently only used when mapping the executable during exec. During exec, we already deny_write_access() the executable, however, after exec completes the VMAs mapped with VM_DENYWRITE effectively keeps write access denied via deny_write_access(). Let's deny write access when setting or replacing the MM exe_file. With this change, we can remove VM_DENYWRITE for mapping executables. Make set_mm_exe_file() return an error in case deny_write_access() fails; note that this should never happen, because exec code does a deny_write_access() early and keeps write access denied when calling set_mm_exe_file. However, it makes the code easier to read and makes set_mm_exe_file() and replace_mm_exe_file() look more similar. This represents a minor user space visible change: sys_prctl(PR_SET_MM_MAP/EXE_FILE) can now fail if the file is already opened writable. Also, after sys_prctl(PR_SET_MM_MAP/EXE_FILE) the file cannot be opened writable. Note that we can already fail with -EACCES if the file doesn't have execute permissions. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: David Hildenbrand <david@redhat.com> |
||
|
|
35d7bdc860 |
kernel/fork: factor out replacing the current MM exe_file
Let's factor the main logic out into replace_mm_exe_file(), such that all mm->exe_file logic is contained in kernel/fork.c. While at it, perform some simple cleanups that are possible now that we're simplifying the individual functions. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: David Hildenbrand <david@redhat.com> |
||
|
|
506f6a3567 |
Merge 5d3c0db459 ("Merge tag 'sched-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") into android-mainline
Steps on the way to 5.15-rc1. Resolves merge conflicts with: kernel/fork.c kernel/sched/core.c Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I73e7bfc310639dae4ca5df7b63d47fa32a171760 |
||
|
|
571976d07a |
Merge tag 'v5.14' into android-mainline
Linux 5.14 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ie2edcf6ad1f7d17ffbe5322bf2378ff01be31e64 |
||
|
|
9e9fb7655e |
Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Enable memcg accounting for various networking objects.
BPF:
- Introduce bpf timers.
- Add perf link and opaque bpf_cookie which the program can read out
again, to be used in libbpf-based USDT library.
- Add bpf_task_pt_regs() helper to access user space pt_regs in
kprobes, to help user space stack unwinding.
- Add support for UNIX sockets for BPF sockmap.
- Extend BPF iterator support for UNIX domain sockets.
- Allow BPF TCP congestion control progs and bpf iterators to call
bpf_setsockopt(), e.g. to switch to another congestion control
algorithm.
Protocols:
- Support IOAM Pre-allocated Trace with IPv6.
- Support Management Component Transport Protocol.
- bridge: multicast: add vlan support.
- netfilter: add hooks for the SRv6 lightweight tunnel driver.
- tcp:
- enable mid-stream window clamping (by user space or BPF)
- allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
- more accurate DSACK processing for RACK-TLP
- mptcp:
- add full mesh path manager option
- add partial support for MP_FAIL
- improve use of backup subflows
- optimize option processing
- af_unix: add OOB notification support.
- ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the
router.
- mac80211: Target Wake Time support in AP mode.
- can: j1939: extend UAPI to notify about RX status.
Driver APIs:
- Add page frag support in page pool API.
- Many improvements to the DSA (distributed switch) APIs.
- ethtool: extend IRQ coalesce uAPI with timer reset modes.
- devlink: control which auxiliary devices are created.
- Support CAN PHYs via the generic PHY subsystem.
- Proper cross-chip support for tag_8021q.
- Allow TX forwarding for the software bridge data path to be
offloaded to capable devices.
Drivers:
- veth: more flexible channels number configuration.
- openvswitch: introduce per-cpu upcall dispatch.
- Add internet mix (IMIX) mode to pktgen.
- Transparently handle XDP operations in the bonding driver.
- Add LiteETH network driver.
- Renesas (ravb):
- support Gigabit Ethernet IP
- NXP Ethernet switch (sja1105):
- fast aging support
- support for "H" switch topologies
- traffic termination for ports under VLAN-aware bridge
- Intel 1G Ethernet
- support getcrosststamp() with PCIe PTM (Precision Time
Measurement) for better time sync
- support Credit-Based Shaper (CBS) offload, enabling HW traffic
prioritization and bandwidth reservation
- Broadcom Ethernet (bnxt)
- support pulse-per-second output
- support larger Rx rings
- Mellanox Ethernet (mlx5)
- support ethtool RSS contexts and MQPRIO channel mode
- support LAG offload with bridging
- support devlink rate limit API
- support packet sampling on tunnels
- Huawei Ethernet (hns3):
- basic devlink support
- add extended IRQ coalescing support
- report extended link state
- Netronome Ethernet (nfp):
- add conntrack offload support
- Broadcom WiFi (brcmfmac):
- add WPA3 Personal with FT to supported cipher suites
- support 43752 SDIO device
- Intel WiFi (iwlwifi):
- support scanning hidden 6GHz networks
- support for a new hardware family (Bz)
- Xen pv driver:
- harden netfront against malicious backends
- Qualcomm mobile
- ipa: refactor power management and enable automatic suspend
- mhi: move MBIM to WWAN subsystem interfaces
Refactor:
- Ambient BPF run context and cgroup storage cleanup.
- Compat rework for ndo_ioctl.
Old code removal:
- prism54 remove the obsoleted driver, deprecated by the p54 driver.
- wan: remove sbni/granch driver"
* tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits)
net: Add depends on OF_NET for LiteX's LiteETH
ipv6: seg6: remove duplicated include
net: hns3: remove unnecessary spaces
net: hns3: add some required spaces
net: hns3: clean up a type mismatch warning
net: hns3: refine function hns3_set_default_feature()
ipv6: remove duplicated 'net/lwtunnel.h' include
net: w5100: check return value after calling platform_get_resource()
net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx()
net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()
net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()
fou: remove sparse errors
ipv4: fix endianness issue in inet_rtm_getroute_build_skb()
octeontx2-af: Set proper errorcode for IPv4 checksum errors
octeontx2-af: Fix static code analyzer reported issues
octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg
octeontx2-af: Fix loop in free and unmap counter
af_unix: fix potential NULL deref in unix_dgram_connect()
dpaa2-eth: Replace strlcpy with strscpy
octeontx2-af: Use NDC TX for transmit packet data
...
|
||
|
|
5d3c0db459 |
Merge tag 'sched-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar: - The biggest change in this cycle is scheduler support for asymmetric scheduling affinity, to support the execution of legacy 32-bit tasks on AArch32 systems that also have 64-bit-only CPUs. Architectures can fill in this functionality by defining their own task_cpu_possible_mask(p). When this is done, the scheduler will make sure the task will only be scheduled on CPUs that support it. (The actual arm64 specific changes are not part of this tree.) For other architectures there will be no change in functionality. - Add cgroup SCHED_IDLE support - Increase node-distance flexibility & delay determining it until a CPU is brought online. (This enables platforms where node distance isn't final until the CPU is only.) - Deadline scheduler enhancements & fixes - Misc fixes & cleanups. * tag 'sched-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) eventfd: Make signal recursion protection a task bit sched/fair: Mark tg_is_idle() an inline in the !CONFIG_FAIR_GROUP_SCHED case sched: Introduce dl_task_check_affinity() to check proposed affinity sched: Allow task CPU affinity to be restricted on asymmetric systems sched: Split the guts of sched_setaffinity() into a helper function sched: Introduce task_struct::user_cpus_ptr to track requested affinity sched: Reject CPU affinity changes based on task_cpu_possible_mask() cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq() cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus() cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 sched: Introduce task_cpu_possible_mask() to limit fallback rq selection sched: Cgroup SCHED_IDLE support sched/topology: Skip updating masks for non-online nodes sched: Replace deprecated CPU-hotplug functions. sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS sched: Fix UCLAMP_FLAG_IDLE setting sched/deadline: Fix missing clock update in migrate_task_rq_dl() sched/fair: Avoid a second scan of target in select_idle_cpu sched/fair: Use prev instead of new target as recent_used_cpu sched: Don't report SCHED_FLAG_SUGOV in sched_getattr() ... |
||
|
|
97c78d0af5 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
5ddf994fa2 |
ucounts: Fix regression preventing increasing of rlimits in init_user_ns
"Ma, XinjianX" <xinjianx.ma@intel.com> reported: > When lkp team run kernel selftests, we found after these series of patches, testcase mqueue: mq_perf_tests > in kselftest failed with following message. > > # selftests: mqueue: mq_perf_tests > # > # Initial system state: > # Using queue path: /mq_perf_tests > # RLIMIT_MSGQUEUE(soft): 819200 > # RLIMIT_MSGQUEUE(hard): 819200 > # Maximum Message Size: 8192 > # Maximum Queue Size: 10 > # Nice value: 0 > # > # Adjusted system state for testing: > # RLIMIT_MSGQUEUE(soft): (unlimited) > # RLIMIT_MSGQUEUE(hard): (unlimited) > # Maximum Message Size: 16777216 > # Maximum Queue Size: 65530 > # Nice value: -20 > # Continuous mode: (disabled) > # CPUs to pin: 3 > # ./mq_perf_tests: mq_open() at 296: Too many open files > not ok 2 selftests: mqueue: mq_perf_tests # exit=1 > ``` > > Test env: > rootfs: debian-10 > gcc version: 9 After investigation the problem turned out to be that ucount_max for the rlimits in init_user_ns was being set to the initial rlimit value. The practical problem is that ucount_max provides a limit that applications inside the user namespace can not exceed. Which means in practice that rlimits that have been converted to use the ucount infrastructure were not able to exceend their initial rlimits. Solve this by setting the relevant values of ucount_max to RLIM_INIFINITY. A limit in init_user_ns is pointless so the code should allow the values to grow as large as possible without riscking an underflow or an overflow. As the ltp test case was a bit of a pain I have reproduced the rlimit failure and tested the fix with the following little C program: > #include <stdio.h> > #include <fcntl.h> > #include <sys/stat.h> > #include <mqueue.h> > #include <sys/time.h> > #include <sys/resource.h> > #include <errno.h> > #include <string.h> > #include <stdlib.h> > #include <limits.h> > #include <unistd.h> > > int main(int argc, char **argv) > { > struct mq_attr mq_attr; > struct rlimit rlim; > mqd_t mqd; > int ret; > > ret = getrlimit(RLIMIT_MSGQUEUE, &rlim); > if (ret != 0) { > fprintf(stderr, "getrlimit(RLIMIT_MSGQUEUE) failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > printf("RLIMIT_MSGQUEUE %lu %lu\n", > rlim.rlim_cur, rlim.rlim_max); > rlim.rlim_cur = RLIM_INFINITY; > rlim.rlim_max = RLIM_INFINITY; > ret = setrlimit(RLIMIT_MSGQUEUE, &rlim); > if (ret != 0) { > fprintf(stderr, "setrlimit(RLIMIT_MSGQUEUE, RLIM_INFINITY) failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > > memset(&mq_attr, 0, sizeof(struct mq_attr)); > mq_attr.mq_maxmsg = 65536 - 1; > mq_attr.mq_msgsize = 16*1024*1024 - 1; > > mqd = mq_open("/mq_rlimit_test", O_RDONLY|O_CREAT, 0600, &mq_attr); > if (mqd == (mqd_t)-1) { > fprintf(stderr, "mq_open failed: %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > ret = mq_close(mqd); > if (ret) { > fprintf(stderr, "mq_close failed; %s\n", strerror(errno)); > exit(EXIT_FAILURE); > } > > return EXIT_SUCCESS; > } Fixes: |