kernel_arpi

Author	SHA1	Message	Date
Greg Kroah-Hartman	5e713c48ff	Merge 5.4.35 into android-5.4-stable Changes in 5.4.35 ext4: use non-movable memory for superblock readahead watchdog: sp805: fix restart handler xsk: Fix out of boundary write in __xsk_rcv_memcpy arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0 arm, bpf: Fix offset overflow for BPF_MEM BPF_DW objtool: Fix switch table detection in .text.unlikely scsi: sg: add sg_remove_request in sg_common_write ALSA: hda: Honor PM disablement in PM freeze and thaw_noirq ops ARM: dts: imx6: Use gpc for FEC interrupt controller to fix wake on LAN. kbuild, btf: Fix dependencies for DEBUG_INFO_BTF netfilter: nf_tables: report EOPNOTSUPP on unsupported flags/object type irqchip/mbigen: Free msi_desc on device teardown ALSA: hda: Don't release card at firmware loading error xsk: Add missing check on user supplied headroom size of: unittest: kmemleak on changeset destroy of: unittest: kmemleak in of_unittest_platform_populate() of: unittest: kmemleak in of_unittest_overlay_high_level() of: overlay: kmemleak in dup_and_fixup_symbol_prop() x86/Hyper-V: Unload vmbus channel in hv panic callback x86/Hyper-V: Trigger crash enlightenment only once during system crash. x86/Hyper-V: Report crash register data or kmsg before running crash kernel x86/Hyper-V: Report crash register data when sysctl_record_panic_msg is not set x86/Hyper-V: Report crash data in die() when panic_on_oops is set afs: Fix missing XDR advance in xdr_decode_{AFS,YFS}FSFetchStatus() afs: Fix decoding of inline abort codes from version 1 status records afs: Fix rename operation status delivery afs: Fix afs_d_validate() to set the right directory version afs: Fix race between post-modification dir edit and readdir/d_revalidate block, bfq: turn put_queue into release_process_ref in __bfq_bic_change_cgroup block, bfq: make reparent_leaf_entity actually work only on leaf entities block, bfq: invoke flush_idle_tree after reparent_active_queues in pd_offline rbd: avoid a deadlock on header_rwsem when flushing notifies rbd: call rbd_dev_unprobe() after unwatching and flushing notifies x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump drm/ttm: flush the fence on the bo after we individualize the reservation object clk: Don't cache errors from clk_ops::get_phase() clk: at91: usb: continue if clk_hw_round_rate() return zero net/mlx5e: Enforce setting of a single FEC mode f2fs: fix the panic in do_checkpoint() ARM: dts: rockchip: fix vqmmc-supply property name for rk3188-bqedison2qc arm64: dts: allwinner: a64: Fix display clock register range power: supply: bq27xxx_battery: Silence deferred-probe error clk: tegra: Fix Tegra PMC clock out parents arm64: tegra: Add PCIe endpoint controllers nodes for Tegra194 arm64: tegra: Fix Tegra194 PCIe compatible string arm64: dts: clearfog-gt-8k: set gigabit PHY reset deassert delay soc: imx: gpc: fix power up sequencing dma-coherent: fix integer overflow in the reserved-memory dma allocation rtc: 88pm860x: fix possible race condition NFS: alloc_nfs_open_context() must use the file cred when available NFSv4/pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() NFSv4.2: error out when relink swapfile ARM: dts: rockchip: fix lvds-encoder ports subnode for rk3188-bqedison2qc KVM: PPC: Book3S HV: Fix H_CEDE return code for nested guests f2fs: fix to show norecovery mount option phy: uniphier-usb3ss: Add Pro5 support NFS: direct.c: Fix memory leak of dreq when nfs_get_lock_context fails f2fs: Fix mount failure due to SPO after a successful online resize FS f2fs: Add a new CP flag to help fsck fix resize SPO issues s390/cpuinfo: fix wrong output when CPU0 is offline hibernate: Allow uswsusp to write to swap btrfs: add RCU locks around block group initialization powerpc/prom_init: Pass the "os-term" message to hypervisor powerpc/maple: Fix declaration made after definition s390/cpum_sf: Fix wrong page count in error message ext4: do not commit super on read-only bdev um: ubd: Prevent buffer overrun on command completion cifs: Allocate encryption header through kmalloc mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS drm/nouveau/svm: check for SVM initialized before migrating drm/nouveau/svm: fix vma range check for migration include/linux/swapops.h: correct guards for non_swap_entry() percpu_counter: fix a data race at vm_committed_as compiler.h: fix error in BUILD_BUG_ON() reporting KVM: s390: vsie: Fix possible race when shadowing region 3 tables drm/nouveau: workaround runpm fail by disabling PCI power management on certain intel bridges leds: core: Fix warning message when init_data x86: ACPI: fix CPU hotplug deadlock csky: Fixup cpu speculative execution to IO area drm/amdkfd: kfree the wrong pointer NFS: Fix memory leaks in nfs_pageio_stop_mirroring() csky: Fixup get wrong psr value from phyical reg f2fs: fix NULL pointer dereference in f2fs_write_begin() ACPICA: Fixes for acpiExec namespace init file um: falloc.h needs to be directly included for older libc drm/vc4: Fix HDMI mode validation iommu/virtio: Fix freeing of incomplete domains iommu/vt-d: Fix mm reference leak SUNRPC: fix krb5p mount to provide large enough buffer in rq_rcvsize ext2: fix empty body warnings when -Wextra is used iommu/vt-d: Silence RCU-list debugging warning in dmar_find_atsr() iommu/vt-d: Fix page request descriptor size ext2: fix debug reference to ext2_xattr_cache sunrpc: Fix gss_unwrap_resp_integ() again csky: Fixup init_fpu compile warning with __init power: supply: axp288_fuel_gauge: Broaden vendor check for Intel Compute Sticks. libnvdimm: Out of bounds read in __nd_ioctl() iommu/amd: Fix the configuration of GCR3 table root pointer f2fs: fix to wait all node page writeback drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during init net: dsa: bcm_sf2: Fix overflow checks dma-debug: fix displaying of dma allocation type fbdev: potential information leak in do_fb_ioctl() ARM: dts: sunxi: Fix DE2 clocks register range iio: si1133: read 24-bit signed integer for measurement fbmem: Adjust indentation in fb_prepare_logo and fb_blank tty: evh_bytechan: Fix out of bounds accesses locktorture: Print ratio of acquisitions, not failures mtd: rawnand: free the nand_device object mtd: spinand: Explicitly use MTD_OPS_RAW to write the bad block marker to OOB docs: Fix path to MTD command line partition parser mtd: lpddr: Fix a double free in probe() mtd: phram: fix a double free issue in error path KEYS: Don't write out to userspace while holding key semaphore bpf: fix buggy r0 retval refinement for tracing helpers bpf: Test_verifier, bpf_get_stack return value add <0 bpf: Test_progs, add test to catch retval refine error handling bpf, test_verifier: switch bpf_get_stack's 0 s> r8 test Linux 5.4.35 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I702aba533097c8533c12561c7f1a51f3a96f6f09	2020-04-23 11:15:10 +02:00
Daniel Borkmann	3bd5bcafbb	bpf: fix buggy r0 retval refinement for tracing helpers [ no upstream commit ] See the glory details in `100605035e` ("bpf: Verifier, do_refine_retval_range may clamp umin to 0 incorrectly") for why `849fa50662` ("bpf/verifier: refine retval R0 state for bpf_get_stack helper") is buggy. The whole series however is not suitable for stable since it adds significant amount [0] of verifier complexity in order to add 32bit subreg tracking. Something simpler is needed. Unfortunately, reverting `849fa50662` ("bpf/verifier: refine retval R0 state for bpf_get_stack helper") or just cherry-picking `100605035e` ("bpf: Verifier, do_refine_retval_range may clamp umin to 0 incorrectly") is not an option since it will break existing tracing programs badly (at least those that are using bpf_get_stack() and bpf_probe_read_str() helpers). Not fixing it in stable is also not an option since on 4.19 kernels an error will cause a soft-lockup due to hitting dead-code sanitized branch since we don't hard-wire such branches in old kernels yet. But even then for 5.x `849fa50662` ("bpf/verifier: refine retval R0 state for bpf_get_stack helper") would cause wrong bounds on the verifier simluation when an error is hit. In one of the earlier iterations of mentioned patch series for upstream there was the concern that just using smax_value in do_refine_retval_range() would nuke bounds by subsequent <<32 >>32 shifts before the comparison against 0 [1] which eventually led to the 32bit subreg tracking in the first place. While I initially went for implementing the idea [1] to pattern match the two shift operations, it turned out to be more complex than actually needed, meaning, we could simply treat do_refine_retval_range() similarly to how we branch off verification for conditionals or under speculation, that is, pushing a new reg state to the stack for later verification. This means, instead of verifying the current path with the ret_reg in [S32MIN, msize_max_value] interval where later bounds would get nuked, we split this into two: i) for the success case where ret_reg can be in [0, msize_max_value], and ii) for the error case with ret_reg known to be in interval [S32MIN, -1]. Latter will preserve the bounds during these shift patterns and can match reg < 0 test. test_progs also succeed with this approach. [0] https://lore.kernel.org/bpf/158507130343.15666.8018068546764556975.stgit@john-Precision-5820-Tower/ [1] https://lore.kernel.org/bpf/158015334199.28573.4940395881683556537.stgit@john-XPS-13-9370/T/#m2e0ad1d5949131014748b6daa48a3495e7f0456d Fixes: `849fa50662` ("bpf/verifier: refine retval R0 state for bpf_get_stack helper") Reported-by: Lorenzo Fontana <fontanalorenz@gmail.com> Reported-by: Leonardo Di Donato <leodidonato@gmail.com> Reported-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:45 +02:00
Paul E. McKenney	0c72ec11d8	locktorture: Print ratio of acquisitions, not failures commit `80c503e0e6` upstream. The __torture_print_stats() function in locktorture.c carefully initializes local variable "min" to statp[0].n_lock_acquired, but then compares it to statp[i].n_lock_fail. Given that the .n_lock_fail field should normally be zero, and given the initialization, it seems reasonable to display the maximum and minimum number acquisitions instead of miscomputing the maximum and minimum number of failures. This commit therefore switches from failures to acquisitions. And this turns out to be not only a day-zero bug, but entirely my own fault. I hate it when that happens! Fixes: `0af3fe1efa` ("locktorture: Add a lock-torture kernel module") Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:44 +02:00
Grygorii Strashko	f093874687	dma-debug: fix displaying of dma allocation type commit `9bb50ed747` upstream. The commit `2e05ea5cdc` ("dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs") removed "dma_debug_page" enum, but missed to update type2name string table. This causes incorrect displaying of dma allocation type. Fix it by removing "page" string from type2name string table and switch to use named initializers. Before (dma_alloc_coherent()): k3-ringacc 4b800000.ringacc: scather-gather idx 2208 P=d1140000 N=d114 D=d1140000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable k3-ringacc 4b800000.ringacc: scather-gather idx 2216 P=d1150000 N=d115 D=d1150000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable After: k3-ringacc 4b800000.ringacc: coherent idx 2208 P=d1140000 N=d114 D=d1140000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable k3-ringacc 4b800000.ringacc: coherent idx 2216 P=d1150000 N=d115 D=d1150000 L=40 DMA_BIDIRECTIONAL dma map error check not applicable Fixes: `2e05ea5cdc` ("dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:43 +02:00
Kevin Grandemange	56aaa0e8c9	dma-coherent: fix integer overflow in the reserved-memory dma allocation [ Upstream commit `286c21de32` ] pageno is an int and the PAGE_SHIFT shift is done on an int, overflowing if the memory is bigger than 2G This can be reproduced using for example a reserved-memory of 4G reserved-memory { #address-cells = <2>; #size-cells = <2>; ranges; reserved_dma: buffer@0 { compatible = "shared-dma-pool"; no-map; reg = <0x5 0x00000000 0x1 0x0>; }; }; Signed-off-by: Kevin Grandemange <kevin.grandemange@allegrodvt.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-23 10:36:30 +02:00
Greg Kroah-Hartman	265e61b656	Merge 5.4.34 into android-5.4-stable Changes in 5.4.34 amd-xgbe: Use __napi_schedule() in BH context hsr: check protocol version in hsr_newlink() l2tp: Allow management of tunnels and session in user namespace net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin net: ipv6: do not consider routes via gateways for anycast address check net: phy: micrel: use genphy_read_status for KSZ9131 net: qrtr: send msgs from local of same id as broadcast net: revert default NAPI poll timeout to 2 jiffies net: tun: record RX queue in skb before do_xdp_generic() net: dsa: mt7530: move mt7623 settings out off the mt7530 net: ethernet: mediatek: move mt7623 settings out off the mt7530 net/mlx5: Fix frequent ioread PCI access during recovery net/mlx5e: Add missing release firmware call net/mlx5e: Fix nest_level for vlan pop action net/mlx5e: Fix pfnum in devlink port attribute net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes Revert "ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add()" ovl: fix value of i_ino for lower hardlink corner case scsi: ufs: Fix ufshcd_hold() caused scheduling while atomic platform/chrome: cros_ec_rpmsg: Fix race with host event jbd2: improve comments about freeing data buffers whose page mapping is NULL acpi/nfit: improve bounds checking for 'func' perf report: Fix no branch type statistics report issue pwm: pca9685: Fix PWM/GPIO inter-operation net/bpfilter: remove superfluous testing message ext4: fix incorrect group count in ext4_fill_super error message ext4: fix incorrect inodes per group in error message clk: at91: sam9x60: fix usb clock parents clk: at91: usb: use proper usbs_mask ARM: dts: imx7-colibri: fix muxing of usbc_det pin arm64: dts: librem5-devkit: add a vbus supply to usb0 usb: dwc3: gadget: Don't clear flags before transfer ended ASoC: Intel: mrfld: fix incorrect check on p->sink ASoC: Intel: mrfld: return error codes when an error occurs ALSA: hda/realtek - Enable the headset mic on Asus FX505DT ALSA: usb-audio: Filter error from connector kctl ops, too ALSA: usb-audio: Don't override ignore_ctl_error value from the map ALSA: usb-audio: Don't create jack controls for PCM terminals ALSA: usb-audio: Check mapping at creating connector controls, too arm64: vdso: don't free unallocated pages keys: Fix proc_keys_next to increase position index tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation btrfs: check commit root generation in should_ignore_root nl80211: fix NL80211_ATTR_FTM_RESPONDER policy mac80211: fix race in ieee80211_register_hw() mac80211_hwsim: Use kstrndup() in place of kasprintf() net/mlx5e: Encapsulate updating netdev queues into a function net/mlx5e: Rename hw_modify to preactivate net/mlx5e: Use preactivate hook to set the indirection table drm/amd/powerplay: force the trim of the mclk dpm_levels if OD is enabled drm/amdgpu: fix the hw hang during perform system reboot and reset i2c: designware: platdrv: Remove DPM_FLAG_SMART_SUSPEND flag on BYT and CHT ext4: do not zeroout extents beyond i_disksize irqchip/ti-sci-inta: Fix processing of masked irqs x86/resctrl: Preserve CDP enable over CPU hotplug x86/resctrl: Fix invalid attempt at removing the default resource group scsi: target: remove boilerplate code scsi: target: fix hang when multiple threads try to destroy the same iscsi session x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE Linux 5.4.34 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ice46175a7478217e00e649fa26ee8631985746f1	2020-04-21 10:22:32 +02:00
Xiao Yang	0026e356e5	tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation commit `0bbe7f7199` upstream. Traced event can trigger 'snapshot' operation(i.e. calls snapshot_trigger() or snapshot_count_trigger()) when register_snapshot_trigger() has completed registration but doesn't allocate buffer for 'snapshot' event trigger. In the rare case, 'snapshot' operation always detects the lack of allocated buffer so make register_snapshot_trigger() allocate buffer first. trigger-snapshot.tc in kselftest reproduces the issue on slow vm: ----------------------------------------------------------- cat trace ... ftracetest-3028 [002] .... 236.784290: sched_process_fork: comm=ftracetest pid=3028 child_comm=ftracetest child_pid=3036 <...>-2875 [003] .... 240.460335: tracing_snapshot_instance_cond: * SNAPSHOT NOT ALLOCATED * <...>-2875 [003] .... 240.460338: tracing_snapshot_instance_cond: * stopping trace here! * ----------------------------------------------------------- Link: http://lkml.kernel.org/r/20200414015145.66236-1-yangx.jy@cn.fujitsu.com Cc: stable@vger.kernel.org Fixes: `93e31ffbf4` ("tracing: Add 'snapshot' event trigger command") Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:59 +02:00
Greg Kroah-Hartman	a9372c6b57	Merge 5.4.33 into android-5.4-stable Changes in 5.4.33 ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads ARM: dts: Fix dm814x Ethernet by changing to use rgmii-id mode bpf: Fix deadlock with rq_lock in bpf_send_signal() iwlwifi: mvm: Fix rate scale NSS configuration Input: tm2-touchkey - add support for Coreriver TC360 variant soc: fsl: dpio: register dpio irq handlers after dpio create rxrpc: Abstract out the calculation of whether there's Tx space rxrpc: Fix call interruptibility handling net: stmmac: platform: Fix misleading interrupt error msg net: vxge: fix wrong __VA_ARGS__ usage hinic: fix a bug of waitting for IO stopped hinic: fix the bug of clearing event queue hinic: fix out-of-order excution in arm cpu hinic: fix wrong para of wait_for_completion_timeout hinic: fix wrong value of MIN_SKB_LEN selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc cxgb4/ptp: pass the sign of offset delta in FW CMD drm/scheduler: fix rare NULL ptr race cfg80211: Do not warn on same channel at the end of CSA qlcnic: Fix bad kzalloc null test i2c: st: fix missing struct parameter description i2c: pca-platform: Use platform_irq_get_optional media: rc: add keymap for Videostrong KII Pro cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL staging: wilc1000: avoid double unlocking of 'wilc->hif_cs' mutex media: venus: hfi_parser: Ignore HEVC encoding for V1 firmware: arm_sdei: fix double-lock on hibernate with shared events null_blk: Fix the null_add_dev() error path null_blk: Handle null_add_dev() failures properly null_blk: fix spurious IO errors after failed past-wp access media: imx: imx7_mipi_csis: Power off the source when stopping streaming media: imx: imx7-media-csi: Fix video field handling xhci: bail out early if driver can't accress host in resume ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add() x86: Don't let pgprot_modify() change the page encryption bit dma-mapping: Fix dma_pgprot() for unencrypted coherent pages block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices debugfs: Check module state before warning in {full/open}_proxy_open() irqchip/versatile-fpga: Handle chained IRQs properly time/sched_clock: Expire timer in hardirq context media: allegro: fix type of gop_length in channel_create message sched: Avoid scale real weight down to zero selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault PCI/switchtec: Fix init_completion race condition with poll_wait() block, bfq: move forward the getting of an extra ref in bfq_bfqq_move media: i2c: video-i2c: fix build errors due to 'imply hwmon' libata: Remove extra scsi_host_put() in ata_scsi_add_hosts() pstore/platform: fix potential mem leak if pstore_init_fs failed gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty gfs2: Don't demote a glock until its revokes are written cpufreq: imx6q: fix error handling x86/boot: Use unsigned comparison for addresses efi/x86: Ignore the memory attributes table on i386 genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() block: Fix use-after-free issue accessing struct io_cq media: i2c: ov5695: Fix power on and off sequences usb: dwc3: core: add support for disabling SS instances in park mode irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency md: check arrays is suspended in mddev_detach before call quiesce operations firmware: fix a double abort case with fw_load_sysfs_fallback spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() block, bfq: fix use-after-free in bfq_idle_slice_timer_body btrfs: qgroup: ensure qgroup_rescan_running is only set when the worker is at least queued btrfs: remove a BUG_ON() from merge_reloc_roots() btrfs: restart relocate_tree_blocks properly btrfs: track reloc roots based on their commit root bytenr ASoC: fix regwmask ASoC: dapm: connect virtual mux with default value ASoC: dpcm: allow start or stop during pause for backend ASoC: topology: use name_prefix for new kcontrol usb: gadget: f_fs: Fix use after free issue as part of queue failure usb: gadget: composite: Inform controller driver of self-powered ALSA: usb-audio: Add mixer workaround for TRX40 and co ALSA: hda: Add driver blacklist ALSA: hda: Fix potential access overflow in beep helper ALSA: ice1724: Fix invalid access for enumerated ctl items ALSA: pcm: oss: Fix regression by buffer overflow fix ALSA: hda/realtek: Enable mute LED on an HP system ALSA: hda/realtek - a fake key event is triggered by running shutup ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256 ALSA: hda/realtek - Set principled PC Beep configuration for ALC256 ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups ALSA: hda/realtek - Add quirk for Lenovo Carbon X1 8th gen ALSA: hda/realtek - Add quirk for MSI GL63 media: venus: firmware: Ignore secure call error on first resume media: hantro: Read be32 words starting at every fourth byte media: ti-vpe: cal: fix disable_irqs to only the intended target media: ti-vpe: cal: fix a kernel oops when unloading module seccomp: Add missing compat_ioctl for notify acpi/x86: ignore unspecified bit positions in the ACPI global lock field ACPICA: Allow acpi_any_gpe_status_set() to skip one GPE ACPI: PM: s2idle: Refine active GPEs check thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n nvmet-tcp: fix maxh2cdata icresp parameter nvme-fc: Revert "add module to ops template to allow module references" efi/x86: Add TPM related EFI tables to unencrypted mapping checks PCI: pciehp: Fix indefinite wait on sysfs requests PCI/ASPM: Clear the correct bits when enabling L1 substates PCI: Add boot interrupt quirk mechanism for Xeon chipsets PCI: qcom: Fix the fixup of PCI_VENDOR_ID_QCOM PCI: endpoint: Fix for concurrent memory allocation in OB address region sched/fair: Fix enqueue_task_fair warning tpm: Don't make log failures fatal tpm: tpm1_bios_measurements_next should increase position index tpm: tpm2_bios_measurements_next should increase position index KEYS: reaching the keys quotas correctly cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus() genirq/debugfs: Add missing sanity checks to interrupt injection irqchip/versatile-fpga: Apply clear-mask earlier io_uring: remove bogus RLIMIT_NOFILE check in file registration pstore: pstore_ftrace_seq_next should increase position index MIPS/tlbex: Fix LDDIR usage in setup_pw() for Loongson-3 MIPS: OCTEON: irq: Fix potential NULL pointer dereference PM / Domains: Allow no domain-idle-states DT property in genpd when parsing PM: sleep: wakeup: Skip wakeup_source_sysfs_remove() if device is not there ath9k: Handle txpower changes even when TPC is disabled signal: Extend exec_id to 64bits x86/tsc_msr: Use named struct initializers x86/tsc_msr: Fix MSR_FSB_FREQ mask for Cherry Trail devices x86/tsc_msr: Make MSR derived TSC frequency more accurate x86/entry/32: Add missing ASM_CLAC to general_protection entry platform/x86: asus-wmi: Support laptops where the first battery is named BATT KVM: nVMX: Properly handle userspace interrupt window request KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks KVM: s390: vsie: Fix delivery of addressing exceptions KVM: x86: Allocate new rmap and large page tracking when moving memslot KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support KVM: x86: Gracefully handle __vmalloc() failure during VM allocation KVM: VMX: Add a trampoline to fix VMREAD error handling KVM: VMX: fix crash cleanup when KVM wasn't used smb3: fix performance regression with setting mtime CIFS: Fix bug which the return value by asynchronous read is error mtd: spinand: Stop using spinand->oobbuf for buffering bad block markers mtd: spinand: Do not erase the block before writing a bad block marker btrfs: Don't submit any btree write bio if the fs has errors Btrfs: fix crash during unmount due to race with delayed inode workers btrfs: reloc: clean dirty subvols if we fail to start a transaction btrfs: set update the uuid generation as soon as possible btrfs: drop block from cache on error in relocation btrfs: fix missing file extent item for hole after ranged fsync btrfs: unset reloc control if we fail to recover btrfs: fix missing semaphore unlock in btrfs_sync_file btrfs: use nofs allocations for running delayed items remoteproc: qcom_q6v5_mss: Don't reassign mpss region on shutdown remoteproc: qcom_q6v5_mss: Reload the mba region on coredump remoteproc: Fix NULL pointer dereference in rproc_virtio_notify crypto: rng - Fix a refcounting bug in crypto_rng_reset() crypto: mxs-dcp - fix scatterlist linearization for hash erofs: correct the remaining shrink objects io_uring: honor original task RLIMIT_FSIZE mmc: sdhci-of-esdhc: fix esdhc_reset() for different controller versions powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init() tools: gpio: Fix out-of-tree build regression net: qualcomm: rmnet: Allow configuration updates to existing devices arm64: dts: allwinner: h6: Fix PMU compatible sched/core: Remove duplicate assignment in sched_tick_remote() arm64: dts: allwinner: h5: Fix PMU compatible mm, memcg: do not high throttle allocators based on wraparound dm writecache: add cond_resched to avoid CPU hangs dm integrity: fix a crash with unusually large tag size dm verity fec: fix memory leak in verity_fec_dtr dm clone: Add overflow check for number of regions dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() XArray: Fix xas_pause for large multi-index entries xarray: Fix early termination of xas_for_each_marked crypto: caam/qi2 - fix chacha20 data size error crypto: caam - update xts sector size for large input length crypto: ccree - protect against empty or NULL scatterlists crypto: ccree - only try to map auth tag if needed crypto: ccree - dec auth tag size from cryptlen map scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point scsi: ufs: fix Auto-Hibern8 error detection scsi: lpfc: Fix lpfc_io_buf resource leak in lpfc_get_scsi_buf_s4 error path ARM: dts: exynos: Fix polarity of the LCD SPI bus on UniversalC210 board arm64: dts: ti: k3-am65: Add clocks to dwc3 nodes arm64: armv8_deprecated: Fix undef_hook mask for thumb setend selftests: vm: drop dependencies on page flags from mlock2 tests selftests/vm: fix map_hugetlb length used for testing read and write selftests/powerpc: Add tlbie_test in .gitignore vfio: platform: Switch to platform_get_irq_optional() drm/i915/gem: Flush all the reloc_gpu batch drm/etnaviv: rework perfmon query infrastructure drm: Remove PageReserved manipulation from drm_pci_alloc drm/amdgpu/powerplay: using the FCLK DPM table to set the MCLK drm/amdgpu: unify fw_write_wait for new gfx9 asics powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable nfsd: fsnotify on rmdir under nfsd/clients/ NFS: Fix use-after-free issues in nfs_pageio_add_request() NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() ext4: fix a data race at inode->i_blocks fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() ocfs2: no need try to truncate file beyond i_size perf tools: Support Python 3.8+ in Makefile s390/diag: fix display of diagnose call statistics Input: i8042 - add Acer Aspire 5738z to nomux list ftrace/kprobe: Show the maxactive number on kprobe_events clk: ingenic/jz4770: Exit with error if CGU init failed clk: ingenic/TCU: Fix round_rate returning error kmod: make request_module() return an error when autoloading is disabled cpufreq: powernv: Fix use-after-free hfsplus: fix crash and filesystem corruption when deleting files libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set ipmi: fix hung processes in __get_guid() xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() powerpc/64/tm: Don't let userspace set regs->trap via sigreturn powerpc/fsl_booke: Avoid creating duplicate tlb1 entry powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs powerpc/64: Setup a paca before parsing device tree etc. powerpc/xive: Fix xmon support on the PowerNV platform powerpc/kprobes: Ignore traps that happened in real mode powerpc/64: Prevent stack protection in early boot scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug powerpc: Make setjmp/longjmp signature standard arm64: Always force a branch protection mode when the compiler has one dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone() dm clone: replace spin_lock_irqsave with spin_lock_irq dm clone: Fix handling of partial region discards dm clone: Add missing casts to prevent overflows and data corruption scsi: lpfc: Add registration for CPU Offline/Online events scsi: lpfc: Fix Fabric hostname registration if system hostname changes scsi: lpfc: Fix configuration of BB credit recovery in service parameters scsi: lpfc: Fix broken Credit Recovery after driver load Revert "drm/dp_mst: Remove VCPI while disabling topology mgr" drm/dp_mst: Fix clearing payload state on topology disable drm/amdgpu: fix gfx hang during suspend with video playback (v2) drm/i915/icl+: Don't enable DDI IO power on a TypeC port in TBT mode powerpc/kasan: Fix kasan_remap_early_shadow_ro() mmc: sdhci: Convert sdhci_set_timeout_irq() to non-static mmc: sdhci: Refactor sdhci_set_timeout() bpf: Fix tnum constraints for 32-bit comparisons mfd: dln2: Fix sanity checking for endpoints efi/x86: Fix the deletion of variables in mixed mode ASoC: stm32: sai: Add missing cleanup scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list() Linux 5.4.33 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6c37e2c64801a572781c46fc5883bcc74e6a7a1a	2020-04-17 11:26:58 +02:00
Jann Horn	b70eb420e9	bpf: Fix tnum constraints for 32-bit comparisons [ Upstream commit `604dca5e3a` ] The BPF verifier tried to track values based on 32-bit comparisons by (ab)using the tnum state via `581738a681` ("bpf: Provide better register bounds after jmp32 instructions"). The idea is that after a check like this: if ((u32)r0 > 3) exit We can't meaningfully constrain the arithmetic-range-based tracking, but we can update the tnum state to (value=0,mask=0xffff'ffff'0000'0003). However, the implementation from `581738a681` didn't compute the tnum constraint based on the fixed operand, but instead derives it from the arithmetic-range-based tracking. This means that after the following sequence of operations: if (r0 >= 0x1'0000'0001) exit if ((u32)r0 > 7) exit The verifier assumed that the lower half of r0 is in the range (0, 0) and apply the tnum constraint (value=0,mask=0xffff'ffff'0000'0000) thus causing the overall tnum to be (value=0,mask=0x1'0000'0000), which was incorrect. Provide a fixed implementation. Fixes: `581738a681` ("bpf: Provide better register bounds after jmp32 instructions") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330160324.15259-3-daniel@iogearbox.net Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Eric Biggers	9cc4f52d34	kmod: make request_module() return an error when autoloading is disabled commit `d7d27cfc5c` upstream. Patch series "module autoloading fixes and cleanups", v5. This series fixes a bug where request_module() was reporting success to kernel code when module autoloading had been completely disabled via 'echo > /proc/sys/kernel/modprobe'. It also addresses the issues raised on the original thread (https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u) bydocumenting the modprobe sysctl, adding a self-test for the empty path case, and downgrading a user-reachable WARN_ONCE(). This patch (of 4): It's long been possible to disable kernel module autoloading completely (while still allowing manual module insertion) by setting /proc/sys/kernel/modprobe to the empty string. This can be preferable to setting it to a nonexistent file since it avoids the overhead of an attempted execve(), avoids potential deadlocks, and avoids the call to security_kernel_module_request() and thus on SELinux-based systems eliminates the need to write SELinux rules to dontaudit module_request. However, when module autoloading is disabled in this way, request_module() returns 0. This is broken because callers expect 0 to mean that the module was successfully loaded. Apparently this was never noticed because this method of disabling module autoloading isn't used much, and also most callers don't use the return value of request_module() since it's always necessary to check whether the module registered its functionality or not anyway. But improperly returning 0 can indeed confuse a few callers, for example get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit: if (!fs && (request_module("fs-%.s", len, name) == 0)) { fs = __get_fs_type(name, len); WARN_ONCE(!fs, "request_module fs-%.s succeeded, but still no fs?\n", len, name); } This is easily reproduced with: echo > /proc/sys/kernel/modprobe mount -t NONEXISTENT none / It causes: request_module fs-NONEXISTENT succeeded, but still no fs? WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0 [...] This should actually use pr_warn_once() rather than WARN_ONCE(), since it's also user-reachable if userspace immediately unloads the module. Regardless, request_module() should correctly return an error when it fails. So let's make it return -ENOENT, which matches the error when the modprobe binary doesn't exist. I've also sent patches to document and test this case. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jessica Yu <jeyu@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Ben Hutchings <benh@debian.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Masami Hiramatsu	7bcca67bde	ftrace/kprobe: Show the maxactive number on kprobe_events commit `6a13a0d7b4` upstream. Show maxactive parameter on kprobe_events. This allows user to save the current configuration and restore it without losing maxactive parameter. Link: http://lkml.kernel.org/r/4762764a-6df7-bc93-ed60-e336146dce1f@gmail.com Link: http://lkml.kernel.org/r/158503528846.22706.5549974121212526020.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: `696ced4fb1` ("tracing/kprobes: expose maxactive for kretprobe in kprobe_events") Reported-by: Taeung Song <treeze.taeung@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:21 +02:00
Scott Wood	1dbfae0095	sched/core: Remove duplicate assignment in sched_tick_remote() commit `82e0516ce3` upstream. A redundant "curr = rq->curr" was added; remove it. Fixes: `ebc0f83c78` ("timers/nohz: Update NOHZ load in remote tick") Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1580776558-12882-1-git-send-email-swood@redhat.com Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:17 +02:00
Eric W. Biederman	5f2d04139a	signal: Extend exec_id to 64bits commit `d1e7fd6462` upstream. Replace the 32bit exec_id with a 64bit exec_id to make it impossible to wrap the exec_id counter. With care an attacker can cause exec_id wrap and send arbitrary signals to a newly exec'd parent. This bypasses the signal sending checks if the parent changes their credentials during exec. The severity of this problem can been seen that in my limited testing of a 32bit exec_id it can take as little as 19s to exec 65536 times. Which means that it can take as little as 14 days to wrap a 32bit exec_id. Adam Zabrocki has succeeded wrapping the self_exe_id in 7 days. Even my slower timing is in the uptime of a typical server. Which means self_exec_id is simply a speed bump today, and if exec gets noticably faster self_exec_id won't even be a speed bump. Extending self_exec_id to 64bits introduces a problem on 32bit architectures where reading self_exec_id is no longer atomic and can take two read instructions. Which means that is is possible to hit a window where the read value of exec_id does not match the written value. So with very lucky timing after this change this still remains expoiltable. I have updated the update of exec_id on exec to use WRITE_ONCE and the read of exec_id in do_notify_parent to use READ_ONCE to make it clear that there is no locking between these two locations. Link: https://lore.kernel.org/kernel-hardening/20200324215049.GA3710@pi3.com.pl Fixes: 2.3.23pre2 Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:12 +02:00
Thomas Gleixner	3f3700c469	genirq/debugfs: Add missing sanity checks to interrupt injection commit `a740a423c3` upstream. Interrupts cannot be injected when the interrupt is not activated and when a replay is already in progress. Fixes: `536e2e34bd` ("genirq/debugfs: Triggering of interrupts from userspace") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200306130623.500019114@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:11 +02:00
Thomas Gleixner	6ecc37daf6	cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus() commit `e98eac6ff1` upstream. A recent change to freeze_secondary_cpus() which added an early abort if a wakeup is pending missed the fact that the function is also invoked for shutdown, reboot and kexec via disable_nonboot_cpus(). In case of disable_nonboot_cpus() the wakeup event needs to be ignored as the purpose is to terminate the currently running kernel. Add a 'suspend' argument which is only set when the freeze is in context of a suspend operation. If not set then an eventually pending wakeup event is ignored. Fixes: `a66d955e91` ("cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending") Reported-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pavankumar Kondeti <pkondeti@codeaurora.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/874kuaxdiz.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:11 +02:00
Vincent Guittot	524089fa70	sched/fair: Fix enqueue_task_fair warning commit `fe61468b2c` upstream. When a cfs rq is throttled, the latter and its child are removed from the leaf list but their nr_running is not changed which includes staying higher than 1. When a task is enqueued in this throttled branch, the cfs rqs must be added back in order to ensure correct ordering in the list but this can only happens if nr_running == 1. When cfs bandwidth is used, we call unconditionnaly list_add_leaf_cfs_rq() when enqueuing an entity to make sure that the complete branch will be added. Similarly unthrottle_cfs_rq() can stop adding cfs in the list when a parent is throttled. Iterate the remaining entity to ensure that the complete branch will be added in the list. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: stable@vger.kernel.org Cc: stable@vger.kernel.org #v5.1+ Link: https://lkml.kernel.org/r/20200306135257.25044-1-vincent.guittot@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:11 +02:00
Sven Schnelle	52e6985f2c	seccomp: Add missing compat_ioctl for notify commit `3db81afd99` upstream. Executing the seccomp_bpf testsuite under a 64-bit kernel with 32-bit userland (both s390 and x86) doesn't work because there's no compat_ioctl handler defined. Add the handler. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Fixes: `6a21cc50f0` ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200310123332.42255-1-svens@linux.ibm.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:09 +02:00
Boqun Feng	bd9afea9bd	locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() [ Upstream commit `25016bd7f4` ] Qian Cai reported a bug when PROVE_RCU_LIST=y, and read on /proc/lockdep triggered a warning: [ ] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) ... [ ] Call Trace: [ ] lock_is_held_type+0x5d/0x150 [ ] ? rcu_lockdep_current_cpu_online+0x64/0x80 [ ] rcu_read_lock_any_held+0xac/0x100 [ ] ? rcu_read_lock_held+0xc0/0xc0 [ ] ? __slab_free+0x421/0x540 [ ] ? kasan_kmalloc+0x9/0x10 [ ] ? __kmalloc_node+0x1d7/0x320 [ ] ? kvmalloc_node+0x6f/0x80 [ ] __bfs+0x28a/0x3c0 [ ] ? class_equal+0x30/0x30 [ ] lockdep_count_forward_deps+0x11a/0x1a0 The warning got triggered because lockdep_count_forward_deps() call __bfs() without current->lockdep_recursion being set, as a result a lockdep internal function (__bfs()) is checked by lockdep, which is unexpected, and the inconsistency between the irq-off state and the state traced by lockdep caused the warning. Apart from this warning, lockdep internal functions like __bfs() should always be protected by current->lockdep_recursion to avoid potential deadlocks and data inconsistency, therefore add the current->lockdep_recursion on-and-off section to protect __bfs() in both lockdep_count_forward_deps() and lockdep_count_backward_deps() Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200312151258.128036-1-boqun.feng@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:05 +02:00
Alexander Sverdlin	b9d5ced37a	genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() [ Upstream commit `87f2d1c662` ] irq_domain_alloc_irqs_hierarchy() has 3 call sites in the compilation unit but only one of them checks for the pointer which is being dereferenced inside the called function. Move the check into the function. This allows for catching the error instead of the following crash: Unable to handle kernel NULL pointer dereference at virtual address 00000000 PC is at 0x0 LR is at gpiochip_hierarchy_irq_domain_alloc+0x11f/0x140 ... [<c06c23ff>] (gpiochip_hierarchy_irq_domain_alloc) [<c0462a89>] (__irq_domain_alloc_irqs) [<c0462dad>] (irq_create_fwspec_mapping) [<c06c2251>] (gpiochip_to_irq) [<c06c1c9b>] (gpiod_to_irq) [<bf973073>] (gpio_irqs_init [gpio_irqs]) [<bf974048>] (gpio_irqs_exit+0xecc/0xe84 [gpio_irqs]) Code: bad PC value Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200306174720.82604-1-alexander.sverdlin@nokia.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:04 +02:00
Michael Wang	dd39eadc71	sched: Avoid scale real weight down to zero [ Upstream commit `26cf52229e` ] During our testing, we found a case that shares no longer working correctly, the cgroup topology is like: /sys/fs/cgroup/cpu/A (shares=102400) /sys/fs/cgroup/cpu/A/B (shares=2) /sys/fs/cgroup/cpu/A/B/C (shares=1024) /sys/fs/cgroup/cpu/D (shares=1024) /sys/fs/cgroup/cpu/D/E (shares=1024) /sys/fs/cgroup/cpu/D/E/F (shares=1024) The same benchmark is running in group C & F, no other tasks are running, the benchmark is capable to consumed all the CPUs. We suppose the group C will win more CPU resources since it could enjoy all the shares of group A, but it's F who wins much more. The reason is because we have group B with shares as 2, since A->cfs_rq.load.weight == B->se.load.weight == B->shares/nr_cpus, so A->cfs_rq.load.weight become very small. And in calc_group_shares() we calculate shares as: load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); shares = (tg_shares * load) / tg_weight; Since the 'cfs_rq->load.weight' is too small, the load become 0 after scale down, although 'tg_shares' is 102400, shares of the se which stand for group A on root cfs_rq become 2. While the se of D on root cfs_rq is far more bigger than 2, so it wins the battle. Thus when scale_load_down() scale real weight down to 0, it's no longer telling the real story, the caller will have the wrong information and the calculation will be buggy. This patch add check in scale_load_down(), so the real weight will be >= MIN_SHARES after scale, after applied the group C wins as expected. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/38e8e212-59a1-64b2-b247-b6d0b52d8dc1@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:02 +02:00
Ahmed S. Darwish	2902207377	time/sched_clock: Expire timer in hardirq context [ Upstream commit `2c8bd58812` ] To minimize latency, PREEMPT_RT kernels expires hrtimers in preemptible softirq context by default. This can be overriden by marking the timer's expiry with HRTIMER_MODE_HARD. sched_clock_timer is missing this annotation: if its callback is preempted and the duration of the preemption exceeds the wrap around time of the underlying clocksource, sched clock will get out of sync. Mark the sched_clock_timer for expiry in hard interrupt context. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200309181529.26558-1-a.darwish@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:02 +02:00
Thomas Hellstrom	e88ee287fd	dma-mapping: Fix dma_pgprot() for unencrypted coherent pages [ Upstream commit `17c4a2ae15` ] When dma_mmap_coherent() sets up a mapping to unencrypted coherent memory under SEV encryption and sometimes under SME encryption, it will actually set up an encrypted mapping rather than an unencrypted, causing devices that DMAs from that memory to read encrypted contents. Fix this. When force_dma_unencrypted() returns true, the linear kernel map of the coherent pages have had the encryption bit explicitly cleared and the page content is unencrypted. Make sure that any additional PTEs we set up to these pages also have the encryption bit cleared by having dma_pgprot() return a protection with the encryption bit cleared in this case. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20200304114527.3636-3-thomas_os@shipmail.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:01 +02:00
Yonghong Song	fd29a0242f	bpf: Fix deadlock with rq_lock in bpf_send_signal() [ Upstream commit `1bc7896e9e` ] When experimenting with bpf_send_signal() helper in our production environment (5.2 based), we experienced a deadlock in NMI mode: #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24 #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012 #7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd #8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55 #9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602 #10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a #11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227 #12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140 #13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf #14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09 #15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47 #16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d #17 [ffffc9002219fc90] schedule at ffffffff81a3e219 #18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9 #19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529 #20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc #21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c #22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602 #23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068 The above call stack is actually very similar to an issue reported by Commit `eac9153f2b` ("bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()") by Song Liu. The only difference is bpf_send_signal() helper instead of bpf_get_stack() helper. The above deadlock is triggered with a perf_sw_event. Similar to Commit `eac9153f2b`, the below almost identical reproducer used tracepoint point sched/sched_switch so the issue can be easily caught. /* stress_test.c / #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define THREAD_COUNT 1000 char filename; void worker(void p) { void ptr; int fd; char pptr; fd = open(filename, O_RDONLY); if (fd < 0) return NULL; while (1) { struct timespec ts = {0, 1000 + rand() % 2000}; ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0); usleep(1); if (ptr == MAP_FAILED) { printf("failed to mmap\n"); break; } munmap(ptr, 4096 * 64); usleep(1); pptr = malloc(1); usleep(1); pptr[0] = 1; usleep(1); free(pptr); usleep(1); nanosleep(&ts, NULL); } close(fd); return NULL; } int main(int argc, char argv[]) { void ptr; int i; pthread_t threads[THREAD_COUNT]; if (argc < 2) return 0; filename = argv[1]; for (i = 0; i < THREAD_COUNT; i++) { if (pthread_create(threads + i, NULL, worker, NULL)) { fprintf(stderr, "Error creating thread\n"); return 0; } } for (i = 0; i < THREAD_COUNT; i++) pthread_join(threads[i], NULL); return 0; } and the following command: 1. run `stress_test /bin/ls` in one windown 2. hack bcc trace.py with the following change: # --- a/tools/trace.py # +++ b/tools/trace.py @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s); __data.tgid = __tgid; __data.pid = __pid; bpf_get_current_comm(&__data.comm, sizeof(__data.comm)); + bpf_send_signal(10); %s %s %s.perf_submit(%s, &__data, sizeof(__data)); 3. in a different window run ./trace.py -p $(pidof stress_test) t:sched:sched_switch The deadlock can be reproduced in our production system. Similar to Song's fix, the fix is to delay sending signal if irqs is disabled to avoid deadlocks involving with rq_lock. With this change, my above stress-test in our production system won't cause deadlock any more. I also implemented a scale-down version of reproducer in the selftest (a subsequent commit). With latest bpf-next, it complains for the following potential deadlock. [ 32.832450] -> #1 (&p->pi_lock){-.-.}: [ 32.833100] _raw_spin_lock_irqsave+0x44/0x80 [ 32.833696] task_rq_lock+0x2c/0xa0 [ 32.834182] task_sched_runtime+0x59/0xd0 [ 32.834721] thread_group_cputime+0x250/0x270 [ 32.835304] thread_group_cputime_adjusted+0x2e/0x70 [ 32.835959] do_task_stat+0x8a7/0xb80 [ 32.836461] proc_single_show+0x51/0xb0 ... [ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}: [ 32.840275] __lock_acquire+0x1358/0x1a20 [ 32.840826] lock_acquire+0xc7/0x1d0 [ 32.841309] _raw_spin_lock_irqsave+0x44/0x80 [ 32.841916] __lock_task_sighand+0x79/0x160 [ 32.842465] do_send_sig_info+0x35/0x90 [ 32.842977] bpf_send_signal+0xa/0x10 [ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000 [ 32.844301] trace_call_bpf+0x115/0x270 [ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0 [ 32.845411] perf_trace_sched_switch+0x10f/0x180 [ 32.846014] __schedule+0x45d/0x880 [ 32.846483] schedule+0x5f/0xd0 ... [ 32.853148] Chain exists of: [ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock [ 32.853148] [ 32.854451] Possible unsafe locking scenario: [ 32.854451] [ 32.855173] CPU0 CPU1 [ 32.855745] ---- ---- [ 32.856278] lock(&rq->lock); [ 32.856671] lock(&p->pi_lock); [ 32.857332] lock(&rq->lock); [ 32.857999] lock(&(&sighand->siglock)->rlock); Deadlock happens on CPU0 when it tries to acquire &sighand->siglock but it has been held by CPU1 and CPU1 tries to grab &rq->lock and cannot get it. This is not exactly the callstack in our production environment, but sympotom is similar and both locks are using spin_lock_irqsave() to acquire the lock, and both involves rq_lock. The fix to delay sending signal when irq is disabled also fixed this issue. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200304191104.2796501-1-yhs@fb.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:49:57 +02:00
John Stultz	5327531d39	ANDROID: irq: irqchip: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent Add EXPORT_SYMBOL_GPL entries for irq_chip_retrigger_hierarchy() and irq_chip_set_vcpu_affinity_parent() so that we can allow drivers like the qcom-pdc driver to be loadable as a module. Signed-off-by: John Stultz <john.stultz@linaro.org> Bug: 153049053 Change-Id: Ie0202adb6f02fee19897d1b18df978c95ff58118	2020-04-16 18:14:36 +00:00
John Stultz	974cac1a5d	ANDROID: irq: irqdomain: Export irq_domain_update_bus_token Add export for irq_domain_update_bus_token() so that we can allow drivers like the qcom-pdc driver to be loadable as a module. Signed-off-by: John Stultz <john.stultz@linaro.org> Bug: 153049053 Change-Id: I395859c3e6fe7e8ceb4075c84267fbb68fbb1938	2020-04-16 18:14:26 +00:00
Kelly Rossmoyer	189ced91cd	ANDROID: power: wakeup_reason: wake reason enhancements These changes build upon the existing Android kernel wakeup reason code to: * improve the positioning of suspend abort logging calls in suspend flow * add logging of abnormal wakeup reasons like unexpected HW IRQs and IRQs configured as both wake-enabled and no-suspend * add support for capturing deferred-processing threaded nested IRQs as wakeup reasons rather than their synchronously-processed parents Bug: 150970830 Bug: 140217217 Bug: 120445600 Signed-off-by: Kelly Rossmoyer <krossmo@google.com> Change-Id: I903b811a0fe11a605a25815c3a341668a23de700	2020-04-09 15:26:39 +00:00
Greg Kroah-Hartman	33717ea779	Merge 5.4.31 into android-5.4 Changes in 5.4.31 nvme-rdma: Avoid double freeing of async event data kconfig: introduce m32-flag and m64-flag drm/amd/display: Add link_rate quirk for Apple 15" MBP 2017 drm/bochs: downgrade pci_request_region failure from error to warning initramfs: restore default compression behavior drm/amdgpu: fix typo for vcn1 idle check tools/power turbostat: Fix gcc build warnings tools/power turbostat: Fix missing SYS_LPI counter on some Chromebooks tools/power turbostat: Fix 32-bit capabilities warning net/mlx5e: kTLS, Fix TCP seq off-by-1 issue in TX resync flow XArray: Fix xa_find_next for large multi-index entries padata: fix uninitialized return value in padata_replace() brcmfmac: abort and release host after error misc: rtsx: set correct pcr_ops for rts522A misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices misc: pci_endpoint_test: Avoid using module parameter to determine irqtype PCI: sysfs: Revert "rescan" file renames coresight: do not use the BIT() macro in the UAPI header mei: me: add cedar fork device ids nvmem: check for NULL reg_read and reg_write before dereferencing extcon: axp288: Add wakeup support power: supply: axp288_charger: Add special handling for HP Pavilion x2 10 Revert "dm: always call blk_queue_split() in dm_process_bio()" ALSA: hda/ca0132 - Add Recon3Di quirk to handle integrated sound on EVGA X99 Classified motherboard soc: mediatek: knows_txdone needs to be set in Mediatek CMDQ helper net/mlx5e: kTLS, Fix wrong value in record tracker enum iwlwifi: consider HE capability when setting LDPC iwlwifi: yoyo: don't add TLV offset when reading FIFOs iwlwifi: dbg: don't abort if sending DBGC_SUSPEND_RESUME fails rxrpc: Fix sendmsg(MSG_WAITALL) handling IB/hfi1: Ensure pq is not left on waitlist tcp: fix TFO SYNACK undo to avoid double-timestamp-undo watchdog: iTCO_wdt: Export vendorsupport watchdog: iTCO_wdt: Make ICH_RES_IO_SMI optional i2c: i801: Do not add ICH_RES_IO_SMI for the iTCO_wdt device net: Fix Tx hash bound checking padata: always acquire cpu_hotplug_lock before pinst->lock mm: mempolicy: require at least one nodeid for MPOL_PREFERRED Linux 5.4.31 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I9c87409ae13ad2da7a90be98586a85904a5cdb33	2020-04-08 13:00:39 +02:00
Daniel Jordan	c3d4e6fc4b	padata: always acquire cpu_hotplug_lock before pinst->lock commit `38228e8848` upstream. lockdep complains when padata's paths to update cpumasks via CPU hotplug and sysfs are both taken: # echo 0 > /sys/devices/system/cpu/cpu1/online # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask ====================================================== WARNING: possible circular locking dependency detected 5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted ------------------------------------------------------ bash/205 is trying to acquire lock: ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120 but task is already holding lock: ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120 which lock already depends on the new lock. padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent order. Which should be first? CPU hotplug calls into padata with cpu_hotplug_lock already held, so it should have priority. Fixes: `6751fb3c0e` ("padata: Use get_online_cpus/put_online_cpus") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-08 09:08:47 +02:00
Daniel Jordan	625b940a28	padata: fix uninitialized return value in padata_replace() [ Upstream commit `41ccdbfd54` ] According to Geert's report[0], kernel/padata.c: warning: 'err' may be used uninitialized in this function [-Wuninitialized]: => 539:2 Warning is seen only with older compilers on certain archs. The runtime effect is potentially returning garbage down the stack when padata's cpumasks are modified before any pcrypt requests have run. Simplest fix is to initialize err to the success value. [0] http://lkml.kernel.org/r/20200210135506.11536-1-geert@linux-m68k.org Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: `bbefa1dd6a` ("crypto: pcrypt - Avoid deadlock by using per-instance padata queues") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-08 09:08:40 +02:00
Eric Biggers	c5c2143f73	FROMLIST: kmod: make request_module() return an error when autoloading is disabled It's long been possible to disable kernel module autoloading completely (while still allowing manual module insertion) by setting /proc/sys/kernel/modprobe to the empty string. This can be preferable to setting it to a nonexistent file since it avoids the overhead of an attempted execve(), avoids potential deadlocks, and avoids the call to security_kernel_module_request() and thus on SELinux-based systems eliminates the need to write SELinux rules to dontaudit module_request. However, when module autoloading is disabled in this way, request_module() returns 0. This is broken because callers expect 0 to mean that the module was successfully loaded. Apparently this was never noticed because this method of disabling module autoloading isn't used much, and also most callers don't use the return value of request_module() since it's always necessary to check whether the module registered its functionality or not anyway. But improperly returning 0 can indeed confuse a few callers, for example get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit: if (!fs && (request_module("fs-%.s", len, name) == 0)) { fs = __get_fs_type(name, len); WARN_ONCE(!fs, "request_module fs-%.s succeeded, but still no fs?\n", len, name); } This is easily reproduced with: echo > /proc/sys/kernel/modprobe mount -t NONEXISTENT none / It causes: request_module fs-NONEXISTENT succeeded, but still no fs? WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0 [...] This should actually use pr_warn_once() rather than WARN_ONCE(), since it's also user-reachable if userspace immediately unloads the module. Regardless, request_module() should correctly return an error when it fails. So let's make it return -ENOENT, which matches the error when the modprobe binary doesn't exist. I've also sent patches to document and test this case. Acked-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Jessica Yu <jeyu@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: NeilBrown <neilb@suse.com> Link: https://lore.kernel.org/r/20200318230515.171692-2-ebiggers@kernel.org Bug: 151589316 Change-Id: I5e04f85e12a4f85da23e53bc11da1ade565abcd6 Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-04-06 10:43:25 -07:00
Alistair Delva	2a3049590d	ANDROID: Fix wq fp check for CFI builds A previous change added a test on the wrong config flag; rename CFI to CFI_CLANG. Bug: 145210207 Change-Id: Id8aead2eb2c75ad6442d10165f6cb86ccfb9c2f9 Signed-off-by: Alistair Delva <adelva@google.com>	2020-04-03 19:36:46 +00:00
Qais Yousef	75fdd658cb	UPSTREAM: sched/rt: cpupri_find: Trigger a full search as fallback If we failed to find a fitting CPU, in cpupri_find(), we only fallback to the level we found a hit at. But Steve suggested to fallback to a second full scan instead as this could be a better effort. https://lore.kernel.org/lkml/20200304135404.146c56eb@gandalf.local.home/ We trigger the 2nd search unconditionally since the argument about triggering a full search is that the recorded fall back level might have become empty by then. Which means storing any data about what happened would be meaningless and stale. I had a humble try at timing it and it seemed okay for the small 6 CPUs system I was running on https://lore.kernel.org/lkml/20200305124324.42x6ehjxbnjkklnh@e107158-lin.cambridge.arm.com/ On large system this second full scan could be expensive. But there are no users outside capacity awareness for this fitness function at the moment. Heterogeneous systems tend to be small with 8cores in total. Bug: 120440300 Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20200310142219.syxzn5ljpdxqtbgx@e107158-lin.cambridge.arm.com (cherry picked from commit `e94f80f6c4`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I94a3abf1440f20f57d8fe9ba1bb00e1d05221565	2020-04-03 16:14:43 +01:00
Qais Yousef	a6ec67b2fc	UPSTREAM: sched/rt: Remove unnecessary push for unfit tasks In task_woken_rt() and switched_to_rto() we try trigger push-pull if the task is unfit. But the logic is found lacking because if the task was the only one running on the CPU, then rt_rq is not in overloaded state and won't trigger a push. The necessity of this logic was under a debate as well, a summary of the discussion can be found in the following thread: https://lore.kernel.org/lkml/20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com/ Remove the logic for now until a better approach is agreed upon. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: `804d402fb6` ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-6-qais.yousef@arm.com (cherry picked from commit `d94a9df490`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ia49173b7be5dbd95e41cb1ddf248760975ad3355	2020-04-03 16:14:38 +01:00
Qais Yousef	063d4fb3fd	UPSTREAM: sched/rt: Allow pulling unfitting task When implemented RT Capacity Awareness; the logic was done such that if a task was running on a fitting CPU, then it was sticky and we would try our best to keep it there. But as Steve suggested, to adhere to the strict priority rules of RT class; allow pulling an RT task to unfitting CPU to ensure it gets a chance to run ASAP. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@oasis.local.home/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: `804d402fb6` ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-5-qais.yousef@arm.com (cherry picked from commit `98ca645f82`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ifa1b1529690e190c7ad209643fb70532d8c3b492	2020-04-03 16:14:33 +01:00
Qais Yousef	e1c68ebed9	UPSTREAM: sched/rt: Optimize cpupri_find() on non-heterogenous systems By introducing a new cpupri_find_fitness() function that takes the fitness_fn as an argument and only called when asym_system static key is enabled. cpupri_find() is now a wrapper function that calls cpupri_find_fitness() passing NULL as a fitness_fn, hence disabling the logic that handles fitness by default. Bug: 120440300 LINK: https://lore.kernel.org/lkml/c0772fca-0a4b-c88d-fdf2-5715fcf8447b@arm.com/ Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: `804d402fb6` ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-4-qais.yousef@arm.com (cherry picked from commit `a1bd02e1f2`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I8d72829d0026219ad7d46633c30cb2653343360c	2020-04-03 16:14:27 +01:00
Qais Yousef	a3b3fe50f5	UPSTREAM: sched/rt: Re-instate old behavior in select_task_rq_rt() When RT Capacity Aware support was added, the logic in select_task_rq_rt was modified to force a search for a fitting CPU if the task currently doesn't run on one. But if the search failed, and the search was only triggered to fulfill the fitness request; we could end up selecting a new CPU unnecessarily. Fix this and re-instate the original behavior by ensuring we bail out in that case. This behavior change only affected asymmetric systems that are using util_clamp to implement capacity aware. None asymmetric systems weren't affected. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200218041620.GD28029@codeaurora.org/ Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: `804d402fb6` ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-3-qais.yousef@arm.com (cherry picked from commit `b28bc1e002`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ie3e9e56fcf7d92a1552828b20a5d9d827c030099	2020-04-03 16:14:23 +01:00
Qais Yousef	0abe667623	UPSTREAM: sched/rt: cpupri_find: Implement fallback mechanism for !fit case When searching for the best lowest_mask with a fitness_fn passed, make sure we record the lowest_level that returns a valid lowest_mask so that we can use that as a fallback in case we fail to find a fitting CPU at all levels. The intention in the original patch was not to allow a down migration to unfitting CPU. But this missed the case where we are already running on unfitting one. With this change now RT tasks can still move between unfitting CPUs when they're already running on such CPU. And as Steve suggested; to adhere to the strict priority rules of RT, if a task is already running on a fitting CPU but due to priority it can't run on it, allow it to downmigrate to unfitting CPU so it can run. Bug: 120440300 Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: `804d402fb6` ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-2-qais.yousef@arm.com Link: https://lore.kernel.org/lkml/20200203142712.a7yvlyo2y3le5cpn@e107158-lin/ (cherry picked from commit `d9cb236b94`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ifa10286907644a01e75e37c9f523188fd7b43469	2020-04-03 16:14:11 +01:00
Greg Kroah-Hartman	877f28596d	bpf: Explicitly memset some bpf info structures declared on the stack commit `5c6f258879` upstream. Trying to initialize a structure with "= {};" will not always clean out all padding locations in a structure. So be explicit and call memset to initialize everything for a number of bpf information structures that are then copied from userspace, sometimes from smaller memory locations than the size of the structure. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:11:01 +02:00
Greg Kroah-Hartman	e92528a898	bpf: Explicitly memset the bpf_attr structure commit `8096f22942` upstream. For the bpf syscall, we are relying on the compiler to properly zero out the bpf_attr union that we copy userspace data into. Unfortunately that doesn't always work properly, padding and other oddities might not be correctly zeroed, and in some tests odd things have been found when the stack is pre-initialized to other values. Fix this by explicitly memsetting the structure to 0 before using it. Reported-by: Maciej Żenczykowski <maze@google.com> Reported-by: John Stultz <john.stultz@linaro.org> Reported-by: Alexander Potapenko <glider@google.com> Reported-by: Alistair Delva <adelva@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://android-review.googlesource.com/c/kernel/common/+/1235490 Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:11:01 +02:00
Greg Kroah-Hartman	a469d42c7c	Merge 5.4.29 into android-5.4 Changes in 5.4.29 mmc: core: Allow host controllers to require R1B for CMD6 mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for erase/trim/discard mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command mmc: sdhci-omap: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY mmc: sdhci-tegra: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY ACPI: PM: s2idle: Rework ACPI events synchronization cxgb4: fix throughput drop during Tx backpressure cxgb4: fix Txq restart check during backpressure geneve: move debug check after netdev unregister hsr: fix general protection fault in hsr_addr_is_self() ipv4: fix a RCU-list lock in inet_dump_fib() macsec: restrict to ethernet devices mlxsw: pci: Only issue reset when system is ready mlxsw: spectrum_mr: Fix list iteration in error path net/bpfilter: fix dprintf usage for /dev/kmsg net: cbs: Fix software cbs to consider packet sending time net: dsa: Fix duplicate frames flooded by learning net: dsa: mt7530: Change the LINK bit to reflect the link status net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop net: ena: Add PCI shutdown handler to allow safe kexec net: mvneta: Fix the case where the last poll did not process all rx net/packet: tpacket_rcv: avoid a producer race condition net: phy: dp83867: w/a for fld detect threshold bootstrapping issue net: phy: mdio-bcm-unimac: Fix clock handling net: phy: mdio-mux-bcm-iproc: check clk_prepare_enable() return value net: qmi_wwan: add support for ASKEY WWHC050 net/sched: act_ct: Fix leak of ct zone template on replace net_sched: cls_route: remove the right filter from hashtable net_sched: hold rtnl lock in tcindex_partial_destroy_work() net_sched: keep alloc_hash updated after hash allocation net: stmmac: dwmac-rk: fix error path in rk_gmac_probe NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() r8169: re-enable MSI on RTL8168c slcan: not call free_netdev before rtnl_unlock in slcan_open tcp: also NULL skb->dev when copy was needed tcp: ensure skb->dev is NULL before leaving TCP stack tcp: repair: fix TCP_QUEUE_SEQ implementation vxlan: check return value of gro_cells_init() bnxt_en: Fix Priority Bytes and Packets counters in ethtool -S. bnxt_en: fix memory leaks in bnxt_dcbnl_ieee_getets() bnxt_en: Return error if bnxt_alloc_ctx_mem() fails. bnxt_en: Free context memory after disabling PCI in probe error path. bnxt_en: Reset rings if ring reservation fails during open() net: ip_gre: Separate ERSPAN newlink / changelink callbacks net: ip_gre: Accept IFLA_INFO_DATA-less configuration hsr: use rcu_read_lock() in hsr_get_node_{list/status}() hsr: add restart routine into hsr_get_node_list() hsr: set .netnsok flag net/mlx5: DR, Fix postsend actions write length net/mlx5e: Enhance ICOSQ WQE info fields net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset net/mlx5e: Fix ICOSQ recovery flow with Striding RQ net/mlx5e: Do not recover from a non-fatal syndrome cgroup-v1: cgroup_pidlist_next should update position index nfs: add minor version to nfs_server_key for fscache cpupower: avoid multiple definition with gcc -fno-common drivers/of/of_mdio.c:fix of_mdiobus_register() cgroup1: don't call release_agent when it is "" dt-bindings: net: FMan erratum A050385 arm64: dts: ls1043a: FMan erratum A050385 fsl/fman: detect FMan erratum A050385 drm/amd/display: update soc bb for nv14 drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20 drm/exynos: Fix cleanup of IOMMU related objects iommu/vt-d: Silence RCU-list debugging warnings s390/qeth: don't reset default_out_queue s390/qeth: handle error when backing RX buffer scsi: ipr: Fix softlockup when rescanning devices in petitboot mac80211: Do not send mesh HWMP PREQ if HWMP is disabled dpaa_eth: Remove unnecessary boolean expression in dpaa_get_headroom sxgbe: Fix off by one in samsung driver strncpy size arg net: hns3: fix "tc qdisc del" failed issue iommu/vt-d: Fix debugfs register reads iommu/vt-d: Populate debugfs if IOMMUs are detected iwlwifi: mvm: fix non-ACPI function i2c: hix5hd2: add missed clk_disable_unprepare in remove Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger() Input: fix stale timestamp on key autorepeat events Input: synaptics - enable RMI on HP Envy 13-ad105ng Input: avoid BIT() macro usage in the serio.h UAPI header IB/rdmavt: Free kernel completion queue when done RDMA/core: Fix missing error check on dev_set_name() gpiolib: Fix irq_disable() semantics RDMA/nl: Do not permit empty devices names during RDMA_NLDEV_CMD_NEWLINK/SET RDMA/mad: Do not crash if the rdma device does not have a umad interface ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL ceph: fix memory leak in ceph_cleanup_snapid_map() ARM: dts: dra7: Add bus_dma_limit for L3 bus ARM: dts: omap5: Add bus_dma_limit for L3 bus x86/ioremap: Fix CONFIG_EFI=n build perf probe: Fix to delete multiple probe event perf probe: Do not depend on dwfl_module_addrsym() rtlwifi: rtl8188ee: Fix regression due to commit `d1d1a96bdb` tools: Let O= makes handle a relative path with -C option scripts/dtc: Remove redundant YYLOC global declaration scsi: sd: Fix optimal I/O size for devices that change reported values nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type mac80211: drop data frames without key on encrypted links mac80211: mark station unauthorized before key removal mm/swapfile.c: move inode_lock out of claim_swapfile drivers/base/memory.c: indicate all memory blocks as removable mm/sparse: fix kernel crash with pfn_section_valid check mm: fork: fix kernel_stack memcg stats for various stack implementations gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model bpf: Fix cgroup ref leak in cgroup_bpf_inherit on out-of-memory RDMA/core: Ensure security pkey modify is not lost afs: Fix handling of an abort from a service handler genirq: Fix reference leaks on irq affinity notifiers xfrm: handle NETDEV_UNREGISTER for xfrm device vti[6]: fix packet tx through bpf_redirect() in XinY cases RDMA/mlx5: Fix the number of hwcounters of a dynamic counter RDMA/mlx5: Fix access to wrong pointer while performing flush due to error RDMA/mlx5: Block delay drop to unprivileged users xfrm: fix uctx len check in verify_sec_ctx_len xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire xfrm: policy: Fix doulbe free in xfrm_policy_timer afs: Fix client call Rx-phase signal handling afs: Fix some tracing details afs: Fix unpinned address list during probing ieee80211: fix HE SPR size calculation mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX netfilter: flowtable: reload ip{v6}h in nf_flow_tuple_ip{v6} netfilter: nft_fwd_netdev: validate family and chain type netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress i2c: nvidia-gpu: Handle timeout correctly in gpu_i2c_check_status() bpf, x32: Fix bug with JMP32 JSET BPF_X checking upper bits bpf: Initialize storage pointers to NULL to prevent freeing garbage pointer bpf/btf: Fix BTF verification of enum members in struct/union bpf, sockmap: Remove bucket->lock from sock_{hash\|map}_free ARM: dts: sun8i-a83t-tbs-a711: Fix USB OTG mode detection vti6: Fix memory leak of skb if input policy check fails r8169: fix PHY driver check on platforms w/o module softdeps clocksource/drivers/hyper-v: Untangle stimers and timesync from clocksources bpf: Undo incorrect __reg_bound_offset32 handling USB: serial: option: add support for ASKEY WWHC050 USB: serial: option: add BroadMobi BM806U USB: serial: option: add Wistron Neweb D19Q1 USB: cdc-acm: restore capability check order USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback usb: musb: fix crash with highmen PIO and usbmon media: flexcop-usb: fix endpoint sanity check media: usbtv: fix control-message timeouts staging: kpc2000: prevent underflow in cpld_reconfigure() staging: rtl8188eu: Add ASUS USB-N10 Nano B1 to device table staging: wlan-ng: fix ODEBUG bug in prism2sta_disconnect_usb staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback ahci: Add Intel Comet Lake H RAID PCI ID libfs: fix infoleak in simple_attr_read() media: ov519: add missing endpoint sanity checks media: dib0700: fix rc endpoint lookup media: stv06xx: add missing descriptor sanity checks media: xirlink_cit: add missing descriptor sanity checks media: v4l2-core: fix a use-after-free bug of sd->devnode net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build Linux 5.4.29 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iebce1f1b95935de9229e2deb83dae66cf8661a88	2020-04-02 14:26:14 +02:00
Daniel Borkmann	8d62a8c748	bpf: Undo incorrect __reg_bound_offset32 handling commit `f2d67fec0b` upstream. Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: (b7) r0 = 808464432 1: (7f) r0 >>= r0 2: (14) w0 -= 808464432 3: (07) r0 += 808464432 4: (b7) r1 = 808464432 5: (de) if w1 s<= w0 goto pc+0 R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0 6: (07) r0 += -2144337872 7: (14) w0 -= -1607454672 8: (25) if r0 > 0x30303030 goto pc+0 R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0 9: (76) if w0 s>= 0x303030 goto pc+2 12: (95) exit from 8 to 9: safe from 5 to 6: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020;0x10000001f)) R1_w=invP808464432 R10=fp0 6: (07) r0 += -2144337872 7: (14) w0 -= -1607454672 8: (25) if r0 > 0x30303030 goto pc+0 R0_w=invP(id=0,umin_value=271581184,umax_value=271581311,var_off=(0x10300000;0x7f)) R1_w=invP808464432 R10=fp0 9: safe from 8 to 9: safe verification time 589 usec stack depth 0 processed 17 insns (limit 1000000) [...] The underlying program was xlated as follows: # bpftool p d x i 9 0: (b7) r0 = 808464432 1: (7f) r0 >>= r0 2: (14) w0 -= 808464432 3: (07) r0 += 808464432 4: (b7) r1 = 808464432 5: (de) if w1 s<= w0 goto pc+0 6: (07) r0 += -2144337872 7: (14) w0 -= -1607454672 8: (25) if r0 > 0x30303030 goto pc+0 9: (76) if w0 s>= 0x303030 goto pc+2 10: (05) goto pc-1 11: (05) goto pc-1 12: (95) exit The verifier rewrote original instructions it recognized as dead code with 'goto pc-1', but reality differs from verifier simulation in that we're actually able to trigger a hang due to hitting the 'goto pc-1' instructions. Taking different examples to make the issue more obvious: in this example we're probing bounds on a completely unknown scalar variable in r1: [...] 5: R0_w=inv1 R1_w=inv(id=0) R10=fp0 5: (18) r2 = 0x4000000000 7: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R10=fp0 7: (18) r3 = 0x2000000000 9: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R10=fp0 9: (18) r4 = 0x400 11: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R10=fp0 11: (18) r5 = 0x200 13: R0_w=inv1 R1_w=inv(id=0) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0 13: (2d) if r1 > r2 goto pc+4 R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0 14: R0_w=inv1 R1_w=inv(id=0,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0 14: (ad) if r1 < r3 goto pc+3 R0_w=inv1 R1_w=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2_w=inv274877906944 R3_w=inv137438953472 R4_w=inv1024 R5_w=inv512 R10=fp0 15: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7fffffffff)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0 15: (2e) if w1 > w4 goto pc+2 R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0 16: R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0 16: (ae) if w1 < w5 goto pc+1 R0=inv1 R1=inv(id=0,umin_value=137438953472,umax_value=274877906944,var_off=(0x0; 0x7f00000000)) R2=inv274877906944 R3=inv137438953472 R4=inv1024 R5=inv512 R10=fp0 [...] We're first probing lower/upper bounds via jmp64, later we do a similar check via jmp32 and examine the resulting var_off there. After fall-through in insn 14, we get the following bounded r1 with 0x7fffffffff unknown marked bits in the variable section. Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000: max: 0b100000000000000000000000000000000000000 / 0x4000000000 var: 0b111111111111111111111111111111111111111 / 0x7fffffffff min: 0b010000000000000000000000000000000000000 / 0x2000000000 Now, in insn 15 and 16, we perform a similar probe with lower/upper bounds in jmp32. Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and w1 <= 0x400 and w1 >= 0x200: max: 0b100000000000000000000000000000000000000 / 0x4000000000 var: 0b111111100000000000000000000000000000000 / 0x7f00000000 min: 0b010000000000000000000000000000000000000 / 0x2000000000 The lower/upper bounds haven't changed since they have high bits set in u64 space and the jmp32 tests can only refine bounds in the low bits. However, for the var part the expectation would have been 0x7f000007ff or something less precise up to 0x7fffffffff. A outcome of 0x7f00000000 is not correct since it would contradict the earlier probed bounds where we know that the result should have been in [0x200,0x400] in u32 space. Therefore, tests with such info will lead to wrong verifier assumptions later on like falsely predicting conditional jumps to be always taken, etc. The issue here is that __reg_bound_offset32()'s implementation from commit `581738a681` ("bpf: Provide better register bounds after jmp32 instructions") makes an incorrect range assumption: static void __reg_bound_offset32(struct bpf_reg_state reg) { u64 mask = 0xffffFFFF; struct tnum range = tnum_range(reg->umin_value & mask, reg->umax_value & mask); struct tnum lo32 = tnum_cast(reg->var_off, 4); struct tnum hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32); reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range)); } In the above walk-through example, __reg_bound_offset32() as-is chose a range after masking with 0xffffffff of [0x0,0x0] since umin:0x2000000000 and umax:0x4000000000 and therefore the lo32 part was clamped to 0x0 as well. However, in the umin:0x2000000000 and umax:0x4000000000 range above we'd end up with an actual possible interval of [0x0,0xffffffff] for u32 space instead. In case of the original reproducer, the situation looked as follows at insn 5 for r0: [...] 5: R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0 0x30303030 0x13030302f 5: (de) if w1 s<= w0 goto pc+0 R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x30303020; 0x10000001f)) R1_w=invP808464432 R10=fp0 0x30303030 0x13030302f [...] After the fall-through, we similarly forced the var_off result into the wrong range [0x30303030,0x3030302f] suggesting later on that fixed bits must only be of 0x30303020 with 0x10000001f unknowns whereas such assumption can only be made when both bounds in hi32 range match. Originally, I was thinking to fix this by moving reg into a temp reg and use proper coerce_reg_to_size() helper on the temp reg where we can then based on that define the range tnum for later intersection: static void __reg_bound_offset32(struct bpf_reg_state reg) { struct bpf_reg_state tmp = reg; struct tnum lo32, hi32, range; coerce_reg_to_size(&tmp, 4); range = tnum_range(tmp.umin_value, tmp.umax_value); lo32 = tnum_cast(reg->var_off, 4); hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32); reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range)); } In the case of the concrete example, this gives us a more conservative unknown section. Thus, after knowing r1 <= 0x4000000000 and r1 >= 0x2000000000 and w1 <= 0x400 and w1 >= 0x200: max: 0b100000000000000000000000000000000000000 / 0x4000000000 var: 0b111111111111111111111111111111111111111 / 0x7fffffffff min: 0b010000000000000000000000000000000000000 / 0x2000000000 However, above new __reg_bound_offset32() has no effect on refining the knowledge of the register contents. Meaning, if the bounds in hi32 range mismatch we'll get the identity function given the range reg spans [0x0,0xffffffff] and we cast var_off into lo32 only to later on binary or it again with the hi32. Likewise, if the bounds in hi32 range match, then we mask both bounds with 0xffffffff, use the resulting umin/umax for the range to later intersect the lo32 with it. However, _prior_ called __reg_bound_offset() did already such intersection on the full reg and we therefore would only repeat the same operation on the lo32 part twice. Given this has no effect and the original commit had false assumptions, this patch reverts the code entirely which is also more straight forward for stable trees: apparently `581738a681` got auto-selected by Sasha's ML system and misclassified as a fix, so it got sucked into v5.4 where it should never have landed. A revert is low-risk also from a user PoV since it requires a recent kernel and llc to opt-into -mcpu=v3 BPF CPU to generate jmp32 instructions. A proper bounds refinement would need a significantly more complex approach which is currently being worked, but no stable material [0]. Hence revert is best option for stable. After the revert, the original reported program gets rejected as follows: 1: (7f) r0 >>= r0 2: (14) w0 -= 808464432 3: (07) r0 += 808464432 4: (b7) r1 = 808464432 5: (de) if w1 s<= w0 goto pc+0 R0_w=invP(id=0,umin_value=808464432,umax_value=5103431727,var_off=(0x0; 0x1ffffffff)) R1_w=invP808464432 R10=fp0 6: (07) r0 += -2144337872 7: (14) w0 -= -1607454672 8: (25) if r0 > 0x30303030 goto pc+0 R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x3fffffff)) R1_w=invP808464432 R10=fp0 9: (76) if w0 s>= 0x303030 goto pc+2 R0=invP(id=0,umax_value=3158063,var_off=(0x0; 0x3fffff)) R1=invP808464432 R10=fp0 10: (30) r0 = (u8 *)skb[808464432] BPF_LD_[ABS\|IND] uses reserved fields processed 11 insns (limit 1000000) [...] [0] https://lore.kernel.org/bpf/158507130343.15666.8018068546764556975.stgit@john-Precision-5820-Tower/T/ Fixes: `581738a681` ("bpf: Provide better register bounds after jmp32 instructions") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330160324.15259-2-daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:13 +02:00
Yoshiki Komachi	657559d632	bpf/btf: Fix BTF verification of enum members in struct/union commit `da6c7faeb1` upstream. btf_enum_check_member() was currently sure to recognize the size of "enum" type members in struct/union as the size of "int" even if its size was packed. This patch fixes BTF enum verification to use the correct size of member in BPF programs. Fixes: `179cde8cef` ("bpf: btf: Check members of struct/union") Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1583825550-18606-2-git-send-email-komachi.yoshiki@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:11 +02:00
Andrii Nakryiko	188aae1f3d	bpf: Initialize storage pointers to NULL to prevent freeing garbage pointer commit `62039c30c1` upstream. Local storage array isn't initialized, so if cgroup storage allocation fails for BPF_CGROUP_STORAGE_SHARED, error handling code will attempt to free uninitialized pointer for BPF_CGROUP_STORAGE_PERCPU storage type. Avoid this by always initializing storage pointers to NULLs. Fixes: `8bad74f984` ("bpf: extend cgroup bpf core to allow multiple cgroup storage types") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200309222756.1018737-1-andriin@fb.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:11 +02:00
Edward Cree	86c7d38c2b	genirq: Fix reference leaks on irq affinity notifiers commit `df81dfcfd6` upstream. The handling of notify->work did not properly maintain notify->kref in two cases: 1) where the work was already scheduled, another irq_set_affinity_locked() would get the ref and (no-op-ly) schedule the work. Thus when irq_affinity_notify() ran, it would drop the original ref but not the additional one. 2) when cancelling the (old) work in irq_set_affinity_notifier(), if there was outstanding work a ref had been got for it but was never put. Fix both by checking the return values of the work handling functions (schedule_work() for (1) and cancel_work_sync() for (2)) and put the extra ref if the return value indicates preexisting work. Fixes: `cd7eab44e9` ("genirq: Add IRQ affinity notifiers") Fixes: `59c39840f5` ("genirq: Prevent use-after-free and work list corruption") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ben Hutchings <ben@decadent.org.uk> Link: https://lkml.kernel.org/r/24f5983f-2ab5-e83a-44ee-a45b5f9300f5@solarflare.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:04 +02:00
Andrii Nakryiko	768e582a99	bpf: Fix cgroup ref leak in cgroup_bpf_inherit on out-of-memory commit `1d8006abaa` upstream. There is no compensating cgroup_bpf_put() for each ancestor cgroup in cgroup_bpf_inherit(). If compute_effective_progs returns error, those cgroups won't be freed ever. Fix it by putting them in cleanup code path. Fixes: `e10360f815` ("bpf: cgroup: prevent out-of-order release of cgroup bpf") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20200309224017.1063297-1-andriin@fb.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:04 +02:00
Roman Gushchin	159aef18f0	mm: fork: fix kernel_stack memcg stats for various stack implementations commit `8380ce4790` upstream. Depending on CONFIG_VMAP_STACK and the THREAD_SIZE / PAGE_SIZE ratio the space for task stacks can be allocated using __vmalloc_node_range(), alloc_pages_node() and kmem_cache_alloc_node(). In the first and the second cases page->mem_cgroup pointer is set, but in the third it's not: memcg membership of a slab page should be determined using the memcg_from_slab_page() function, which looks at page->slab_cache->memcg_params.memcg . In this case, using mod_memcg_page_state() (as in account_kernel_stack()) is incorrect: page->mem_cgroup pointer is NULL even for pages charged to a non-root memory cgroup. It can lead to kernel_stack per-memcg counters permanently showing 0 on some architectures (depending on the configuration). In order to fix it, let's introduce a mod_memcg_obj_state() helper, which takes a pointer to a kernel object as a first argument, uses mem_cgroup_from_obj() to get a RCU-protected memcg pointer and calls mod_memcg_state(). It allows to handle all possible configurations (CONFIG_VMAP_STACK and various THREAD_SIZE/PAGE_SIZE values) without spilling any memcg/kmem specifics into fork.c . Note: This is a special version of the patch created for stable backports. It contains code from the following two patches: - mm: memcg/slab: introduce mem_cgroup_from_obj() - mm: fork: fix kernel_stack memcg stats for various stack implementations [guro@fb.com: introduce mem_cgroup_from_obj()] Link: http://lkml.kernel.org/r/20200324004221.GA36662@carbon.dhcp.thefacebook.com Fixes: `4d96ba3530` ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages") Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200303233550.251375-1-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-01 11:02:03 +02:00
Tycho Andersen	b82e91ae63	cgroup1: don't call release_agent when it is "" [ Upstream commit `2e5383d790` ] Older (and maybe current) versions of systemd set release_agent to "" when shutting down, but do not set notify_on_release to 0. Since `64e90a8acb` ("Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()"), we filter out such calls when the user mode helper path is "". However, when used in conjunction with an actual (i.e. non "") STATIC_USERMODEHELPER, the path is never "", so the real usermode helper will be called with argv[0] == "". Let's avoid this by not invoking the release_agent when it is "". Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-01 11:01:51 +02:00
Vasily Averin	b51274fabe	cgroup-v1: cgroup_pidlist_next should update position index [ Upstream commit `db8dd96972` ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. # mount \| grep cgroup # dd if=/mnt/cgroup.procs bs=1 # normal output ... 1294 1295 1296 1304 1382 584+0 records in 584+0 records out 584 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 83 <<< generates end of last line 1383 <<< ... and whole last line once again 0+1 records in 0+1 records out 8 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 1386 <<< generates last line anyway 0+1 records in 0+1 records out 5 bytes copied https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-01 11:01:50 +02:00
Todd Kjos	669d93f855	Revert "sched/core: Prevent race condition between cpuset and __sched_setscheduler()" This reverts commit `710da3c8ea`. When changing a thread's scheduling priority, binder calls sched_setscheduler_nocheck() while holding the node lock and proc inner lock. This was safe until v5.3 when this change was introduced where cpuset_read_lock() is called in this path which can sleep. This patch was introduced to fix a possible accounting error in sched deadline (potential oversell of CPU bandwidth) due to a race condition between cpusets and __sched_setscheduler(). This is not an issue for Android. This should be fixed in the binder driver, but that may take some time. Bug: 151861772 Change-Id: Ica1ef71b3cdcdc509b341ea1b57c41f8ee73794a Signed-off-by: Todd Kjos <tkjos@google.com>	2020-03-30 04:49:04 +00:00
Greg Kroah-Hartman	2341be6d9d	Merge 5.4.28 into android-5.4 Changes in 5.4.28 locks: fix a potential use-after-free problem when wakeup a waiter locks: reinstate locks_delete_block optimization spi: spi-omap2-mcspi: Support probe deferral for DMA channels drm/mediatek: Find the cursor plane instead of hard coding it phy: ti: gmii-sel: fix set of copy-paste errors phy: ti: gmii-sel: do not fail in case of gmii ARM: dts: dra7-l4: mark timer13-16 as pwm capable spi: qup: call spi_qup_pm_resume_runtime before suspending powerpc: Include .BTF section cifs: fix potential mismatch of UNC paths cifs: add missing mount option to /proc/mounts ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes spi: pxa2xx: Add CS control clock quirk spi/zynqmp: remove entry that causes a cs glitch drm/exynos: dsi: propagate error value and silence meaningless warning drm/exynos: dsi: fix workaround for the legacy clock name drm/exynos: hdmi: don't leak enable HDMI_EN regulator if probe fails drivers/perf: fsl_imx8_ddr: Correct the CLEAR bit definition drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer altera-stapl: altera_get_note: prevent write beyond end of 'key' dm bio record: save/restore bi_end_io and bi_integrity dm integrity: use dm_bio_record and dm_bio_restore riscv: avoid the PIC offset of static percpu data in module beyond 2G limits ASoC: stm32: sai: manage rebind issue spi: spi_register_controller(): free bus id on error paths riscv: Force flat memory model with no-mmu riscv: Fix range looking for kernel image memblock drm/amdgpu: clean wptr on wb when gpu recovery drm/amd/display: Clear link settings on MST disable connector drm/amd/display: fix dcc swath size calculations on dcn1 xenbus: req->body should be updated before req->state xenbus: req->err should be updated before req->state block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() parse-maintainers: Mark as executable binderfs: use refcount for binder control devices too Revert "drm/fbdev: Fallback to non tiled mode if all tiles not present" USB: Disable LPM on WD19's Realtek Hub usb: quirks: add NO_LPM quirk for RTL8153 based ethernet adapters USB: serial: option: add ME910G1 ECM composition 0x110b usb: host: xhci-plat: add a shutdown USB: serial: pl2303: add device-id for HP LD381 usb: xhci: apply XHCI_SUSPEND_DELAY to AMD XHCI controller 1022:145c usb: typec: ucsi: displayport: Fix NULL pointer dereference usb: typec: ucsi: displayport: Fix a potential race during registration USB: cdc-acm: fix close_delay and closing_wait units in TIOCSSERIAL USB: cdc-acm: fix rounding error in TIOCSSERIAL ALSA: line6: Fix endless MIDI read loop ALSA: hda/realtek - Enable headset mic of Acer X2660G with ALC662 ALSA: hda/realtek - Enable the headset of Acer N50-600 with ALC662 ALSA: seq: virmidi: Fix running status after receiving sysex ALSA: seq: oss: Fix running status after receiving sysex ALSA: pcm: oss: Avoid plugin buffer overflow ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks tty: fix compat TIOCGSERIAL leaking uninitialized memory tty: fix compat TIOCGSERIAL checking wrong function ptr iio: chemical: sps30: fix missing triggered buffer dependency iio: st_sensors: remap SMO8840 to LIS2DH12 iio: trigger: stm32-timer: disable master mode when stopping iio: accel: adxl372: Set iio_chan BE iio: magnetometer: ak8974: Fix negative raw values in sysfs iio: adc: stm32-dfsdm: fix sleep in atomic context iio: adc: at91-sama5d2_adc: fix differential channels in triggered mode iio: light: vcnl4000: update sampling periods for vcnl4200 iio: light: vcnl4000: update sampling periods for vcnl4040 mmc: rtsx_pci: Fix support for speed-modes that relies on tuning mmc: sdhci-of-at91: fix cd-gpios for SAMA5D2 mmc: sdhci-cadence: set SDHCI_QUIRK2_PRESET_VALUE_BROKEN for UniPhier CIFS: fiemap: do not return EINVAL if get nothing kbuild: Disable -Wpointer-to-enum-cast staging: rtl8188eu: Add device id for MERCUSYS MW150US v2 staging: greybus: loopback_test: fix poll-mask build breakage staging/speakup: fix get_word non-space look-ahead intel_th: msu: Fix the unexpected state warning intel_th: Fix user-visible error codes intel_th: pci: Add Elkhart Lake CPU support modpost: move the namespace field in Module.symvers last rtc: max8907: add missing select REGMAP_IRQ arm64: compat: Fix syscall number of compat_clock_getres xhci: Do not open code __print_symbolic() in xhci trace events btrfs: fix log context list corruption after rename whiteout error drm/amd/amdgpu: Fix GPR read from debugfs (v2) drm/lease: fix WARNING in idr_destroy stm class: sys-t: Fix the use of time_after() memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event mm, memcg: fix corruption on 64-bit divisor in memory.high throttling mm, memcg: throttle allocators based on ancestral memory.high mm/hotplug: fix hot remove failure in SPARSEMEM\|!VMEMMAP case mm: do not allow MADV_PAGEOUT for CoW pages epoll: fix possible lost wakeup on epoll_ctl() path mm: slub: be more careful about the double cmpxchg of freelist mm, slub: prevent kmalloc_node crashes and memory leaks page-flags: fix a crash at SetPageError(THP_SWAP) x86/mm: split vmalloc_sync_all() futex: Fix inode life-time issue futex: Unbreak futex hashing ALSA: hda/realtek: Fix pop noise on ALC225 arm64: smp: fix smp_send_stop() behaviour arm64: smp: fix crash_smp_send_stop() behaviour nvmet-tcp: set MSG_MORE only if we actually have more to send drm/bridge: dw-hdmi: fix AVI frame colorimetry staging: greybus: loopback_test: fix potential path truncation staging: greybus: loopback_test: fix potential path truncations Linux 5.4.28 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I5d9d15d6236d8ab7374205c6ceda7efa7a9abb70	2020-03-25 16:12:11 +01:00

1 2 3 4 5 ...

31836 Commits