Commit Graph

2743 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
ce5de62e20 Merge 5.4.24 into android-5.4
Changes in 5.4.24
	io_uring: grab ->fs as part of async offload
	EDAC: skx_common: downgrade message importance on missing PCI device
	net: dsa: b53: Ensure the default VID is untagged
	net: fib_rules: Correctly set table field when table number exceeds 8 bits
	net: macb: ensure interface is not suspended on at91rm9200
	net: mscc: fix in frame extraction
	net: phy: restore mdio regs in the iproc mdio driver
	net: sched: correct flower port blocking
	net/tls: Fix to avoid gettig invalid tls record
	nfc: pn544: Fix occasional HW initialization failure
	qede: Fix race between rdma destroy workqueue and link change event
	Revert "net: dev: introduce support for sch BYPASS for lockless qdisc"
	udp: rehash on disconnect
	sctp: move the format error check out of __sctp_sf_do_9_1_abort
	bnxt_en: Improve device shutdown method.
	bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs.
	bonding: add missing netdev_update_lockdep_key()
	net: export netdev_next_lower_dev_rcu()
	bonding: fix lockdep warning in bond_get_stats()
	ipv6: Fix route replacement with dev-only route
	ipv6: Fix nlmsg_flags when splitting a multipath route
	ipmi:ssif: Handle a possible NULL pointer reference
	drm/msm: Set dma maximum segment size for mdss
	sched/core: Don't skip remote tick for idle CPUs
	timers/nohz: Update NOHZ load in remote tick
	sched/fair: Prevent unlimited runtime on throttled group
	dax: pass NOWAIT flag to iomap_apply
	mac80211: consider more elements in parsing CRC
	cfg80211: check wiphy driver existence for drvinfo report
	s390/zcrypt: fix card and queue total counter wrap
	qmi_wwan: re-add DW5821e pre-production variant
	qmi_wwan: unconditionally reject 2 ep interfaces
	NFSv4: Fix races between open and dentry revalidation
	perf/smmuv3: Use platform_get_irq_optional() for wired interrupt
	perf/x86/intel: Add Elkhart Lake support
	perf/x86/cstate: Add Tremont support
	perf/x86/msr: Add Tremont support
	ceph: do not execute direct write in parallel if O_APPEND is specified
	ARM: dts: sti: fixup sound frame-inversion for stihxxx-b2120.dtsi
	drm/amd/display: Do not set optimized_require to false after plane disable
	RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready()
	drm/amd/display: Check engine is not NULL before acquiring
	drm/amd/display: Limit minimum DPPCLK to 100MHz.
	drm/amd/display: Add initialitions for PLL2 clock source
	amdgpu: Prevent build errors regarding soft/hard-float FP ABI tags
	soc/tegra: fuse: Fix build with Tegra194 configuration
	i40e: Fix the conditional for i40e_vc_validate_vqs_bitmaps
	net: ena: fix potential crash when rxfh key is NULL
	net: ena: fix uses of round_jiffies()
	net: ena: add missing ethtool TX timestamping indication
	net: ena: fix incorrect default RSS key
	net: ena: rss: do not allocate key when not supported
	net: ena: rss: fix failure to get indirection table
	net: ena: rss: store hash function as values and not bits
	net: ena: fix incorrectly saving queue numbers when setting RSS indirection table
	net: ena: fix corruption of dev_idx_to_host_tbl
	net: ena: ethtool: use correct value for crc32 hash
	net: ena: ena-com.c: prevent NULL pointer dereference
	ice: update Unit Load Status bitmask to check after reset
	cifs: Fix mode output in debugging statements
	cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE
	mac80211: fix wrong 160/80+80 MHz setting
	net: hns3: add management table after IMP reset
	net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples()
	nvme/tcp: fix bug on double requeue when send fails
	nvme: prevent warning triggered by nvme_stop_keep_alive
	nvme/pci: move cqe check after device shutdown
	ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
	audit: fix error handling in audit_data_to_entry()
	audit: always check the netlink payload length in audit_receive_msg()
	ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro
	ACPI: watchdog: Fix gas->access_width usage
	KVM: VMX: check descriptor table exits on instruction emulation
	HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock
	HID: core: fix off-by-one memset in hid_report_raw_event()
	HID: core: increase HID report buffer size to 8KiB
	drm/amdgpu: Drop DRIVER_USE_AGP
	drm/radeon: Inline drm_get_pci_dev
	macintosh: therm_windtunnel: fix regression when instantiating devices
	tracing: Disable trace_printk() on post poned tests
	Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
	amdgpu/gmc_v9: save/restore sdpif regs during S3
	cpufreq: Fix policy initialization for internal governor drivers
	io_uring: fix 32-bit compatability with sendmsg/recvmsg
	netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports
	net/smc: transfer fasync_list in case of fallback
	vhost: Check docket sk_family instead of call getname
	netfilter: ipset: Fix forceadd evaluation path
	netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()
	HID: alps: Fix an error handling path in 'alps_input_configured()'
	HID: hiddev: Fix race in in hiddev_disconnect()
	MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'
	i2c: altera: Fix potential integer overflow
	i2c: jz4780: silence log flood on txabrt
	drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime
	drm/i915/gvt: Separate display reset from ALL_ENGINES reset
	nl80211: fix potential leak in AP start
	mac80211: Remove a redundant mutex unlock
	kbuild: fix DT binding schema rule to detect command line changes
	hv_netvsc: Fix unwanted wakeup in netvsc_attach()
	usb: charger: assign specific number for enum value
	nvme-pci: Hold cq_poll_lock while completing CQEs
	s390/qeth: vnicc Fix EOPNOTSUPP precedence
	net: netlink: cap max groups which will be considered in netlink_bind()
	net: atlantic: fix use after free kasan warn
	net: atlantic: fix potential error handling
	net: atlantic: fix out of range usage of active_vlans array
	net/smc: no peer ID in CLC decline for SMCD
	net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE
	selftests: Install settings files to fix TIMEOUT failures
	kbuild: remove header compile test
	kbuild: move headers_check rule to usr/include/Makefile
	kbuild: remove unneeded variable, single-all
	kbuild: make single target builds even faster
	namei: only return -ECHILD from follow_dotdot_rcu()
	mwifiex: drop most magic numbers from mwifiex_process_tdls_action_frame()
	mwifiex: delete unused mwifiex_get_intf_num()
	KVM: SVM: Override default MMIO mask if memory encryption is enabled
	KVM: Check for a bad hva before dropping into the ghc slow path
	sched/fair: Optimize select_idle_cpu
	f2fs: fix to add swap extent correctly
	RDMA/hns: Simplify the calculation and usage of wqe idx for post verbs
	RDMA/hns: Bugfix for posting a wqe with sge
	drivers: net: xgene: Fix the order of the arguments of 'alloc_etherdev_mqs()'
	ima: ima/lsm policy rule loading logic bug fixes
	kprobes: Set unoptimized flag after unoptimizing code
	lib/vdso: Make __arch_update_vdso_data() logic understandable
	lib/vdso: Update coarse timekeeper unconditionally
	pwm: omap-dmtimer: put_device() after of_find_device_by_node()
	perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etc
	perf ui gtk: Add missing zalloc object
	x86/resctrl: Check monitoring static key in the MBM overflow handler
	KVM: x86: Remove spurious kvm_mmu_unload() from vcpu destruction path
	KVM: x86: Remove spurious clearing of async #PF MSR
	rcu: Allow only one expedited GP to run concurrently with wakeups
	ubifs: Fix ino_t format warnings in orphan_delete()
	thermal: db8500: Depromote debug print
	thermal: brcmstb_thermal: Do not use DT coefficients
	netfilter: nft_tunnel: no need to call htons() when dumping ports
	netfilter: nf_flowtable: fix documentation
	bus: tegra-aconnect: Remove PM_CLK dependency
	xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLE
	locking/lockdep: Fix lockdep_stats indentation problem
	mm/debug.c: always print flags in dump_page()
	mm/gup: allow FOLL_FORCE for get_user_pages_fast()
	mm/huge_memory.c: use head to check huge zero page
	mm, thp: fix defrag setting if newline is not used
	kvm: nVMX: VMWRITE checks VMCS-link pointer before VMCS field
	kvm: nVMX: VMWRITE checks unsupported field before read-only field
	blktrace: Protect q->blk_trace with RCU
	Linux 5.4.24

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0b31557e16c72bd30d1e6938ed199918ff326d88
2020-03-05 17:42:40 +01:00
Cheng Jian
a25ae55390 sched/fair: Optimize select_idle_cpu
commit 60588bfa22 upstream.

select_idle_cpu() will scan the LLC domain for idle CPUs,
it's always expensive. so the next commit :

	1ad3aaf3fc ("sched/core: Implement new approach to scale select_idle_cpu()")

introduces a way to limit how many CPUs we scan.

But it consume some CPUs out of 'nr' that are not allowed
for the task and thus waste our attempts. The function
always return nr_cpumask_bits, and we can't find a CPU
which our task is allowed to run.

Cpumask may be too big, similar to select_idle_core(), use
per_cpu_ptr 'select_idle_mask' to prevent stack overflow.

Fixes: 1ad3aaf3fc ("sched/core: Implement new approach to scale select_idle_cpu()")
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20191213024530.28052-1-cj.chengjian@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-05 16:43:48 +01:00
Vincent Guittot
36b5fcc140 sched/fair: Prevent unlimited runtime on throttled group
[ Upstream commit 2a4b03ffc6 ]

When a running task is moved on a throttled task group and there is no
other task enqueued on the CPU, the task can keep running using 100% CPU
whatever the allocated bandwidth for the group and although its cfs rq is
throttled. Furthermore, the group entity of the cfs_rq and its parents are
not enqueued but only set as curr on their respective cfs_rqs.

We have the following sequence:

sched_move_task
  -dequeue_task: dequeue task and group_entities.
  -put_prev_task: put task and group entities.
  -sched_change_group: move task to new group.
  -enqueue_task: enqueue only task but not group entities because cfs_rq is
    throttled.
  -set_next_task : set task and group_entities as current sched_entity of
    their cfs_rq.

Another impact is that the root cfs_rq runnable_load_avg at root rq stays
null because the group_entities are not enqueued. This situation will stay
the same until an "external" event triggers a reschedule. Let trigger it
immediately instead.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/1579011236-31256-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-05 16:43:36 +01:00
Peter Zijlstra (Intel)
166d6008fa timers/nohz: Update NOHZ load in remote tick
[ Upstream commit ebc0f83c78 ]

The way loadavg is tracked during nohz only pays attention to the load
upon entering nohz.  This can be particularly noticeable if full nohz is
entered while non-idle, and then the cpu goes idle and stays that way for
a long time.

Use the remote tick to ensure that full nohz cpus report their deltas
within a reasonable time.

[ swood: Added changelog and removed recheck of stopped tick. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Scott Wood <swood@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/1578736419-14628-3-git-send-email-swood@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-05 16:43:36 +01:00
Scott Wood
5a309e3bf1 sched/core: Don't skip remote tick for idle CPUs
[ Upstream commit 488603b815 ]

This will be used in the next patch to get a loadavg update from
nohz cpus.  The delta check is skipped because idle_sched_class
doesn't update se.exec_start.

Signed-off-by: Scott Wood <swood@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/1578736419-14628-2-git-send-email-swood@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-05 16:43:35 +01:00
Suren Baghdasaryan
e61c236dcf sched/psi: Fix OOB write when writing 0 bytes to PSI files
commit 6fcca0fa48 upstream.

Issuing write() with count parameter set to 0 on any file under
/proc/pressure/ will cause an OOB write because of the access to
buf[buf_size-1] when NUL-termination is performed. Fix this by checking
for buf_size to be non-zero.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20200203212216.7076-1-surenb@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 17:22:21 +01:00
Suren Baghdasaryan
e7ce229bd8 UPSTREAM: sched/psi: Fix OOB write when writing 0 bytes to PSI files
Issuing write() with count parameter set to 0 on any file under
/proc/pressure/ will cause an OOB write because of the access to
buf[buf_size-1] when NUL-termination is performed. Fix this by checking
for buf_size to be non-zero.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20200203212216.7076-1-surenb@google.com

(cherry picked from commit 6fcca0fa48)

Bug: 148159562
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I9ec7acfc6e1083c677a95b0ea1c559ab50152873
2020-02-28 15:35:06 +00:00
Greg Kroah-Hartman
835bd1de9c Merge 5.4.22 into android-5.4
Changes in 5.4.22
	core: Don't skip generic XDP program execution for cloned SKBs
	enic: prevent waking up stopped tx queues over watchdog reset
	net/smc: fix leak of kernel memory to user space
	net: dsa: tag_qca: Make sure there is headroom for tag
	net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS
	net/sched: flower: add missing validation of TCA_FLOWER_FLAGS
	drm/gma500: Fixup fbdev stolen size usage evaluation
	ath10k: Fix qmi init error handling
	wil6210: fix break that is never reached because of zero'ing of a retry counter
	drm/qxl: Complete exception handling in qxl_device_init()
	rcu/nocb: Fix dump_tree hierarchy print always active
	rcu: Fix missed wakeup of exp_wq waiters
	rcu: Fix data-race due to atomic_t copy-by-value
	f2fs: preallocate DIO blocks when forcing buffered_io
	f2fs: call f2fs_balance_fs outside of locked page
	media: meson: add missing allocation failure check on new_buf
	clk: meson: pll: Fix by 0 division in __pll_params_to_rate()
	cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
	brcmfmac: Fix memory leak in brcmf_p2p_create_p2pdev()
	brcmfmac: Fix use after free in brcmf_sdio_readframes()
	PCI: Fix pci_add_dma_alias() bitmask size
	drm/amd/display: Map ODM memory correctly when doing ODM combine
	leds: pca963x: Fix open-drain initialization
	ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT
	ALSA: ctl: allow TLV read operation for callback type of element in locked case
	gianfar: Fix TX timestamping with a stacked DSA driver
	pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs
	printk: fix exclusive_console replaying
	drm/mipi_dbi: Fix off-by-one bugs in mipi_dbi_blank()
	drm/msm/adreno: fix zap vs no-zap handling
	pxa168fb: Fix the function used to release some memory in an error handling path
	media: ov5640: Fix check for PLL1 exceeding max allowed rate
	media: i2c: mt9v032: fix enum mbus codes and frame sizes
	media: sun4i-csi: Deal with DRAM offset
	media: sun4i-csi: Fix data sampling polarity handling
	media: sun4i-csi: Fix [HV]sync polarity handling
	clk: at91: sam9x60: fix programmable clock prescaler
	powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE number
	clk: meson: meson8b: make the CCF use the glitch-free mali mux
	gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in grgpio_irq_map/unmap()
	iommu/vt-d: Fix off-by-one in PASID allocation
	x86/fpu: Deactivate FPU state after failure during state load
	char/random: silence a lockdep splat with printk()
	media: sti: bdisp: fix a possible sleep-in-atomic-context bug in bdisp_device_run()
	kernel/module: Fix memleak in module_add_modinfo_attrs()
	IB/core: Let IB core distribute cache update events
	pinctrl: baytrail: Do not clear IRQ flags on direct-irq enabled pins
	efi/x86: Map the entire EFI vendor string before copying it
	MIPS: Loongson: Fix potential NULL dereference in loongson3_platform_init()
	sparc: Add .exit.data section.
	net: ethernet: ixp4xx: Standard module init
	raid6/test: fix a compilation error
	uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()
	drm/amdgpu/sriov: workaround on rev_id for Navi12 under sriov
	spi: fsl-lpspi: fix only one cs-gpio working
	drm/nouveau/nouveau: fix incorrect sizeof on args.src an args.dst
	usb: gadget: udc: fix possible sleep-in-atomic-context bugs in gr_probe()
	usb: dwc2: Fix IN FIFO allocation
	clocksource/drivers/bcm2835_timer: Fix memory leak of timer
	drm/amd/display: Clear state after exiting fixed active VRR state
	kselftest: Minimise dependency of get_size on C library interfaces
	jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal
	ext4: fix deadlock allocating bio_post_read_ctx from mempool
	clk: ti: dra7: fix parent for gmac_clkctrl
	x86/sysfb: Fix check for bad VRAM size
	pwm: omap-dmtimer: Simplify error handling
	udf: Allow writing to 'Rewritable' partitions
	dmaengine: fsl-qdma: fix duplicated argument to &&
	wan/hdlc_x25: fix skb handling
	s390/pci: Fix possible deadlock in recover_store()
	powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
	tracing: Fix tracing_stat return values in error handling paths
	tracing: Fix very unlikely race of registering two stat tracers
	ARM: 8952/1: Disable kmemleak on XIP kernels
	ext4, jbd2: ensure panic when aborting with zero errno
	ath10k: Correct the DMA direction for management tx buffers
	rtw88: fix rate mask for 1SS chip
	brcmfmac: sdio: Fix OOB interrupt initialization on brcm43362
	selftests: settings: tests can be in subsubdirs
	rtc: i2c/spi: Avoid inclusion of REGMAP support when not needed
	drm/amd/display: Retrain dongles when SINK_COUNT becomes non-zero
	tracing: Simplify assignment parsing for hist triggers
	nbd: add a flush_workqueue in nbd_start_device
	KVM: s390: ENOTSUPP -> EOPNOTSUPP fixups
	Btrfs: keep pages dirty when using btrfs_writepage_fixup_worker
	drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store
	block, bfq: do not plug I/O for bfq_queues with no proc refs
	kconfig: fix broken dependency in randconfig-generated .config
	clk: qcom: Don't overwrite 'cfg' in clk_rcg2_dfs_populate_freq()
	clk: qcom: rcg2: Don't crash if our parent can't be found; return an error
	drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode
	bpf, sockhash: Synchronize_rcu before free'ing map
	drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table
	ath10k: correct the tlv len of ath10k_wmi_tlv_op_gen_config_pno_start
	drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
	drm/panel: simple: Add Logic PD Type 28 display support
	arm64: dts: rockchip: Fix NanoPC-T4 cooling maps
	modules: lockdep: Suppress suspicious RCU usage warning
	ASoC: intel: sof_rt5682: Add quirk for number of HDMI DAI's
	ASoC: intel: sof_rt5682: Add support for tgl-max98357a-rt5682
	regulator: rk808: Lower log level on optional GPIOs being not available
	net/wan/fsl_ucc_hdlc: reject muram offsets above 64K
	NFC: port100: Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().
	arm64: dts: allwinner: H6: Add PMU mode
	arm64: dts: allwinner: H5: Add PMU node
	arm: dts: allwinner: H3: Add PMU node
	opp: Free static OPPs on errors while adding them
	selinux: ensure we cleanup the internal AVC counters on error in avc_insert()
	arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core
	padata: validate cpumask without removed CPU during offline
	clk: imx: Add correct failure handling for clk based helpers
	ARM: exynos_defconfig: Bring back explicitly wanted options
	ARM: dts: imx6: rdu2: Disable WP for USDHC2 and USDHC3
	ARM: dts: imx6: rdu2: Limit USBH1 to Full Speed
	bus: ti-sysc: Implement quirk handling for CLKDM_NOAUTO
	PCI: iproc: Apply quirk_paxc_bridge() for module as well as built-in
	media: cx23885: Add support for AVerMedia CE310B
	PCI: Add generic quirk for increasing D3hot delay
	PCI: Increase D3 delay for AMD Ryzen5/7 XHCI controllers
	Revert "nfp: abm: fix memory leak in nfp_abm_u32_knode_replace"
	gpu/drm: ingenic: Avoid null pointer deference in plane atomic update
	selftests/net: make so_txtime more robust to timer variance
	media: v4l2-device.h: Explicitly compare grp{id,mask} to zero in v4l2_device macros
	reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling
	samples/bpf: Set -fno-stack-protector when building BPF programs
	r8169: check that Realtek PHY driver module is loaded
	fore200e: Fix incorrect checks of NULL pointer dereference
	netfilter: nft_tunnel: add the missing ERSPAN_VERSION nla_policy
	ALSA: usx2y: Adjust indentation in snd_usX2Y_hwdep_dsp_status
	PCI: Add nr_devfns parameter to pci_add_dma_alias()
	PCI: Add DMA alias quirk for PLX PEX NTB
	b43legacy: Fix -Wcast-function-type
	ipw2x00: Fix -Wcast-function-type
	iwlegacy: Fix -Wcast-function-type
	rtlwifi: rtl_pci: Fix -Wcast-function-type
	orinoco: avoid assertion in case of NULL pointer
	drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV
	clk: qcom: smd: Add missing bimc clock
	ACPICA: Disassembler: create buffer fields in ACPI_PARSE_LOAD_PASS1
	nfsd: Clone should commit src file metadata too
	scsi: ufs: Complete pending requests in host reset and restore path
	scsi: aic7xxx: Adjust indentation in ahc_find_syncrate
	crypto: inside-secure - add unspecified HAS_IOMEM dependency
	drm/mediatek: handle events when enabling/disabling crtc
	clk: renesas: rcar-gen3: Allow changing the RPC[D2] clocks
	ARM: dts: r8a7779: Add device node for ARM global timer
	selinux: ensure we cleanup the internal AVC counters on error in avc_update()
	scsi: lpfc: Fix: Rework setting of fdmi symbolic node name registration
	arm64: dts: qcom: db845c: Enable ath10k 8bit host-cap quirk
	iommu/amd: Check feature support bit before accessing MSI capability registers
	iommu/amd: Only support x2APIC with IVHD type 11h/40h
	iommu/iova: Silence warnings under memory pressure
	clk: actually call the clock init before any other callback of the clock
	dmaengine: Store module owner in dma_device struct
	dmaengine: imx-sdma: Fix memory leak
	bpf: Print error message for bpftool cgroup show
	net: phy: realtek: add logging for the RGMII TX delay configuration
	crypto: chtls - Fixed memory leak
	x86/vdso: Provide missing include file
	PM / devfreq: exynos-ppmu: Fix excessive stack usage
	PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency
	drm/fbdev: Fallback to non tiled mode if all tiles not present
	pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs
	reset: uniphier: Add SCSSI reset control for each channel
	ASoC: soc-topology: fix endianness issues
	fbdev: fix numbering of fbcon options
	RDMA/rxe: Fix error type of mmap_offset
	clk: sunxi-ng: add mux and pll notifiers for A64 CPU clock
	ALSA: sh: Fix unused variable warnings
	clk: Use parent node pointer during registration if necessary
	clk: uniphier: Add SCSSI clock gate for each channel
	ALSA: hda/realtek - Apply mic mute LED quirk for Dell E7xx laptops, too
	ALSA: sh: Fix compile warning wrt const
	net: phy: fixed_phy: fix use-after-free when checking link GPIO
	tools lib api fs: Fix gcc9 stringop-truncation compilation error
	vfio/spapr/nvlink2: Skip unpinning pages on error exit
	ASoC: Intel: sof_rt5682: Ignore the speaker amp when there isn't one.
	ACPI: button: Add DMI quirk for Razer Blade Stealth 13 late 2019 lid switch
	iommu/vt-d: Match CPU and IOMMU paging mode
	iommu/vt-d: Avoid sending invalid page response
	drm/amdkfd: Fix permissions of hang_hws
	mlx5: work around high stack usage with gcc
	RDMA/hns: Avoid printing address of mtt page
	drm: remove the newline for CRC source name.
	usb: dwc3: use proper initializers for property entries
	ARM: dts: stm32: Add power-supply for DSI panel on stm32f469-disco
	usbip: Fix unsafe unaligned pointer usage
	udf: Fix free space reporting for metadata and virtual partitions
	drm/mediatek: Add gamma property according to hardware capability
	staging: rtl8188: avoid excessive stack usage
	IB/hfi1: Add software counter for ctxt0 seq drop
	IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats
	soc/tegra: fuse: Correct straps' address for older Tegra124 device trees
	efi/x86: Don't panic or BUG() on non-critical error conditions
	rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls
	Input: edt-ft5x06 - work around first register access error
	bnxt: Detach page from page pool before sending up the stack
	x86/nmi: Remove irq_work from the long duration NMI handler
	wan: ixp4xx_hss: fix compile-testing on 64-bit
	clocksource: davinci: only enable clockevents once tim34 is initialized
	arm64: dts: rockchip: fix dwmmc clock name for px30
	arm64: dts: rockchip: add reg property to brcmf sub-nodes
	ARM: dts: rockchip: add reg property to brcmf sub node for rk3188-bqedison2qc
	ALSA: usb-audio: Add boot quirk for MOTU M Series
	ASoC: atmel: fix build error with CONFIG_SND_ATMEL_SOC_DMA=m
	raid6/test: fix a compilation warning
	tty: synclinkmp: Adjust indentation in several functions
	tty: synclink_gt: Adjust indentation in several functions
	misc: xilinx_sdfec: fix xsdfec_poll()'s return type
	visorbus: fix uninitialized variable access
	driver core: platform: Prevent resouce overflow from causing infinite loops
	driver core: Print device when resources present in really_probe()
	ASoC: SOF: Intel: hda-dai: fix compilation warning in pcm_prepare
	bpf: Return -EBADRQC for invalid map type in __bpf_tx_xdp_map
	vme: bridges: reduce stack usage
	drm/nouveau/secboot/gm20b: initialize pointer in gm20b_secboot_new()
	drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw
	drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handler
	drm/nouveau/drm/ttm: Remove set but not used variable 'mem'
	drm/nouveau/fault/gv100-: fix memory leak on module unload
	dm thin: don't allow changing data device during thin-pool reload
	gpiolib: Set lockdep class for hierarchical irq domains
	drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add
	perf/imx_ddr: Fix cpu hotplug state cleanup
	usb: musb: omap2430: Get rid of musb .set_vbus for omap2430 glue
	kbuild: remove *.tmp file when filechk fails
	iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
	ALSA: usb-audio: unlock on error in probe
	f2fs: set I_LINKABLE early to avoid wrong access by vfs
	f2fs: free sysfs kobject
	scsi: ufs: pass device information to apply_dev_quirks
	scsi: ufs-mediatek: add apply_dev_quirks variant operation
	scsi: iscsi: Don't destroy session if there are outstanding connections
	crypto: essiv - fix AEAD capitalization and preposition use in help text
	ALSA: usb-audio: add implicit fb quirk for MOTU M Series
	RDMA/mlx5: Don't fake udata for kernel path
	arm64: lse: fix LSE atomics with LLVM's integrated assembler
	arm64: fix alternatives with LLVM's integrated assembler
	drm/amd/display: fixup DML dependencies
	EDAC/sifive: Fix return value check in ecc_register()
	KVM: PPC: Remove set but not used variable 'ra', 'rs', 'rt'
	arm64: dts: ti: k3-j721e-main: Add missing power-domains for smmu
	sched/core: Fix size of rq::uclamp initialization
	sched/topology: Assert non-NUMA topology masks don't (partially) overlap
	perf/x86/amd: Constrain Large Increment per Cycle events
	watchdog/softlockup: Enforce that timestamp is valid on boot
	debugobjects: Fix various data races
	ASoC: SOF: Intel: hda: Fix SKL dai count
	regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage
	f2fs: fix memleak of kobject
	x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
	pwm: omap-dmtimer: Remove PWM chip in .remove before making it unfunctional
	cmd64x: potential buffer overflow in cmd64x_program_timings()
	ide: serverworks: potential overflow in svwks_set_pio_mode()
	pwm: Remove set but not set variable 'pwm'
	btrfs: fix possible NULL-pointer dereference in integrity checks
	btrfs: safely advance counter when looking up bio csums
	btrfs: device stats, log when stats are zeroed
	module: avoid setting info->name early in case we can fall back to info->mod->name
	remoteproc: Initialize rproc_class before use
	regulator: core: Fix exported symbols to the exported GPL version
	irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problems
	ALSA: hda/hdmi - add retry logic to parse_intel_hdmi()
	spi: spi-fsl-qspi: Ensure width is respected in spi-mem operations
	kbuild: use -S instead of -E for precise cc-option test in Kconfig
	objtool: Fix ARCH=x86_64 build error
	x86/decoder: Add TEST opcode to Group3-2
	s390: adjust -mpacked-stack support check for clang 10
	s390/ftrace: generate traced function stack frame
	driver core: platform: fix u32 greater or equal to zero comparison
	bpf, btf: Always output invariant hit in pahole DWARF to BTF transform
	ALSA: hda - Add docking station support for Lenovo Thinkpad T420s
	sunrpc: Fix potential leaks in sunrpc_cache_unhash()
	drm/nouveau/mmu: fix comptag memory leak
	powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV
	media: uvcvideo: Add a quirk to force GEO GC6500 Camera bits-per-pixel value
	btrfs: separate definition of assertion failure handlers
	btrfs: Fix split-brain handling when changing FSID to metadata uuid
	bcache: cached_dev_free needs to put the sb page
	bcache: rework error unwinding in register_bcache
	bcache: fix use-after-free in register_bcache()
	iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
	alarmtimer: Make alarmtimer platform device child of RTC device
	selftests: bpf: Reset global state between reuseport test runs
	jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record
	jbd2: make sure ESHUTDOWN to be recorded in the journal superblock
	powerpc/pseries/lparcfg: Fix display of Maximum Memory
	selftests/eeh: Bump EEH wait time to 60s
	ARM: 8951/1: Fix Kexec compilation issue.
	ALSA: usb-audio: add quirks for Line6 Helix devices fw>=2.82
	hostap: Adjust indentation in prism2_hostapd_add_sta
	rtw88: fix potential NULL skb access in TX ISR
	iwlegacy: ensure loop counter addr does not wrap and cause an infinite loop
	cifs: fix unitialized variable poential problem with network I/O cache lock patch
	cifs: Fix mount options set in automount
	cifs: fix NULL dereference in match_prepath
	bpf: map_seq_next should always increase position index
	powerpc/mm: Don't log user reads to 0xffffffff
	ceph: check availability of mds cluster on mount after wait timeout
	rbd: work around -Wuninitialized warning
	drm/amd/display: do not allocate display_mode_lib unnecessarily
	irqchip/gic-v3: Only provision redistributors that are enabled in ACPI
	drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
	char: hpet: Fix out-of-bounds read bug
	ftrace: fpid_next() should increase position index
	trigger_next should increase position index
	radeon: insert 10ms sleep in dce5_crtc_load_lut
	powerpc: Do not consider weak unresolved symbol relocations as bad
	btrfs: do not do delalloc reservation under page lock
	ocfs2: make local header paths relative to C files
	ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
	lib/scatterlist.c: adjust indentation in __sg_alloc_table
	reiserfs: prevent NULL pointer dereference in reiserfs_insert_item()
	bcache: fix memory corruption in bch_cache_accounting_clear()
	bcache: explicity type cast in bset_bkey_last()
	bcache: fix incorrect data type usage in btree_flush_write()
	irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL
	nvmet: Pass lockdep expression to RCU lists
	nvme-pci: remove nvmeq->tags
	iwlwifi: mvm: Fix thermal zone registration
	iwlwifi: mvm: Check the sta is not NULL in iwl_mvm_cfg_he_sta()
	asm-generic/tlb: add missing CONFIG symbol
	microblaze: Prevent the overflow of the start
	brd: check and limit max_part par
	drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency
	drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage
	NFS: Fix memory leaks
	help_next should increase position index
	i40e: Relax i40e_xsk_wakeup's return value when PF is busy
	cifs: log warning message (once) if out of disk space
	virtio_balloon: prevent pfn array overflow
	fuse: don't overflow LLONG_MAX with end offset
	mlxsw: spectrum_dpipe: Add missing error path
	s390/pci: Recover handle in clp_set_pci_fn()
	drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2)
	bcache: properly initialize 'path' and 'err' in register_bcache()
	rtc: Kconfig: select REGMAP_I2C when necessary
	Linux 5.4.22

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iaeb3945493ecc81a0ae90ef87b19ceb2caf48164
2020-02-24 09:16:10 +01:00
Valentin Schneider
f2323c374e sched/topology: Assert non-NUMA topology masks don't (partially) overlap
[ Upstream commit ccf74128d6 ]

topology.c::get_group() relies on the assumption that non-NUMA domains do
not partially overlap. Zeng Tao pointed out in [1] that such topology
descriptions, while completely bogus, can end up being exposed to the
scheduler.

In his example (8 CPUs, 2-node system), we end up with:
  MC span for CPU3 == 3-7
  MC span for CPU4 == 4-7

The first pass through get_group(3, sdd@MC) will result in the following
sched_group list:

  3 -> 4 -> 5 -> 6 -> 7
  ^                  /
   `----------------'

And a later pass through get_group(4, sdd@MC) will "corrupt" that to:

  3 -> 4 -> 5 -> 6 -> 7
       ^             /
	`-----------'

which will completely break things like 'while (sg != sd->groups)' when
using CPU3's base sched_domain.

There already are some architecture-specific checks in place such as
x86/kernel/smpboot.c::topology.sane(), but this is something we can detect
in the core scheduler, so it seems worthwhile to do so.

Warn and abort the construction of the sched domains if such a broken
topology description is detected. Note that this is somewhat
expensive (O(t.c²), 't' non-NUMA topology levels and 'c' CPUs) and could be
gated under SCHED_DEBUG if deemed necessary.

Testing
=======

Dietmar managed to reproduce this using the following qemu incantation:

  $ qemu-system-aarch64 -kernel ./Image -hda ./qemu-image-aarch64.img \
  -append 'root=/dev/vda console=ttyAMA0 loglevel=8 sched_debug' -smp \
  cores=8 --nographic -m 512 -cpu cortex-a53 -machine virt -numa \
  node,cpus=0-2,nodeid=0 -numa node,cpus=3-7,nodeid=1

alongside the following drivers/base/arch_topology.c hack (AIUI wouldn't be
needed if '-smp cores=X, sockets=Y' would work with qemu):

8<---
@@ -465,6 +465,9 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;

+		if ((cpu < 4 && cpuid > 3) || (cpu > 3 && cpuid < 4))
+			continue;
+
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);

8<---

[1]: https://lkml.kernel.org/r/1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com

Reported-by: Zeng Tao <prime.zeng@hisilicon.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200115160915.22575-1-valentin.schneider@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:52 +01:00
Li Guanglei
5d13f62b9e sched/core: Fix size of rq::uclamp initialization
[ Upstream commit dcd6dffb0a ]

rq::uclamp is an array of struct uclamp_rq, make sure we clear the
whole thing.

Fixes: 69842cba9a ("sched/uclamp: Add CPU's clamp buckets refcountinga")
Signed-off-by: Li Guanglei <guanglei.li@unisoc.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lkml.kernel.org/r/1577259844-12677-1-git-send-email-guangleix.li@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:51 +01:00
Morten Rasmussen
6bec3fbeac BACKPORT: sched/fair: Remove wake_cap()
Capacity-awareness in the wake-up path previously involved disabling
wake_affine in certain scenarios. We have just made select_idle_sibling()
capacity-aware, so this isn't needed anymore.

Remove wake_cap() entirely.

Bug: 120440300
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[Changelog tweaks]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[Changelog tweaks]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200206191957.12325-5-valentin.schneider@arm.com
(cherry picked from commit 25ac227a25ac946e0356772012398cd1710a8bab)
[Fixed trivial conflict with 24c4437797]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: Ic91c61ee94c3aa54c573a4cd3fe6882e5fde6335
2020-02-20 11:32:43 +00:00
Valentin Schneider
7874cf9c9f UPSTREAM: sched/core: Remove for_each_lower_domain()
The last remaining user of this macro has just been removed, get rid of it.

Bug: 120440300
Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://lkml.kernel.org/r/20200206191957.12325-4-valentin.schneider@arm.com
(cherry picked from commit afbcf99785a580e2cc089646e971deceb31a18b1)
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: I554858a5dc278e666c47fb968942bc0ced8b94e8
2020-02-20 11:32:35 +00:00
Morten Rasmussen
8aa7b34bad UPSTREAM: sched/topology: Remove SD_BALANCE_WAKE on asymmetric capacity systems
SD_BALANCE_WAKE was previously added to lower sched_domain levels on
asymmetric CPU capacity systems by commit:

  9ee1cda5ee ("sched/core: Enable SD_BALANCE_WAKE for asymmetric capacity systems")

to enable the use of find_idlest_cpu() and friends to find an appropriate
CPU for tasks.

That responsibility has now been shifted to select_idle_sibling() and
friends, and hence the flag can be removed. Note that this causes
asymmetric CPU capacity systems to no longer enter the slow wakeup path
(find_idlest_cpu()) on wakeups - only on execs and forks (which is aligned
with all other mainline topologies).

Bug: 120440300
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[Changelog tweaks]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://lkml.kernel.org/r/20200206191957.12325-3-valentin.schneider@arm.com
(cherry picked from commit 38c6e4963b509c47c33539534b84195558c0376d)
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: I73ec387cdca6df6897a9d1c84a1dd13dad786f76
2020-02-20 11:32:27 +00:00
Morten Rasmussen
f042d57132 UPSTREAM: sched/fair: Add asymmetric CPU capacity wakeup scan
Issue
=====

On asymmetric CPU capacity topologies, we currently rely on wake_cap() to
drive select_task_rq_fair() towards either:

- its slow-path (find_idlest_cpu()) if either the previous or
  current (waking) CPU has too little capacity for the waking task
- its fast-path (select_idle_sibling()) otherwise

Commit:

  3273163c67 ("sched/fair: Let asymmetric CPU configurations balance at wake-up")

points out that this relies on the assumption that "[...]the CPU capacities
within an SD_SHARE_PKG_RESOURCES domain (sd_llc) are homogeneous".

This assumption no longer holds on newer generations of big.LITTLE
systems (DynamIQ), which can accommodate CPUs of different compute capacity
within a single LLC domain. To hopefully paint a better picture, a regular
big.LITTLE topology would look like this:

  +---------+ +---------+
  |   L2    | |   L2    |
  +----+----+ +----+----+
  |CPU0|CPU1| |CPU2|CPU3|
  +----+----+ +----+----+
      ^^^         ^^^
    LITTLEs      bigs

which would result in the following scheduler topology:

  DIE [         ] <- sd_asym_cpucapacity
  MC  [   ] [   ] <- sd_llc
       0 1   2 3

Conversely, a DynamIQ topology could look like:

  +-------------------+
  |        L3         |
  +----+----+----+----+
  | L2 | L2 | L2 | L2 |
  +----+----+----+----+
  |CPU0|CPU1|CPU2|CPU3|
  +----+----+----+----+
     ^^^^^     ^^^^^
    LITTLEs    bigs

which would result in the following scheduler topology:

  MC [       ] <- sd_llc, sd_asym_cpucapacity
      0 1 2 3

What this means is that, on DynamIQ systems, we could pass the wake_cap()
test (IOW presume the waking task fits on the CPU capacities of some LLC
domain), thus go through select_idle_sibling().
This function operates on an LLC domain, which here spans both bigs and
LITTLEs, so it could very well pick a CPU of too small capacity for the
task, despite there being fitting idle CPUs - it very much depends on the
CPU iteration order, on which we have absolutely no guarantees
capacity-wise.

Implementation
==============

Introduce yet another select_idle_sibling() helper function that takes CPU
capacity into account. The policy is to pick the first idle CPU which is
big enough for the task (task_util * margin < cpu_capacity). If no
idle CPU is big enough, we pick the idle one with the highest capacity.

Unlike other select_idle_sibling() helpers, this one operates on the
sd_asym_cpucapacity sched_domain pointer, which is guaranteed to span all
known CPU capacities in the system. As such, this will work for both
"legacy" big.LITTLE (LITTLEs & bigs split at MC, joined at DIE) and for
newer DynamIQ systems (e.g. LITTLEs and bigs in the same MC domain).

Note that this limits the scope of select_idle_sibling() to
select_idle_capacity() for asymmetric CPU capacity systems - the LLC domain
will not be scanned, and no further heuristic will be applied.

Bug: 120440300
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://lkml.kernel.org/r/20200206191957.12325-2-valentin.schneider@arm.com
(cherry picked from commit 913c310c8e8abb6a9eb8b3c8bfc33bd1dddded04)
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: Ia0f7a085e57709a1f2855e2905c966b90cd06c72
2020-02-20 11:32:18 +00:00
Greg Kroah-Hartman
4f48e3619d Merge 5.4.21 into android-5.4
Changes in 5.4.21
	Input: synaptics - switch T470s to RMI4 by default
	Input: synaptics - enable SMBus on ThinkPad L470
	Input: synaptics - remove the LEN0049 dmi id from topbuttonpad list
	ALSA: usb-audio: Fix UAC2/3 effect unit parsing
	ALSA: hda/realtek - Add more codec supported Headset Button
	ALSA: hda/realtek - Fix silent output on MSI-GL73
	ALSA: usb-audio: Apply sample rate quirk for Audioengine D1
	ACPI: EC: Fix flushing of pending work
	ACPI: PM: s2idle: Avoid possible race related to the EC GPE
	ACPICA: Introduce acpi_any_gpe_status_set()
	ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system
	ALSA: usb-audio: sound: usb: usb true/false for bool return type
	ALSA: usb-audio: Add clock validity quirk for Denon MC7000/MCX8000
	ext4: don't assume that mmp_nodename/bdevname have NUL
	ext4: fix support for inode sizes > 1024 bytes
	ext4: fix checksum errors with indexed dirs
	ext4: add cond_resched() to ext4_protect_reserved_inode
	ext4: improve explanation of a mount failure caused by a misconfigured kernel
	Btrfs: fix race between using extent maps and merging them
	btrfs: ref-verify: fix memory leaks
	btrfs: print message when tree-log replay starts
	btrfs: log message when rw remount is attempted with unclean tree-log
	ARM: npcm: Bring back GPIOLIB support
	gpio: xilinx: Fix bug where the wrong GPIO register is written to
	arm64: ssbs: Fix context-switch when SSBS is present on all CPUs
	xprtrdma: Fix DMA scatter-gather list mapping imbalance
	cifs: make sure we do not overflow the max EA buffer size
	EDAC/sysfs: Remove csrow objects on errors
	EDAC/mc: Fix use-after-free and memleaks during device removal
	KVM: nVMX: Use correct root level for nested EPT shadow page tables
	perf/x86/amd: Add missing L2 misses event spec to AMD Family 17h's event map
	s390/pkey: fix missing length of protected key on return
	s390/uv: Fix handling of length extensions
	drm/vgem: Close use-after-free race in vgem_gem_create
	drm/panfrost: Make sure the shrinker does not reclaim referenced BOs
	bus: moxtet: fix potential stack buffer overflow
	nvme: fix the parameter order for nvme_get_log in nvme_get_fw_slot_info
	drivers: ipmi: fix off-by-one bounds check that leads to a out-of-bounds write
	IB/mlx5: Return failure when rts2rts_qp_counters_set_id is not supported
	IB/hfi1: Acquire lock to release TID entries when user file is closed
	IB/hfi1: Close window for pq and request coliding
	IB/rdmavt: Reset all QPs when the device is shut down
	IB/umad: Fix kernel crash while unloading ib_umad
	RDMA/core: Fix invalid memory access in spec_filter_size
	RDMA/iw_cxgb4: initiate CLOSE when entering TERM
	RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create
	RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
	RDMA/core: Fix protection fault in get_pkey_idx_qp_list
	s390/time: Fix clk type in get_tod_clock
	sched/uclamp: Reject negative values in cpu_uclamp_write()
	spmi: pmic-arb: Set lockdep class for hierarchical irq domains
	perf/x86/intel: Fix inaccurate period in context switch for auto-reload
	hwmon: (pmbus/ltc2978) Fix PMBus polling of MFR_COMMON definitions.
	mac80211: fix quiet mode activation in action frames
	cifs: fix mount option display for sec=krb5i
	arm64: dts: fast models: Fix FVP PCI interrupt-map property
	KVM: x86: Mask off reserved bit from #DB exception payload
	perf stat: Don't report a null stalled cycles per insn metric
	NFSv4.1 make cachethis=no for writes
	Revert "drm/sun4i: drv: Allow framebuffer modifiers in mode config"
	jbd2: move the clearing of b_modified flag to the journal_unmap_buffer()
	jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer
	ext4: choose hardlimit when softlimit is larger than hardlimit in ext4_statfs_project()
	KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging
	gpio: add gpiod_toggle_active_low()
	mmc: core: Rework wp-gpio handling
	Linux 5.4.21

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5a17276dc0cad8748c87af9019be9ee4d3064f0e
2020-02-20 08:13:39 +01:00
Qais Yousef
9f6f61c61a sched/uclamp: Reject negative values in cpu_uclamp_write()
commit b562d14064 upstream.

The check to ensure that the new written value into cpu.uclamp.{min,max}
is within range, [0:100], wasn't working because of the signed
comparison

 7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
 7302                         req.ret = -ERANGE;
 7303                         return req;
 7304                 }

	# echo -1 > cpu.uclamp.min
	# cat cpu.uclamp.min
	42949671.96

Cast req.percent into u64 to force the comparison to be unsigned and
work as intended in capacity_from_percent().

	# echo -1 > cpu.uclamp.min
	sh: write error: Numerical result out of range

Fixes: 2480c09313 ("sched/uclamp: Extend CPU's cgroup controller")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200114210947.14083-1-qais.yousef@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-19 19:53:07 +01:00
Greg Kroah-Hartman
e736cc6873 Merge 5.4.20 into android-5.4
Changes in 5.4.20
	ASoC: pcm: update FE/BE trigger order based on the command
	hv_sock: Remove the accept port restriction
	IB/mlx4: Fix memory leak in add_gid error flow
	IB/srp: Never use immediate data if it is disabled by a user
	IB/mlx4: Fix leak in id_map_find_del
	RDMA/netlink: Do not always generate an ACK for some netlink operations
	RDMA/i40iw: fix a potential NULL pointer dereference
	RDMA/core: Fix locking in ib_uverbs_event_read
	RDMA/uverbs: Verify MR access flags
	RDMA/cma: Fix unbalanced cm_id reference count during address resolve
	RDMA/umem: Fix ib_umem_find_best_pgsz()
	scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails
	PCI/IOV: Fix memory leak in pci_iov_add_virtfn()
	ath10k: pci: Only dump ATH10K_MEM_REGION_TYPE_IOREG when safe
	PCI/switchtec: Use dma_set_mask_and_coherent()
	PCI/switchtec: Fix vep_vector_number ioread width
	PCI: tegra: Fix afi_pex2_ctrl reg offset for Tegra30
	PCI: Don't disable bridge BARs when assigning bus resources
	PCI/AER: Initialize aer_fifo
	iwlwifi: mvm: avoid use after free for pmsr request
	bpftool: Don't crash on missing xlated program instructions
	bpf, sockmap: Don't sleep while holding RCU lock on tear-down
	bpf, sockhash: Synchronize_rcu before free'ing map
	selftests/bpf: Test freeing sockmap/sockhash with a socket in it
	bpf: Improve bucket_log calculation logic
	bpf, sockmap: Check update requirements after locking
	nfs: NFS_SWAP should depend on SWAP
	NFS: Revalidate the file size on a fatal write error
	NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes()
	NFS: Fix fix of show_nfs_errors
	NFSv4: pnfs_roc() must use cred_fscmp() to compare creds
	NFSv4: try lease recovery on NFS4ERR_EXPIRED
	NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
	x86/boot: Handle malformed SRAT tables during early ACPI parsing
	rtc: hym8563: Return -EINVAL if the time is known to be invalid
	rtc: cmos: Stop using shared IRQ
	watchdog: qcom: Use platform_get_irq_optional() for bark irq
	ARC: [plat-axs10x]: Add missing multicast filter number to GMAC node
	platform/x86: intel_mid_powerbtn: Take a copy of ddata
	arm64: dts: qcom: msm8998: Fix tcsr syscon size
	arm64: dts: uDPU: fix broken ethernet
	ARM: dts: at91: Reenable UART TX pull-ups
	ARM: dts: am43xx: add support for clkout1 clock
	arm64: dts: renesas: r8a77990: ebisu: Remove clkout-lr-synchronous from sound
	arm64: dts: marvell: clearfog-gt-8k: fix switch cpu port node
	ARM: dts: meson8: use the actual frequency for the GPU's 182.1MHz OPP
	ARM: dts: meson8b: use the actual frequency for the GPU's 364MHz OPP
	ARM: dts: at91: sama5d3: fix maximum peripheral clock rates
	ARM: dts: at91: sama5d3: define clock rate range for tcb1
	tools/power/acpi: fix compilation error
	soc: qcom: rpmhpd: Set 'active_only' for active only power domains
	Revert "powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests"
	powerpc/ptdump: Fix W+X verification call in mark_rodata_ro()
	powerpc/ptdump: Only enable PPC_CHECK_WX with STRICT_KERNEL_RWX
	powerpc/papr_scm: Fix leaking 'bus_desc.provider_name' in some paths
	powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning
	powerpc/pseries: Allow not having ibm, hypertas-functions::hcall-multi-tce for DDW
	iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA
	ARM: at91: pm: use SAM9X60 PMC's compatible
	ARM: at91: pm: use of_device_id array to find the proper shdwc node
	KVM: arm/arm64: vgic-its: Fix restoration of unmapped collections
	ARM: 8949/1: mm: mark free_memmap as __init
	sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
	arm64: cpufeature: Fix the type of no FP/SIMD capability
	arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly
	arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations
	KVM: arm/arm64: Fix young bit from mmu notifier
	KVM: arm: Fix DFSR setting for non-LPAE aarch32 guests
	KVM: arm: Make inject_abt32() inject an external abort instead
	KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset
	KVM: arm64: pmu: Fix chained SW_INCR counters
	KVM: arm64: Treat emulated TVAL TimerValue as a signed 32-bit integer
	arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly
	mtd: onenand_base: Adjust indentation in onenand_read_ops_nolock
	mtd: sharpslpart: Fix unsigned comparison to zero
	crypto: testmgr - don't try to decrypt uninitialized buffers
	crypto: artpec6 - return correct error code for failed setkey()
	crypto: atmel-sha - fix error handling when setting hmac key
	crypto: caam/qi2 - fix typo in algorithm's driver name
	drivers: watchdog: stm32_iwdg: set WDOG_HW_RUNNING at probe
	media: i2c: adv748x: Fix unsafe macros
	dt-bindings: iio: adc: ad7606: Fix wrong maxItems value
	bcache: avoid unnecessary btree nodes flushing in btree_flush_write()
	selinux: revert "stop passing MAY_NOT_BLOCK to the AVC upon follow_link"
	selinux: fix regression introduced by move_mount(2) syscall
	pinctrl: sh-pfc: r8a77965: Fix DU_DOTCLKIN3 drive/bias control
	pinctrl: sh-pfc: r8a7778: Fix duplicate SDSELF_B and SD1_CLK_B
	regmap: fix writes to non incrementing registers
	mfd: max77650: Select REGMAP_IRQ in Kconfig
	clk: meson: g12a: fix missing uart2 in regmap table
	dmaengine: axi-dmac: add a check for devm_regmap_init_mmio
	mwifiex: Fix possible buffer overflows in mwifiex_ret_wmm_get_status()
	mwifiex: Fix possible buffer overflows in mwifiex_cmd_append_vsie_tlv()
	libertas: don't exit from lbs_ibss_join_existing() with RCU read lock held
	libertas: make lbs_ibss_join_existing() return error code on rates overflow
	selinux: fall back to ref-walk if audit is required
	Linux 5.4.20

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I68c0ac72422e279b38324afc91dc52df3eadc0f7
2020-02-19 08:28:27 +01:00
Qais Yousef
ba95651cef sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
commit 7226017ad3 upstream.

When a new cgroup is created, the effective uclamp value wasn't updated
with a call to cpu_util_update_eff() that looks at the hierarchy and
update to the most restrictive values.

Fix it by ensuring to call cpu_util_update_eff() when a new cgroup
becomes online.

Without this change, the newly created cgroup uses the default
root_task_group uclamp values, which is 1024 for both uclamp_{min, max},
which will cause the rq to to be clamped to max, hence cause the
system to run at max frequency.

The problem was observed on Ubuntu server and was reproduced on Debian
and Buildroot rootfs.

By default, Ubuntu and Debian create a cpu controller cgroup hierarchy
and add all tasks to it - which creates enough noise to keep the rq
uclamp value at max most of the time. Imitating this behavior makes the
problem visible in Buildroot too which otherwise looks fine since it's a
minimal userspace.

Fixes: 0b60ba2dd3 ("sched/uclamp: Propagate parent clamps")
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Doug Smythies <dsmythies@telus.net>
Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-14 16:34:17 -05:00
Qais Yousef
4a192aa559 UPSTREAM: sched/rt: Make RT capacity-aware
Capacity Awareness refers to the fact that on heterogeneous systems
(like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence
when placing tasks we need to be aware of this difference of CPU
capacities.

In such scenarios we want to ensure that the selected CPU has enough
capacity to meet the requirement of the running task. Enough capacity
means here that capacity_orig_of(cpu) >= task.requirement.

The definition of task.requirement is dependent on the scheduling class.

For CFS, utilization is used to select a CPU that has >= capacity value
than the cfs_task.util.

	capacity_orig_of(cpu) >= cfs_task.util

DL isn't capacity aware at the moment but can make use of the bandwidth
reservation to implement that in a similar manner CFS uses utilization.
The following patchset implements that:

https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/

	capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime

For RT we don't have a per task utilization signal and we lack any
information in general about what performance requirement the RT task
needs. But with the introduction of uclamp, RT tasks can now control
that by setting uclamp_min to guarantee a minimum performance point.

ATM the uclamp value are only used for frequency selection; but on
heterogeneous systems this is not enough and we need to ensure that the
capacity of the CPU is >= uclamp_min. Which is what implemented here.

	capacity_orig_of(cpu) >= rt_task.uclamp_min

Note that by default uclamp.min is 1024, which means that RT tasks will
always be biased towards the big CPUs, which make for a better more
predictable behavior for the default case.

Must stress that the bias acts as a hint rather than a definite
placement strategy. For example, if all big cores are busy executing
other RT tasks we can't guarantee that a new RT task will be placed
there.

On non-heterogeneous systems the original behavior of RT should be
retained. Similarly if uclamp is not selected in the config.

[ mingo: Minor edits to comments. ]

Bug: 133435829
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 804d402fb6)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I3f173fd2264510142fbbe3e292caba8771263c92
2020-02-01 14:36:45 +00:00
Valentin Schneider
15d93f6707 UPSTREAM: sched/fair: Make EAS wakeup placement consider uclamp restrictions
task_fits_capacity() has just been made uclamp-aware, and
find_energy_efficient_cpu() needs to go through the same treatment.

Things are somewhat different here however - using the task max clamp isn't
sufficient. Consider the following setup:

  The target runqueue, rq:
    rq.cpu_capacity_orig = 512
    rq.cfs.avg.util_avg = 200
    rq.uclamp.max = 768 // the max p.uclamp.max of all enqueued p's is 768

  The waking task, p (not yet enqueued on rq):
    p.util_est = 600
    p.uclamp.max = 100

Now, consider the following code which doesn't use the rq clamps:

  util = uclamp_task_util(p);
  // Does the task fit in the spare CPU capacity?
  cpu = cpu_of(rq);
  fits_capacity(util, cpu_capacity(cpu) - cpu_util(cpu))

This would lead to:

  util = 100;
  fits_capacity(100, 512 - 200)

fits_capacity() would return true. However, enqueuing p on that CPU *will*
cause it to become overutilized since rq clamp values are max-aggregated,
so we'd remain with

  rq.uclamp.max = 768

which comes from the other tasks already enqueued on rq. Thus, we could
select a high enough frequency to reach beyond 0.8 * 512 utilization
(== overutilized) after enqueuing p on rq. What find_energy_efficient_cpu()
needs here is uclamp_rq_util_with() which lets us peek at the future
utilization landscape, including rq-wide uclamp values.

Make find_energy_efficient_cpu() use uclamp_rq_util_with() for its
fits_capacity() check. This is in line with what compute_energy() ends up
using for estimating utilization.

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-6-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 1d42509e47)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I0f74ea0b05590b0776e8e2ac568af3162d5ca29a
2020-02-01 14:36:12 +00:00
Valentin Schneider
d5c2a096a5 UPSTREAM: sched/fair: Make task_fits_capacity() consider uclamp restrictions
task_fits_capacity() drives CPU selection at wakeup time, and is also used
to detect misfit tasks. Right now it does so by comparing task_util_est()
with a CPU's capacity, but doesn't take into account uclamp restrictions.

There's a few interesting uses that can come out of doing this. For
instance, a low uclamp.max value could prevent certain tasks from being
flagged as misfit tasks, so they could merrily remain on low-capacity CPUs.
Similarly, a high uclamp.min value would steer tasks towards high capacity
CPUs at wakeup (and, should that fail, later steered via misfit balancing),
so such "boosted" tasks would favor CPUs of higher capacity.

Introduce uclamp_task_util() and make task_fits_capacity() use it.

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-5-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit a7008c07a5)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: Icd08c2d2a415695b980371eff8ab42cd9a83609b
2020-02-01 14:35:59 +00:00
Valentin Schneider
1356a58daa UPSTREAM: sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
The current helper returns (CPU) rq utilization with uclamp restrictions
taken into account. A uclamp task utilization helper would be quite
helpful, but this requires some renaming.

Prepare the code for the introduction of a uclamp_task_util() by renaming
the existing uclamp_util_with() to uclamp_rq_util_with().

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-4-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit d2b58a286e)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I33d3f6ad4faba10da5a0a20a9ecb444cee81d8da
2020-02-01 14:35:36 +00:00
Valentin Schneider
fedb670f02 UPSTREAM: sched/uclamp: Make uclamp util helpers use and return UL values
Vincent pointed out recently that the canonical type for utilization
values is 'unsigned long'. Internally uclamp uses 'unsigned int' values for
cache optimization, but this doesn't have to be exported to its users.

Make the uclamp helpers that deal with utilization use and return unsigned
long values.

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-3-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 686516b55e)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I8c3a191ef6ebae6b76521321ee911e9b9e4eda45
2020-02-01 14:35:03 +00:00
Valentin Schneider
6966eb9ecf BACKPORT: sched/uclamp: Remove uclamp_util()
The sole user of uclamp_util(), schedutil_cpu_util(), was made to use
uclamp_util_with() instead in commit:

  af24bde8df ("sched/uclamp: Add uclamp support to energy_compute()")

From then on, uclamp_util() has remained unused. Being a simple wrapper
around uclamp_util_with(), we can get rid of it and win back a few lines.

[QP: fixed trivial cherry-pick conflict caused by uclamp_boosted()]

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 59fe675248)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I85c97644dbb3172cb9a661c5f0b9af43dfe9c87f
2020-02-01 14:30:56 +00:00
Quentin Perret
1473e2098c Revert "ANDROID: sched/fair: EAS: Add uclamp support to find_energy_efficient_cpu()"
This reverts commit b61876ed12.

There is a better alternative upstream now, so let's drop the android
specific version.

Bug: 120440300
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I311c528c9f342035ef9da6a3833004f942daef00
2020-02-01 14:30:56 +00:00
Greg Kroah-Hartman
33c8a1b2d0 Merge 5.4.15 into android-5.4
Changes in 5.4.15
	drm/i915: Fix pid leak with banned clients
	libbpf: Fix compatibility for kernels without need_wakeup
	libbpf: Fix memory leak/double free issue
	libbpf: Fix potential overflow issue
	libbpf: Fix another potential overflow issue in bpf_prog_linfo
	libbpf: Make btf__resolve_size logic always check size error condition
	bpf: Force .BTF section start to zero when dumping from vmlinux
	samples: bpf: update map definition to new syntax BTF-defined map
	samples/bpf: Fix broken xdp_rxq_info due to map order assumptions
	ARM: dts: logicpd-torpedo-37xx-devkit-28: Reference new DRM panel
	ARM: OMAP2+: Add missing put_device() call in omapdss_init_of()
	xfs: Sanity check flags of Q_XQUOTARM call
	i2c: stm32f7: rework slave_id allocation
	i2c: i2c-stm32f7: fix 10-bits check in slave free id search loop
	mfd: intel-lpss: Add default I2C device properties for Gemini Lake
	SUNRPC: Fix svcauth_gss_proxy_init()
	SUNRPC: Fix backchannel latency metrics
	powerpc/security: Fix debugfs data leak on 32-bit
	powerpc/pseries: Enable support for ibm,drc-info property
	powerpc/kasan: Fix boot failure with RELOCATABLE && FSL_BOOKE
	powerpc/archrandom: fix arch_get_random_seed_int()
	tipc: reduce sensitive to retransmit failures
	tipc: update mon's self addr when node addr generated
	tipc: fix potential memory leak in __tipc_sendmsg()
	tipc: fix wrong socket reference counter after tipc_sk_timeout() returns
	tipc: fix wrong timeout input for tipc_wait_for_cond()
	net/mlx5e: Fix free peer_flow when refcount is 0
	phy: lantiq: vrx200-pcie: fix error return code in ltq_vrx200_pcie_phy_power_on()
	net: phy: broadcom: Fix RGMII delays configuration for BCM54210E
	phy: ti: gmii-sel: fix mac tx internal delay for rgmii-rxid
	mt76: mt76u: fix endpoint definition order
	mt7601u: fix bbp version check in mt7601u_wait_bbp_ready
	ice: fix stack leakage
	s390/pkey: fix memory leak within _copy_apqns_from_user()
	nfsd: depend on CRYPTO_MD5 for legacy client tracking
	crypto: amcc - restore CRYPTO_AES dependency
	crypto: sun4i-ss - fix big endian issues
	perf map: No need to adjust the long name of modules
	leds: tlc591xx: update the maximum brightness
	soc/tegra: pmc: Fix crashes for hierarchical interrupts
	soc: qcom: llcc: Name regmaps to avoid collisions
	soc: renesas: Add missing check for non-zero product register address
	soc: aspeed: Fix snoop_file_poll()'s return type
	watchdog: sprd: Fix the incorrect pointer getting from driver data
	ipmi: Fix memory leak in __ipmi_bmc_register
	sched/core: Further clarify sched_class::set_next_task()
	gpiolib: No need to call gpiochip_remove_pin_ranges() twice
	rtw88: fix beaconing mode rsvd_page memory violation issue
	rtw88: fix error handling when setup efuse info
	drm/panfrost: Add missing check for pfdev->regulator
	drm: panel-lvds: Potential Oops in probe error handling
	drm/amdgpu: remove excess function parameter description
	hwrng: omap3-rom - Fix missing clock by probing with device tree
	dpaa2-eth: Fix minor bug in ethtool stats reporting
	drm/rockchip: Round up _before_ giving to the clock framework
	software node: Get reference to parent swnode in get_parent op
	PCI: mobiveil: Fix csr_read()/write() build issue
	drm: rcar_lvds: Fix color mismatches on R-Car H2 ES2.0 and later
	net: netsec: Correct dma sync for XDP_TX frames
	ACPI: platform: Unregister stale platform devices
	pwm: sun4i: Fix incorrect calculation of duty_cycle/period
	regulator: bd70528: Add MODULE_ALIAS to allow module auto loading
	drm/amdgpu/vi: silence an uninitialized variable warning
	power: supply: bd70528: Add MODULE_ALIAS to allow module auto loading
	firmware: imx: Remove call to devm_of_platform_populate
	libbpf: Don't use kernel-side u32 type in xsk.c
	rcu: Fix uninitialized variable in nocb_gp_wait()
	dpaa_eth: perform DMA unmapping before read
	dpaa_eth: avoid timestamp read on error paths
	scsi: ufs: delete redundant function ufshcd_def_desc_sizes()
	net: openvswitch: don't unlock mutex when changing the user_features fails
	hv_netvsc: flag software created hash value
	rt2800: remove errornous duplicate condition
	net: neigh: use long type to store jiffies delta
	net: axienet: Fix error return code in axienet_probe()
	selftests: gen_kselftest_tar.sh: Do not clobber kselftest/
	rtc: bd70528: fix module alias to autoload module
	packet: fix data-race in fanout_flow_is_huge()
	i2c: stm32f7: report dma error during probe
	kselftests: cgroup: Avoid the reuse of fd after it is deallocated
	firmware: arm_scmi: Fix doorbell ring logic for !CONFIG_64BIT
	mmc: sdio: fix wl1251 vendor id
	mmc: core: fix wl1251 sdio quirks
	tee: optee: Fix dynamic shm pool allocations
	tee: optee: fix device enumeration error handling
	workqueue: Add RCU annotation for pwq list walk
	SUNRPC: Fix another issue with MIC buffer space
	sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util()
	mt76: mt76u: rely on usb_interface instead of usb_dev
	dma-direct: don't check swiotlb=force in dma_direct_map_resource
	afs: Remove set but not used variables 'before', 'after'
	dmaengine: ti: edma: fix missed failure handling
	drm/radeon: fix bad DMA from INTERRUPT_CNTL2
	xdp: Fix cleanup on map free for devmap_hash map type
	platform/chrome: wilco_ec: fix use after free issue
	block: fix memleak of bio integrity data
	s390/qeth: fix dangling IO buffers after halt/clear
	net-sysfs: Call dev_hold always in netdev_queue_add_kobject
	gpio: aspeed: avoid return type warning
	phy/rockchip: inno-hdmi: round clock rate down to closest 1000 Hz
	optee: Fix multi page dynamic shm pool alloc
	Linux 5.4.15

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I28b2a19657d40804406dc0e7c266296ce8768eb7
2020-01-26 11:14:46 +01:00
Vincent Guittot
0e9619ff10 sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util()
[ Upstream commit bef69dd878 ]

update_cfs_rq_load_avg() calls cfs_rq_util_change() every time PELT decays,
which might be inefficient when the cpufreq driver has rate limitation.

When a task is attached on a CPU, we have this call path:

update_load_avg()
  update_cfs_rq_load_avg()
    cfs_rq_util_change -- > trig frequency update
  attach_entity_load_avg()
    cfs_rq_util_change -- > trig frequency update

The 1st frequency update will not take into account the utilization of the
newly attached task and the 2nd one might be discarded because of rate
limitation of the cpufreq driver.

update_cfs_rq_load_avg() is only called by update_blocked_averages()
and update_load_avg() so we can move the call to
cfs_rq_util_change/cpufreq_update_util() into these two functions.

It's also interesting to note that update_load_avg() already calls
cfs_rq_util_change() directly for the !SMP case.

This change will also ensure that cpufreq_update_util() is called even
when there is no more CFS rq in the leaf_cfs_rq_list to update, but only
IRQ, RT or DL PELT signals.

[ mingo: Minor updates. ]

Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: juri.lelli@redhat.com
Cc: linux-pm@vger.kernel.org
Cc: mgorman@suse.de
Cc: rostedt@goodmis.org
Cc: sargun@sargun.me
Cc: srinivas.pandruvada@linux.intel.com
Cc: tj@kernel.org
Cc: xiexiuqi@huawei.com
Cc: xiezhipeng1@huawei.com
Fixes: 039ae8bcf7 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Link: https://lkml.kernel.org/r/1574083279-799-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-26 10:01:08 +01:00
Peter Zijlstra
37bb3c4646 sched/core: Further clarify sched_class::set_next_task()
commit a0e813f26e upstream.

It turns out there really is something special to the first
set_next_task() invocation. In specific the 'change' pattern really
should not cause balance callbacks.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: ktkhai@virtuozzo.com
Cc: mgorman@suse.de
Cc: qais.yousef@arm.com
Cc: qperret@google.com
Cc: rostedt@goodmis.org
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Fixes: f95d4eaee6 ("sched/{rt,deadline}: Fix set_next_task vs pick_next_task")
Link: https://lkml.kernel.org/r/20191108131909.775434698@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-26 10:01:03 +01:00
Greg Kroah-Hartman
fde6e0c654 Merge 5.4.11 into android-5.4
Changes in 5.4.11
	USB: dummy-hcd: use usb_urb_dir_in instead of usb_pipein
	bpf: Fix passing modified ctx to ld/abs/ind instruction
	ASoC: rt5682: fix i2c arbitration lost issue
	spi: pxa2xx: Add support for Intel Jasper Lake
	regulator: fix use after free issue
	ASoC: max98090: fix possible race conditions
	spi: fsl: Fix GPIO descriptor support
	gpio: Handle counting of Freescale chipselects
	spi: fsl: Handle the single hardwired chipselect case
	locking/spinlock/debug: Fix various data races
	netfilter: ctnetlink: netns exit must wait for callbacks
	x86/intel: Disable HPET on Intel Ice Lake platforms
	netfilter: nf_tables_offload: Check for the NETDEV_UNREGISTER event
	mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()
	libtraceevent: Fix lib installation with O=
	libtraceevent: Copy pkg-config file to output folder when using O=
	regulator: core: fix regulator_register() error paths to properly release rdev
	x86/efi: Update e820 with reserved EFI boot services data to fix kexec breakage
	ASoC: Intel: bytcr_rt5640: Update quirk for Teclast X89
	selftests: netfilter: use randomized netns names
	efi/gop: Return EFI_NOT_FOUND if there are no usable GOPs
	efi/gop: Return EFI_SUCCESS if a usable GOP was found
	efi/gop: Fix memory leak in __gop_query32/64()
	efi/earlycon: Remap entire framebuffer after page initialization
	ARM: dts: imx6ul: imx6ul-14x14-evk.dtsi: Fix SPI NOR probing
	ARM: vexpress: Set-up shared OPP table instead of individual for each CPU
	netfilter: uapi: Avoid undefined left-shift in xt_sctp.h
	netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
	netfilter: nf_tables: validate NFT_SET_ELEM_INTERVAL_END
	netfilter: nf_tables: validate NFT_DATA_VALUE after nft_data_init()
	netfilter: nf_tables: skip module reference count bump on object updates
	netfilter: nf_tables_offload: return EOPNOTSUPP if rule specifies no actions
	ARM: dts: BCM5301X: Fix MDIO node address/size cells
	selftests/ftrace: Fix to check the existence of set_ftrace_filter
	selftests/ftrace: Fix ftrace test cases to check unsupported
	selftests/ftrace: Do not to use absolute debugfs path
	selftests/ftrace: Fix multiple kprobe testcase
	selftests: safesetid: Move link library to LDLIBS
	selftests: safesetid: Check the return value of setuid/setgid
	selftests: safesetid: Fix Makefile to set correct test program
	ARM: exynos_defconfig: Restore debugfs support
	ARM: dts: Cygnus: Fix MDIO node address/size cells
	spi: spi-cavium-thunderx: Add missing pci_release_regions()
	reset: Do not register resource data for missing resets
	ASoC: topology: Check return value for snd_soc_add_dai_link()
	ASoC: topology: Check return value for soc_tplg_pcm_create()
	ASoC: SOF: loader: snd_sof_fw_parse_ext_data log warning on unknown header
	ASoC: SOF: Intel: split cht and byt debug window sizes
	ARM: dts: am335x-sancloud-bbe: fix phy mode
	ARM: omap2plus_defconfig: Add back DEBUG_FS
	ARM: dts: bcm283x: Fix critical trip point
	arm64: dts: ls1028a: fix typo in TMU calibration data
	bpf, riscv: Limit to 33 tail calls
	bpf, mips: Limit to 33 tail calls
	bpftool: Don't crash on missing jited insns or ksyms
	perf metricgroup: Fix printing event names of metric group with multiple events
	perf header: Fix false warning when there are no duplicate cache entries
	spi: spi-ti-qspi: Fix a bug when accessing non default CS
	ARM: dts: am437x-gp/epos-evm: fix panel compatible
	kselftest/runner: Print new line in print of timeout log
	kselftest: Support old perl versions
	samples: bpf: Replace symbol compare of trace_event
	samples: bpf: fix syscall_tp due to unused syscall
	arm64: dts: ls1028a: fix reboot node
	ARM: imx_v6_v7_defconfig: Explicitly restore CONFIG_DEBUG_FS
	pinctrl: aspeed-g6: Fix LPC/eSPI mux configuration
	bus: ti-sysc: Fix missing reset delay handling
	clk: walk orphan list on clock provider registration
	mac80211: fix TID field in monitor mode transmit
	cfg80211: fix double-free after changing network namespace
	pinctrl: pinmux: fix a possible null pointer in pinmux_can_be_used_for_gpio
	powerpc: Ensure that swiotlb buffer is allocated from low memory
	btrfs: Fix error messages in qgroup_rescan_init
	Btrfs: fix cloning range with a hole when using the NO_HOLES feature
	powerpc/vcpu: Assume dedicated processors as non-preempt
	powerpc/spinlocks: Include correct header for static key
	btrfs: handle error in btrfs_cache_block_group
	Btrfs: fix hole extent items with a zero size after range cloning
	ocxl: Fix potential memory leak on context creation
	bpf: Clear skb->tstamp in bpf_redirect when necessary
	habanalabs: rate limit error msg on waiting for CS
	habanalabs: remove variable 'val' set but not used
	bnx2x: Do not handle requests from VFs after parity
	bnx2x: Fix logic to get total no. of PFs per engine
	cxgb4: Fix kernel panic while accessing sge_info
	net: usb: lan78xx: Fix error message format specifier
	parisc: fix compilation when KEXEC=n and KEXEC_FILE=y
	parisc: add missing __init annotation
	rfkill: Fix incorrect check to avoid NULL pointer dereference
	ASoC: wm8962: fix lambda value
	regulator: rn5t618: fix module aliases
	spi: nxp-fspi: Ensure width is respected in spi-mem operations
	clk: at91: fix possible deadlock
	staging: axis-fifo: add unspecified HAS_IOMEM dependency
	iommu/iova: Init the struct iova to fix the possible memleak
	kconfig: don't crash on NULL expressions in expr_eq()
	scripts: package: mkdebian: add missing rsync dependency
	perf/x86: Fix potential out-of-bounds access
	perf/x86/intel: Fix PT PMI handling
	sched/psi: Fix sampling error and rare div0 crashes with cgroups and high uptime
	psi: Fix a division error in psi poll()
	usb: typec: fusb302: Fix an undefined reference to 'extcon_get_state'
	block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT
	fs: avoid softlockups in s_inodes iterators
	fs: call fsnotify_sb_delete after evict_inodes
	perf/smmuv3: Remove the leftover put_cpu() in error path
	iommu/dma: Relax locking in iommu_dma_prepare_msi()
	io_uring: don't wait when under-submitting
	clk: Move clk_core_reparent_orphans() under CONFIG_OF
	net: stmmac: selftests: Needs to check the number of Multicast regs
	net: stmmac: Determine earlier the size of RX buffer
	net: stmmac: Do not accept invalid MTU values
	net: stmmac: xgmac: Clear previous RX buffer size
	net: stmmac: RX buffer size must be 16 byte aligned
	net: stmmac: Always arm TX Timer at end of transmission start
	s390/purgatory: do not build purgatory with kcov, kasan and friends
	drm/exynos: gsc: add missed component_del
	tpm/tpm_ftpm_tee: add shutdown call back
	xsk: Add rcu_read_lock around the XSK wakeup
	net/mlx5e: Fix concurrency issues between config flow and XSK
	net/i40e: Fix concurrency issues between config flow and XSK
	net/ixgbe: Fix concurrency issues between config flow and XSK
	platform/x86: pcengines-apuv2: fix simswap GPIO assignment
	arm64: cpu_errata: Add Hisilicon TSV110 to spectre-v2 safe list
	block: Fix a lockdep complaint triggered by request queue flushing
	s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly
	s390/dasd: fix memleak in path handling error case
	block: fix memleak when __blk_rq_map_user_iov() is failed
	parisc: Fix compiler warnings in debug_core.c
	sbitmap: only queue kyber's wait callback if not already active
	s390/qeth: handle error due to unsupported transport mode
	s390/qeth: fix promiscuous mode after reset
	s390/qeth: don't return -ENOTSUPP to userspace
	llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
	hv_netvsc: Fix unwanted rx_table reset
	selftests: pmtu: fix init mtu value in description
	tracing: Do not create directories if lockdown is in affect
	gtp: fix bad unlock balance in gtp_encap_enable_socket
	macvlan: do not assume mac_header is set in macvlan_broadcast()
	net: dsa: mv88e6xxx: Preserve priority when setting CPU port.
	net: freescale: fec: Fix ethtool -d runtime PM
	net: stmmac: dwmac-sun8i: Allow all RGMII modes
	net: stmmac: dwmac-sunxi: Allow all RGMII modes
	net: stmmac: Fixed link does not need MDIO Bus
	net: usb: lan78xx: fix possible skb leak
	pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM
	sch_cake: avoid possible divide by zero in cake_enqueue()
	sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY
	tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK
	vxlan: fix tos value before xmit
	mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO
	net: sch_prio: When ungrafting, replace with FIFO
	vlan: fix memory leak in vlan_dev_set_egress_priority
	vlan: vlan_changelink() should propagate errors
	macb: Don't unregister clks unconditionally
	net/mlx5: Move devlink registration before interfaces load
	net: dsa: mv88e6xxx: force cmode write on 6141/6341
	net/mlx5e: Always print health reporter message to dmesg
	net/mlx5: DR, No need for atomic refcount for internal SW steering resources
	net/mlx5e: Fix hairpin RSS table size
	net/mlx5: DR, Init lists that are used in rule's member
	usb: dwc3: gadget: Fix request complete check
	USB: core: fix check for duplicate endpoints
	USB: serial: option: add Telit ME910G1 0x110a composition
	usb: missing parentheses in USE_NEW_SCHEME
	Linux 5.4.11

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Idb9985bebc97203fa305f881fd98a62ac08e66d9
2020-01-12 15:36:52 +01:00
Johannes Weiner
4e38135180 psi: Fix a division error in psi poll()
[ Upstream commit c3466952ca ]

The psi window size is a u64 an can be up to 10 seconds right now,
which exceeds the lower 32 bits of the variable. We currently use
div_u64 for it, which is meant only for 32-bit divisors. The result is
garbage pressure sampling values and even potential div0 crashes.

Use div64_u64.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Jingfeng Xie <xiejingfeng@linux.alibaba.com>
Link: https://lkml.kernel.org/r/20191203183524.41378-3-hannes@cmpxchg.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:21:36 +01:00
Johannes Weiner
74e2bdcb7d sched/psi: Fix sampling error and rare div0 crashes with cgroups and high uptime
[ Upstream commit 3dfbe25c27 ]

Jingfeng reports rare div0 crashes in psi on systems with some uptime:

[58914.066423] divide error: 0000 [#1] SMP
[58914.070416] Modules linked in: ipmi_poweroff ipmi_watchdog toa overlay fuse tcp_diag inet_diag binfmt_misc aisqos(O) aisqos_hotfixes(O)
[58914.083158] CPU: 94 PID: 140364 Comm: kworker/94:2 Tainted: G W OE K 4.9.151-015.ali3000.alios7.x86_64 #1
[58914.093722] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 3.23.34 02/14/2019
[58914.102728] Workqueue: events psi_update_work
[58914.107258] task: ffff8879da83c280 task.stack: ffffc90059dcc000
[58914.113336] RIP: 0010:[] [] psi_update_stats+0x1c1/0x330
[58914.122183] RSP: 0018:ffffc90059dcfd60 EFLAGS: 00010246
[58914.127650] RAX: 0000000000000000 RBX: ffff8858fe98be50 RCX: 000000007744d640
[58914.134947] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00003594f700648e
[58914.142243] RBP: ffffc90059dcfdf8 R08: 0000359500000000 R09: 0000000000000000
[58914.149538] R10: 0000000000000000 R11: 0000000000000000 R12: 0000359500000000
[58914.156837] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8858fe98bd78
[58914.164136] FS: 0000000000000000(0000) GS:ffff887f7f380000(0000) knlGS:0000000000000000
[58914.172529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[58914.178467] CR2: 00007f2240452090 CR3: 0000005d5d258000 CR4: 00000000007606f0
[58914.185765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[58914.193061] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[58914.200360] PKRU: 55555554
[58914.203221] Stack:
[58914.205383] ffff8858fe98bd48 00000000000002f0 0000002e81036d09 ffffc90059dcfde8
[58914.213168] ffff8858fe98bec8 0000000000000000 0000000000000000 0000000000000000
[58914.220951] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[58914.228734] Call Trace:
[58914.231337] [] psi_update_work+0x22/0x60
[58914.237067] [] process_one_work+0x189/0x420
[58914.243063] [] worker_thread+0x4e/0x4b0
[58914.248701] [] ? process_one_work+0x420/0x420
[58914.254869] [] kthread+0xe6/0x100
[58914.259994] [] ? kthread_park+0x60/0x60
[58914.265640] [] ret_from_fork+0x39/0x50
[58914.271193] Code: 41 29 c3 4d 39 dc 4d 0f 42 dc <49> f7 f1 48 8b 13 48 89 c7 48 c1
[58914.279691] RIP [] psi_update_stats+0x1c1/0x330

The crashing instruction is trying to divide the observed stall time
by the sampling period. The period, stored in R8, is not 0, but we are
dividing by the lower 32 bits only, which are all 0 in this instance.

We could switch to a 64-bit division, but the period shouldn't be that
big in the first place. It's the time between the last update and the
next scheduled one, and so should always be around 2s and comfortably
fit into 32 bits.

The bug is in the initialization of new cgroups: we schedule the first
sampling event in a cgroup as an offset of sched_clock(), but fail to
initialize the last_update timestamp, and it defaults to 0. That
results in a bogusly large sampling period the first time we run the
sampling code, and consequently we underreport pressure for the first
2s of a cgroup's life. But worse, if sched_clock() is sufficiently
advanced on the system, and the user gets unlucky, the period's lower
32 bits can all be 0 and the sampling division will crash.

Fix this by initializing the last update timestamp to the creation
time of the cgroup, thus correctly marking the start of the first
pressure sampling period in a new cgroup.

Reported-by: Jingfeng Xie <xiejingfeng@linux.alibaba.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Link: https://lkml.kernel.org/r/20191203183524.41378-2-hannes@cmpxchg.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:21:36 +01:00
Greg Kroah-Hartman
861433ef01 Merge 5.4.7 into android-5.4
Changes in 5.4.7
	af_packet: set defaule value for tmo
	fjes: fix missed check in fjes_acpi_add
	mod_devicetable: fix PHY module format
	net: dst: Force 4-byte alignment of dst_metrics
	net: gemini: Fix memory leak in gmac_setup_txqs
	net: hisilicon: Fix a BUG trigered by wrong bytes_compl
	net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive()
	net: phy: ensure that phy IDs are correctly typed
	net: qlogic: Fix error paths in ql_alloc_large_buffers()
	net-sysfs: Call dev_hold always in rx_queue_add_kobject
	net: usb: lan78xx: Fix suspend/resume PHY register access error
	nfp: flower: fix stats id allocation
	qede: Disable hardware gro when xdp prog is installed
	qede: Fix multicast mac configuration
	sctp: fix memleak on err handling of stream initialization
	sctp: fully initialize v4 addr in some functions
	selftests: forwarding: Delete IPv6 address at the end
	neighbour: remove neigh_cleanup() method
	bonding: fix bond_neigh_init()
	net: ena: fix default tx interrupt moderation interval
	net: ena: fix issues in setting interrupt moderation params in ethtool
	dpaa2-ptp: fix double free of the ptp_qoriq IRQ
	mlxsw: spectrum_router: Remove unlikely user-triggerable warning
	net: ethernet: ti: davinci_cpdma: fix warning "device driver frees DMA memory with different size"
	net: stmmac: platform: Fix MDIO init for platforms without PHY
	net: dsa: b53: Fix egress flooding settings
	NFC: nxp-nci: Fix probing without ACPI
	btrfs: don't double lock the subvol_sem for rename exchange
	btrfs: do not call synchronize_srcu() in inode_tree_del
	Btrfs: make tree checker detect checksum items with overlapping ranges
	btrfs: return error pointer from alloc_test_extent_buffer
	Btrfs: fix missing data checksums after replaying a log tree
	btrfs: send: remove WARN_ON for readonly mount
	btrfs: abort transaction after failed inode updates in create_subvol
	btrfs: skip log replay on orphaned roots
	btrfs: do not leak reloc root if we fail to read the fs root
	btrfs: handle ENOENT in btrfs_uuid_tree_iterate
	Btrfs: fix removal logic of the tree mod log that leads to use-after-free issues
	ALSA: pcm: Avoid possible info leaks from PCM stream buffers
	ALSA: hda/ca0132 - Keep power on during processing DSP response
	ALSA: hda/ca0132 - Avoid endless loop
	ALSA: hda/ca0132 - Fix work handling in delayed HP detection
	drm/vc4/vc4_hdmi: fill in connector info
	drm/virtio: switch virtio_gpu_wait_ioctl() to gem helper.
	drm: mst: Fix query_payload ack reply struct
	drm/mipi-dbi: fix a loop in debugfs code
	drm/panel: Add missing drm_panel_init() in panel drivers
	drm: exynos: exynos_hdmi: use cec_notifier_conn_(un)register
	drm: Use EOPNOTSUPP, not ENOTSUPP
	drm/amd/display: verify stream link before link test
	drm/bridge: analogix-anx78xx: silence -EPROBE_DEFER warnings
	drm/amd/display: OTC underflow fix
	iio: max31856: add missing of_node and parent references to iio_dev
	iio: light: bh1750: Resolve compiler warning and make code more readable
	drm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code
	drm/amdgpu: grab the id mgr lock while accessing passid_mapping
	drm/ttm: return -EBUSY on pipelining with no_gpu_wait (v2)
	drm/amd/display: Rebuild mapped resources after pipe split
	ath10k: add cleanup in ath10k_sta_state()
	drm/amd/display: Handle virtual signal type in disable_link()
	ath10k: Check if station exists before forwarding tx airtime report
	spi: Add call to spi_slave_abort() function when spidev driver is released
	drm/meson: vclk: use the correct G12A frac max value
	staging: rtl8192u: fix multiple memory leaks on error path
	staging: rtl8188eu: fix possible null dereference
	rtlwifi: prevent memory leak in rtl_usb_probe
	libertas: fix a potential NULL pointer dereference
	Revert "pinctrl: sh-pfc: r8a77990: Fix MOD_SEL1 bit30 when using SSI_SCK2 and SSI_WS2"
	Revert "pinctrl: sh-pfc: r8a77990: Fix MOD_SEL1 bit31 when using SIM0_D"
	ath10k: fix backtrace on coredump
	IB/iser: bound protection_sg size by data_sg size
	drm/komeda: Workaround for broken FLIP_COMPLETE timestamps
	spi: gpio: prevent memory leak in spi_gpio_probe
	media: am437x-vpfe: Setting STD to current value is not an error
	media: cedrus: fill in bus_info for media device
	media: seco-cec: Add a missing 'release_region()' in an error handling path
	media: vim2m: Fix abort issue
	media: vim2m: Fix BUG_ON in vim2m_device_release()
	media: max2175: Fix build error without CONFIG_REGMAP_I2C
	media: ov6650: Fix control handler not freed on init error
	media: i2c: ov2659: fix s_stream return value
	media: ov6650: Fix crop rectangle alignment not passed back
	media: i2c: ov2659: Fix missing 720p register config
	media: ov6650: Fix stored frame format not in sync with hardware
	media: ov6650: Fix stored crop rectangle not in sync with hardware
	tools/power/cpupower: Fix initializer override in hsw_ext_cstates
	media: venus: core: Fix msm8996 frequency table
	ath10k: fix offchannel tx failure when no ath10k_mac_tx_frm_has_freq
	media: vimc: Fix gpf in rmmod path when stream is active
	drm/amd/display: Set number of pipes to 1 if the second pipe was disabled
	pinctrl: devicetree: Avoid taking direct reference to device name string
	drm/sun4i: dsi: Fix TCON DRQ set bits
	drm/amdkfd: fix a potential NULL pointer dereference (v2)
	x86/math-emu: Check __copy_from_user() result
	drm/amd/powerplay: A workaround to GPU RESET on APU
	selftests/bpf: Correct path to include msg + path
	drm/amd/display: set minimum abm backlight level
	media: venus: Fix occasionally failures to suspend
	rtw88: fix NSS of hw_cap
	drm/amd/display: fix struct init in update_bounding_box
	usb: renesas_usbhs: add suspend event support in gadget mode
	crypto: aegis128-neon - use Clang compatible cflags for ARM
	hwrng: omap3-rom - Call clk_disable_unprepare() on exit only if not idled
	regulator: max8907: Fix the usage of uninitialized variable in max8907_regulator_probe()
	tools/memory-model: Fix data race detection for unordered store and load
	media: flexcop-usb: fix NULL-ptr deref in flexcop_usb_transfer_init()
	media: cec-funcs.h: add status_req checks
	media: meson/ao-cec: move cec_notifier_cec_adap_register after hw setup
	drm/bridge: dw-hdmi: Refuse DDC/CI transfers on the internal I2C controller
	samples: pktgen: fix proc_cmd command result check logic
	block: Fix writeback throttling W=1 compiler warnings
	drm/amdkfd: Fix MQD size calculation
	MIPS: futex: Emit Loongson3 sync workarounds within asm
	mwifiex: pcie: Fix memory leak in mwifiex_pcie_init_evt_ring
	drm/drm_vblank: Change EINVAL by the correct errno
	selftests/bpf: Fix btf_dump padding test case
	libbpf: Fix struct end padding in btf_dump
	libbpf: Fix passing uninitialized bytes to setsockopt
	net/smc: increase device refcount for added link group
	team: call RCU read lock when walking the port_list
	media: cx88: Fix some error handling path in 'cx8800_initdev()'
	crypto: inside-secure - Fix a maybe-uninitialized warning
	crypto: aegis128/simd - build 32-bit ARM for v8 architecture explicitly
	misc: fastrpc: fix memory leak from miscdev->name
	ASoC: SOF: enable sync_write in hdac_bus
	media: ti-vpe: vpe: Fix Motion Vector vpdma stride
	media: ti-vpe: vpe: fix a v4l2-compliance warning about invalid pixel format
	media: ti-vpe: vpe: fix a v4l2-compliance failure about frame sequence number
	media: ti-vpe: vpe: Make sure YUYV is set as default format
	media: ti-vpe: vpe: fix a v4l2-compliance failure causing a kernel panic
	media: ti-vpe: vpe: ensure buffers are cleaned up properly in abort cases
	drm/amd/display: Properly round nominal frequency for SPD
	drm/amd/display: wait for set pipe mcp command completion
	media: ti-vpe: vpe: fix a v4l2-compliance failure about invalid sizeimage
	drm/amd/display: add new active dongle to existent w/a
	syscalls/x86: Use the correct function type in SYSCALL_DEFINE0
	drm/amd/display: Fix dongle_caps containing stale information.
	extcon: sm5502: Reset registers during initialization
	drm/amd/display: Program DWB watermarks from correct state
	x86/mm: Use the correct function type for native_set_fixmap()
	ath10k: Correct error handling of dma_map_single()
	rtw88: coex: Set 4 slot mode for A2DP
	drm/bridge: dw-hdmi: Restore audio when setting a mode
	perf test: Report failure for mmap events
	perf report: Add warning when libunwind not compiled in
	perf test: Avoid infinite loop for task exit case
	perf vendor events arm64: Fix Hisi hip08 DDRC PMU eventname
	usb: usbfs: Suppress problematic bind and unbind uevents.
	drm/amd/powerplay: avoid disabling ECC if RAS is enabled for VEGA20
	iio: adc: max1027: Reset the device at probe time
	Bluetooth: btusb: avoid unused function warning
	Bluetooth: missed cpu_to_le16 conversion in hci_init4_req
	Bluetooth: Workaround directed advertising bug in Broadcom controllers
	Bluetooth: hci_core: fix init for HCI_USER_CHANNEL
	bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()
	x86/mce: Lower throttling MCE messages' priority to warning
	drm/amd/display: enable hostvm based on roimmu active for dcn2.1
	drm/amd/display: fix header for RN clk mgr
	drm/amdgpu: fix amdgpu trace event print string format error
	staging: iio: ad9834: add a check for devm_clk_get
	power: supply: cpcap-battery: Check voltage before orderly_poweroff
	perf tests: Disable bp_signal testing for arm64
	selftests/bpf: Make a copy of subtest name
	net: hns3: log and clear hardware error after reset complete
	RDMA/hns: Fix wrong parameters when initial mtt of srq->idx_que
	drm/gma500: fix memory disclosures due to uninitialized bytes
	ASoC: soc-pcm: fixup dpcm_prune_paths() loop continue
	rtl8xxxu: fix RTL8723BU connection failure issue after warm reboot
	RDMA/siw: Fix SQ/RQ drain logic
	ipmi: Don't allow device module unload when in use
	x86/ioapic: Prevent inconsistent state when moving an interrupt
	media: cedrus: Fix undefined shift with a SHIFT_AND_MASK_BITS macro
	media: aspeed: set hsync and vsync polarities to normal before starting mode detection
	drm/nouveau: Don't grab runtime PM refs for HPD IRQs
	media: ov6650: Fix stored frame interval not in sync with hardware
	media: ad5820: Define entity function
	media: ov5640: Make 2592x1944 mode only available at 15 fps
	media: st-mipid02: add a check for devm_gpiod_get_optional
	media: imx7-mipi-csis: Add a check for devm_regulator_get
	media: aspeed: clear garbage interrupts
	media: smiapp: Register sensor after enabling runtime PM on the device
	md: no longer compare spare disk superblock events in super_load
	staging: wilc1000: potential corruption in wilc_parse_join_bss_param()
	md/bitmap: avoid race window between md_bitmap_resize and bitmap_file_clear_bit
	drm: Don't free jobs in wait_event_interruptible()
	EDAC/amd64: Set grain per DIMM
	arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
	drm/amd/display: setting the DIG_MODE to the correct value.
	i40e: initialize ITRN registers with correct values
	drm/amd/display: correctly populate dpp refclk in fpga
	i40e: Wrong 'Advertised FEC modes' after set FEC to AUTO
	net: phy: dp83867: enable robust auto-mdix
	drm/tegra: sor: Use correct SOR index on Tegra210
	regulator: core: Release coupled_rdevs on regulator_init_coupling() error
	ubsan, x86: Annotate and allow __ubsan_handle_shift_out_of_bounds() in uaccess regions
	spi: sprd: adi: Add missing lock protection when rebooting
	ACPI: button: Add DMI quirk for Medion Akoya E2215T
	RDMA/qedr: Fix memory leak in user qp and mr
	RDMA/hns: Fix memory leak on 'context' on error return path
	RDMA/qedr: Fix srqs xarray initialization
	RDMA/core: Set DMA parameters correctly
	staging: wilc1000: check if device is initialzied before changing vif
	gpu: host1x: Allocate gather copy for host1x
	net: dsa: LAN9303: select REGMAP when LAN9303 enable
	phy: renesas: phy-rcar-gen2: Fix the array off by one warning
	phy: qcom-usb-hs: Fix extcon double register after power cycle
	s390/time: ensure get_clock_monotonic() returns monotonic values
	s390: add error handling to perf_callchain_kernel
	s390/mm: add mm_pxd_folded() checks to pxd_free()
	net: hns3: add struct netdev_queue debug info for TX timeout
	libata: Ensure ata_port probe has completed before detach
	loop: fix no-unmap write-zeroes request behavior
	net/mlx5e: Verify that rule has at least one fwd/drop action
	pinctrl: sh-pfc: sh7734: Fix duplicate TCLK1_B
	ALSA: bebob: expand sleep just after breaking connections for protocol version 1
	iio: dln2-adc: fix iio_triggered_buffer_postenable() position
	libbpf: Fix error handling in bpf_map__reuse_fd()
	Bluetooth: Fix advertising duplicated flags
	ALSA: pcm: Fix missing check of the new non-cached buffer type
	spi: sifive: disable clk when probe fails and remove
	ASoC: SOF: imx: fix reverse CONFIG_SND_SOC_SOF_OF dependency
	pinctrl: qcom: sc7180: Add missing tile info in SDC_QDSD_PINGROUP/UFS_RESET
	pinctrl: amd: fix __iomem annotation in amd_gpio_irq_handler()
	ixgbe: protect TX timestamping from API misuse
	cpufreq: sun50i: Fix CPU speed bin detection
	media: rcar_drif: fix a memory disclosure
	media: v4l2-core: fix touch support in v4l_g_fmt
	nvme: introduce "Command Aborted By host" status code
	media: staging/imx: Use a shorter name for driver
	nvmem: imx-ocotp: reset error status on probe
	nvmem: core: fix nvmem_cell_write inline function
	ASoC: SOF: topology: set trigger order for FE DAI link
	media: vivid: media_device_cleanup was called too early
	spi: dw: Fix Designware SPI loopback
	bnx2x: Fix PF-VF communication over multi-cos queues.
	spi: img-spfi: fix potential double release
	ALSA: timer: Limit max amount of slave instances
	RDMA/core: Fix return code when modify_port isn't supported
	drm: msm: a6xx: fix debug bus register configuration
	rtlwifi: fix memory leak in rtl92c_set_fw_rsvdpagepkt()
	perf probe: Fix to find range-only function instance
	perf cs-etm: Fix definition of macro TO_CS_QUEUE_NR
	perf probe: Fix to list probe event with correct line number
	perf jevents: Fix resource leak in process_mapfile() and main()
	perf probe: Walk function lines in lexical blocks
	perf probe: Fix to probe an inline function which has no entry pc
	perf probe: Fix to show ranges of variables in functions without entry_pc
	perf probe: Fix to show inlined function callsite without entry_pc
	libsubcmd: Use -O0 with DEBUG=1
	perf probe: Fix to probe a function which has no entry pc
	perf tools: Fix cross compile for ARM64
	perf tools: Splice events onto evlist even on error
	drm/amdgpu: disallow direct upload save restore list from gfx driver
	drm/amd/powerplay: fix struct init in renoir_print_clk_levels
	drm/amdgpu: fix potential double drop fence reference
	ice: Check for null pointer dereference when setting rings
	xen/gntdev: Use select for DMA_SHARED_BUFFER
	perf parse: If pmu configuration fails free terms
	perf probe: Skip overlapped location on searching variables
	net: avoid potential false sharing in neighbor related code
	perf probe: Return a better scope DIE if there is no best scope
	perf probe: Fix to show calling lines of inlined functions
	perf probe: Skip end-of-sequence and non statement lines
	perf probe: Filter out instances except for inlined subroutine and subprogram
	libbpf: Fix negative FD close() in xsk_setup_xdp_prog()
	s390/bpf: Use kvcalloc for addrs array
	cgroup: freezer: don't change task and cgroups status unnecessarily
	selftests: proc: Make va_max 1MB
	drm/amdgpu: Avoid accidental thread reactivation.
	media: exynos4-is: fix wrong mdev and v4l2 dev order in error path
	ath10k: fix get invalid tx rate for Mesh metric
	fsi: core: Fix small accesses and unaligned offsets via sysfs
	selftests: net: Fix printf format warnings on arm
	media: pvrusb2: Fix oops on tear-down when radio support is not present
	soundwire: intel: fix PDI/stream mapping for Bulk
	crypto: atmel - Fix authenc support when it is set to m
	ice: delay less
	media: si470x-i2c: add missed operations in remove
	media: cedrus: Use helpers to access capture queue
	media: v4l2-ctrl: Lock main_hdl on operations of requests_queued.
	iio: cros_ec_baro: set info_mask_shared_by_all_available field
	EDAC/ghes: Fix grain calculation
	media: vicodec: media_device_cleanup was called too early
	media: vim2m: media_device_cleanup was called too early
	spi: pxa2xx: Add missed security checks
	ASoC: rt5677: Mark reg RT5677_PWR_ANLG2 as volatile
	iio: dac: ad5446: Add support for new AD5600 DAC
	bpf, testing: Workaround a verifier failure for test_progs
	ASoC: Intel: kbl_rt5663_rt5514_max98927: Add dmic format constraint
	net: dsa: sja1105: Disallow management xmit during switch reset
	r8169: respect EEE user setting when restarting network
	s390/disassembler: don't hide instruction addresses
	net: ethernet: ti: Add dependency for TI_DAVINCI_EMAC
	nvme: Discard workaround for non-conformant devices
	parport: load lowlevel driver if ports not found
	bcache: fix static checker warning in bcache_device_free()
	cpufreq: Register drivers only after CPU devices have been registered
	qtnfmac: fix debugfs support for multiple cards
	qtnfmac: fix invalid channel information output
	x86/crash: Add a forward declaration of struct kimage
	qtnfmac: fix using skb after free
	RDMA/efa: Clear the admin command buffer prior to its submission
	tracing: use kvcalloc for tgid_map array allocation
	MIPS: ralink: enable PCI support only if driver for mt7621 SoC is selected
	tracing/kprobe: Check whether the non-suffixed symbol is notrace
	bcache: fix deadlock in bcache_allocator
	iwlwifi: mvm: fix unaligned read of rx_pkt_status
	ASoC: wm8904: fix regcache handling
	regulator: core: Let boot-on regulators be powered off
	spi: tegra20-slink: add missed clk_unprepare
	tun: fix data-race in gro_normal_list()
	xhci-pci: Allow host runtime PM as default also for Intel Ice Lake xHCI
	crypto: virtio - deal with unsupported input sizes
	mmc: tmio: Add MMC_CAP_ERASE to allow erase/discard/trim requests
	btrfs: don't prematurely free work in end_workqueue_fn()
	btrfs: don't prematurely free work in run_ordered_work()
	sched/uclamp: Fix overzealous type replacement
	ASoC: wm2200: add missed operations in remove and probe failure
	spi: st-ssc4: add missed pm_runtime_disable
	ASoC: wm5100: add missed pm_runtime_disable
	perf/core: Fix the mlock accounting, again
	selftests, bpf: Fix test_tc_tunnel hanging
	selftests, bpf: Workaround an alu32 sub-register spilling issue
	bnxt_en: Return proper error code for non-existent NVM variable
	net: phy: avoid matching all-ones clause 45 PHY IDs
	firmware_loader: Fix labels with comma for builtin firmware
	ASoC: Intel: bytcr_rt5640: Update quirk for Acer Switch 10 SW5-012 2-in-1
	x86/insn: Add some Intel instructions to the opcode map
	net-af_xdp: Use correct number of channels from ethtool
	brcmfmac: remove monitor interface when detaching
	perf session: Fix decompression of PERF_RECORD_COMPRESSED records
	perf probe: Fix to show function entry line as probe-able
	s390/crypto: Fix unsigned variable compared with zero
	s390/kasan: support memcpy_real with TRACE_IRQFLAGS
	bnxt_en: Improve RX buffer error handling.
	iwlwifi: check kasprintf() return value
	fbtft: Make sure string is NULL terminated
	ASoC: soc-pcm: check symmetry before hw_params
	net: ethernet: ti: ale: clean ale tbl on init and intf restart
	mt76: fix possible out-of-bound access in mt7615_fill_txs/mt7603_fill_txs
	s390/cpumf: Adjust registration of s390 PMU device drivers
	crypto: sun4i-ss - Fix 64-bit size_t warnings
	crypto: sun4i-ss - Fix 64-bit size_t warnings on sun4i-ss-hash.c
	mac80211: consider QoS Null frames for STA_NULLFUNC_ACKED
	crypto: vmx - Avoid weird build failures
	libtraceevent: Fix memory leakage in copy_filter_type
	mips: fix build when "48 bits virtual memory" is enabled
	drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
	ice: Only disable VF state when freeing each VF resources
	ice: Fix setting coalesce to handle DCB configuration
	net: phy: initialise phydev speed and duplex sanely
	tools, bpf: Fix build for 'make -s tools/bpf O=<dir>'
	RDMA/bnxt_re: Fix missing le16_to_cpu
	RDMA/bnxt_re: Fix stat push into dma buffer on gen p5 devices
	bpf: Provide better register bounds after jmp32 instructions
	RDMA/bnxt_re: Fix chip number validation Broadcom's Gen P5 series
	ibmvnic: Fix completion structure initialization
	net: wireless: intel: iwlwifi: fix GRO_NORMAL packet stalling
	MIPS: futex: Restore \n after sync instructions
	btrfs: don't prematurely free work in reada_start_machine_worker()
	btrfs: don't prematurely free work in scrub_missing_raid56_worker()
	Revert "mmc: sdhci: Fix incorrect switch to HS mode"
	mmc: mediatek: fix CMD_TA to 2 for MT8173 HS200/HS400 mode
	tpm_tis: reserve chip for duration of tpm_tis_core_init
	tpm: fix invalid locking in NONBLOCKING mode
	iommu: fix KASAN use-after-free in iommu_insert_resv_region
	iommu: set group default domain before creating direct mappings
	iommu/vt-d: Fix dmar pte read access not set error
	iommu/vt-d: Set ISA bridge reserved region as relaxable
	iommu/vt-d: Allocate reserved region for ISA with correct permission
	can: xilinx_can: Fix missing Rx can packets on CANFD2.0
	can: m_can: tcan4x5x: add required delay after reset
	can: j1939: j1939_sk_bind(): take priv after lock is held
	can: flexcan: fix possible deadlock and out-of-order reception after wakeup
	can: flexcan: poll MCR_LPM_ACK instead of GPR ACK for stop mode acknowledgment
	can: kvaser_usb: kvaser_usb_leaf: Fix some info-leaks to USB devices
	selftests: net: tls: remove recv_rcvbuf test
	spi: dw: Correct handling of native chipselect
	spi: cadence: Correct handling of native chipselect
	usb: xhci: Fix build warning seen with CONFIG_PM=n
	drm/amdgpu: fix uninitialized variable pasid_mapping_needed
	ath10k: Revert "ath10k: add cleanup in ath10k_sta_state()"
	RDMA/siw: Fix post_recv QP state locking
	md: avoid invalid memory access for array sb->dev_roles
	s390/ftrace: fix endless recursion in function_graph tracer
	ARM: dts: Fix vcsi regulator to be always-on for droid4 to prevent hangs
	can: flexcan: add low power enter/exit acknowledgment helper
	usbip: Fix receive error in vhci-hcd when using scatter-gather
	usbip: Fix error path of vhci_recv_ret_submit()
	spi: fsl: don't map irq during probe
	spi: fsl: use platform_get_irq() instead of of_irq_to_resource()
	efi/memreserve: Register reservations as 'reserved' in /proc/iomem
	cpufreq: Avoid leaving stale IRQ work items during CPU offline
	KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails
	mm: vmscan: protect shrinker idr replace with CONFIG_MEMCG
	USB: EHCI: Do not return -EPIPE when hub is disconnected
	intel_th: pci: Add Comet Lake PCH-V support
	intel_th: pci: Add Elkhart Lake SOC support
	intel_th: Fix freeing IRQs
	intel_th: msu: Fix window switching without windows
	platform/x86: hp-wmi: Make buffer for HPWMI_FEATURE2_QUERY 128 bytes
	staging: comedi: gsc_hpdi: check dma_alloc_coherent() return value
	tty/serial: atmel: fix out of range clock divider handling
	serial: sprd: Add clearing break interrupt operation
	pinctrl: baytrail: Really serialize all register accesses
	clk: imx: clk-imx7ulp: Add missing sentinel of ulp_div_table
	clk: imx: clk-composite-8m: add lock to gate/mux
	clk: imx: pll14xx: fix clk_pll14xx_wait_lock
	ext4: fix ext4_empty_dir() for directories with holes
	ext4: check for directory entries too close to block end
	ext4: unlock on error in ext4_expand_extra_isize()
	ext4: validate the debug_want_extra_isize mount option at parse time
	iocost: over-budget forced IOs should schedule async delay
	KVM: PPC: Book3S HV: Fix regression on big endian hosts
	kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
	kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
	KVM: arm/arm64: Properly handle faulting of device mappings
	KVM: arm64: Ensure 'params' is initialised when looking up sys register
	x86/intel: Disable HPET on Intel Coffee Lake H platforms
	x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()
	x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
	x86/mce: Fix possibly incorrect severity calculation on AMD
	powerpc/vcpu: Assume dedicated processors as non-preempt
	powerpc/irq: fix stack overflow verification
	ocxl: Fix concurrent AFU open and device removal
	mmc: sdhci-msm: Correct the offset and value for DDR_CONFIG register
	mmc: sdhci-of-esdhc: Revert "mmc: sdhci-of-esdhc: add erratum A-009204 support"
	mmc: sdhci: Update the tuning failed messages to pr_debug level
	mmc: sdhci-of-esdhc: fix P2020 errata handling
	mmc: sdhci: Workaround broken command queuing on Intel GLK
	mmc: sdhci: Add a quirk for broken command queuing
	nbd: fix shutdown and recv work deadlock v2
	iwlwifi: pcie: move power gating workaround earlier in the flow
	Linux 5.4.7

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3585238149235bf73bb453e25861d9a6b9193dfa
2019-12-31 17:56:13 +01:00
Rafael J. Wysocki
da06508bcb cpufreq: Avoid leaving stale IRQ work items during CPU offline
commit 85572c2c4a upstream.

The scheduler code calling cpufreq_update_util() may run during CPU
offline on the target CPU after the IRQ work lists have been flushed
for it, so the target CPU should be prevented from running code that
may queue up an IRQ work item on it at that point.

Unfortunately, that may not be the case if dvfs_possible_from_any_cpu
is set for at least one cpufreq policy in the system, because that
allows the CPU going offline to run the utilization update callback
of the cpufreq governor on behalf of another (online) CPU in some
cases.

If that happens, the cpufreq governor callback may queue up an IRQ
work on the CPU running it, which is going offline, and the IRQ work
may not be flushed after that point.  Moreover, that IRQ work cannot
be flushed until the "offlining" CPU goes back online, so if any
other CPU calls irq_work_sync() to wait for the completion of that
IRQ work, it will have to wait until the "offlining" CPU is back
online and that may not happen forever.  In particular, a system-wide
deadlock may occur during CPU online as a result of that.

The failing scenario is as follows.  CPU0 is the boot CPU, so it
creates a cpufreq policy and becomes the "leader" of it
(policy->cpu).  It cannot go offline, because it is the boot CPU.
Next, other CPUs join the cpufreq policy as they go online and they
leave it when they go offline.  The last CPU to go offline, say CPU3,
may queue up an IRQ work while running the governor callback on
behalf of CPU0 after leaving the cpufreq policy because of the
dvfs_possible_from_any_cpu effect described above.  Then, CPU0 is
the only online CPU in the system and the stale IRQ work is still
queued on CPU3.  When, say, CPU1 goes back online, it will run
irq_work_sync() to wait for that IRQ work to complete and so it
will wait for CPU3 to go back online (which may never happen even
in principle), but (worse yet) CPU0 is waiting for CPU1 at that
point too and a system-wide deadlock occurs.

To address this problem notice that CPUs which cannot run cpufreq
utilization update code for themselves (for example, because they
have left the cpufreq policies that they belonged to), should also
be prevented from running that code on behalf of the other CPUs that
belong to a cpufreq policy with dvfs_possible_from_any_cpu set and so
in that case the cpufreq_update_util_data pointer of the CPU running
the code must not be NULL as well as for the CPU which is the target
of the cpufreq utilization update in progress.

Accordingly, change cpufreq_this_cpu_can_update() into a regular
function in kernel/sched/cpufreq.c (instead of a static inline in a
header file) and make it check the cpufreq_update_util_data pointer
of the local CPU if dvfs_possible_from_any_cpu is set for the target
cpufreq policy.

Also update the schedutil governor to do the
cpufreq_this_cpu_can_update() check in the non-fast-switch
case too to avoid the stale IRQ work issues.

Fixes: 99d14d0e16 ("cpufreq: Process remote callbacks from any CPU if the platform permits")
Link: https://lore.kernel.org/linux-pm/20191121093557.bycvdo4xyinbc5cb@vireshk-i7/
Reported-by: Anson Huang <anson.huang@nxp.com>
Tested-by: Anson Huang <anson.huang@nxp.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Peng Fan <peng.fan@nxp.com> (i.MX8QXP-MEK)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:46:06 +01:00
Valentin Schneider
c598c8a46d sched/uclamp: Fix overzealous type replacement
[ Upstream commit 7763baace1 ]

Some uclamp helpers had their return type changed from 'unsigned int' to
'enum uclamp_id' by commit

  0413d7f33e ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values")

but it happens that some do return a value in the [0, SCHED_CAPACITY_SCALE]
range, which should really be unsigned int. The affected helpers are
uclamp_none(), uclamp_rq_max_value() and uclamp_eff_value(). Fix those up.

Note that this doesn't lead to any obj diff using a relatively recent
aarch64 compiler (8.3-2019.03). The current code of e.g. uclamp_eff_value()
properly returns an 11 bit value (bits_per(1024)) and doesn't seem to do
anything funny. I'm still marking this as fixing the above commit to be on
the safe side.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: patrick.bellasi@matbug.net
Cc: qperret@google.com
Cc: surenb@google.com
Cc: tj@kernel.org
Fixes: 0413d7f33e ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values")
Link: https://lkml.kernel.org/r/20191115103908.27610-1-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31 16:45:37 +01:00
Amit Kucheria
98117f2abf UPSTREAM: cpufreq: Initialize the governors in core_initcall
Initialize the cpufreq governors earlier to allow for earlier
performance control during the boot process.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/b98eae9b44eb2f034d7f5d12a161f5f831be1eb7.1571656015.git.amit.kucheria@linaro.org
(cherry picked from commit 3f6ec871e1)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Bug: 146449535
Change-Id: Ib5599ade3811fb44a15cda6ee3791d997207cad5
2019-12-20 09:57:40 -08:00
Sami Tolvanen
ff9de73a0a FROMLIST: add support for Clang's Shadow Call Stack (SCS)
This change adds generic support for Clang's Shadow Call Stack,
which uses a shadow stack to protect return addresses from being
overwritten by an attacker. Details are available here:

  https://clang.llvm.org/docs/ShadowCallStack.html

Note that security guarantees in the kernel differ from the
ones documented for user space. The kernel must store addresses
of shadow stacks used by other tasks and interrupt handlers in
memory, which means an attacker capable reading and writing
arbitrary memory may be able to locate them and hijack control
flow by modifying shadow stacks that are not currently in use.

Bug: 145210207
Change-Id: I2a8ba6a3decac50c169731c3121c9dcab96621d2
(am from https://lore.kernel.org/patchwork/patch/1149054/)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2019-11-27 12:49:09 -08:00
Greg Kroah-Hartman
ad5859c6ae Merge 5.4-rc8 into android-mainline
Linux 5.4-rc8

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1f55e5d34dc78ddb064910ce1e1b7a7b5b39aaba
2019-11-18 08:31:11 +01:00
Qais Yousef
6e1ff0773f sched/uclamp: Fix incorrect condition
uclamp_update_active() should perform the update when
p->uclamp[clamp_id].active is true. But when the logic was inverted in
[1], the if condition wasn't inverted correctly too.

[1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/

Reported-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: babbe170e0 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-15 11:02:18 +01:00
Vincent Guittot
b90f7c9d21 sched/pelt: Fix update of blocked PELT ordering
update_cfs_rq_load_avg() can call cpufreq_update_util() to trigger an
update of the frequency. Make sure that RT, DL and IRQ PELT signals have
been updated before calling cpufreq.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: dsmythies@telus.net
Cc: juri.lelli@redhat.com
Cc: mgorman@suse.de
Cc: rostedt@goodmis.org
Fixes: 371bf42732 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e163 ("sched/dl: Add dl_rq utilization tracking")
Fixes: 91c27493e7 ("sched/irq: Add IRQ utilization tracking")
Link: https://lkml.kernel.org/r/1572434309-32512-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-13 08:01:31 +01:00
Peter Zijlstra
ff51ff84d8 sched/core: Avoid spurious lock dependencies
While seemingly harmless, __sched_fork() does hrtimer_init(), which,
when DEBUG_OBJETS, can end up doing allocations.

This then results in the following lock order:

  rq->lock
    zone->lock.rlock
      batched_entropy_u64.lock

Which in turn causes deadlocks when we do wakeups while holding that
batched_entropy lock -- as the random code does.

Solve this by moving __sched_fork() out from under rq->lock. This is
safe because nothing there relies on rq->lock, as also evident from the
other __sched_fork() callsite.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: bigeasy@linutronix.de
Cc: cl@linux.com
Cc: keescook@chromium.org
Cc: penberg@kernel.org
Cc: rientjes@google.com
Cc: thgarnie@google.com
Cc: tytso@mit.edu
Cc: will@kernel.org
Fixes: b7d5dc2107 ("random: add a spinlock_t to struct batched_entropy")
Link: https://lkml.kernel.org/r/20191001091837.GK4536@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-13 08:01:30 +01:00
Greg Kroah-Hartman
682d8bf784 Merge tag 'v5.4-rc7' into android-mainline
Linux 5.4-rc7

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I505207a0a6f68ccc3519d7f190d8faf25d9d479a
2019-11-11 06:10:55 +01:00
Greg Kroah-Hartman
8c36c8518c Revert "Revert "sched: Rework pick_next_task() slow-path""
This reverts commit 9acb7c07b6.

The root problem has now been fixed in 5.4-rc7, so revert this to
prevent merge issues.

Bug: 142182814
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-11 06:09:50 +01:00
Peter Zijlstra
6e2df0581f sched: Fix pick_next_task() vs 'change' pattern race
Commit 67692435c4 ("sched: Rework pick_next_task() slow-path")
inadvertly introduced a race because it changed a previously
unexplored dependency between dropping the rq->lock and
sched_class::put_prev_task().

The comments about dropping rq->lock, in for example
newidle_balance(), only mentions the task being current and ->on_cpu
being set. But when we look at the 'change' pattern (in for example
sched_setnuma()):

	queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
	running = task_current(rq, p); /* rq->curr == p */

	if (queued)
		dequeue_task(...);
	if (running)
		put_prev_task(...);

	/* change task properties */

	if (queued)
		enqueue_task(...);
	if (running)
		set_next_task(...);

It becomes obvious that if we do this after put_prev_task() has
already been called on @p, things go sideways. This is exactly what
the commit in question allows to happen when it does:

	prev->sched_class->put_prev_task(rq, prev, rf);
	if (!rq->nr_running)
		newidle_balance(rq, rf);

The newidle_balance() call will drop rq->lock after we've called
put_prev_task() and that allows the above 'change' pattern to
interleave and mess up the state.

Furthermore, it turns out we lost the RT-pull when we put the last DL
task.

Fix both problems by extracting the balancing from put_prev_task() and
doing a multi-class balance() pass before put_prev_task().

Fixes: 67692435c4 ("sched: Rework pick_next_task() slow-path")
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Quentin Perret <qperret@google.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
2019-11-08 22:34:14 +01:00
Qais Yousef
e3b8b6a0d1 sched/core: Fix compilation error when cgroup not selected
When cgroup is disabled the following compilation error was hit

	kernel/sched/core.c: In function ‘uclamp_update_active_tasks’:
	kernel/sched/core.c:1081:23: error: storage size of ‘it’ isn’t known
	  struct css_task_iter it;
			       ^~
	kernel/sched/core.c:1084:2: error: implicit declaration of function ‘css_task_iter_start’; did you mean ‘__sg_page_iter_start’? [-Werror=implicit-function-declaration]
	  css_task_iter_start(css, 0, &it);
	  ^~~~~~~~~~~~~~~~~~~
	  __sg_page_iter_start
	kernel/sched/core.c:1085:14: error: implicit declaration of function ‘css_task_iter_next’; did you mean ‘__sg_page_iter_next’? [-Werror=implicit-function-declaration]
	  while ((p = css_task_iter_next(&it))) {
		      ^~~~~~~~~~~~~~~~~~
		      __sg_page_iter_next
	kernel/sched/core.c:1091:2: error: implicit declaration of function ‘css_task_iter_end’; did you mean ‘get_task_cred’? [-Werror=implicit-function-declaration]
	  css_task_iter_end(&it);
	  ^~~~~~~~~~~~~~~~~
	  get_task_cred
	kernel/sched/core.c:1081:23: warning: unused variable ‘it’ [-Wunused-variable]
	  struct css_task_iter it;
			       ^~
	cc1: some warnings being treated as errors
	make[2]: *** [kernel/sched/core.o] Error 1

Fix by protetion uclamp_update_active_tasks() with
CONFIG_UCLAMP_TASK_GROUP

Fixes: babbe170e0 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/20191105112212.596-1-qais.yousef@arm.com
2019-11-08 22:34:14 +01:00
Greg Kroah-Hartman
2a71bdee3f Merge 5.4-rc6 into android-mainline
Linux 5.4-rc6

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I211f7159a46cf2a3dbb18afe56777cae1c13ac73
2019-11-04 06:51:47 +01:00
Quentin Perret
9acb7c07b6 Revert "sched: Rework pick_next_task() slow-path"
This reverts commit 67692435c4
temporarily in order to avoid the following splat:

[  222.515957] BUG: kernel NULL pointer dereference, address: 0000000000000040
[  222.518871] #PF: supervisor read access in kernel mode
[  222.518871] #PF: error_code(0x0000) - not-present page
[  222.518871] PGD 800000007cdf7067 P4D 800000007cdf7067 PUD 7c868067 PMD 0
[  222.518871] Oops: 0000 [#1] PREEMPT SMP PTI
[  222.518871] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.4.0-rc4-mainline-g6a9b34166c05 #1
[  222.518871] Hardware name: ChromiumOS crosvm, BIOS 0
[  222.518871] RIP: 0010:set_next_entity+0xf/0xf0
[  222.518871] Code: 42 0b e5 85 31 c0 e8 90 1a fc ff 0f 0b eb 9b 66 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 53 48 89 f3 49 89 fe <83> 7e 40 00 74 3d 4c 89 f7 48 89 de e8 80 f9 ff ff 4c 8d 7b 18 4d
[  222.518871] RSP: 0018:ffffb95040073de8 EFLAGS: 00010046
[  222.518871] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff85ec0c30
[  222.518871] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ec44c728500
[  222.518871] RBP: ffffb95040073e00 R08: 0000000000000000 R09: 0000000000000000
[  222.518871] R10: 0000000000000000 R11: ffffffff84ef0a30 R12: 0000000000000000
[  222.518871] R13: 0000000000000000 R14: ffff9ec44c728500 R15: ffffb95040073e60
[  222.518871] FS:  0000000000000000(0000) GS:ffff9ec44c700000(0000) knlGS:0000000000000000
[  222.518871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  222.518871] CR2: 0000000000000040 CR3: 000000007c87a002 CR4: 00000000003606e0
[  222.518871] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  222.518871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  222.518871] Call Trace:
[  222.518871]  pick_next_task_fair+0xe8/0x410
[  222.518871]  ? rcu_note_context_switch+0x3ef/0x520
[  222.518871]  ? finish_task_switch+0x10d/0x250
[  222.518871]  __schedule+0x1f8/0x6e0
[  222.518871]  schedule_idle+0x27/0x40
[  222.518871]  do_idle+0x21c/0x240
[  222.518871]  cpu_startup_entry+0x25/0x30
[  222.518871]  start_secondary+0x18d/0x1b0
[  222.518871]  secondary_startup_64+0xa4/0xb0
[  222.518871] Modules linked in: vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock_diag vsock can_raw can snd_intel8x0 snd_ac97_codec ac97_bus virtio_input virtio_pci virtio_mmio virtio_crypto mmc_block mmc_core dummy_cpufreq rtc_test dummy_hcd can_dev vcan virtio_net net_failover failover virt_wifi virtio_blk virtio_gpu ttm virtio_rng virtio_ring virtio 8250_of crypto_engine
[  222.518871] CR2: 0000000000000040
[  222.518871] ---[ end trace ae741809965b22f2 ]---
[  222.518871] RIP: 0010:set_next_entity+0xf/0xf0
[  222.518871] Code: 42 0b e5 85 31 c0 e8 90 1a fc ff 0f 0b eb 9b 66 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 53 48 89 f3 49 89 fe <83> 7e 40 00 74 3d 4c 89 f7 48 89 de e8 80 f9 ff ff 4c 8d 7b 18 4d
[  222.518871] RSP: 0018:ffffb95040073de8 EFLAGS: 00010046
[  222.518871] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff85ec0c30
[  222.518871] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ec44c728500
[  222.518871] RBP: ffffb95040073e00 R08: 0000000000000000 R09: 0000000000000000
[  222.518871] R10: 0000000000000000 R11: ffffffff84ef0a30 R12: 0000000000000000
[  222.518871] R13: 0000000000000000 R14: ffff9ec44c728500 R15: ffffb95040073e60
[  222.518871] FS:  0000000000000000(0000) GS:ffff9ec44c700000(0000) knlGS:0000000000000000
[  222.518871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  222.518871] CR2: 0000000000000040 CR3: 000000007c87a002 CR4: 00000000003606e0
[  222.518871] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  222.518871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  222.518871] Kernel panic - not syncing: Fatal exception
[  222.518871] Kernel Offset: 0x3e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Bug: 142182814
Test: 20/20 avd/avd_boot_test_kernel_triggered passing
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I62bb09b4710fdc593cbf1f42bf4bbc066e401458
2019-10-29 17:29:04 +00:00
Valentin Schneider
e284df705c sched/topology: Allow sched_asym_cpucapacity to be disabled
While the static key is correctly initialized as being disabled, it will
remain forever enabled once turned on. This means that if we start with an
asymmetric system and hotplug out enough CPUs to end up with an SMP system,
the static key will remain set - which is obviously wrong. We should detect
this and turn off things like misfit migration and capacity aware wakeups.

As Quentin pointed out, having separate root domains makes this slightly
trickier. We could have exclusive cpusets that create an SMP island - IOW,
the domains within this root domain will not see any asymmetry. This means
we can't just disable the key on domain destruction, we need to count how
many asymmetric root domains we have.

Consider the following example using Juno r0 which is 2+4 big.LITTLE, where
two identical cpusets are created: they both span both big and LITTLE CPUs:

    asym0    asym1
  [       ][       ]
   L  L  B  L  L  B

  $ cgcreate -g cpuset:asym0
  $ cgset -r cpuset.cpus=0,1,3 asym0
  $ cgset -r cpuset.mems=0 asym0
  $ cgset -r cpuset.cpu_exclusive=1 asym0

  $ cgcreate -g cpuset:asym1
  $ cgset -r cpuset.cpus=2,4,5 asym1
  $ cgset -r cpuset.mems=0 asym1
  $ cgset -r cpuset.cpu_exclusive=1 asym1

  $ cgset -r cpuset.sched_load_balance=0 .

(the CPU numbering may look odd because on the Juno LITTLEs are CPUs 0,3-5
and bigs are CPUs 1-2)

If we make one of those SMP (IOW remove asymmetry) by e.g. hotplugging its
big core, we would end up with an SMP cpuset and an asymmetric cpuset - the
static key must remain set, because we still have one asymmetric root domain.

With the above example, this could be done with:

  $ echo 0 > /sys/devices/system/cpu/cpu2/online

Which would result in:

    asym0   asym1
  [       ][    ]
   L  L  B  L  L

When both SMP and asymmetric cpusets are present, all CPUs will observe
sched_asym_cpucapacity being set (it is system-wide), but not all CPUs
observe asymmetry in their sched domain hierarchy:

  per_cpu(sd_asym_cpucapacity, <any CPU in asym0>) == <some SD at DIE level>
  per_cpu(sd_asym_cpucapacity, <any CPU in asym1>) == NULL

Change the simple key enablement to an increment, and decrement the key
counter when destroying domains that cover asymmetric CPUs.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hannes@cmpxchg.org
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: qperret@google.com
Cc: tj@kernel.org
Cc: vincent.guittot@linaro.org
Fixes: df054e8445 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations")
Link: https://lkml.kernel.org/r/20191023153745.19515-3-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29 09:58:46 +01:00
Valentin Schneider
cd1cb33505 sched/topology: Don't try to build empty sched domains
Turns out hotplugging CPUs that are in exclusive cpusets can lead to the
cpuset code feeding empty cpumasks to the sched domain rebuild machinery.

This leads to the following splat:

    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23
    Hardware name: ARM Juno development board (r0) (DT)
    Workqueue: events cpuset_hotplug_workfn
    pstate: 60000005 (nZCv daif -PAN -UAO)
    pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
    lr : build_sched_domains (kernel/sched/topology.c:1966)
    Call trace:
    build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969)
    partition_sched_domains_locked (kernel/sched/topology.c:2250)
    rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019)
    rebuild_sched_domains (kernel/cgroup/cpuset.c:1032)
    cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2))
    process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274)
    worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416)
    kthread (kernel/kthread.c:255)
    ret_from_fork (arch/arm64/kernel/entry.S:1167)
    Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853)

The faulty line in question is:

  cap = arch_scale_cpu_capacity(cpumask_first(cpu_map));

and we're not checking the return value against nr_cpu_ids (we shouldn't
have to!), which leads to the above.

Prevent generate_sched_domains() from returning empty cpumasks, and add
some assertion in build_sched_domains() to scream bloody murder if it
happens again.

The above splat was obtained on my Juno r0 with the following reproducer:

  $ cgcreate -g cpuset:asym
  $ cgset -r cpuset.cpus=0-3 asym
  $ cgset -r cpuset.mems=0 asym
  $ cgset -r cpuset.cpu_exclusive=1 asym

  $ cgcreate -g cpuset:smp
  $ cgset -r cpuset.cpus=4-5 smp
  $ cgset -r cpuset.mems=0 smp
  $ cgset -r cpuset.cpu_exclusive=1 smp

  $ cgset -r cpuset.sched_load_balance=0 .

  $ echo 0 > /sys/devices/system/cpu/cpu4/online
  $ echo 0 > /sys/devices/system/cpu/cpu5/online

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hannes@cmpxchg.org
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: qperret@google.com
Cc: tj@kernel.org
Cc: vincent.guittot@linaro.org
Fixes: 05484e0984 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection")
Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29 09:58:45 +01:00
Chris Redpath
06dc94f19d ANDROID: sched: Honor sync flag for energy-aware wakeups
Since we don't do energy-aware wakeups when we are overutilized, always
honoring sync wakeups in this state does not prevent wake-wide mechanics
overruling the flag as normal.

Having the check for (cpu_rq(cpu)->nr_running == 1) in the test condition
means that we only bypass the energy model for sync wakeups where the
cpu is about to become idle. This matches the use of this flag in Binder
which is the main place where sync wakeups come from in Android.

This patch is based upon previous work to build EAS for android products.
It replaces
commit 79e3a4a27e ("ANDROID: sched: Unconditionally honor sync flag
 for energy-aware wakeups")
in what's considered to be 'EAS'.

sync-hint code taken from wahoo
https://android.googlesource.com/kernel/msm/+/4a5e890ec60d
written by Juri Lelli <juri.lelli@arm.com>

Change-Id: I4b3d79141fc8e53dc51cd63ac11096c2e3cb10f5
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
(cherry-picked from commit f1ec666a62de)
[ Moved the feature to find_energy_efficient_cpu() and removed the
  sysctl knob ]
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
[ Add 'cpu_rq(cpu)->nr_running == 1' to the condition and remove the
  word 'Unconditionally' from the patch name ]
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
2019-10-22 09:04:43 +00:00
Greg Kroah-Hartman
630839ac24 Merge 5.4-rc3 into android-mainline
Linux 5.4-rc3

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia87ba662738dd58ddb917e32c1fbd812861e7a46
2019-10-17 05:28:13 -07:00