kernel_arpi

Author	SHA1	Message	Date
Alistair Delva	2a3049590d	ANDROID: Fix wq fp check for CFI builds A previous change added a test on the wrong config flag; rename CFI to CFI_CLANG. Bug: 145210207 Change-Id: Id8aead2eb2c75ad6442d10165f6cb86ccfb9c2f9 Signed-off-by: Alistair Delva <adelva@google.com>	2020-04-03 19:36:46 +00:00
Greg Kroah-Hartman	0d3cca0c7d	Merge 5.4.26 into android-5.4 Changes in 5.4.26 virtio_balloon: Adjust label in virtballoon_probe ALSA: hda/realtek - More constifications ALSA: hda/realtek - Add Headset Mic supported for HP cPC ALSA: hda/realtek - Fixed one of HP ALC671 platform Headset Mic supported cgroup, netclassid: periodically release file_lock on classid updating gre: fix uninit-value in __iptunnel_pull_header inet_diag: return classid for all socket types ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface ipvlan: add cond_resched_rcu() while processing muticast backlog ipvlan: do not add hardware address of master to its unicast filter list ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast() ipvlan: don't deref eth hdr before checking it's set macvlan: add cond_resched() during multicast processing net: dsa: fix phylink_start()/phylink_stop() calls net: dsa: mv88e6xxx: fix lockup on warm boot net: fec: validate the new settings in fec_enet_set_coalesce() net: hns3: fix a not link up issue when fibre port supports autoneg net/ipv6: use configured metric when add peer route netlink: Use netlink header as base to calculate bad attribute offset net: macsec: update SCI upon MAC address change. net: nfc: fix bounds checking bugs on "pipe" net/packet: tpacket_rcv: do not increment ring index on drop net: phy: bcm63xx: fix OOPS due to missing driver name net: stmmac: dwmac1000: Disable ACS if enhanced descs are not used net: systemport: fix index check to avoid an array out of bounds access r8152: check disconnect status after long sleep sfc: detach from cb_page in efx_copy_channel() slip: make slhc_compress() more robust against malicious packets taprio: Fix sending packets without dequeueing them bonding/alb: make sure arp header is pulled before accessing it bnxt_en: reinitialize IRQs when MTU is modified bnxt_en: fix error handling when flashing from file cgroup: memcg: net: do not associate sock with unrelated cgroup net: memcg: late association of sock to memcg net: memcg: fix lockdep splat in inet_csk_accept() devlink: validate length of param values devlink: validate length of region addr/len fib: add missing attribute validation for tun_id nl802154: add missing attribute validation nl802154: add missing attribute validation for dev_type can: add missing attribute validation for termination macsec: add missing attribute validation for port net: fq: add missing attribute validation for orphan mask net: taprio: add missing attribute validation for txtime delay team: add missing attribute validation for port ifindex team: add missing attribute validation for array index tipc: add missing attribute validation for MTU property nfc: add missing attribute validation for SE API nfc: add missing attribute validation for deactivate target nfc: add missing attribute validation for vendor subcommand net: phy: avoid clearing PHY interrupts twice in irq handler net: phy: fix MDIO bus PM PHY resuming net/ipv6: need update peer route when modify metric net/ipv6: remove the old peer route if change it to a new one selftests/net/fib_tests: update addr_metric_test for peer route testing net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed net: phy: Avoid multiple suspends cgroup: cgroup_procs_next should increase position index cgroup: Iterate tasks that did not finish do_exit() netfilter: nf_tables: fix infinite loop when expr is not available iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices virtio-blk: fix hw_queue stopped on arbitrary error iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint netfilter: nf_conntrack: ct_cpu_seq_next should increase position index netfilter: synproxy: synproxy_cpu_seq_next should increase position index netfilter: xt_recent: recent_seq_next should increase position index netfilter: x_tables: xt_mttg_seq_next should increase position index workqueue: don't use wq_select_unbound_cpu() for bound works drm/amd/display: remove duplicated assignment to grph_obj_type drm/i915: be more solid in checking the alignment drm/i915: Defer semaphore priority bumping to a workqueue mmc: sdhci-pci-gli: Enable MSI interrupt for GL975x pinctrl: falcon: fix syntax error ktest: Add timeout for ssh sync testing cifs_atomic_open(): fix double-put on late allocation failure gfs2_atomic_open(): fix O_EXCL\|O_CREAT handling on cold dcache KVM: x86: clear stale x86_emulate_ctxt->intercept value KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs ARC: define __ALIGN_STR and __ALIGN symbols for ARC fuse: fix stack use after return s390/dasd: fix data corruption for thin provisioned devices ipmi_si: Avoid spurious errors for optional IRQs blk-iocost: fix incorrect vtime comparison in iocg_is_idle() fscrypt: don't evict dirty inodes after removing key macintosh: windfarm: fix MODINFO regression x86/ioremap: Map EFI runtime services data as encrypted for SEV efi: Fix a race and a buffer overflow while reading efivars via sysfs efi: Add a sanity check to efivar_store_raw() i2c: designware-pci: Fix BUG_ON during device removal mt76: fix array overflow on receiving too many fragments for a packet perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag x86/mce: Fix logic and comments around MSR_PPIN_CTL iommu/dma: Fix MSI reservation allocation iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint iommu/vt-d: Fix RCU list debugging warnings iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page batman-adv: Don't schedule OGM for disabled interface clk: imx8mn: Fix incorrect clock defines pinctrl: meson-gxl: fix GPIOX sdio pins pinctrl: imx: scu: Align imx sc msg structs to 4 virtio_ring: Fix mem leak with vring_new_virtqueue() drm/i915/gvt: Fix dma-buf display blur issue on CFL pinctrl: core: Remove extra kref_get which blocks hogs being freed drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits driver code: clarify and fix platform device DMA mask allocation iommu/vt-d: Fix RCU-list bugs in intel_iommu_init() i2c: gpio: suppress error on probe defer nl80211: add missing attribute validation for critical protocol indication nl80211: add missing attribute validation for beacon report scanning nl80211: add missing attribute validation for channel switch perf bench futex-wake: Restore thread count default to online CPU count netfilter: cthelper: add missing attribute validation for cthelper netfilter: nft_payload: add missing attribute validation for payload csum flags netfilter: nft_tunnel: add missing attribute validation for tunnels netfilter: nf_tables: dump NFTA_CHAIN_FLAGS attribute netfilter: nft_chain_nat: inet family is missing module ownership iommu/vt-d: Fix the wrong printing in RHSA parsing iommu/vt-d: Ignore devices with out-of-spec domain number i2c: acpi: put device when verifying client fails iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE ipv6: restrict IPV6_ADDRFORM operation net/smc: check for valid ib_client_data net/smc: cancel event worker during device removal Linux 5.4.26 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ifacde9164f8ded01031a9e6a9c313d4fbcead25b	2020-03-18 08:19:15 +01:00
Hillf Danton	22540ca3d0	workqueue: don't use wq_select_unbound_cpu() for bound works commit `aa202f1f56` upstream. wq_select_unbound_cpu() is designed for unbound workqueues only, but it's wrongly called when using a bound workqueue too. Fixing this ensures work queued to a bound workqueue with cpu=WORK_CPU_UNBOUND always runs on the local CPU. Before, that would happen only if wq_unbound_cpumask happened to include it (likely almost always the case), or was empty, or we got lucky with forced round-robin placement. So restricting /sys/devices/virtual/workqueue/cpumask to a small subset of a machine's CPUs would cause some bound work items to run unexpectedly there. Fixes: `ef55718044` ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Hillf Danton <hdanton@sina.com> [dj: massage changelog] Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-18 07:17:50 +01:00
Sami Tolvanen	30063ed94e	ANDROID: Disable wq fp check in CFI builds With non-canonical CFI, LLVM generates jump table entries for external symbols in modules and as a result, a function pointer passed from a module to the core kernel will have a different address. Disable the warning for now. Bug: 145210207 Change-Id: Ifdcee3479280f7b97abdee6b4c746f447e0944e6 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Alistair Delva <adelva@google.com>	2020-02-27 00:52:08 +00:00
Sebastian Andrzej Siewior	16128944c9	workqueue: Add RCU annotation for pwq list walk [ Upstream commit `49e9d1a9fa` ] An additional check has been recently added to ensure that a RCU related lock is held while the RCU list is iterated. The `pwqs' are sometimes iterated without a RCU lock but with the &wq->mutex acquired leading to a warning. Teach list_for_each_entry_rcu() that the RCU usage is okay if &wq->mutex is acquired during the list traversal. Fixes: `28875945ba` ("rcu: Add support for consolidated-RCU reader checking") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-26 10:01:07 +01:00
Tejun Heo	26ba4f73a0	workqueue: Fix missing kfree(rescuer) in destroy_workqueue() commit `8efe1223d7` upstream. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Qian Cai <cai@lca.pw> Fixes: `def98c84b6` ("workqueue: Fix spurious sanity check failures in destroy_workqueue()") Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-17 19:56:54 +01:00
Tejun Heo	470e77ea87	workqueue: Fix pwq ref leak in rescuer_thread() commit `e66b39af00` upstream. `008847f66c` ("workqueue: allow rescuer thread to do more work.") made the rescuer worker requeue the pwq immediately if there may be more work items which need rescuing instead of waiting for the next mayday timer expiration. Unfortunately, it doesn't check whether the pwq is already on the mayday list and unconditionally gets the ref and moves it onto the list. This doesn't corrupt the list but creates an additional reference to the pwq. It got queued twice but will only be removed once. This leak later can trigger pwq refcnt warning on workqueue destruction and prevent freeing of the workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Williams, Gerald S" <gerald.s.williams@intel.com> Cc: NeilBrown <neilb@suse.de> Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-17 19:56:13 +01:00
Tejun Heo	20caa355f3	workqueue: Fix spurious sanity check failures in destroy_workqueue() commit `def98c84b6` upstream. Before actually destrying a workqueue, destroy_workqueue() checks whether it's actually idle. If it isn't, it prints out a bunch of warning messages and leaves the workqueue dangling. It unfortunately has a couple issues. * Mayday list queueing increments pwq's refcnts which gets detected as busy and fails the sanity checks. However, because mayday list queueing is asynchronous, this condition can happen without any actual work items left in the workqueue. * Sanity check failure leaves the sysfs interface behind too which can lead to init failure of newer instances of the workqueue. This patch fixes the above two by * If a workqueue has a rescuer, disable and kill the rescuer before sanity checks. Disabling and killing is guaranteed to flush the existing mayday list. * Remove sysfs interface before sanity checks. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Marcin Pawlowski <mpawlowski@fb.com> Reported-by: "Williams, Gerald S" <gerald.s.williams@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-17 19:56:13 +01:00
Daniel Jordan	509b320489	workqueue: require CPU hotplug read exclusion for apply_workqueue_attrs Change the calling convention for apply_workqueue_attrs to require CPU hotplug read exclusion. Avoids lockdep complaints about nested calls to get_online_cpus in a future patch where padata calls apply_workqueue_attrs when changing other CPU-hotplug-sensitive data structures with the CPU read lock already held. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-09-13 21:15:40 +10:00
Daniel Jordan	513c98d086	workqueue: unconfine alloc/apply/free_workqueue_attrs() padata will use these these interfaces in a later patch, so unconfine them. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-09-13 21:15:39 +10:00
Thomas Gleixner	be69d00d97	workqueue: Remove GPF argument from alloc_workqueue_attrs() All callers use GFP_KERNEL. No point in having that argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-06-27 14:12:19 -07:00
Thomas Gleixner	2c9858ecbe	workqueue: Make alloc/apply/free_workqueue_attrs() static None of those functions have any users outside of workqueue.c. Confine them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-06-27 14:12:15 -07:00
Thomas Gleixner	457c899653	treewide: Add SPDX license identifier for missed files Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:45 +02:00
Linus Torvalds	23c970608a	Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "Only three commits, of which two are trivial. The non-trivial chagne is Thomas's patch to switch workqueue from sched RCU to regular one. The use of sched RCU is mostly historic and doesn't really buy us anything noticeable" * 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Use normal rcu kernel/workqueue: Document wq_worker_last_func() argument kernel/workqueue: Use __printf markup to silence compiler in function 'alloc_workqueue'	2019-05-09 13:48:52 -07:00
Linus Torvalds	0968621917	Merge tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Allow state reset of printk_once() calls. - Prevent crashes when dereferencing invalid pointers in vsprintf(). Only the first byte is checked for simplicity. - Make vsprintf warnings consistent and inlined. - Treewide conversion of obsolete %pf, %pF to %ps, %pF printf modifiers. - Some clean up of vsprintf and test_printf code. * tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: lib/vsprintf: Make function pointer_string static vsprintf: Limit the length of inlined error messages vsprintf: Avoid confusion between invalid address and value vsprintf: Prevent crash when dereferencing invalid pointers vsprintf: Consolidate handling of unknown pointer specifiers vsprintf: Factor out %pO handler as kobject_string() vsprintf: Factor out %pV handler as va_format() vsprintf: Factor out %p[iI] handler as ip_addr_string() vsprintf: Do not check address of well-known strings vsprintf: Consistent %pK handling for kptr_restrict == 0 vsprintf: Shuffle restricted_pointer() printk: Tie printk_once / printk_deferred_once into .data.once for reset treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively lib/test_printf: Switch to bitmap_zalloc()	2019-05-07 09:18:12 -07:00
Thomas Gleixner	6d25be5782	sched/core, workqueues: Distangle worker accounting from rq lock The worker accounting for CPU bound workers is plugged into the core scheduler code and the wakeup code. This is not a hard requirement and can be avoided by keeping track of the state in the workqueue code itself. Keep track of the sleeping state in the worker itself and call the notifier before entering the core scheduler. There might be false positives when the task is woken between that call and actually scheduling, but that's not really different from scheduling and being woken immediately after switching away. When nr_running is updated when the task is retunrning from schedule() then it is later compared when it is done from ttwu(). [ bigeasy: preempt_disable() around wq_worker_sleeping() by Daniel Bristot de Oliveira ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ad2b29b5715f970bffc1a7026cabd6ff0b24076a.1532952814.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-04-16 16:55:15 +02:00
Sakari Ailus	d75f773c86	treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively %pF and %pf are functionally equivalent to %pS and %ps conversion specifiers. The former are deprecated, therefore switch the current users to use the preferred variant. The changes have been produced by the following command: git grep -l '%p[fF]' \| grep -v '^$tools\\|Documentation$/' \| \ while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done And verifying the result. Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: xen-devel@lists.xenproject.org Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: drbd-dev@lists.linbit.com Cc: linux-block@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: linux-pci@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: linux-btrfs@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: linux-mm@kvack.org Cc: ceph-devel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: David Sterba <dsterba@suse.com> (for btrfs) Acked-by: Mike Rapoport <rppt@linux.ibm.com> (for mm/memblock.c) Acked-by: Bjorn Helgaas <bhelgaas@google.com> (for drivers/pci) Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>	2019-04-09 14:19:06 +02:00
Thomas Gleixner	24acfb7182	workqueue: Use normal rcu There is no need for sched_rcu. The undocumented reason why sched_rcu is used is to avoid a few explicit rcu_read_lock()/unlock() pairs by the fact that sched_rcu reader side critical sections are also protected by preempt or irq disabled regions. Replace rcu_read_lock_sched with rcu_read_lock and acquire the RCU lock where it is not yet explicit acquired. Replace local_irq_disable() with rcu_read_lock(). Update asserts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: mangle changelog a little] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-04-08 12:37:43 -07:00
Bart Van Assche	82efcab3b9	workqueue: Only unregister a registered lockdep key The recent change to prevent use after free and a memory leak introduced an unconditional call to wq_unregister_lockdep() in the error handling path. If the lockdep key had not been registered yet, then the lockdep core emits a warning. Only call wq_unregister_lockdep() if wq_register_lockdep() has been called first. Fixes: `009bb421b6` ("workqueue, lockdep: Fix an alloc_workqueue() error path") Reported-by: syzbot+be0c198232f86389c3dd@syzkaller.appspotmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: https://lkml.kernel.org/r/20190311230255.176081-1-bvanassche@acm.org	2019-03-21 12:00:18 +01:00
Bart Van Assche	8194fe94ab	kernel/workqueue: Document wq_worker_last_func() argument This patch avoids that the following warning is reported when building with W=1: kernel/workqueue.c:938: warning: Function parameter or member 'task' not described in 'wq_worker_last_func' Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-03-19 10:48:20 -07:00
Mathieu Malaterre	a2775bbc1d	kernel/workqueue: Use __printf markup to silence compiler in function 'alloc_workqueue' Silence warnings (triggered at W=1) by adding relevant __printf attributes. kernel/workqueue.c:4249:2: warning: function 'alloc_workqueue' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-03-15 08:47:22 -07:00
Linus Torvalds	9e55f87c0e	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A few fixes for lockdep: - initialize lockdep internal RCU head after initializing RCU - prevent use after free in a alloc_workqueue() error handling path - plug a memory leak in the workqueue core which fails to free a dynamically allocated lock name. - make Clang happy" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: workqueue, lockdep: Fix a memory leak in wq->lock_name workqueue, lockdep: Fix an alloc_workqueue() error path locking/lockdep: Only call init_rcu_head() after RCU has been initialized locking/lockdep: Avoid a Clang warning	2019-03-10 13:48:14 -07:00
Qian Cai	69a106c00e	workqueue, lockdep: Fix a memory leak in wq->lock_name The following commit: `669de8bda8` ("kernel/workqueue: Use dynamic lockdep keys for workqueues") introduced a memory leak as wq_free_lockdep() calls kfree(wq->lock_name), but wq_init_lockdep() does not point wq->lock_name to the newly allocated slab object. This can be reproduced by running LTP fallocate04 followed by oom01 tests: unreferenced object 0xc0000005876384d8 (size 64): comm "fallocate04", pid 26972, jiffies 4297139141 (age 40370.480s) hex dump (first 32 bytes): 28 77 71 5f 63 6f 6d 70 6c 65 74 69 6f 6e 29 65 (wq_completion)e 78 74 34 2d 72 73 76 2d 63 6f 6e 76 65 72 73 69 xt4-rsv-conversi backtrace: [<00000000cb452883>] kvasprintf+0x6c/0xe0 [<000000004654ddac>] kasprintf+0x34/0x60 [<000000001c68f311>] alloc_workqueue+0x1f8/0x6ac [<0000000003c2ad83>] ext4_fill_super+0x23d4/0x3c80 [ext4] [<0000000006610538>] mount_bdev+0x25c/0x290 [<00000000bcf955ec>] ext4_mount+0x28/0x50 [ext4] [<0000000016e08fd3>] legacy_get_tree+0x4c/0xb0 [<0000000042b6a5fc>] vfs_get_tree+0x6c/0x190 [<00000000268ab022>] do_mount+0xb9c/0x1100 [<00000000698e6898>] ksys_mount+0x158/0x180 [<0000000064e391fd>] sys_mount+0x20/0x30 [<00000000ba378f12>] system_call+0x5c/0x70 Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: catalin.marinas@arm.com Cc: jiangshanlai@gmail.com Cc: tj@kernel.org Fixes: `669de8bda8` ("kernel/workqueue: Use dynamic lockdep keys for workqueues") Link: https://lkml.kernel.org/r/20190307002731.47371-1-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-03-09 14:15:52 +01:00
Bart Van Assche	009bb421b6	workqueue, lockdep: Fix an alloc_workqueue() error path This patch fixes a use-after-free and a memory leak in an alloc_workqueue() error path. Repoted by syzkaller and KASAN: BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:197 [inline] BUG: KASAN: use-after-free in lockdep_register_key+0x3b9/0x490 kernel/locking/lockdep.c:1023 Read of size 8 at addr ffff888090fc2698 by task syz-executor134/7858 CPU: 1 PID: 7858 Comm: syz-executor134 Not tainted 5.0.0-rc8-next-20190301 #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132 __read_once_size include/linux/compiler.h:197 [inline] lockdep_register_key+0x3b9/0x490 kernel/locking/lockdep.c:1023 wq_init_lockdep kernel/workqueue.c:3444 [inline] alloc_workqueue+0x427/0xe70 kernel/workqueue.c:4263 ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732 misc_open+0x398/0x4c0 drivers/char/misc.c:141 chrdev_open+0x247/0x6b0 fs/char_dev.c:417 do_dentry_open+0x488/0x1160 fs/open.c:771 vfs_open+0xa0/0xd0 fs/open.c:880 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x46e0 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1063 __do_sys_openat fs/open.c:1090 [inline] __se_sys_openat fs/open.c:1084 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1084 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Allocated by task 7789: save_stack+0x45/0xd0 mm/kasan/common.c:75 set_track mm/kasan/common.c:87 [inline] __kasan_kmalloc mm/kasan/common.c:497 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:511 __do_kmalloc mm/slab.c:3726 [inline] __kmalloc+0x15c/0x740 mm/slab.c:3735 kmalloc include/linux/slab.h:553 [inline] kzalloc include/linux/slab.h:743 [inline] alloc_workqueue+0x13c/0xe70 kernel/workqueue.c:4236 ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732 misc_open+0x398/0x4c0 drivers/char/misc.c:141 chrdev_open+0x247/0x6b0 fs/char_dev.c:417 do_dentry_open+0x488/0x1160 fs/open.c:771 vfs_open+0xa0/0xd0 fs/open.c:880 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x46e0 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1063 __do_sys_openat fs/open.c:1090 [inline] __se_sys_openat fs/open.c:1084 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1084 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 7789: save_stack+0x45/0xd0 mm/kasan/common.c:75 set_track mm/kasan/common.c:87 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459 kasan_slab_free+0xe/0x10 mm/kasan/common.c:467 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x230 mm/slab.c:3821 alloc_workqueue+0xc3e/0xe70 kernel/workqueue.c:4295 ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732 misc_open+0x398/0x4c0 drivers/char/misc.c:141 chrdev_open+0x247/0x6b0 fs/char_dev.c:417 do_dentry_open+0x488/0x1160 fs/open.c:771 vfs_open+0xa0/0xd0 fs/open.c:880 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x46e0 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1063 __do_sys_openat fs/open.c:1090 [inline] __se_sys_openat fs/open.c:1084 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1084 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888090fc2580 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 280 bytes inside of 512-byte region [ffff888090fc2580, ffff888090fc2780) Reported-by: syzbot+17335689e239ce135d8b@syzkaller.appspotmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Fixes: `669de8bda8` ("kernel/workqueue: Use dynamic lockdep keys for workqueues") Link: https://lkml.kernel.org/r/20190303220046.29448-1-bvanassche@acm.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-03-09 14:15:52 +01:00
Linus Torvalds	b5dd0c658c	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: - some of the rest of MM - various misc things - dynamic-debug updates - checkpatch - some epoll speedups - autofs - rapidio - lib/, lib/lzo/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits) samples/mic/mpssd/mpssd.h: remove duplicate header kernel/fork.c: remove duplicated include include/linux/relay.h: fix percpu annotation in struct rchan arch/nios2/mm/fault.c: remove duplicate include unicore32: stop printing the virtual memory layout MAINTAINERS: fix GTA02 entry and mark as orphan mm: create the new vm_fault_t type arm, s390, unicore32: remove oneliner wrappers for memblock_alloc() arch: simplify several early memory allocations openrisc: simplify pte_alloc_one_kernel() sh: prefer memblock APIs returning virtual address microblaze: prefer memblock API returning virtual address powerpc: prefer memblock APIs returning virtual address lib/lzo: separate lzo-rle from lzo lib/lzo: implement run-length encoding lib/lzo: fast 8-byte copy on arm64 lib/lzo: 64-bit CTZ on arm64 lib/lzo: tidy-up ifdefs ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size ipc: annotate implicit fall through ...	2019-03-07 19:25:37 -08:00
Johannes Weiner	4b04700275	kernel: workqueue: clarify wq_worker_last_func() caller requirements This function can only be called safely from very specific scheduler contexts. Document those. Link: http://lkml.kernel.org/r/20190206150528.31198-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-07 18:32:01 -08:00
Linus Torvalds	abf7c3d8dd	Merge branch 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "All trivial. Two comment updates and one more initialization sanity check in flush_work()" * 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Fix spelling in source code comments workqueue: fix typo in comment workqueue: Try to catch flush_work() without INIT_WORK().	2019-03-07 10:09:52 -08:00
Linus Torvalds	e431f2d74e	Merge tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big driver core patchset for 5.1-rc1 More patches than "normal" here this merge window, due to some work in the driver core by Alexander Duyck to rework the async probe functionality to work better for a number of devices, and independant work from Rafael for the device link functionality to make it work "correctly". Also in here is: - lots of BUS_ATTR() removals, the macro is about to go away - firmware test fixups - ihex fixups and simplification - component additions (also includes i915 patches) - lots of minor coding style fixups and cleanups. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (65 commits) driver core: platform: remove misleading err_alloc label platform: set of_node in platform_device_register_full() firmware: hardcode the debug message for -ENOENT driver core: Add missing description of new struct device_link field driver core: Fix PM-runtime for links added during consumer probe drivers/component: kerneldoc polish async: Add cmdline option to specify drivers to be async probed driver core: Fix possible supplier PM-usage counter imbalance PM-runtime: Fix __pm_runtime_set_status() race with runtime resume driver: platform: Support parsing GpioInt 0 in platform_get_irq() selftests: firmware: fix verify_reqs() return value Revert "selftests: firmware: remove use of non-standard diff -Z option" Revert "selftests: firmware: add CONFIG_FW_LOADER_USER_HELPER_FALLBACK to config" device: Fix comment for driver_data in struct device kernfs: Allocating memory for kernfs_iattrs with kmem_cache. sysfs: remove unused include of kernfs-internal.h driver core: Postpone DMA tear-down until after devres release driver core: Document limitation related to DL_FLAG_RPM_ACTIVE PM-runtime: Take suppliers into account in __pm_runtime_set_status() device.h: Add __cold to dev_<level> logging functions ...	2019-03-06 14:52:48 -08:00
Bart Van Assche	bf393fd4a3	workqueue: Fix spelling in source code comments Change "execuing" into "executing" and "guarnateed" into "guaranteed". Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-03-05 07:52:39 -08:00
Bart Van Assche	669de8bda8	kernel/workqueue: Use dynamic lockdep keys for workqueues The following commit: `87915adc3f` ("workqueue: re-add lockdep dependencies for flushing") improved deadlock checking in the workqueue implementation. Unfortunately that patch also introduced a few false positive lockdep complaints. This patch suppresses these false positives by allocating the workqueue mutex lockdep key dynamically. An example of a false positive lockdep complaint suppressed by this patch can be found below. The root cause of the lockdep complaint shown below is that the direct I/O code can call alloc_workqueue() from inside a work item created by another alloc_workqueue() call and that both workqueues share the same lockdep key. This patch avoids that that lockdep complaint is triggered by allocating the work queue lockdep keys dynamically. In other words, this patch guarantees that a unique lockdep key is associated with each work queue mutex. ====================================================== WARNING: possible circular locking dependency detected 4.19.0-dbg+ #1 Not tainted fio/4129 is trying to acquire lock: 00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970 but task is already holding lock: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#14){+.+.}: down_write+0x3d/0x80 __generic_file_fsync+0x77/0xf0 ext4_sync_file+0x3c9/0x780 vfs_fsync_range+0x66/0x100 dio_complete+0x2f5/0x360 dio_aio_complete_work+0x1c/0x20 process_one_work+0x481/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #1 ((work_completion)(&dio->complete_work)){+.+.}: process_one_work+0x447/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}: lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock((work_completion)(&dio->complete_work)); lock(&sb->s_type->i_mutex_key#14); lock((wq_completion)"dio/%s"sb->s_id); * DEADLOCK * 1 lock held by fio/4129: #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 stack backtrace: CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x86/0xc5 print_circular_bug.isra.32+0x20a/0x218 __lock_acquire+0x1c68/0x1cf0 lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/r/20190214230058.196511-20-bvanassche@acm.org [ Reworked the changelog a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-02-28 07:55:47 +01:00
Liu Song	8bdc620178	workqueue: fix typo in comment qeueue/queue Signed-off-by: Liu Song <liu.song11@zte.com.cn> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-02-21 08:03:38 -08:00
Greg Kroah-Hartman	9481caf39b	Merge 5.0-rc6 into driver-core-next We need the debugfs fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-11 09:09:02 +01:00
Johannes Weiner	1b69ac6b40	psi: fix aggregation idle shut-off psi has provisions to shut off the periodic aggregation worker when there is a period of no task activity - and thus no data that needs aggregating. However, while developing psi monitoring, Suren noticed that the aggregation clock currently won't stay shut off for good. Debugging this revealed a flaw in the idle design: an aggregation run will see no task activity and decide to go to sleep; shortly thereafter, the kworker thread that executed the aggregation will go idle and cause a scheduling change, during which the psi callback will kick the !pending worker again. This will ping-pong forever, and is equivalent to having no shut-off logic at all (but with more code!) Fix this by exempting aggregation workers from psi's clock waking logic when the state change is them going to sleep. To do this, tag workers with the last work function they executed, and if in psi we see a worker going to sleep after aggregating psi data, we will not reschedule the aggregation work item. What if the worker is also executing other items before or after? Any psi state times that were incurred by work items preceding the aggregation work will have been collected from the per-cpu buckets during the aggregation itself. If there are work items following the aggregation work, the worker's last_func tag will be overwritten and the aggregator will be kept alive to process this genuine new activity. If the aggregation work is the last thing the worker does, and we decide to go idle, the brief period of non-idle time incurred between the aggregation run and the kworker's dequeue will be stranded in the per-cpu buckets until the clock is woken by later activity. But that should not be a problem. The buckets can hold 4s worth of time, and future activity will wake the clock with a 2s delay, giving us 2s worth of data we can leave behind when disabling aggregation. If it takes a worker more than two seconds to go idle after it finishes its last work item, we likely have bigger problems in the system, and won't notice one sample that was averaged with a bogus per-CPU weight. Link: http://lkml.kernel.org/r/20190116193501.1910-1-hannes@cmpxchg.org Fixes: `eb414681d5` ("psi: pressure stall information for CPU, memory, and IO") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-02-01 15:46:23 -08:00
Alexander Duyck	8204e0c111	workqueue: Provide queue_work_node to queue work near a given NUMA node Provide a new function, queue_work_node, which is meant to schedule work on a "random" CPU of the requested NUMA node. The main motivation for this is to help assist asynchronous init to better improve boot times for devices that are local to a specific node. For now we just default to the first CPU that is in the intersection of the cpumask of the node and the online cpumask. The only exception is if the CPU is local to the node we will just use the current CPU. This should work for our purposes as we are currently only using this for unbound work so the CPU will be translated to a node anyway instead of being directly used. As we are only using the first CPU to represent the NUMA node for now I am limiting the scope of the function so that it can only be used with unbound workqueues. Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-31 14:20:54 +01:00
Tetsuo Handa	4d43d395fe	workqueue: Try to catch flush_work() without INIT_WORK(). syzbot found a flush_work() caller who forgot to call INIT_WORK() because that work_struct was allocated by kzalloc() [1]. But the message INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. by lock_map_acquire() is failing to tell that INIT_WORK() is missing. Since flush_work() without INIT_WORK() is a bug, and INIT_WORK() should set ->func field to non-zero, let's warn if ->func field is zero. [1] https://syzkaller.appspot.com/bug?id=a5954455fcfa51c29ca2ab55b203076337e1c770 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Tejun Heo <tj@kernel.org>	2019-01-25 07:28:29 -08:00
Paul E. McKenney	25b0077511	workqueue: Replace call_rcu_sched() with call_rcu() Now that call_rcu()'s callback is not invoked until after all preempt-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_sched(). This commit therefore makes that change. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Acked-by: Tejun Heo <tj@kernel.org>	2018-11-27 09:21:44 -08:00
Vincent Whitchurch	cb9d7fd51d	watchdog: Mark watchdog touch functions as notrace Some architectures need to use stop_machine() to patch functions for ftrace, and the assumption is that the stopped CPUs do not make function calls to traceable functions when they are in the stopped state. Commit `ce4f06dcbb` ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") added calls to the watchdog touch functions from the stopped CPUs and those functions lack notrace annotations. This leads to crashes when enabling/disabling ftrace on ARM kernels built with the Thumb-2 instruction set. Fix it by adding the necessary notrace annotations. Fixes: `ce4f06dcbb` ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: oleg@redhat.com Cc: tj@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180821152507.18313-1-vincent.whitchurch@axis.com	2018-08-30 12:56:40 +02:00
Linus Torvalds	9022ada8ab	Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "Over the lockdep cross-release churn, workqueue lost some of the existing annotations. Johannes Berg restored it and also improved them" * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: re-add lockdep dependencies for flushing workqueue: skip lockdep wq dependency in cancel_work_sync()	2018-08-24 13:16:36 -07:00
Johannes Berg	87915adc3f	workqueue: re-add lockdep dependencies for flushing In flush_work(), we need to create a lockdep dependency so that the following scenario is appropriately tagged as a problem: work_function() { mutex_lock(&mutex); ... } other_function() { mutex_lock(&mutex); flush_work(&work); // or cancel_work_sync(&work); } This is a problem since the work might be running and be blocked on trying to acquire the mutex. Similarly, in flush_workqueue(). These were removed after cross-release partially caught these problems, but now cross-release was reverted anyway. IMHO the removal was erroneous anyway though, since lockdep should be able to catch potential problems, not just actual ones, and cross-release would only have caught the problem when actually invoking wait_for_completion(). Fixes: `fd1a5b04df` ("workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-08-22 08:31:38 -07:00
Johannes Berg	d6e89786be	workqueue: skip lockdep wq dependency in cancel_work_sync() In cancel_work_sync(), we can only have one of two cases, even with an ordered workqueue: * the work isn't running, just cancelled before it started * the work is running, but then nothing else can be on the workqueue before it Thus, we need to skip the lockdep workqueue dependency handling, otherwise we get false positive reports from lockdep saying that we have a potential deadlock when the workqueue also has other work items with locking, e.g. work1_function() { mutex_lock(&mutex); ... } work2_function() { /* nothing / } other_function() { queue_work(ordered_wq, &work1); queue_work(ordered_wq, &work2); mutex_lock(&mutex); cancel_work_sync(&work2); } As described above, this isn't a problem, but lockdep will currently flag it as if cancel_work_sync() was flush_work(), which is* a problem. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-08-22 08:31:37 -07:00
Kees Cook	6396bb2215	treewide: kzalloc() -> kcalloc() The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) \| kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) \| kzalloc(sizeof(TYPE) * C2, ...) \| kzalloc(C1 * C2 * C3, ...) \| kzalloc(C1 * C2, ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Linus Torvalds	5f85942c2e	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc, xfcp, hisi_sas, cxlflash, qla2xxx. In the absence of Nic, we're also taking target updates which are mostly minor except for the tcmu refactor. The only real core change to worry about is the removal of high page bouncing (in sas, storvsc and iscsi). This has been well tested and no problems have shown up so far" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits) scsi: lpfc: update driver version to 12.0.0.4 scsi: lpfc: Fix port initialization failure. scsi: lpfc: Fix 16gb hbas failing cq create. scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc scsi: lpfc: correct oversubscription of nvme io requests for an adapter scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx) scsi: hisi_sas: Mark PHY as in reset for nexus reset scsi: hisi_sas: Fix return value when get_free_slot() failed scsi: hisi_sas: Terminate STP reject quickly for v2 hw scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot scsi: hisi_sas: Try wait commands before before controller reset scsi: hisi_sas: Init disks after controller reset scsi: hisi_sas: Create a scsi_host_template per HW module scsi: hisi_sas: Reset disks when discovered scsi: hisi_sas: Add LED feature for v3 hw scsi: hisi_sas: Change common allocation mode of device id scsi: hisi_sas: change slot index allocation mode scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate() scsi: hisi_sas: fix a typo in hisi_sas_task_prep() ...	2018-06-10 13:01:12 -07:00
Linus Torvalds	2857676045	Merge tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull overflow updates from Kees Cook: "This adds the new overflow checking helpers and adds them to the 2-factor argument allocators. And this adds the saturating size helpers and does a treewide replacement for the struct_size() usage. Additionally this adds the overflow testing modules to make sure everything works. I'm still working on the treewide replacements for allocators with "simple" multiplied arguments: alloc(a b, ...) -> alloc_array(a, b, ...) and zalloc(a * b, ...) -> calloc(a, b, ...) as well as the more complex cases, but that's separable from this portion of the series. I expect to have the rest sent before -rc1 closes; there are a lot of messy cases to clean up. Summary: - Introduce arithmetic overflow test helper functions (Rasmus) - Use overflow helpers in 2-factor allocators (Kees, Rasmus) - Introduce overflow test module (Rasmus, Kees) - Introduce saturating size helper functions (Matthew, Kees) - Treewide use of struct_size() for allocators (Kees)" tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: treewide: Use struct_size() for devm_kmalloc() and friends treewide: Use struct_size() for vmalloc()-family treewide: Use struct_size() for kmalloc()-family device: Use overflow helpers for devm_kmalloc() mm: Use overflow helpers in kvmalloc() mm: Use overflow helpers in kmalloc_array() test_overflow: Add memory allocation overflow tests overflow.h: Add allocation size calculation helpers test_overflow: Report test failures test_overflow: macrofy some more, do more tests for free lib: add runtime test of check__overflow functions compiler.h: enable builtin overflow checkers and add fallback code	2018-06-06 17:27:14 -07:00
Kees Cook	acafe7e302	treewide: Use struct_size() for kmalloc()-family One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = kmalloc(sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL); This patch makes the changes for kmalloc()-family (and kvmalloc()-family) uses. It was done via automatic conversion with manual review for the "CHECKME" non-standard cases noted below, using the following Coccinelle script: // pkey_cache = kmalloc(sizeof pkey_cache + tprops->pkey_tbl_len // sizeof pkey_cache->table, GFP_KERNEL); @@ identifier alloc =~ "kmalloc\|kzalloc\|kvmalloc\|kvzalloc"; expression GFP; identifier VAR, ELEMENT; expression COUNT; @@ - alloc(sizeof(VAR) + COUNT * sizeof(VAR->ELEMENT), GFP) + alloc(struct_size(VAR, ELEMENT, COUNT), GFP) // mr = kzalloc(sizeof(mr) + m * sizeof(mr->map[0]), GFP_KERNEL); @@ identifier alloc =~ "kmalloc\|kzalloc\|kvmalloc\|kvzalloc"; expression GFP; identifier VAR, ELEMENT; expression COUNT; @@ - alloc(sizeof(VAR) + COUNT sizeof(VAR->ELEMENT[0]), GFP) + alloc(struct_size(VAR, ELEMENT, COUNT), GFP) // Same pattern, but can't trivially locate the trailing element name, // or variable name. @@ identifier alloc =~ "kmalloc\|kzalloc\|kvmalloc\|kvzalloc"; expression GFP; expression SOMETHING, COUNT, ELEMENT; @@ - alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP) + alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-06 11:15:43 -07:00
Mathieu Malaterre	66448bc274	workqueue: move function definitions within CONFIG_SMP block In commit `7ee681b252` ("workqueue: Convert to state machine callbacks"), three new function definitions were added: ‘workqueue_prepare_cpu’, ‘workqueue_online_cpu’ and ‘workqueue_offline_cpu’. Move these function definitions within a CONFIG_SMP block since they are not used outside of it. This will match function declarations in header <include/linux/workqueue.h>, and silence the following gcc warning (W=1): kernel/workqueue.c:4743:5: warning: no previous prototype for ‘workqueue_prepare_cpu’ [-Wmissing-prototypes] kernel/workqueue.c:4756:5: warning: no previous prototype for ‘workqueue_online_cpu’ [-Wmissing-prototypes] kernel/workqueue.c:4783:5: warning: no previous prototype for ‘workqueue_offline_cpu’ [-Wmissing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-23 11:16:58 -07:00
Tejun Heo	197f6accac	workqueue: Make sure struct worker is accessible for wq_worker_comm() The worker struct could already be freed when wq_worker_comm() tries to access it for reporting. This patch protects PF_WQ_WORKER modifications with wq_pool_attach_mutex and makes wq_worker_comm() test the flag before dereferencing worker from kthread_data(), which ensures that it only dereferences when the worker struct is valid. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Lai Jiangshan <jiangshanlai@gmail.com> Fixes: `6b59808bfe` ("workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status}")	2018-05-21 08:04:35 -07:00
Tejun Heo	6b59808bfe	workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status} There can be a lot of workqueue workers and they all show up with the cryptic kworker/* names making it difficult to understand which is doing what and how they came to be. # ps -ef \| grep kworker root 4 2 0 Feb25 ? 00:00:00 [kworker/0:0H] root 6 2 0 Feb25 ? 00:00:00 [kworker/u112:0] root 19 2 0 Feb25 ? 00:00:00 [kworker/1:0H] root 25 2 0 Feb25 ? 00:00:00 [kworker/2:0H] root 31 2 0 Feb25 ? 00:00:00 [kworker/3:0H] ... This patch makes workqueue workers report the latest workqueue it was executing for through /proc/PID/{comm,stat,status}. The extra information is appended to the kthread name with intervening '+' if currently executing, otherwise '-'. # cat /proc/25/comm kworker/2:0-events_power_efficient # cat /proc/25/stat 25 (kworker/2:0-events_power_efficient) I 2 0 0 0 -1 69238880 0 0... # grep Name /proc/25/status Name: kworker/2:0-events_power_efficient Unfortunately, ps(1) truncates comm to 15 characters, # ps 25 PID TTY STAT TIME COMMAND 25 ? I 0:00 [kworker/2:0-eve] making it a lot less useful; however, this should be an easy fix from ps(1) side. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Craig Small <csmall@enc.com.au>	2018-05-18 08:47:13 -07:00
Tejun Heo	8bf895931e	workqueue: Set worker->desc to workqueue name by default Work functions can use set_worker_desc() to improve the visibility of what the worker task is doing. Currently, the desc field is unset at the beginning of each execution and there is a separate field to track the field is set during the current execution. Instead of leaving empty till desc is set, worker->desc can be used to remember the last workqueue the worker worked on by default and users that use set_worker_desc() can override it to something more informative as necessary. This simplifies desc handling and helps tracking the last workqueue that the worker exected on to improve visibility. Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-18 08:47:13 -07:00
Tejun Heo	a2d812a27a	workqueue: Make worker_attach/detach_pool() update worker->pool For historical reasons, the worker attach/detach functions don't currently manage worker->pool and the callers are manually and inconsistently updating it. This patch moves worker->pool updates into the worker attach/detach functions. This makes worker->pool consistent and clearly defines how worker->pool updates are synchronized. This will help later workqueue visibility improvements by allowing safe access to workqueue information from worker->task. Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-18 08:47:13 -07:00
Tejun Heo	1258fae73c	workqueue: Replace pool->attach_mutex with global wq_pool_attach_mutex To improve workqueue visibility, we want to be able to access workqueue information from worker tasks. The per-pool attach mutex makes that difficult because there's no way of stabilizing task -> worker pool association without knowing the pool first. Worker attach/detach is a slow path and there's no need for different pools to be able to perform them concurrently. This patch replaces the per-pool attach_mutex with global wq_pool_attach_mutex to prepare for visibility improvement changes. Signed-off-by: Tejun Heo <tj@kernel.org>	2018-05-18 08:47:13 -07:00

1 2 3 4 5 ...

668 Commits