Changes in 5.15.83 clk: generalize devm_clk_get() a bit clk: Provide new devm_clk helpers for prepared and enabled clocks mmc: mtk-sd: Fix missing clk_disable_unprepare in msdc_of_clock_parse() arm64: dts: rockchip: keep I2S1 disabled for GPIO function on ROCK Pi 4 series arm: dts: rockchip: fix node name for hym8563 rtc arm: dts: rockchip: remove clock-frequency from rtc ARM: dts: rockchip: fix ir-receiver node names arm64: dts: rockchip: fix ir-receiver node names ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name fs: use acquire ordering in __fget_light() ARM: 9251/1: perf: Fix stacktraces for tracepoint events in THUMB2 kernels ARM: 9266/1: mm: fix no-MMU ZERO_PAGE() implementation ASoC: wm8962: Wait for updated value of WM8962_CLOCKING1 register spi: mediatek: Fix DEVAPC Violation at KO Remove ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188 ASoC: rt711-sdca: fix the latency time of clock stop prepare state machine transitions 9p/fd: Use P9_HDRSZ for header size regulator: slg51000: Wait after asserting CS pin ALSA: seq: Fix function prototype mismatch in snd_seq_expand_var_event selftests/net: Find nettest in current directory btrfs: send: avoid unaligned encoded writes when attempting to clone range ASoC: soc-pcm: Add NULL check in BE reparenting regulator: twl6030: fix get status of twl6032 regulators fbcon: Use kzalloc() in fbcon_prepare_logo() usb: dwc3: gadget: Disable GUSB2PHYCFG.SUSPHY for End Transfer 9p/xen: check logical size for buffer size net: usb: qmi_wwan: add u-blox 0x1342 composition mm/khugepaged: take the right locks for page table retraction mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths rtc: mc146818-lib: extract mc146818_avoid_UIP rtc: cmos: avoid UIP when writing alarm time rtc: cmos: avoid UIP when reading alarm time cifs: fix use-after-free caused by invalid pointer `hostname` drm/bridge: anx7625: Fix edid_read break case in sp_tx_edid_read() xen/netback: Ensure protocol headers don't fall in the non-linear area xen/netback: do some code cleanup xen/netback: don't call kfree_skb() with interrupts disabled media: videobuf2-core: take mmap_lock in vb2_get_unmapped_area() soundwire: intel: Initialize clock stop timeout Revert "ARM: dts: imx7: Fix NAND controller size-cells" media: v4l2-dv-timings.c: fix too strict blanking sanity checks memcg: fix possible use-after-free in memcg_write_event_control() mm/gup: fix gup_pud_range() for dax Bluetooth: btusb: Add debug message for CSR controllers Bluetooth: Fix crash when replugging CSR fake controllers net: mana: Fix race on per-CQ variable napi work_done KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field drm/vmwgfx: Don't use screen objects when SEV is active drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend drm/shmem-helper: Remove errant put in error path drm/shmem-helper: Avoid vm_open error paths net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing() HID: usbhid: Add ALWAYS_POLL quirk for some mice HID: hid-lg4ff: Add check for empty lbuf HID: core: fix shift-out-of-bounds in hid_report_raw_event HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch V 10 can: af_can: fix NULL pointer dereference in can_rcv_filter clk: Fix pointer casting to prevent oops in devm_clk_release() gpiolib: improve coding style for local variables gpiolib: check the 'ngpios' property in core gpiolib code gpiolib: fix memory leak in gpiochip_setup_dev() netfilter: nft_set_pipapo: Actually validate intervals in fields after the first one drm/vmwgfx: Fix race issue calling pin_user_pages ieee802154: cc2520: Fix error return code in cc2520_hw_init() ca8210: Fix crash by zero initializing data netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark drm/bridge: ti-sn65dsi86: Fix output polarity setting bug gpio: amd8111: Fix PCI device reference count leak e1000e: Fix TX dispatch condition igb: Allocate MSI-X vector when testing net: broadcom: Add PTP_1588_CLOCK_OPTIONAL dependency for BCMGENET under ARCH_BCM2835 drm: bridge: dw_hdmi: fix preference of RGB modes over YUV420 af_unix: Get user_ns from in_skb in unix_diag_get_exact(). vmxnet3: correctly report encapsulated LRO packet vmxnet3: use correct intrConf reference when using extended queues Bluetooth: 6LoWPAN: add missing hci_dev_put() in get_l2cap_conn() Bluetooth: Fix not cleanup led when bt_init fails net: dsa: ksz: Check return value net: dsa: hellcreek: Check return value net: dsa: sja1105: Check return value selftests: rtnetlink: correct xfrm policy rule in kci_test_ipsec_offload mac802154: fix missing INIT_LIST_HEAD in ieee802154_if_add() net: encx24j600: Add parentheses to fix precedence net: encx24j600: Fix invalid logic in reading of MISTAT register net: mdiobus: fwnode_mdiobus_register_phy() rework error handling net: mdiobus: fix double put fwnode in the error path octeontx2-pf: Fix potential memory leak in otx2_init_tc() xen-netfront: Fix NULL sring after live migration net: mvneta: Prevent out of bounds read in mvneta_config_rss() i40e: Fix not setting default xps_cpus after reset i40e: Fix for VF MAC address 0 i40e: Disallow ip4 and ip6 l4_4_bytes NFC: nci: Bounds check struct nfc_target arrays nvme initialize core quirks before calling nvme_init_subsystem gpio/rockchip: fix refcount leak in rockchip_gpiolib_register() net: stmmac: fix "snps,axi-config" node property parsing ip_gre: do not report erspan version on GRE interface net: microchip: sparx5: Fix missing destroy_workqueue of mact_queue net: thunderx: Fix missing destroy_workqueue of nicvf_rx_mode_wq net: hisilicon: Fix potential use-after-free in hisi_femac_rx() net: mdio: fix unbalanced fwnode reference count in mdio_device_release() net: hisilicon: Fix potential use-after-free in hix5hd2_rx() tipc: Fix potential OOB in tipc_link_proto_rcv() ipv4: Fix incorrect route flushing when source address is deleted ipv4: Fix incorrect route flushing when table ID 0 is used net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions() tipc: call tipc_lxc_xmit without holding node_read_lock ethernet: aeroflex: fix potential skb leak in greth_init_rings() dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove() xen/netback: fix build warning net: phy: mxl-gpy: fix version reporting net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq() ipv6: avoid use-after-free in ip6_fragment() net: thunderbolt: fix memory leak in tbnet_open() net: mvneta: Fix an out of bounds check macsec: add missing attribute validation for offload s390/qeth: fix various format strings s390/qeth: fix use-after-free in hsci can: esd_usb: Allow REC and TEC to return to zero block: move CONFIG_BLOCK guard to top Makefile io_uring: move to separate directory io_uring: Fix a null-ptr-deref in io_tctx_exit_cb() Linux 5.15.83 Change-Id: I08ef74d6ad8786c191050294dcbf1090908e7c4d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
298 lines
8.8 KiB
C
298 lines
8.8 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
#ifndef __CGROUP_INTERNAL_H
|
|
#define __CGROUP_INTERNAL_H
|
|
|
|
#include <linux/cgroup.h>
|
|
#include <linux/kernfs.h>
|
|
#include <linux/workqueue.h>
|
|
#include <linux/list.h>
|
|
#include <linux/refcount.h>
|
|
#include <linux/fs_parser.h>
|
|
|
|
#define TRACE_CGROUP_PATH_LEN 1024
|
|
extern spinlock_t trace_cgroup_path_lock;
|
|
extern char trace_cgroup_path[TRACE_CGROUP_PATH_LEN];
|
|
extern bool cgroup_debug;
|
|
extern void __init enable_debug_cgroup(void);
|
|
|
|
/*
|
|
* cgroup_path() takes a spin lock. It is good practice not to take
|
|
* spin locks within trace point handlers, as they are mostly hidden
|
|
* from normal view. As cgroup_path() can take the kernfs_rename_lock
|
|
* spin lock, it is best to not call that function from the trace event
|
|
* handler.
|
|
*
|
|
* Note: trace_cgroup_##type##_enabled() is a static branch that will only
|
|
* be set when the trace event is enabled.
|
|
*/
|
|
#define TRACE_CGROUP_PATH(type, cgrp, ...) \
|
|
do { \
|
|
if (trace_cgroup_##type##_enabled()) { \
|
|
unsigned long flags; \
|
|
spin_lock_irqsave(&trace_cgroup_path_lock, \
|
|
flags); \
|
|
cgroup_path(cgrp, trace_cgroup_path, \
|
|
TRACE_CGROUP_PATH_LEN); \
|
|
trace_cgroup_##type(cgrp, trace_cgroup_path, \
|
|
##__VA_ARGS__); \
|
|
spin_unlock_irqrestore(&trace_cgroup_path_lock, \
|
|
flags); \
|
|
} \
|
|
} while (0)
|
|
|
|
/*
|
|
* The cgroup filesystem superblock creation/mount context.
|
|
*/
|
|
struct cgroup_fs_context {
|
|
struct kernfs_fs_context kfc;
|
|
struct cgroup_root *root;
|
|
struct cgroup_namespace *ns;
|
|
unsigned int flags; /* CGRP_ROOT_* flags */
|
|
|
|
/* cgroup1 bits */
|
|
bool cpuset_clone_children;
|
|
bool none; /* User explicitly requested empty subsystem */
|
|
bool all_ss; /* Seen 'all' option */
|
|
u16 subsys_mask; /* Selected subsystems */
|
|
char *name; /* Hierarchy name */
|
|
char *release_agent; /* Path for release notifications */
|
|
};
|
|
|
|
static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc)
|
|
{
|
|
struct kernfs_fs_context *kfc = fc->fs_private;
|
|
|
|
return container_of(kfc, struct cgroup_fs_context, kfc);
|
|
}
|
|
|
|
struct cgroup_pidlist;
|
|
|
|
struct cgroup_file_ctx {
|
|
struct cgroup_namespace *ns;
|
|
|
|
struct {
|
|
void *trigger;
|
|
} psi;
|
|
|
|
struct {
|
|
bool started;
|
|
struct css_task_iter iter;
|
|
} procs;
|
|
|
|
struct {
|
|
struct cgroup_pidlist *pidlist;
|
|
} procs1;
|
|
};
|
|
|
|
/*
|
|
* A cgroup can be associated with multiple css_sets as different tasks may
|
|
* belong to different cgroups on different hierarchies. In the other
|
|
* direction, a css_set is naturally associated with multiple cgroups.
|
|
* This M:N relationship is represented by the following link structure
|
|
* which exists for each association and allows traversing the associations
|
|
* from both sides.
|
|
*/
|
|
struct cgrp_cset_link {
|
|
/* the cgroup and css_set this link associates */
|
|
struct cgroup *cgrp;
|
|
struct css_set *cset;
|
|
|
|
/* list of cgrp_cset_links anchored at cgrp->cset_links */
|
|
struct list_head cset_link;
|
|
|
|
/* list of cgrp_cset_links anchored at css_set->cgrp_links */
|
|
struct list_head cgrp_link;
|
|
};
|
|
|
|
/* used to track tasks and csets during migration */
|
|
struct cgroup_taskset {
|
|
/* the src and dst cset list running through cset->mg_node */
|
|
struct list_head src_csets;
|
|
struct list_head dst_csets;
|
|
|
|
/* the number of tasks in the set */
|
|
int nr_tasks;
|
|
|
|
/* the subsys currently being processed */
|
|
int ssid;
|
|
|
|
/*
|
|
* Fields for cgroup_taskset_*() iteration.
|
|
*
|
|
* Before migration is committed, the target migration tasks are on
|
|
* ->mg_tasks of the csets on ->src_csets. After, on ->mg_tasks of
|
|
* the csets on ->dst_csets. ->csets point to either ->src_csets
|
|
* or ->dst_csets depending on whether migration is committed.
|
|
*
|
|
* ->cur_csets and ->cur_task point to the current task position
|
|
* during iteration.
|
|
*/
|
|
struct list_head *csets;
|
|
struct css_set *cur_cset;
|
|
struct task_struct *cur_task;
|
|
};
|
|
|
|
/* migration context also tracks preloading */
|
|
struct cgroup_mgctx {
|
|
/*
|
|
* Preloaded source and destination csets. Used to guarantee
|
|
* atomic success or failure on actual migration.
|
|
*/
|
|
struct list_head preloaded_src_csets;
|
|
struct list_head preloaded_dst_csets;
|
|
|
|
/* tasks and csets to migrate */
|
|
struct cgroup_taskset tset;
|
|
|
|
/* subsystems affected by migration */
|
|
u16 ss_mask;
|
|
};
|
|
|
|
#define CGROUP_TASKSET_INIT(tset) \
|
|
{ \
|
|
.src_csets = LIST_HEAD_INIT(tset.src_csets), \
|
|
.dst_csets = LIST_HEAD_INIT(tset.dst_csets), \
|
|
.csets = &tset.src_csets, \
|
|
}
|
|
|
|
#define CGROUP_MGCTX_INIT(name) \
|
|
{ \
|
|
LIST_HEAD_INIT(name.preloaded_src_csets), \
|
|
LIST_HEAD_INIT(name.preloaded_dst_csets), \
|
|
CGROUP_TASKSET_INIT(name.tset), \
|
|
}
|
|
|
|
#define DEFINE_CGROUP_MGCTX(name) \
|
|
struct cgroup_mgctx name = CGROUP_MGCTX_INIT(name)
|
|
|
|
extern spinlock_t css_set_lock;
|
|
extern struct cgroup_subsys *cgroup_subsys[];
|
|
extern struct list_head cgroup_roots;
|
|
|
|
/* iterate across the hierarchies */
|
|
#define for_each_root(root) \
|
|
list_for_each_entry((root), &cgroup_roots, root_list)
|
|
|
|
/**
|
|
* for_each_subsys - iterate all enabled cgroup subsystems
|
|
* @ss: the iteration cursor
|
|
* @ssid: the index of @ss, CGROUP_SUBSYS_COUNT after reaching the end
|
|
*/
|
|
#define for_each_subsys(ss, ssid) \
|
|
for ((ssid) = 0; (ssid) < CGROUP_SUBSYS_COUNT && \
|
|
(((ss) = cgroup_subsys[ssid]) || true); (ssid)++)
|
|
|
|
static inline bool cgroup_is_dead(const struct cgroup *cgrp)
|
|
{
|
|
return !(cgrp->self.flags & CSS_ONLINE);
|
|
}
|
|
|
|
static inline bool notify_on_release(const struct cgroup *cgrp)
|
|
{
|
|
return test_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
|
|
}
|
|
|
|
void put_css_set_locked(struct css_set *cset);
|
|
|
|
static inline void put_css_set(struct css_set *cset)
|
|
{
|
|
unsigned long flags;
|
|
|
|
/*
|
|
* Ensure that the refcount doesn't hit zero while any readers
|
|
* can see it. Similar to atomic_dec_and_lock(), but for an
|
|
* rwlock
|
|
*/
|
|
if (refcount_dec_not_one(&cset->refcount))
|
|
return;
|
|
|
|
spin_lock_irqsave(&css_set_lock, flags);
|
|
put_css_set_locked(cset);
|
|
spin_unlock_irqrestore(&css_set_lock, flags);
|
|
}
|
|
|
|
/*
|
|
* refcounted get/put for css_set objects
|
|
*/
|
|
static inline void get_css_set(struct css_set *cset)
|
|
{
|
|
refcount_inc(&cset->refcount);
|
|
}
|
|
|
|
bool cgroup_ssid_enabled(int ssid);
|
|
bool cgroup_on_dfl(const struct cgroup *cgrp);
|
|
bool cgroup_is_thread_root(struct cgroup *cgrp);
|
|
bool cgroup_is_threaded(struct cgroup *cgrp);
|
|
|
|
struct cgroup_root *cgroup_root_from_kf(struct kernfs_root *kf_root);
|
|
struct cgroup *task_cgroup_from_root(struct task_struct *task,
|
|
struct cgroup_root *root);
|
|
struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn, bool drain_offline);
|
|
void cgroup_kn_unlock(struct kernfs_node *kn);
|
|
int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen,
|
|
struct cgroup_namespace *ns);
|
|
|
|
void cgroup_free_root(struct cgroup_root *root);
|
|
void init_cgroup_root(struct cgroup_fs_context *ctx);
|
|
int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask);
|
|
int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask);
|
|
int cgroup_do_get_tree(struct fs_context *fc);
|
|
|
|
int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp);
|
|
void cgroup_migrate_finish(struct cgroup_mgctx *mgctx);
|
|
void cgroup_migrate_add_src(struct css_set *src_cset, struct cgroup *dst_cgrp,
|
|
struct cgroup_mgctx *mgctx);
|
|
int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx);
|
|
int cgroup_migrate(struct task_struct *leader, bool threadgroup,
|
|
struct cgroup_mgctx *mgctx);
|
|
|
|
int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader,
|
|
bool threadgroup);
|
|
struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
|
|
bool *locked,
|
|
struct cgroup *dst_cgrp);
|
|
__acquires(&cgroup_threadgroup_rwsem);
|
|
void cgroup_procs_write_finish(struct task_struct *task, bool locked)
|
|
__releases(&cgroup_threadgroup_rwsem);
|
|
|
|
void cgroup_lock_and_drain_offline(struct cgroup *cgrp);
|
|
|
|
int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode);
|
|
int cgroup_rmdir(struct kernfs_node *kn);
|
|
int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
|
|
struct kernfs_root *kf_root);
|
|
|
|
int __cgroup_task_count(const struct cgroup *cgrp);
|
|
int cgroup_task_count(const struct cgroup *cgrp);
|
|
|
|
/*
|
|
* rstat.c
|
|
*/
|
|
int cgroup_rstat_init(struct cgroup *cgrp);
|
|
void cgroup_rstat_exit(struct cgroup *cgrp);
|
|
void cgroup_rstat_boot(void);
|
|
void cgroup_base_stat_cputime_show(struct seq_file *seq);
|
|
|
|
/*
|
|
* namespace.c
|
|
*/
|
|
extern const struct proc_ns_operations cgroupns_operations;
|
|
|
|
/*
|
|
* cgroup-v1.c
|
|
*/
|
|
extern struct cftype cgroup1_base_files[];
|
|
extern struct kernfs_syscall_ops cgroup1_kf_syscall_ops;
|
|
extern const struct fs_parameter_spec cgroup1_fs_parameters[];
|
|
|
|
int proc_cgroupstats_show(struct seq_file *m, void *v);
|
|
bool cgroup1_ssid_disabled(int ssid);
|
|
void cgroup1_pidlist_destroy_all(struct cgroup *cgrp);
|
|
void cgroup1_release_agent(struct work_struct *work);
|
|
void cgroup1_check_for_release(struct cgroup *cgrp);
|
|
int cgroup1_parse_param(struct fs_context *fc, struct fs_parameter *param);
|
|
int cgroup1_get_tree(struct fs_context *fc);
|
|
int cgroup1_reconfigure(struct fs_context *ctx);
|
|
|
|
#endif /* __CGROUP_INTERNAL_H */
|