Revert "FROMLIST: mm: multi-gen LRU: admin guide"

This reverts commit 6e815a6f34. To be replaced with upstream version. Bug: 249601646 Change-Id: Ib1036315a5ec79a240304a865c9a33a8f79d0b3c Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
2022-11-07 15:57:11 -08:00
parent b8f8d02fd4
commit 543542a21e
3 changed files with 1 additions and 155 deletions
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -32,7 +32,6 @@ the Linux memory management.
   idle_page_tracking
   ksm
   memory-hotplug
   multigen_lru
   nommu-mmap
   numa_memory_policy
   numaperf
--- a/Documentation/admin-guide/mm/multigen_lru.rst
+++ b/Documentation/admin-guide/mm/multigen_lru.rst
@@ -1,152 +0,0 @@
 .. SPDX-License-Identifier: GPL-2.0
 =============
 Multi-Gen LRU
 =============
 The multi-gen LRU is an alternative LRU implementation that optimizes
 page reclaim and improves performance under memory pressure. Page
 reclaim decides the kernel's caching policy and ability to overcommit
 memory. It directly impacts the kswapd CPU usage and RAM efficiency.
 Quick start
 ===========
 Build the kernel with the following configurations.
 * ``CONFIG_LRU_GEN=y``
 * ``CONFIG_LRU_GEN_ENABLED=y``
 All set!
 Runtime options
 ===============
 ``/sys/kernel/mm/lru_gen/`` contains stable ABIs described in the
 following subsections.
 Kill switch
 -----------
 ``enable`` accepts different values to enable or disable the following
 components. Its default value depends on ``CONFIG_LRU_GEN_ENABLED``.
 All the components should be enabled unless some of them have
 unforeseen side effects. Writing to ``enable`` has no effect when a
 component is not supported by the hardware, and valid values will be
 accepted even when the main switch is off.
 ====== ===============================================================
 Values Components
 ====== ===============================================================
 0x0001 The main switch for the multi-gen LRU.
 0x0002 Clearing the accessed bit in leaf page table entries in large
       batches, when MMU sets it (e.g., on x86). This behavior can
       theoretically worsen lock contention (mmap_lock). If it is
       disabled, the multi-gen LRU will suffer a minor performance
       degradation.
 0x0004 Clearing the accessed bit in non-leaf page table entries as
       well, when MMU sets it (e.g., on x86). This behavior was not
       verified on x86 varieties other than Intel and AMD. If it is
       disabled, the multi-gen LRU will suffer a negligible
       performance degradation.
 [yYnN] Apply to all the components above.
 ====== ===============================================================
 E.g.,
 ::
    echo y >/sys/kernel/mm/lru_gen/enabled
    cat /sys/kernel/mm/lru_gen/enabled
    0x0007
    echo 5 >/sys/kernel/mm/lru_gen/enabled
    cat /sys/kernel/mm/lru_gen/enabled
    0x0005
 Thrashing prevention
 --------------------
 Personal computers are more sensitive to thrashing because it can
 cause janks (lags when rendering UI) and negatively impact user
 experience. The multi-gen LRU offers thrashing prevention to the
 majority of laptop and desktop users who do not have ``oomd``.
 Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of
 ``N`` milliseconds from getting evicted. The OOM killer is triggered
 if this working set cannot be kept in memory. In other words, this
 option works as an adjustable pressure relief valve, and when open, it
 terminates applications that are hopefully not being used.
 Based on the average human detectable lag (~100ms), ``N=1000`` usually
 eliminates intolerable janks due to thrashing. Larger values like
 ``N=3000`` make janks less noticeable at the risk of premature OOM
 kills.
 The default value ``0`` means disabled.
 Experimental features
 =====================
 ``/sys/kernel/debug/lru_gen`` accepts commands described in the
 following subsections. Multiple command lines are supported, so does
 concatenation with delimiters ``,`` and ``;``.
 ``/sys/kernel/debug/lru_gen_full`` provides additional stats for
 debugging. ``CONFIG_LRU_GEN_STATS=y`` keeps historical stats from
 evicted generations in this file.
 Working set estimation
 ----------------------
 Working set estimation measures how much memory an application
 requires in a given time interval, and it is usually done with little
 impact on the performance of the application. E.g., data centers want
 to optimize job scheduling (bin packing) to improve memory
 utilizations. When a new job comes in, the job scheduler needs to find
 out whether each server it manages can allocate a certain amount of
 memory for this new job before it can pick a candidate. To do so, this
 job scheduler needs to estimate the working sets of the existing jobs.
 When it is read, ``lru_gen`` returns a histogram of numbers of pages
 accessed over different time intervals for each memcg and node.
 ``MAX_NR_GENS`` decides the number of bins for each histogram.
 ::
    memcg  memcg_id  memcg_path
       node  node_id
           min_gen_nr  age_in_ms  nr_anon_pages  nr_file_pages
           ...
           max_gen_nr  age_in_ms  nr_anon_pages  nr_file_pages
 Each generation contains an estimated number of pages that have been
 accessed within ``age_in_ms`` non-cumulatively. E.g., ``min_gen_nr``
 contains the coldest pages and ``max_gen_nr`` contains the hottest
 pages, since ``age_in_ms`` of the former is the largest and that of
 the latter is the smallest.
 Users can write ``+ memcg_id node_id max_gen_nr
 [can_swap[full_scan]]`` to ``lru_gen`` to create a new generation
 ``max_gen_nr+1``. ``can_swap`` defaults to the swap setting and, if it
 is set to ``1``, it forces the scan of anon pages when swap is off.
 ``full_scan`` defaults to ``1`` and, if it is set to ``0``, it reduces
 the overhead as well as the coverage when scanning page tables.
 A typical use case is that a job scheduler writes to ``lru_gen`` at a
 certain time interval to create new generations, and it ranks the
 servers it manages based on the sizes of their cold memory defined by
 this time interval.
 Proactive reclaim
 -----------------
 Proactive reclaim induces memory reclaim when there is no memory
 pressure and usually targets cold memory only. E.g., when a new job
 comes in, the job scheduler wants to proactively reclaim memory on the
 server it has selected to improve the chance of successfully landing
 this new job.
 Users can write ``- memcg_id node_id min_gen_nr [swappiness
 [nr_to_reclaim]]`` to ``lru_gen`` to evict generations less than or
 equal to ``min_gen_nr``. Note that ``min_gen_nr`` should be less than
 ``max_gen_nr-1`` as ``max_gen_nr`` and ``max_gen_nr-1`` are not fully
 aged and therefore cannot be evicted. ``swappiness`` overrides the
 default value in ``/proc/sys/vm/swappiness``. ``nr_to_reclaim`` limits
 the number of pages to evict.
 A typical use case is that a job scheduler writes to ``lru_gen``
 before it tries to land a new job on a server, and if it fails to
 materialize the cold memory without impacting the existing jobs on
 this server, it retries on the next server according to the ranking
 result obtained from the working set estimation step described
 earlier.
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -930,8 +930,7 @@ config LRU_GEN
 	# the following options can use up the spare bits in page flags
 	depends on !MAXSMP && (64BIT || !SPARSEMEM || SPARSEMEM_VMEMMAP)
 	help
-	  A high performance LRU implementation to overcommit memory. See
+	  A high performance LRU implementation to overcommit memory.
 	  Documentation/admin-guide/mm/multigen_lru.rst for details.
 config LRU_GEN_ENABLED
 	bool "Enable by default"