This optimization allows us to re-create higher order block mappings in
the host stage2 pagetables after we teardown a guest VM. The coalescing
code is triggered on host_stage2_set_owner_locked path when we annotate
the entries in the host stage2 page-tables with an invalid entry that has
the owner set to PKVM_ID_HOST. This can also be triggered from
page_relinquish when we do page insertion in the ballooning code.
When the host reclaims ownership during guest teardown, the page table
walker drops the refcount of the counted entries and clears out
unreferenced entries (refcount == 1). Clearing out the entry installs a
zero PTE. When the host stage2 receives a data abort because there is no
mapping associated, it will try to create the largest possible block
mapping from the founded leaf entry.
With the current patch, we increase the chances of finding a leaf entry
that has level < 3 if the requested region comes from a reclaimed torned
down VM memory. This has the advantage of reducing the TLB pressure at
host stage2.
To be able to do coalescing, we modify the way we do refcounting by not
counting the following descriptor types at host stage 2:
- non-zero invalid PTEs
- any descriptor that has at least one of the reserved-high bits(58-55)
toogled
- non-default attribute mappings
- page table descriptors
The algorithm works as presented below:
Is refcount(child(pte_table)) == 1 ?
Yes -> (because we left only default mappings)
Zap the table by setting 0 in the pte_table
and put the page that holds the level 3 entries
back into the memcache
level 2
+---------+
| |
| ... |
| pte_table---+ level 3 -> we can now re-create a 2Mb mapping
| ... | +---> +---------+
| | | |
| | | |
| | |def entry|
+---------+ | |
|def entry|
| |
| ... |
+---------+
This (v3) is a re-work of the previous version which fixes some issues on
the stage2_unmap path:
When we register a pKVM IOMMU we unmap the MMIO region from the host
stage2. While we treat most of the MMIO regions as default mappings in
the coalescing change, we end up decrementing the page table page
refcount for a default mapping which breaks the refcounting. Fix this by
adding a check which verifies if we have a default mapping before
decrementing the reference.
Bug: 222044487
Test: dump the host stage2 pagetables and view the mapping
Change-Id: I518fcbd7f022e77965eef54dd59dac07425db3a5
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Signed-off-by: Will Deacon <willdeacon@google.com>