Add two new symbols to aarch64 kernel ABI:
* pkvm_iommu_sysmmu_sync_register
* pkvm_iommu_finalize
The former allows vendor modules to register a SYSMMU_SYNC device with
the hypervisor, and the latter tells the hypervisor to stop acception
new device registrations.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I6c6948d94cb6494f07d52b4e2b7e91db40e2fcd6
Add new hypercall that the host can use to inform the hypervisor that
all hypervisor-controlled IOMMUs have been registered and no new
registrations should be allowed. This will typically be called at the
end of kernel module initialization phase.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I8c175310d5b262a67947443c5a0154056a8ebf3e
The IOMMU DABT handler currently checks if the device is considered
powered by hyp before resolving the request. If the power tracking does
not reflect reality, the IOMMU may trigger issues in the host but the
incorrect state prevents it from diagnosing the issue.
Drop the powered check from the generic IOMMU code. The host accessing
the device's SFR means that it assumes it is powered, and individual
drivers can choose to reject that DABT request.
Bug: 224891559
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I1c132c4030a61a90be4675867c9658e3bc696118
SysMMU_SYNC devices expose an interface to start a sync counter and
poll its SFR until the device signals that all memory transactions in
flight at the start have drained. This gives the hypervisor a reliable
indicator that S2MPU invalidation has fully completed and all new
transactions will use the new MPTs.
Add a new pKVM IOMMU driver that the host can use to register
SysMMU_SYNCs. Each device is expected to be a supplier to exactly one
S2MPU (parent), but multiple SYNCs can supply a single S2MPU.
To keep things simple, the SYNCs do not implement suspend/resume and are
assumed to follow the power transitions of their parent.
Following an invalidation, the S2MPU driver iterates over its children
and waits for each SYNC to signal that its transactions have drained.
The algorithm currently waits on each SYNC in turn. If latency proves to
be an issue, this could be optimized to initiate a SYNC on all powered
devices before starting to poll.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I45b832fd11d76b65987935c8548e2a214ee2fa2a
In preparation for adding new IOMMU devices that act as suppliers to
others, add the notion of a parent IOMMU device. Such device must be
registered after its parent and the driver of the parent device must
validate the addition.
The relation has no generic implications, it is up to drivers to make
use of it.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I4ee3675e5529bb73ad4546fa32380f237f054177
In preparation for needing to validate more aspects of a device that is
about to be registered, change the callback to accept the to-be-added
'struct pkvm_iommu' rather than individual inputs.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I3fb911e4280c220ddd779cf6a5fc9c302a5617f7
Private EL2 mappings currently cannot be removed. Move the creation of
IOMMU device mappings at the end of the registration function so that
other errors do not result in unnecessary mappings.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I3139e9af3345f157295eb72441a7cf3cc055116d
Memory for IOMMU device entries gets allocated from a pool donated by
the host. It is possible for pkvm_iommu_register() to allocate the
memory and then fail, in which case the memory remains unused but not
freed.
Refactor the code such that the host lock covers the entire section
where the memory is allocated. This way we can return the memory back to
the linear allocator if an error is returned.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I8c1650ba3e545741144d793de506e93c4066896f
Currently __pkvm_iommu_pm_notify always changes the value of
dev->powered following a suspend/resume attempt. This could potentially
be abused to force the hypervisor to stop issuing updates to an S2MPU
and preserving an old/invalid state.
Modify to only update the power state if suspend/resume was successful.
Bug: 190463801
Signed-off-by: David Brazdil <dbrazdil@google.com>
Change-Id: I285fc822e9fc926c49b9b5e69446790e1edccafb
The synchronous wakeup interface is available only for the
interruptible wakeup. Add it for normal wakeup and use this
synchronous wakeup interface to wakeup the userspace daemon.
Scheduler can make use of this hint to find a better CPU for
the waker task.
With this change the performance numbers for compress, decompress
and copy use-cases on /sdcard path has improved by ~30%.
Use-case details:
1. copy 10000 files of each 4k size into /sdcard path
2. use any File explorer application that has compress/decompress
support
3. start compress/decompress and capture the time.
-------------------------------------------------
| Default | wakeup support | Improvement/Diff |
-------------------------------------------------
| 13.8 sec | 9.9 sec | 3.9 sec (28.26%) |
-------------------------------------------------
Co-developed-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
Bug: 216261533
Link: https://lore.kernel.org/lkml/1638780405-38026-1-git-send-email-quic_pragalla@quicinc.com/
Change-Id: I9ac89064e34b1e0605064bf4d2d3a310679cb605
Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Currently, The fixed 512KB prealloc buffer size is too larger for
tiny memory kernel (such as 16MB memory). This patch adds the module
option "prealloc_buffer_size_kbytes" to specify prealloc buffer size.
It's suitable for cards which use the generic dmaengine pcm driver
with no config.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Link: https://lore.kernel.org/r/1632394246-59341-1-git-send-email-sugar.zhang@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Bug: 226673331
Change-Id: I13e4e65502e8e9625577e9a9bb9f3e02154f31f4
(cherry picked from commit b0e3b0a7078d71455747025e7deee766d4d43432)
Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
For SoC's skin temperature, we have to use more stringent temperature
control to make IPA can monitor and mitigate temperature control earlier
and faster, so add it to meet platform thermal requirement.
Bug: 211564753
Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Di Shen <di.shen@unisoc.com>
Change-Id: Iaef87287eef93d6fdbc3c58c93f70c1525e38296
(cherry picked from commit 6709f523251f77dc1e9ea643668c630db1f7db80)
android_rvh_effective_cpu_util:
To perform vendor-specific cpu util, it is used in EAS/schedutil/thermal.
The effective_cpu_util would be called when thermal calc the dynamic power,
it's non-atomic context, so set the hook be restricted.
Bug: 226686099
Test: build pass
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Change-Id: I6fd77f44ca4328f5ef37d96989aa2e08d65e29bb
virtio pci config structures may in future have non-standard bar
values in the bar field. We should anticipate this by skipping any
structures containing such a reserved value.
The bar value should never change: check for harmful modified values
we re-read it from the config space in vp_modern_map_capability().
Also clean up an existing check to consistently use PCI_STD_NUM_BARS.
Signed-off-by: Keir Fraser <keirf@google.com>
Link: https://lore.kernel.org/r/20220323140727.3499235-1-keirf@google.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3f63a1d7f6f500b6891b1003cec3e23ea4996a2e)
Bug: 222232623
Signed-off-by: Keir Fraser <keirf@google.com>
Change-Id: Idbba48154a051cf173b9cb0bd40c77fcf02902a4
This reverts commit 9e35276a5344f74d4a3600fc4100b3dd251d5c56. Issue
were reported for the drivers that are using affinity managed IRQ
where manually toggling IRQ status is not expected. And we forget to
enable the interrupts in the restore path as well.
In the future, we will rework on the interrupt hardening.
Fixes: 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
Reported-by: Marc Zyngier <maz@kernel.org>
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220323031524.6555-2-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit eb4cecb453a19b34d5454b49532e09e9cb0c1529)
Bug: 196772804
Signed-off-by: Keir Fraser <keirf@google.com>
Change-Id: I05264d9e61d558522a8a20cf87399aa3578b3a6e
This reverts commit 080cd7c3ac8701081d143a15ba17dd9475313188. Since
the MSI-X interrupts hardening will be reverted in the next patch. We
will rework the interrupt hardening in the future.
Fixes: 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220323031524.6555-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7b79edfb862d6b1ecc66479419ae67a7db2d02e3)
Bug: 196772804
Signed-off-by: Keir Fraser <keirf@google.com>
Change-Id: I11ea35dcff3cfd43a535acf2033d7921f39c8be7
We no longer need to map the host's .rodata and .bss sections in the
pkvm hypervisor, so let's remove those mappings. This will avoid
creating dependencies at EL2 on host-controlled data-structures.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 225169428
Change-Id: I0fcb0e1b34d3c7c0c226b3fd30cdec0e8d7bfb44
The pkvm hypervisor may need to read the kvm_vgic_global_state variable
at EL2. Make sure to explicitely map it in the its stage-1 page-table
rather than relying on mapping all of .rodata.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 225169428
Change-Id: I72d1eba78fb6b7593d236539cd81269480856fdf
In pKVM mode, we can't trust the host not to mess with the hypervisor
per-cpu offsets, so let's move the array containing them to the nVHE
code.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 225169428
Change-Id: I9ef4175ce9cf00d6ff1c0e358551a565358f2408
The host KVM PMU code can currently index kvm_arm_hyp_percpu_base[]
through this_cpu_ptr_hyp_sym(), but will not actually dereference that
pointer when protected KVM is enabled. In preparation for making
kvm_arm_hyp_percpu_base[] unaccessible to the host, let's make sure the
indexing in hyp per-cpu pages is also done after the static key check to
avoid spurious accesses to EL2-private data from EL1.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 225169428
Change-Id: I3f4e3f7ee789c31a1ae1f67e07edf8fb34f520b9
Add code to head.S's el2_setup to detect MPAM and disable any EL2 traps.
This register resets to an unknown value, setting it to the default
parititons/pmg before we enable the MMU is the best thing to do.
Kexec/kdump will depend on this if the previous kernel left the CPU
configured with a restrictive configuration.
If linux is booted at the highest implemented exception level el2_setup
will clear the enable bit, disabling MPAM.
Signed-off-by: James Morse <james.morse@arm.com>
Bug: 221768437
(cherry picked from commit fa0ff38f06b397d8a92d88eb8083c2c5a20ac87f
git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/snapshot/v5.16)
Change-Id: I2758f7f7b236d09a207e13d1165efb6887e8611a
Signed-off-by: Valentin Schneider <Valentin.Schneider@arm.com>
[bm: amended commit msg, dropped config option and switched to named labels]
Signed-off-by: Beata Michalska <beata.michalska@arm.com>
This is a partial cherry-pick of commit:
7fe77616f156 ("arm64: cpufeature: discover CPU support for MPAM")
from git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
Bug: 221768437
Change-Id: I77101abb07f9b73dbc7cc2a53ac44fbf772f0b1d
Signed-off-by: Valentin Schneider <Valentin.Schneider@arm.com>
Signed-off-by: Beata Michalska <beata.michalska@arm.com>
While 5.15.27 LTS merge, the change of below patch might be omitted.
- 1921d1fd0e ("arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL")
To correct the patch, we need to recover the NOKPROBE_SYMBOL macro along
with EXPORT_SYMBOL_GPL macro that was added by below patch.
- b7ca6bc390 ("ANDROID: arm64: stacktrace: export start_backtrace symbol")
Bug: 227151759
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Change-Id: I2e4cf0654c6bd65a8ad606709642a27cdc5b2f72
(recover commit from 1921d1fd0e)
Make the large_file_test check if there is at least 3GB of free disk
space and skip the test if there is not. This is to make the tests pass
on a VM with limited disk size, now all functional tests are passing.
TAP version 13
1..26
ok 1 basic_file_ops_test
ok 2 cant_touch_index_test
ok 3 dynamic_files_and_data_test
ok 4 concurrent_reads_and_writes_test
ok 5 attribute_test
ok 6 work_after_remount_test
ok 7 child_procs_waiting_for_data_test
ok 8 multiple_providers_test
ok 9 hash_tree_test
ok 10 read_log_test
ok 11 get_blocks_test
ok 12 get_hash_blocks_test
ok 13 large_file_test
ok 14 mapped_file_test
ok 15 compatibility_test
ok 16 data_block_count_test
ok 17 hash_block_count_test
ok 18 per_uid_read_timeouts_test
ok 19 inotify_test
ok 20 verity_test
ok 21 enable_verity_test
ok 22 mmap_test
ok 23 truncate_test
ok 24 stat_test
ok 25 sysfs_test
Error mounting fs.: File exists
Error mounting fs.: File exists
ok 26 sysfs_rename_test
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I2260e2b314429251070d0163c70173f237f86476
Without it incfs/incfs_perf runtime fails in format_signature:
malloc(): invalid size (unsorted)
Aborted
When compiled with gcc version 11.2.0.
Also add check for NULL after the malloc, and remove unneeded
space for uint32_t in signing_section.
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I62b775140e4b89f75335cbd65665cf6a3e0fe964
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
This is a design decision and the user space component "Data Loader"
expects this to work for app re-install use case.
The mounting depth needs to be controlled, however, and only allowed
to be two levels deep. In case of more than two mount attempts the
driver needs to return an error.
In case of the issues listed below the common pattern is that the
reproducer calls:
mount("./file0", "./file0", "incremental-fs", 0, NULL)
many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:
BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN
This change also cleans up the mount error path to properly clean
allocated resources and call deactivate_locked_super(), which
causes the incfs_kill_sb() to be called, where the sb is freed.
Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I08d9b545a2715423296bf4beb67bdbbed78d1be1
Export stack_trace_save_tsk and stack_trace_save_regs so that modules
can use it to get the stacktrace of specific tasks or pt_regs.
Bug: 220704805
Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Change-Id: I568e5d8c3e2474a0dce96da6adb4eec2583d45d2
(cherry picked from commit db42a16b80a073fdf3f6f3ee207395bba0c7219e)
To support multiple kswap threads vendor modules need
access to kswapd function. So, export it.
Bug: 201263306
Change-Id: I442612710835f39836a295e9d1936f86826ab960
Signed-off-by: Vijayanand Jitta <quic_vjitta@quicinc.com>
This reverts commit 33061d0fba.
It will come back after the next ABI break as it is needed to resolve
CVE-2022-1048. But for now, while testing, it can be reverted in order
to preserve the ABI.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iebd89734d6d1b0bdb3aaa9e8b52fc07345377a64
This reverts commit 47711ff10c.
It will come back after the next ABI break as it is needed to resolve
CVE-2022-1048. But for now, while testing, it can be reverted in order
to preserve the ABI.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5d4a52d443cffa34c8550b83c90d11a69da09cfc
This reverts commit cb6a39c5eb.
It will come back after the next ABI break as it is needed to resolve
CVE-2022-1048. But for now, while testing, it can be reverted in order
to preserve the ABI.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I52c67352b87bc95efab566d76f95d5466a61da64
This reverts commit 51fce708ab.
It will come back after the next ABI break as it is needed to resolve
CVE-2022-1048. But for now, while testing, it can be reverted in order
to preserve the ABI.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8fa7d4fd7200f76aeb02f0741c6508e86146c2a3
Changes in 5.15.32
nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION
net: ipv6: fix skb_over_panic in __ip6_append_data
tpm: Fix error handling in async work
Bluetooth: btusb: Add another Realtek 8761BU
llc: fix netdevice reference leaks in llc_ui_bind()
ASoC: sti: Fix deadlock via snd_pcm_stop_xrun() call
ALSA: oss: Fix PCM OSS buffer allocation overflow
ALSA: usb-audio: add mapping for new Corsair Virtuoso SE
ALSA: hda/realtek: Add quirk for Clevo NP70PNJ
ALSA: hda/realtek: Add quirk for Clevo NP50PNJ
ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
ALSA: hda/realtek: Add quirk for ASUS GA402
ALSA: pcm: Fix races among concurrent hw_params and hw_free calls
ALSA: pcm: Fix races among concurrent read/write and buffer changes
ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls
ALSA: pcm: Fix races among concurrent prealloc proc writes
ALSA: pcm: Add stream lock during PCM reset ioctl operations
ALSA: usb-audio: Add mute TLV for playback volumes on RODE NT-USB
ALSA: cmipci: Restore aux vol on suspend/resume
ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec
drivers: net: xgene: Fix regression in CRC stripping
netfilter: nf_tables: initialize registers in nft_do_chain()
netfilter: nf_tables: validate registers coming from userspace.
ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
ACPI: battery: Add device HID and quirk for Microsoft Surface Go 3
ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU
crypto: qat - disable registration of algorithms
Bluetooth: btusb: Add one more Bluetooth part for the Realtek RTL8852AE
Revert "ath: add support for special 0x0 regulatory domain"
drm/virtio: Ensure that objs is not NULL in virtio_gpu_array_put_free()
rcu: Don't deboost before reporting expedited quiescent state
uaccess: fix integer overflow on access_ok()
mac80211: fix potential double free on mesh join
tpm: use try_get_ops() in tpm-space.c
wcn36xx: Differentiate wcn3660 from wcn3620
m68k: fix access_ok for coldfire
nds32: fix access_ok() checks in get/put_user
llc: only change llc->dev when bind() succeeds
Linux 5.15.32
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3f78747142047de7aa26e2c81bc378d4d0aedaff
Passing FOLL_FORCE when pinning guest memory pages was intended to allow
the VMM to map guest memory as PROT_NONE without prohibiting access from
the guest. As it turns out, crosvm doesn't implement this, and since
the host kernel will inject a signal into the VMM on a bad access
irrespective of the stage-1 permissions, we can drop the FOLL_FORCE flag
altogether.
Bug: 226564150
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: If21091b6adf3dbe4155c5c840753c912d283b159
This reverts commit a1bd2a6b3c.
This capability is unused, so remove it to avoid UAPI divergence from
upstream.
Bug: 226564150
[willdeacon@: Also removed additional instance in arch/arm64/kvm/arm.c]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ib3e929a5fc81dc5c9c1ff8512d48f63bdda5c404
This reverts commit 0b7e337baf.
These notifications are unused by crosvm and are no longer required now
that the host takes care of injecting a SEGV on an illegal memory access
from userspace.
Bug: 226564150
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I22c3e49b4aa5f023961c8849b79e2e0a21ebf0c1
Export these tracepoint functions to track USB data flow
for performance tuning.
Bug: 192512300
Signed-off-by: chihhao.chen <chihhao.chen@mediatek.com>
Change-Id: I37ae07e87b5b2d0fb24c1e0a2e83954ceb4aa4f9
(cherry picked from commit 5eb3930a32)
The pKVM hypervisor will currently panic if the host tries to access
memory that it doesn't own (e.g. protected guest memory). Sadly, as
guest memory can still be mapped into the VMM's address space, userspace
can trivially crash the kernel/hypervisor by poking into guest memory.
To prevent this, inject the abort back in the host with S1PTW set in the
ESR, hence allowing the host to differentiate this abort from normal
userspace faults and inject a SIGSEGV cleanly.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 215520143
Change-Id: I9636e71e2fe3eb49d2d7cddaab7774cd672cfcae
In order to simplify the injection of exceptions in the host in pkvm
context, let's factor out of enter_exception64() the code calculating
the exception offset from VBAR_EL1 and the cpsr.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 215520143
Change-Id: I97b2431a79fdec87c95c2d1f691bd3a11635c29b
Add a helper allowing to check when the pkvm static key is enabled to
ease the introduction of pkvm hooks in other parts of the code.
Signed-off-by: Quentin Perret <qperret@google.com>
Bug: 215520143
Change-Id: Iae065b09bb33d42d73a408365c803727269d0de0
If a malicious/compromised host issues a PSCI SYSTEM_RESET call in the
presence of guest-owned pages then the contents of those pages may be
susceptible to cold-reboot attacks.
Use the PSCI MEM_PROTECT call to ensure that volatile memory is wiped by
the firmware if a SYSTEM_RESET occurs while unpoisoned guest pages exist
in the system. Since this call does not offer protection for a "warm"
reset initiated by SYSTEM_RESET2, detect this case in the PSCI relay and
repaint the call to a standard SYSTEM_RESET instead.
Bug: 196204410
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5c3dd93bc83ebcd0b6cea2ec734f6e3a77f0064e
Let's check the return value of pin_user_pages() before blindly
dereferencing the struct page pointer as it may very well be NULL.
Bug: 223678931
Reported-by: Keir Fraser <keirf@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I49eb0eb14b88429cfeed3e7cc8a2a72404cfea97