[ Upstream commit 92c45b63ce22c8898aa41806e8d6692bcd577510 ]
For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb26592301, but rate-limiting is not the best
choice here. Some devices may not show the warnings they should if another
device has just produced a bunch of warnings. Also, the number of messages
can be a nuisance on devices which are otherwise working fine.
Change the ratelimit to a single warning per bus. This ensures no bus is
'starved' of emitting a warning and also that there isn't a continuous
stream of warnings. It would be preferable to have a warning per device,
but the pci_dev structure is not available here, and a lookup from devfn
would be far too slow.
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Fixes: fb26592301 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
Link: https://lore.kernel.org/r/20200806041455.11070-1-mark.tomlinson@alliedtelesis.co.nz
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2226667a145db2e1f314d7f57fd644fe69863ab9 upstream.
It appears that some devices are lying about their mask capability,
pretending that they don't have it, while they actually do.
The net result is that now that we don't enable MSIs on such
endpoint.
Add a new per-device flag to deal with this. Further patches will
make use of it, sadly.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20211104180130.3825416-2-maz@kernel.org
Cc: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull VFIO updates from Alex Williamson:
- Fix dma-valid return WAITED implementation (Anthony Yznaga)
- SPDX license cleanups (Cai Huoqing)
- Split vfio-pci-core from vfio-pci and enhance PCI driver matching to
support future vendor provided vfio-pci variants (Yishai Hadas, Max
Gurtovoy, Jason Gunthorpe)
- Replace duplicated reflck with core support for managing first open,
last close, and device sets (Jason Gunthorpe, Max Gurtovoy, Yishai
Hadas)
- Fix non-modular mdev support and don't nag about request callback
support (Christoph Hellwig)
- Add semaphore to protect instruction intercept handler and replace
open-coded locks in vfio-ap driver (Tony Krowiak)
- Convert vfio-ap to vfio_register_group_dev() API (Jason Gunthorpe)
* tag 'vfio-v5.15-rc1' of git://github.com/awilliam/linux-vfio: (37 commits)
vfio/pci: Introduce vfio_pci_core.ko
vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'
vfio: Use select for eventfd
PCI / VFIO: Add 'override_only' support for VFIO PCI sub system
PCI: Add 'override_only' field to struct pci_device_id
vfio/pci: Move module parameters to vfio_pci.c
vfio/pci: Move igd initialization to vfio_pci.c
vfio/pci: Split the pci_driver code out of vfio_pci_core.c
vfio/pci: Include vfio header in vfio_pci_core.h
vfio/pci: Rename ops functions to fit core namings
vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
vfio/ap_ops: Convert to use vfio_register_group_dev()
s390/vfio-ap: replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification
s390/vfio-ap: r/w lock for PQAP interception handler function pointer
vfio/type1: Fix vfio_find_dma_valid return
vfio-pci/zdev: Remove repeated verbose license text
vfio: platform: reset: Convert to SPDX identifier
vfio: Remove struct vfio_device_ops open/release
...
- Add domain_nr in struct pci_host_bridge (Boqun Feng)
- Use host bridge MSI domain for root buses if present (Boqun Feng)
- Allow ARM64 virtual host bridge with no ACPI companion (e.g., Hyper-V)
(Boqun Feng)
- Make Hyper-V enumeration more generic (Arnd Bergmann)
- Set Hyper-V domain_nr at probe-time (Boqun Feng)
- Set up Hyper-V MSI domain at bridge probe-time (Boqun Feng)
- Enable Hyper-V bridge probing on ARM64 (Boqun Feng)
* remotes/lorenzo/pci/hyper-v:
PCI: hv: Turn on the host bridge probing on ARM64
PCI: hv: Set up MSI domain at bridge probing time
PCI: hv: Set ->domain_nr of pci_host_bridge at probing time
PCI: hv: Generify PCI probing
arm64: PCI: Support root bridge preparation for Hyper-V
arm64: PCI: Restructure pcibios_root_bridge_prepare()
PCI: Support populating MSI domains of root buses via bridges
PCI: Introduce domain_nr in pci_host_bridge
- Cache PCIe Device Capabilities register (Amey Narkhede)
- Add pcie_reset_flr() with 'probe' argument (Amey Narkhede)
- Add pdev->reset_methods[] array to track reset method ordering (Amey
Narkhede)
- Remove reset_fn field from pci_dev (Amey Narkhede)
- Add sysfs interface to query and set device reset mechanism (Amey
Narkhede)
- Add pci_set_acpi_fwnode() to set ACPI_COMPANION (Shanker Donthineni)
- Use acpi_pci_power_manageable() instead of duplicating logic (Shanker
Donthineni)
- Set ACPI fwnode early and at the same time with OF (Shanker Donthineni)
- Add support for ACPI _RST reset method (Shanker Donthineni)
- Change reset function 'probe' argument to bool (Amey Narkhede)
* pci/reset:
PCI: Change the type of probe argument in reset functions
PCI: Add support for ACPI _RST reset method
PCI: Setup ACPI fwnode early and at the same time with OF
PCI: Use acpi_pci_power_manageable()
PCI: Add pci_set_acpi_fwnode() to set ACPI_COMPANION
PCI: Allow userspace to query and set device reset mechanism
PCI: Remove reset_fn field from pci_dev
PCI: Add array to track reset method ordering
PCI: Add pcie_reset_flr() with 'probe' argument
PCI: Cache PCIe Device Capabilities register
Expose an 'override_only' helper macro (i.e.
PCI_DRIVER_OVERRIDE_DEVICE_VFIO) for VFIO PCI sub system and add the
required code to prefix its matching entries with "vfio_" in
modules.alias file.
It allows VFIO device drivers to include match entries in the
modules.alias file produced by kbuild that are not used for normal
driver autoprobing and module autoloading. Drivers using these match
entries can be connected to the PCI device manually, by userspace, using
the existing driver_override sysfs.
For example the resulting modules.alias may have:
alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
In this example mlx5_core and mlx5_vfio_pci match to the same PCI
device. The kernel will autoload and autobind to mlx5_core but the
kernel and udev mechanisms will ignore mlx5_vfio_pci.
When userspace wants to change a device to the VFIO subsystem it can
implement a generic algorithm:
1) Identify the sysfs path to the device:
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
2) Get the modalias string from the kernel:
$ cat /sys/bus/pci/devices/0000:01:00.0/modalias
pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
3) Prefix it with vfio_:
vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
4) Search modules.alias for the above string and select the entry that
has the fewest *'s:
alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
5) modprobe the matched module name:
$ modprobe mlx5_vfio_pci
6) cat the matched module name to driver_override:
echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
7) unbind device from original module
echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
8) probe PCI drivers (or explicitly bind to mlx5_vfio_pci)
echo 0000:01:00.0 > /sys/bus/pci/drivers_probe
The algorithm is independent of bus type. In future the other buses with
VFIO device drivers, like platform and ACPI, can use this algorithm as
well.
This patch is the infrastructure to provide the information in the
modules.alias to userspace. Convert the only VFIO pci_driver which results
in one new line in the modules.alias:
alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
Later series introduce additional HW specific VFIO PCI drivers, such as
mlx5_vfio_pci.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for pci.h
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20210826103912.128972-11-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add 'override_only' field to struct pci_device_id to be used as part of
pci_match_device().
When set, a driver only matches the entry when dev->driver_override is
set to that driver.
In addition, add a helper macro named 'PCI_DEVICE_DRIVER_OVERRIDE' to
enable setting some data on it.
Next patch from this series will use the above functionality.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20210826103912.128972-10-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add a predicate that returns if PCIe PTM (Precision Time Measurement)
is enabled.
It will only return true if it's enabled in all the ports in the path
from the device to the root.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Make pci_enable_ptm() accessible from the drivers.
Exposing this to the driver enables the driver to use the
'ptm_enabled' field of 'pci_dev' to check if PTM is enabled or not.
This reverts commit ac6c26da29 ("PCI: Make pci_enable_ptm() private").
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently we retrieve the PCI domain number of the host bridge from the
bus sysdata (or pci_config_window if PCI_DOMAINS_GENERIC=y). Actually
we have the information at PCI host bridge probing time, and it makes
sense that we store it into pci_host_bridge. One benefit of doing so is
the requirement for supporting PCI on Hyper-V for ARM64, because the
host bridge of Hyper-V doesn't have pci_config_window, whereas ARM64 is
a PCI_DOMAINS_GENERIC=y arch, so we cannot retrieve the PCI domain
number from pci_config_window on ARM64 Hyper-V guest.
As the preparation for ARM64 Hyper-V PCI support, we introduce the
domain_nr in pci_host_bridge and a sentinel value to allow drivers to
set domain numbers properly at probing time. Currently
CONFIG_PCI_DOMAINS_GENERIC=y archs are only users of this
newly-introduced field.
Link: https://lore.kernel.org/r/20210726180657.142727-2-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
pci_resource_end() can be 0 only when pci_resource_start() is 0.
Otherwise, it is definitely an error. In this case, pci_resource_len()
should be regarded as 0. Therefore, determining whether
pci_resource_start() and pci_resource_end() are both 0 can be reduced to
determining only whether pci_resource_end() is 0.
Although only one condition judgment is reduced, the macro function
pci_resource_len() is widely referenced in the kernel. I used defconfig to
compile the latest kernel on X86, and its binary code size was reduced by
about 3KB.
Before:
[ 2] .rela.text RELA 0000000000000000 093bfcb0
0000000001a67168 0000000000000018 I 68 1 8
After:
[ 2] .rela.text RELA 0000000000000000 093bfcb0
0000000001a66598 0000000000000018 I 68 1 8
Link: https://lore.kernel.org/r/20210713072236.3043-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Interfaces and structs for saving and restoring PCI Capability state were
declared in include/linux/pci.h, but aren't needed outside drivers/pci/.
Move these to drivers/pci/pci.h:
struct pci_cap_saved_data
struct pci_cap_saved_state
void pci_allocate_cap_save_buffers()
void pci_free_cap_save_buffers()
int pci_add_cap_save_buffer()
int pci_add_ext_cap_save_buffer()
struct pci_cap_saved_state *pci_find_saved_cap()
struct pci_cap_saved_state *pci_find_saved_ext_cap()
Link: https://lore.kernel.org/r/20210802221728.1469304-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
All users of pci_vpd_find_info_keyword() are interested in the VPD RO
section only. In addition all calls are followed by the same activities to
calculate start of tag data area and size of the data area.
Add pci_vpd_find_ro_info_keyword() that combines these functionalities.
pci_vpd_find_info_keyword() can be phased out once all users are converted.
[bhelgaas: split pci_vpd_check_csum() to separate patch]
Link: https://lore.kernel.org/r/1643bd7a-088e-1028-c9b0-9d112cf48d63@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Several users of the VPD API use a fixed-size buffer and read the VPD into
it for further usage. This requires special handling for the case that the
buffer isn't big enough to hold the full VPD data. Also the buffer is
often allocated on the stack, which isn't too nice.
Add pci_vpd_alloc() to dynamically allocate buffer of the correct size and
read VPD into it.
Link: https://lore.kernel.org/r/955ff598-0021-8446-f856-0c2c077635d7@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Most reset methods are of the form "pci_*_reset(dev, probe)". pcie_flr()
was an exception because it relied on a separate pcie_has_flr() function
instead of taking a "probe" argument.
Add "pcie_reset_flr(dev, probe)" to follow the convention. Remove
pcie_has_flr().
Some pcie_flr() callers that did not use pcie_has_flr() remain.
[bhelgaas: commit log, rework pcie_reset_flr() to use dev->devcap directly]
Link: https://lore.kernel.org/r/20210817180500.1253-3-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Pull VFIO updates from Alex Williamson:
- Module reference fixes, structure renaming (Max Gurtovoy)
- Export and use common pci_dev_trylock() (Luis Chamberlain)
- Enable direct mdev device creation and probing by parent (Christoph
Hellwig & Jason Gunthorpe)
- Fix mdpy error path leak (Colin Ian King)
- Fix mtty list entry leak (Jason Gunthorpe)
- Enforce mtty device limit (Alex Williamson)
- Resolve concurrent vfio-pci mmap faults (Alex Williamson)
* tag 'vfio-v5.14-rc1' of git://github.com/awilliam/linux-vfio:
vfio/pci: Handle concurrent vma faults
vfio/mtty: Enforce available_instances
vfio/mtty: Delete mdev_devices_list
vfio: use the new pci_dev_trylock() helper to simplify try lock
PCI: Export pci_dev_trylock() and pci_dev_unlock()
vfio/mdpy: Fix memory leak of object mdev_state->vconfig
vfio/iommu_type1: rename vfio_group struck to vfio_iommu_group
vfio/mbochs: Convert to use vfio_register_group_dev()
vfio/mdpy: Convert to use vfio_register_group_dev()
vfio/mtty: Convert to use vfio_register_group_dev()
vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
driver core: Export device_driver_attach()
driver core: Don't return EPROBE_DEFER to userspace during sysfs bind
driver core: Flow the return code from ->probe() through to sysfs bind
driver core: Better distinguish probe errors in really_probe
driver core: Pull required checks into driver_probe_device()
vfio/platform: remove unneeded parent_module attribute
vfio: centralize module refcount in subsystem layer
Pull devicetree updates from Rob Herring:
- Refine reserved memory nomap handling
- Merge some PCI and non-PCI address handling implementations
- Simplify of_address.h header ifdefs
- Improve printk handling of some 64-bit types
- Convert adi,adv7511, Arm ccree, Arm SCMI, Arm SCU, Arm TWD timer, Arm
VIC, arm,sbsa-gwdt, Arm/Amlogic SCPI, Aspeed I2C, Broadcom iProc PWM,
linaro,optee-tz, MDIO GPIO, Mediatek RNG, MTD physmap, NXP
pcf8563/pcf85263/pcf85363, Renesas TPU, renesas,emev2-smu,
renesas,r9a06g032-sysctrl, sysc-rmobile, Tegra20 EMC, TI AM56 PCI, TI
OMAP mailbox, TI SCI bindings, virtio-mmio, Zynq FPGA, and ZynqMP RTC
to DT schema
- Convert mux and mux controller bindings to schema. This includes MDIO
IIO, and I2C muxes.
- Add Arm PL031 RTC binding schema
- Add vendor prefixes for StarFive Technology Co. Ltd. and Insignal Ltd
- Fix some stale doc references
- Remove stale property-units.txt. Superseded by schema in dt-schema
repo.
- Fixes for 'unevaluatedProperties' handling (enabled with experimental
json-schema support)
- Drop redundant usage of minItems and maxItems across the tree
- Update some examples to use bindings with a schema
* tag 'devicetree-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (83 commits)
dt-bindings: Fix 'unevaluatedProperties' errors in DT graph users
dt-bindings: display: renesas,du: Fix 'ports' reference
dt-bindings: media: adv7180: Add missing video-interfaces.yaml reference
dt-bindings: crypto: ccree: Convert to json-schema
dt-bindings: fpga: zynq: convert bindings to YAML
dt-bindings: rtc: zynqmp: convert bindings to YAML
dt-bindings: interrupt-controller: Convert ARM VIC to json-schema
of: of_reserved_mem: mark nomap memory instead of removing
of: of_reserved_mem: only call memblock_free for normal reserved memory
dt-bindings: Drop redundant minItems/maxItems
dt-bindings: spmi: Correct 'reg' schema
of: reserved-memory: Add stub for RESERVEDMEM_OF_DECLARE()
dt-bindings: clk: vc5: Fix example
dt-bindings: timer: renesas,tmu: add r8a779a0 TMU support
dt-bindings: drm: bridge: adi,adv7511.txt: convert to yaml
dt-bindings: PCI: ti,am65: Convert PCIe host/endpoint mode dt-bindings to YAML
of: Remove superfluous casts when printing u64 values
of: Fix truncation of memory sizes on 32-bit platforms
dt-bindings: rtc: nxp,pcf8563: Absorb pcf85263/pcf85363 bindings
dt-bindings: pwm: Use examples with documented/matching schema
...
Since commit 9ec37efb87 ("PCI/MSI: Make pci_host_common_probe() declare
its reliance on MSI domains"), platforms that rely on the "msi-map"
device-tree property don't get MSIs anymore.
On the Arm Fast Model for example [1], the host bridge doesn't have a
"msi-parent" property since it doesn't itself generate MSIs, and so doesn't
get a MSI domain. It has an "msi-map" property instead to describe MSI
controllers of child devices. As a result, due to the new msi_domain check
in pci_register_host_bridge(), the whole bus gets PCI_BUS_FLAGS_NO_MSI.
Check whether the root complex has an "msi-map" property before giving
up on MSIs.
[1] arch/arm64/boot/dts/arm/fvp-base-revc.dts
Fixes: 9ec37efb87 ("PCI/MSI: Make pci_host_common_probe() declare its reliance on MSI domains")
Link: https://lore.kernel.org/r/20210510173129.750496-1-jean-philippe@linaro.org
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Pull dmaengine updates from Vinod Koul:
"New drivers/devices:
- Support for QCOM SM8150 GPI DMA
Updates:
- Big pile of idxd updates including support for performance
monitoring
- Support in dw-edma for interleaved dma
- Support for synchronize() in Xilinx driver"
* tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (42 commits)
dmaengine: idxd: Enable IDXD performance monitor support
dmaengine: idxd: Add IDXD performance monitor support
dmaengine: idxd: remove MSIX masking for interrupt handlers
dmaengine: idxd: device cmd should use dedicated lock
dmaengine: idxd: support reporting of halt interrupt
dmaengine: idxd: enable SVA feature for IOMMU
dmaengine: idxd: convert sprintf() to sysfs_emit() for all usages
dmaengine: idxd: add interrupt handle request and release support
dmaengine: idxd: add support for readonly config mode
dmaengine: idxd: add percpu_ref to descriptor submission path
dmaengine: idxd: remove detection of device type
dmaengine: idxd: iax bus removal
dmaengine: idxd: fix cdev setup and free device lifetime issues
dmaengine: idxd: fix group conf_dev lifetime
dmaengine: idxd: fix engine conf_dev lifetime
dmaengine: idxd: fix wq conf_dev 'struct device' lifetime
dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime
dmaengine: idxd: use ida for device instance enumeration
dmaengine: idxd: removal of pcim managed mmio mapping
dmaengine: idxd: cleanup pci interrupt vector allocation management
...
- Convert tegra to MSI domains (Marc Zyngier)
- Use rcar controller address as MSI doorbell instead of allocating a page
(Marc Zyngier)
- Convert rcar to MSI domains (Marc Zyngier)
- Use xilinx port structure as MSI doorbell instead of allocating a page
(Marc Zyngier)
- Convert xilinx to MSI domains (Marc Zyngier)
- Remove unused Hyper-V msi_controller structure (Marc Zyngier)
- Remove unused PCI core msi_controller support (Marc Zyngier)
- Remove struct msi_controller (Marc Zyngier)
- Remove unused default_teardown_msi_irqs() (Marc Zyngier)
- Let host bridges declare their reliance on MSI domains (Marc Zyngier)
- Make pci_host_common_probe() declare its reliance on MSI domains (Marc
Zyngier)
- Advertise mediatek lack of built-in MSI handling (Thomas Gleixner)
- Document ways of ending up with NO_MSI (Marc Zyngier)
- Refactor HT advertising of NO_MSI flag (Marc Zyngier)
* remotes/lorenzo/pci/msi:
PCI: Refactor HT advertising of NO_MSI flag
PCI/MSI: Document the various ways of ending up with NO_MSI
PCI: mediatek: Advertise lack of built-in MSI handling
PCI/MSI: Make pci_host_common_probe() declare its reliance on MSI domains
PCI/MSI: Let PCI host bridges declare their reliance on MSI domains
PCI/MSI: Kill default_teardown_msi_irqs()
PCI/MSI: Kill msi_controller structure
PCI/MSI: Drop use of msi_controller from core code
PCI: hv: Drop msi_controller structure
PCI: xilinx: Convert to MSI domains
PCI: xilinx: Don't allocate extra memory for the MSI capture address
PCI: rcar: Convert to MSI domains
PCI: rcar: Don't allocate extra memory for the MSI capture address
PCI: tegra: Convert to MSI domains
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
The "rom" sysfs attribute allows access to the PCI Option ROM. Previously
it was dynamically created either by pci_bus_add_device() or the
pci_sysfs_init() initcall, but since it doesn't need to be created or
removed dynamically, we can use a static attribute so the device model
takes care of addition and removal automatically.
Convert "rom" to a static attribute and use the .is_bin_visible() callback
to set the correct object size based on the ROM size.
Remove "rom_attr" from the struct pci_dev since it is no longer needed.
This attribute was added in the pre-git era by https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/drivers/pci/pci-sysfs.c?id=f6d553444da2
[bhelgaas: commit log]
Suggested-by: Oliver O'Halloran <oohall@gmail.com>
Link: https://lore.kernel.org/r/20210416205856.3234481-3-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>