This reverts commit 7a67b48b624b3f23cd4c7ee2ebfbba45d0b00f28.
The original commit was reverted due to the issue in speculative
handling of file-backed pages. Now that the fix is implemented by
disabling interrupts when walking the page tables this functionality
can be re-enabled.
Bug: 245389404
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ib33a02757eaab60e829941ccfd0f9caa15991282
Changes in 5.15.75
Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
ALSA: oss: Fix potential deadlock at unregistration
ALSA: rawmidi: Drop register_mutex in snd_rawmidi_free()
ALSA: usb-audio: Fix potential memory leaks
ALSA: usb-audio: Fix NULL dererence at error path
ALSA: hda/realtek: remove ALC289_FIXUP_DUAL_SPK for Dell 5530
ALSA: hda/realtek: Correct pin configs for ASUS G533Z
ALSA: hda/realtek: Add quirk for ASUS GV601R laptop
ALSA: hda/realtek: Add Intel Reference SSID to support headset keys
mtd: rawnand: atmel: Unmap streaming DMA mappings
io_uring/net: don't update msg_name if not provided
hv_netvsc: Fix race between VF offering and VF association message from host
cifs: destage dirty pages before re-reading them for cache=none
cifs: Fix the error length of VALIDATE_NEGOTIATE_INFO message
iio: dac: ad5593r: Fix i2c read protocol requirements
iio: ltc2497: Fix reading conversion results
iio: adc: ad7923: fix channel readings for some variants
iio: pressure: dps310: Refactor startup procedure
iio: pressure: dps310: Reset chip after timeout
xhci: dbc: Fix memory leak in xhci_alloc_dbc()
usb: add quirks for Lenovo OneLink+ Dock
can: kvaser_usb: Fix use of uninitialized completion
can: kvaser_usb_leaf: Fix overread with an invalid command
can: kvaser_usb_leaf: Fix TX queue out of sync after restart
can: kvaser_usb_leaf: Fix CAN state after restart
mmc: sdhci-sprd: Fix minimum clock limit
i2c: designware: Fix handling of real but unexpected device interrupts
fs: dlm: fix race between test_bit() and queue_work()
fs: dlm: handle -EBUSY first in lock arg validation
HID: multitouch: Add memory barriers
quota: Check next/prev free block number after reading from quota file
platform/chrome: cros_ec_proto: Update version on GET_NEXT_EVENT failure
ASoC: wcd9335: fix order of Slimbus unprepare/disable
ASoC: wcd934x: fix order of Slimbus unprepare/disable
hwmon: (gsc-hwmon) Call of_node_get() before of_find_xxx API
net: thunderbolt: Enable DMA paths only after rings are enabled
regulator: qcom_rpm: Fix circular deferral regression
arm64: topology: move store_cpu_topology() to shared code
riscv: topology: fix default topology reporting
RISC-V: Make port I/O string accessors actually work
parisc: fbdev/stifb: Align graphics memory size to 4MB
riscv: Allow PROT_WRITE-only mmap()
riscv: Make VM_WRITE imply VM_READ
riscv: always honor the CONFIG_CMDLINE_FORCE when parsing dtb
riscv: Pass -mno-relax only on lld < 15.0.0
UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
nvmem: core: Fix memleak in nvmem_register()
nvme-multipath: fix possible hang in live ns resize with ANA access
nvme-pci: set min_align_mask before calculating max_hw_sectors
Revert "drm/amdgpu: use dirty framebuffer helper"
dmaengine: mxs: use platform_driver_register
drm/virtio: Check whether transferred 2D BO is shmem
drm/virtio: Unlock reservations on virtio_gpu_object_shmem_init() error
drm/virtio: Use appropriate atomic state in virtio_gpu_plane_cleanup_fb()
drm/udl: Restore display mode on resume
arm64: errata: Add Cortex-A55 to the repeat tlbi list
mm/damon: validate if the pmd entry is present before accessing
mm/mmap: undo ->mmap() when arch_validate_flags() fails
xen/gntdev: Prevent leaking grants
xen/gntdev: Accommodate VMA splitting
PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge
serial: 8250: Let drivers request full 16550A feature probing
serial: 8250: Request full 16550A feature probing for OxSemi PCIe devices
NFSD: Protect against send buffer overflow in NFSv3 READDIR
NFSD: Protect against send buffer overflow in NFSv2 READ
NFSD: Protect against send buffer overflow in NFSv3 READ
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
powerpc/boot: Explicitly disable usage of SPE instructions
slimbus: qcom-ngd: use correct error in message of pdr_add_lookup() failure
slimbus: qcom-ngd: cleanup in probe error path
scsi: qedf: Populate sysfs attributes for vport
gpio: rockchip: request GPIO mux to pinctrl when setting direction
pinctrl: rockchip: add pinmux_ops.gpio_set_direction callback
fbdev: smscufx: Fix use-after-free in ufx_ops_open()
ksmbd: fix endless loop when encryption for response fails
ksmbd: Fix wrong return value and message length check in smb2_ioctl()
ksmbd: Fix user namespace mapping
fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE
btrfs: fix race between quota enable and quota rescan ioctl
btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
f2fs: complete checkpoints during remount
f2fs: flush pending checkpoints when freezing super
f2fs: increase the limit for reserve_root
f2fs: fix to do sanity check on destination blkaddr during recovery
f2fs: fix to do sanity check on summary info
hardening: Avoid harmless Clang option under CONFIG_INIT_STACK_ALL_ZERO
hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
jbd2: wake up journal waiters in FIFO order, not LIFO
jbd2: fix potential buffer head reference count leak
jbd2: fix potential use-after-free in jbd2_fc_wait_bufs
jbd2: add miss release buffer head in fc_do_one_pass()
ext4: avoid crash when inline data creation follows DIO write
ext4: fix null-ptr-deref in ext4_write_info
ext4: make ext4_lazyinit_thread freezable
ext4: fix check for block being out of directory size
ext4: don't increase iversion counter for ea_inodes
ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
ext4: place buffer head allocation before handle start
ext4: fix dir corruption when ext4_dx_add_entry() fails
ext4: fix miss release buffer head in ext4_fc_write_inode
ext4: fix potential memory leak in ext4_fc_record_modified_inode()
ext4: fix potential memory leak in ext4_fc_record_regions()
ext4: update 'state->fc_regions_size' after successful memory allocation
livepatch: fix race between fork and KLP transition
ftrace: Properly unset FTRACE_HASH_FL_MOD
ring-buffer: Allow splice to read previous partially read pages
ring-buffer: Have the shortest_full queue be the shortest not longest
ring-buffer: Check pending waiters when doing wake ups as well
ring-buffer: Add ring_buffer_wake_waiters()
ring-buffer: Fix race between reset page and reading page
tracing: Disable interrupt or preemption before acquiring arch_spinlock_t
tracing: Wake up ring buffer waiters on closing of the file
tracing: Wake up waiters when tracing is disabled
tracing: Add ioctl() to force ring buffer waiters to wake up
tracing: Move duplicate code of trace_kprobe/eprobe.c into header
tracing: Add "(fault)" name injection to kernel probes
tracing: Fix reading strings from synthetic events
thunderbolt: Explicitly enable lane adapter hotplug events at startup
efi: libstub: drop pointless get_memory_map() call
media: cedrus: Set the platform driver data earlier
media: cedrus: Fix endless loop in cedrus_h265_skip_bits()
blk-wbt: call rq_qos_add() after wb_normal is initialized
KVM: x86/emulator: Fix handing of POP SS to correctly set interruptibility
KVM: nVMX: Unconditionally purge queued/injected events on nested "exit"
KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02
KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS
staging: greybus: audio_helper: remove unused and wrong debugfs usage
drm/nouveau/kms/nv140-: Disable interlacing
drm/nouveau: fix a use-after-free in nouveau_gem_prime_import_sg_table()
drm/i915: Fix watermark calculations for gen12+ RC CCS modifier
drm/i915: Fix watermark calculations for gen12+ MC CCS modifier
drm/i915: Fix watermark calculations for gen12+ CCS+CC modifier
drm/amd/display: Fix vblank refcount in vrr transition
smb3: must initialize two ACL struct fields to zero
selinux: use "grep -E" instead of "egrep"
ima: fix blocking of security.ima xattrs of unsupported algorithms
userfaultfd: open userfaultfds with O_RDONLY
ntfs3: rework xattr handlers and switch to POSIX ACL VFS helpers
thermal: cpufreq_cooling: Check the policy first in cpufreq_cooling_register()
sh: machvec: Use char[] for section boundaries
MIPS: SGI-IP27: Free some unused memory
MIPS: SGI-IP27: Fix platform-device leak in bridge_platform_create()
ARM: 9244/1: dump: Fix wrong pg_level in walk_pmd()
ARM: 9247/1: mm: set readonly for MT_MEMORY_RO with ARM_LPAE
objtool: Preserve special st_shndx indexes in elf_update_symbol
nfsd: Fix a memory leak in an error handling path
SUNRPC: Fix svcxdr_init_decode's end-of-buffer calculation
SUNRPC: Fix svcxdr_init_encode's buflen calculation
NFSD: Protect against send buffer overflow in NFSv2 READDIR
NFSD: Fix handling of oversized NFSv4 COMPOUND requests
wifi: rtlwifi: 8192de: correct checking of IQK reload
wifi: ath10k: add peer map clean up for peer delete in ath10k_sta_state()
leds: lm3601x: Don't use mutex after it was destroyed
bpf: Fix reference state management for synchronous callbacks
wifi: mac80211: allow bw change during channel switch in mesh
bpftool: Fix a wrong type cast in btf_dumper_int
spi: mt7621: Fix an error message in mt7621_spi_probe()
x86/resctrl: Fix to restore to original value when re-enabling hardware prefetch register
xsk: Fix backpressure mechanism on Tx
bpf: Disable preemption when increasing per-cpu map_locked
bpf: Propagate error from htab_lock_bucket() to userspace
bpf: Use this_cpu_{inc|dec|inc_return} for bpf_task_storage_busy
Bluetooth: btusb: mediatek: fix WMT failure during runtime suspend
wifi: rtl8xxxu: tighten bounds checking in rtl8xxxu_read_efuse()
wifi: rtw88: add missing destroy_workqueue() on error path in rtw_core_init()
selftests/xsk: Avoid use-after-free on ctx
spi: qup: add missing clk_disable_unprepare on error in spi_qup_resume()
spi: qup: add missing clk_disable_unprepare on error in spi_qup_pm_resume_runtime()
wifi: rtl8xxxu: Fix skb misuse in TX queue selection
spi: meson-spicc: do not rely on busy flag in pow2 clk ops
bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
wifi: rtl8xxxu: gen2: Fix mistake in path B IQ calibration
wifi: rtl8xxxu: Remove copy-paste leftover in gen2_update_rate_mask
wifi: mt76: sdio: fix transmitting packet hangs
wifi: mt76: mt7615: add mt7615_mutex_acquire/release in mt7615_sta_set_decap_offload
wifi: mt76: mt7915: do not check state before configuring implicit beamform
Bluetooth: RFCOMM: Fix possible deadlock on socket shutdown/release
net: fs_enet: Fix wrong check in do_pd_setup
bpf: Ensure correct locking around vulnerable function find_vpid()
Bluetooth: hci_{ldisc,serdev}: check percpu_init_rwsem() failure
netfilter: conntrack: fix the gc rescheduling delay
netfilter: conntrack: revisit the gc initial rescheduling bias
wifi: ath11k: fix number of VHT beamformee spatial streams
x86/microcode/AMD: Track patch allocation size explicitly
x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
spi: dw: Fix PM disable depth imbalance in dw_spi_bt1_probe
spi/omap100k:Fix PM disable depth imbalance in omap1_spi100k_probe
skmsg: Schedule psock work if the cached skb exists on the psock
i2c: mlxbf: support lock mechanism
Bluetooth: hci_core: Fix not handling link timeouts propertly
xfrm: Reinject transport-mode packets through workqueue
netfilter: nft_fib: Fix for rpath check with VRF devices
spi: s3c64xx: Fix large transfers with DMA
wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
vhost/vsock: Use kvmalloc/kvfree for larger packets.
eth: alx: take rtnl_lock on resume
mISDN: fix use-after-free bugs in l1oip timer handlers
sctp: handle the error returned from sctp_auth_asoc_init_active_key
tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
spi: Ensure that sg_table won't be used after being freed
hwmon: (pmbus/mp2888) Fix sensors readouts for MPS Multi-phase mp2888 controller
net: rds: don't hold sock lock when cancelling work from rds_tcp_reset_callbacks()
bnx2x: fix potential memory leak in bnx2x_tpa_stop()
net: wwan: iosm: Call mutex_init before locking it
net/ieee802154: reject zero-sized raw_sendmsg()
once: add DO_ONCE_SLOW() for sleepable contexts
net: mvpp2: fix mvpp2 debugfs leak
drm: bridge: adv7511: fix CEC power down control register offset
drm: bridge: adv7511: unregister cec i2c device after cec adapter
drm/bridge: Avoid uninitialized variable warning
drm/mipi-dsi: Detach devices when removing the host
drm/virtio: Correct drm_gem_shmem_get_sg_table() error handling
drm/bridge: parade-ps8640: Fix regulator supply order
drm/dp_mst: fix drm_dp_dpcd_read return value checks
drm:pl111: Add of_node_put() when breaking out of for_each_available_child_of_node()
ASoC: mt6359: fix tests for platform_get_irq() failure
platform/chrome: fix double-free in chromeos_laptop_prepare()
platform/chrome: fix memory corruption in ioctl
ASoC: tas2764: Allow mono streams
ASoC: tas2764: Drop conflicting set_bias_level power setting
ASoC: tas2764: Fix mute/unmute
platform/x86: msi-laptop: Fix old-ec check for backlight registering
platform/x86: msi-laptop: Fix resource cleanup
platform/chrome: cros_ec_typec: Correct alt mode index
drm/amdgpu: add missing pci_disable_device() in amdgpu_pmops_runtime_resume()
drm/bridge: megachips: Fix a null pointer dereference bug
ASoC: rsnd: Add check for rsnd_mod_power_on
ALSA: hda: beep: Simplify keep-power-at-enable behavior
drm/bochs: fix blanking
drm/omap: dss: Fix refcount leak bugs
drm/amdgpu: Fix memory leak in hpd_rx_irq_create_workqueue()
mmc: au1xmmc: Fix an error handling path in au1xmmc_probe()
ASoC: eureka-tlv320: Hold reference returned from of_find_xxx API
drm/msm/dpu: index dpu_kms->hw_vbif using vbif_idx
drm/msm/dp: correct 1.62G link rate at dp_catalog_ctrl_config_msa()
drm/vmwgfx: Fix memory leak in vmw_mksstat_add_ioctl()
ASoC: codecs: tx-macro: fix kcontrol put
ASoC: da7219: Fix an error handling path in da7219_register_dai_clks()
ALSA: dmaengine: increment buffer pointer atomically
mmc: wmt-sdmmc: Fix an error handling path in wmt_mci_probe()
ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe
ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe
ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe
ASoC: mt6660: Fix PM disable depth imbalance in mt6660_i2c_probe
ALSA: hda/hdmi: Don't skip notification handling during PM operation
memory: pl353-smc: Fix refcount leak bug in pl353_smc_probe()
memory: of: Fix refcount leak bug in of_get_ddr_timings()
memory: of: Fix refcount leak bug in of_lpddr3_get_ddr_timings()
locks: fix TOCTOU race when granting write lease
soc: qcom: smsm: Fix refcount leak bugs in qcom_smsm_probe()
soc: qcom: smem_state: Add refcounting for the 'state->of_node'
ARM: dts: imx6qdl-kontron-samx6i: hook up DDC i2c bus
ARM: dts: turris-omnia: Fix mpp26 pin name and comment
ARM: dts: kirkwood: lsxl: fix serial line
ARM: dts: kirkwood: lsxl: remove first ethernet port
ia64: export memory_add_physaddr_to_nid to fix cxl build error
soc/tegra: fuse: Drop Kconfig dependency on TEGRA20_APB_DMA
arm64: dts: ti: k3-j7200: fix main pinmux range
ARM: dts: exynos: correct s5k6a3 reset polarity on Midas family
ARM: Drop CMDLINE_* dependency on ATAGS
ext4: don't run ext4lazyinit for read-only filesystems
arm64: ftrace: fix module PLTs with mcount
ARM: dts: exynos: fix polarity of VBUS GPIO of Origen
iio: adc: at91-sama5d2_adc: fix AT91_SAMA5D2_MR_TRACKTIM_MAX
iio: adc: at91-sama5d2_adc: check return status for pressure and touch
iio: adc: at91-sama5d2_adc: lock around oversampling and sample freq
iio: adc: at91-sama5d2_adc: disable/prepare buffer on suspend/resume
iio: inkern: only release the device node when done with it
iio: inkern: fix return value in devm_of_iio_channel_get_by_name()
iio: ABI: Fix wrong format of differential capacitance channel ABI.
iio: magnetometer: yas530: Change data type of hard_offsets to signed
RDMA/mlx5: Don't compare mkey tags in DEVX indirect mkey
usb: common: debug: Check non-standard control requests
clk: meson: Hold reference returned by of_get_parent()
clk: oxnas: Hold reference returned by of_get_parent()
clk: qoriq: Hold reference returned by of_get_parent()
clk: berlin: Add of_node_put() for of_get_parent()
clk: sprd: Hold reference returned by of_get_parent()
clk: tegra: Fix refcount leak in tegra210_clock_init
clk: tegra: Fix refcount leak in tegra114_clock_init
clk: tegra20: Fix refcount leak in tegra20_clock_init
HSI: omap_ssi: Fix refcount leak in ssi_probe
HSI: omap_ssi_port: Fix dma_map_sg error check
media: exynos4-is: fimc-is: Add of_node_put() when breaking out of loop
tty: xilinx_uartps: Fix the ignore_status
media: meson: vdec: add missing clk_disable_unprepare on error in vdec_hevc_start()
media: uvcvideo: Fix memory leak in uvc_gpio_parse
media: uvcvideo: Use entity get_cur in uvc_ctrl_set
media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init
RDMA/rxe: Fix "kernel NULL pointer dereference" error
RDMA/rxe: Fix the error caused by qp->sk
misc: ocxl: fix possible refcount leak in afu_ioctl()
fpga: prevent integer overflow in dfl_feature_ioctl_set_irq()
dmaengine: hisilicon: Disable channels when unregister hisi_dma
dmaengine: hisilicon: Fix CQ head update
dmaengine: hisilicon: Add multi-thread support for a DMA channel
dyndbg: fix static_branch manipulation
dyndbg: fix module.dyndbg handling
dyndbg: let query-modname override actual module name
dyndbg: drop EXPORTed dynamic_debug_exec_queries
clk: qcom: sm6115: Select QCOM_GDSC
mtd: devices: docg3: check the return value of devm_ioremap() in the probe
phy: amlogic: phy-meson-axg-mipi-pcie-analog: Hold reference returned by of_get_parent()
phy: phy-mtk-tphy: fix the phy type setting issue
mtd: rawnand: intel: Read the chip-select line from the correct OF node
mtd: rawnand: intel: Remove undocumented compatible string
mtd: rawnand: fsl_elbc: Fix none ECC mode
RDMA/irdma: Align AE id codes to correct flush code and event
RDMA/srp: Fix srp_abort()
RDMA/siw: Always consume all skbuf data in sk_data_ready() upcall.
RDMA/siw: Fix QP destroy to wait for all references dropped.
ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting()
ata: fix ata_id_has_devslp()
ata: fix ata_id_has_ncq_autosense()
ata: fix ata_id_has_dipm()
mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct()
md: Replace snprintf with scnprintf
md/raid5: Ensure stripe_fill happens on non-read IO with journal
md/raid5: Remove unnecessary bio_put() in raid5_read_one_chunk()
RDMA/cm: Use SLID in the work completion as the DLID in responder side
IB: Set IOVA/LENGTH on IB_MR in core/uverbs layers
xhci: Don't show warning for reinit on known broken suspend
usb: gadget: function: fix dangling pnp_string in f_printer.c
drivers: serial: jsm: fix some leaks in probe
serial: 8250: Toggle IER bits on only after irq has been set up
tty: serial: fsl_lpuart: disable dma rx/tx use flags in lpuart_dma_shutdown
phy: qualcomm: call clk_disable_unprepare in the error handling
staging: vt6655: fix some erroneous memory clean-up loops
slimbus: qcom-ngd-ctrl: allow compile testing without QCOM_RPROC_COMMON
firmware: google: Test spinlock on panic path to avoid lockups
serial: 8250: Fix restoring termios speed after suspend
scsi: libsas: Fix use-after-free bug in smp_execute_task_sg()
scsi: iscsi: Rename iscsi_conn_queue_work()
scsi: iscsi: Add recv workqueue helpers
scsi: iscsi: Run recv path from workqueue
scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()
clk: qcom: apss-ipq6018: mark apcs_alias0_core_clk as critical
clk: qcom: gcc-sm6115: Override default Alpha PLL regs
RDMA/rxe: Fix resize_finish() in rxe_queue.c
fsi: core: Check error number after calling ida_simple_get
mfd: intel_soc_pmic: Fix an error handling path in intel_soc_pmic_i2c_probe()
mfd: fsl-imx25: Fix an error handling path in mx25_tsadc_setup_irq()
mfd: lp8788: Fix an error handling path in lp8788_probe()
mfd: lp8788: Fix an error handling path in lp8788_irq_init() and lp8788_irq_init()
mfd: fsl-imx25: Fix check for platform_get_irq() errors
mfd: sm501: Add check for platform_driver_register()
clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
dmaengine: ioat: stop mod_timer from resurrecting deleted timer in __cleanup()
usb: mtu3: fix failed runtime suspend in host only mode
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
clk: vc5: Fix 5P49V6901 outputs disabling when enabling FOD
clk: baikal-t1: Fix invalid xGMAC PTP clock divider
clk: baikal-t1: Add shared xGMAC ref/ptp clocks internal parent
clk: baikal-t1: Add SATA internal ref clock buffer
clk: bcm2835: fix bcm2835_clock_rate_from_divisor declaration
clk: imx: scu: fix memleak on platform_device_add() fails
clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
clk: ast2600: BCLK comes from EPLL
mailbox: mpfs: fix handling of the reg property
mailbox: mpfs: account for mbox offsets while sending
mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg
powerpc/configs: Properly enable PAPR_SCM in pseries_defconfig
powerpc/math_emu/efp: Include module.h
powerpc/sysdev/fsl_msi: Add missing of_node_put()
powerpc/pci_dn: Add missing of_node_put()
powerpc/powernv: add missing of_node_put() in opal_export_attrs()
powerpc: Fix fallocate and fadvise64_64 compat parameter combination
x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5
powerpc: Fix SPE Power ISA properties for e500v1 platforms
powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()
powerpc/pseries/vas: Pass hw_cpu_id to node associativity HCALL
crypto: sahara - don't sleep when in softirq
crypto: hisilicon/zip - fix mismatch in get/set sgl_sge_nr
hwrng: arm-smccc-trng - fix NO_ENTROPY handling
cgroup: Honor caller's cgroup NS when resolving path
hwrng: imx-rngc - Moving IRQ handler registering after imx_rngc_irq_mask_clear()
crypto: qat - fix default value of WDT timer
crypto: hisilicon/qm - fix missing put dfx access
cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
iommu/omap: Fix buffer overflow in debugfs
crypto: akcipher - default implementation for setting a private key
crypto: ccp - Release dma channels before dmaengine unrgister
crypto: inside-secure - Change swab to swab32
crypto: qat - fix DMA transfer direction
cifs: return correct error in ->calc_signature()
iommu/iova: Fix module config properly
tracing: kprobe: Fix kprobe event gen test module on exit
tracing: kprobe: Make gen test module work in arm and riscv
tracing/osnoise: Fix possible recursive locking in stop_per_cpu_kthreads
kbuild: remove the target in signal traps when interrupted
kbuild: rpm-pkg: fix breakage when V=1 is used
crypto: marvell/octeontx - prevent integer overflows
crypto: cavium - prevent integer overflow loading firmware
thermal/drivers/qcom/tsens-v0_1: Fix MSM8939 fourth sensor hw_id
ACPI: APEI: do not add task_work to kernel thread to avoid memory leak
f2fs: fix race condition on setting FI_NO_EXTENT flag
f2fs: fix to account FS_CP_DATA_IO correctly
selftest: tpm2: Add Client.__del__() to close /dev/tpm* handle
fs: dlm: fix race in lowcomms
rcu: Avoid triggering strict-GP irq-work when RCU is idle
rcu: Back off upon fill_page_cache_func() allocation failure
rcu-tasks: Convert RCU_LOCKDEP_WARN() to WARN_ONCE()
ACPI: video: Add Toshiba Satellite/Portege Z830 quirk
ACPI: tables: FPDT: Don't call acpi_os_map_memory() on invalid phys address
cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
MIPS: BCM47XX: Cast memcmp() of function to (void *)
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash
ARM: decompressor: Include .data.rel.ro.local
ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable
x86/entry: Work around Clang __bdos() bug
NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data
NFSD: fix use-after-free on source server when doing inter-server copy
wifi: brcmfmac: fix invalid address access when enabling SCAN log level
bpftool: Clear errno after libcap's checks
ice: set tx_tstamps when creating new Tx rings via ethtool
net: ethernet: ti: davinci_mdio: Add workaround for errata i2329
openvswitch: Fix double reporting of drops in dropwatch
openvswitch: Fix overreporting of drops in dropwatch
tcp: annotate data-race around tcp_md5sig_pool_populated
x86/mce: Retrieve poison range from hardware
wifi: ath9k: avoid uninit memory read in ath9k_htc_rx_msg()
thunderbolt: Add back Intel Falcon Ridge end-to-end flow control workaround
xfrm: Update ipcomp_scratches with NULL when freed
iavf: Fix race between iavf_close and iavf_reset_task
wifi: brcmfmac: fix use-after-free bug in brcmf_netdev_start_xmit()
Bluetooth: btintel: Mark Intel controller to support LE_STATES quirk
regulator: core: Prevent integer underflow
wifi: mt76: mt7921: reset msta->airtime_ac while clearing up hw value
Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()
Bluetooth: hci_sysfs: Fix attempting to call device_add multiple times
can: bcm: check the result of can_send() in bcm_can_tx()
wifi: rt2x00: don't run Rt5592 IQ calibration on MT7620
wifi: rt2x00: set correct TX_SW_CFG1 MAC register for MT7620
wifi: rt2x00: set VGC gain for both chains of MT7620
wifi: rt2x00: set SoC wmac clock register
wifi: rt2x00: correctly set BBP register 86 for MT7620
hwmon: (sht4x) do not overflow clamping operation on 32-bit platforms
net: If sock is dead don't access sock's sk_wq in sk_stream_wait_memory
Bluetooth: L2CAP: Fix user-after-free
r8152: Rate limit overflow messages
drm/nouveau/nouveau_bo: fix potential memory leak in nouveau_bo_alloc()
drm: Use size_t type for len variable in drm_copy_field()
drm: Prevent drm_copy_field() to attempt copying a NULL pointer
drm/komeda: Fix handling of atomic commits in the atomic_commit_tail hook
gpu: lontium-lt9611: Fix NULL pointer dereference in lt9611_connector_init()
drm/amd/display: fix overflow on MIN_I64 definition
udmabuf: Set ubuf->sg = NULL if the creation of sg table fails
drm: bridge: dw_hdmi: only trigger hotplug event on link change
ALSA: usb-audio: Register card at the last interface
drm/vc4: vec: Fix timings for VEC modes
drm: panel-orientation-quirks: Add quirk for Anbernic Win600
platform/chrome: cros_ec: Notify the PM of wake events during resume
platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading
ASoC: SOF: pci: Change DMI match info to support all Chrome platforms
drm/amdgpu: fix initial connector audio value
drm/meson: reorder driver deinit sequence to fix use-after-free bug
drm/meson: explicitly remove aggregate driver at module unload time
mmc: sdhci-msm: add compatible string check for sdm670
drm/dp: Don't rewrite link config when setting phy test pattern
drm/amd/display: Remove interface for periodic interrupt 1
ARM: dts: imx7d-sdb: config the max pressure for tsc2046
ARM: dts: imx6q: add missing properties for sram
ARM: dts: imx6dl: add missing properties for sram
ARM: dts: imx6qp: add missing properties for sram
ARM: dts: imx6sl: add missing properties for sram
ARM: dts: imx6sll: add missing properties for sram
ARM: dts: imx6sx: add missing properties for sram
kselftest/arm64: Fix validatation termination record after EXTRA_CONTEXT
arm64: dts: imx8mq-librem5: Add bq25895 as max17055's power supply
btrfs: dump extra info if one free space cache has more bitmaps than it should
btrfs: scrub: try to fix super block errors
btrfs: don't print information about space cache or tree every remount
ARM: 9242/1: kasan: Only map modules if CONFIG_KASAN_VMALLOC=n
clk: zynqmp: Fix stack-out-of-bounds in strncpy`
media: cx88: Fix a null-ptr-deref bug in buffer_prepare()
media: platform: fix some double free in meson-ge2d and mtk-jpeg and s5p-mfc
clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
usb: host: xhci-plat: suspend and resume clocks
usb: host: xhci-plat: suspend/resume clks for brcm
dmaengine: ti: k3-udma: Reset UDMA_CHAN_RT byte counters to prevent overflow
scsi: 3w-9xxx: Avoid disabling device if failing to enable it
nbd: Fix hung when signal interrupts nbd_start_device_ioctl()
iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()
staging: vt6655: fix potential memory leak
blk-throttle: prevent overflow while calculating wait time
ata: libahci_platform: Sanity check the DT child nodes number
bcache: fix set_at_max_writeback_rate() for multiple attached devices
soundwire: cadence: Don't overwrite msg->buf during write commands
soundwire: intel: fix error handling on dai registration issues
HID: roccat: Fix use-after-free in roccat_read()
eventfd: guard wake_up in eventfd fs calls as well
md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d
usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()
usb: musb: Fix musb_gadget.c rxstate overflow bug
arm64: dts: imx8mp: Add snps,gfladj-refclk-lpm-sel quirk to USB nodes
usb: dwc3: core: Enable GUCTL1 bit 10 for fixing termination error after resume bug
Revert "usb: storage: Add quirk for Samsung Fit flash"
staging: rtl8723bs: fix potential memory leak in rtw_init_drv_sw()
staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
scsi: tracing: Fix compile error in trace_array calls when TRACING is disabled
ext2: Use kvmalloc() for group descriptor array
nvme: copy firmware_rev on each init
nvmet-tcp: add bounds check on Transfer Tag
usb: idmouse: fix an uninit-value in idmouse_open
clk: bcm2835: Make peripheral PLLC critical
clk: bcm2835: Round UART input clock up
perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
io_uring/af_unix: defer registered files gc to io_uring release
io_uring: correct pinned_vm accounting
io_uring/rw: fix short rw error handling
io_uring/rw: fix error'ed retry return values
io_uring/rw: fix unexpected link breakage
mm: hugetlb: fix UAF in hugetlb_handle_userfault
net: ieee802154: return -EINVAL for unknown addr type
ALSA: usb-audio: Fix last interface check for registration
blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
net: ethernet: ti: davinci_mdio: fix build for mdio bitbang uses
Revert "net/ieee802154: reject zero-sized raw_sendmsg()"
net/ieee802154: don't warn zero-sized raw_sendmsg()
drm/amd/display: Fix build breakage with CONFIG_DEBUG_FS=n
Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
ext4: continue to expand file system when the target size doesn't reach
thermal: intel_powerclamp: Use first online CPU as control_cpu
gcov: support GCC 12.1 and newer compilers
io-wq: Fix memory leak in worker creation
Linux 5.15.75
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id3dfe8285e830d01df7adb33729e78a75c1cac1a
commit 4bb26f2885ac6930984ee451b952c5a6042f2c0e upstream.
When inode is created and written to using direct IO, there is nothing
to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
truncated later to say 1 byte and written using normal write, we will
try to store the data as inline data. This confuses the code later
because the inode now has both normal block and inline data allocated
and the confusion manifests for example as:
kernel BUG at fs/ext4/inode.c:2721!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
RIP: 0010:ext4_writepages+0x363d/0x3660
RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293
RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180
RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000
RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b
R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128
R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001
FS: 00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0
Call Trace:
<TASK>
do_writepages+0x397/0x640
filemap_fdatawrite_wbc+0x151/0x1b0
file_write_and_wait_range+0x1c9/0x2b0
ext4_sync_file+0x19e/0xa00
vfs_fsync_range+0x17b/0x190
ext4_buffered_write_iter+0x488/0x530
ext4_file_write_iter+0x449/0x1b90
vfs_write+0xbcd/0xf40
ksys_write+0x198/0x2c0
__x64_sys_write+0x7b/0x90
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>
Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing
direct IO write to a file.
Cc: stable@kernel.org
Reported-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Reported-by: syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Tested-by: Tadeusz Struk<tadeusz.struk@linaro.org>
Link: https://lore.kernel.org/r/20220727155753.13969-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To prepare for STATX_DIOALIGN support, make two changes to
fscrypt_dio_supported().
First, remove the filesystem-block-alignment check and make the
filesystems handle it instead. It previously made sense to have it in
fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction
would have to be returned to filesystems. It ends up being simpler if
filesystems handle this part themselves, especially for f2fs which only
allows fs-block-aligned DIO in the first place.
Second, make fscrypt_dio_supported() work on inodes whose encryption key
hasn't been set up yet, by making it set up the key if needed. This is
required for statx(), since statx() doesn't require a file descriptor.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220827065851.135710-4-ebiggers@kernel.org
commit 4fdccaa0d184c202f98d73b24e3ec8eeee88ab8d upstream
Add a done_before argument to iomap_dio_rw that indicates how much of
the request has already been transferred. When the request succeeds, we
report that done_before additional bytes were tranferred. This is
useful for finishing a request asynchronously when part of the request
has already been completed synchronously.
We'll use that to allow iomap_dio_rw to be used with page faults
disabled: when a page fault occurs while submitting a request, we
synchronously complete the part of the request that has already been
submitted. The caller can then take care of the page fault and call
iomap_dio_rw again for the rest of the request, passing in the number of
bytes already tranferred.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Encrypted files traditionally haven't supported DIO, due to the need to
encrypt/decrypt the data. However, when the encryption is implemented
using inline encryption (blk-crypto) instead of the traditional
filesystem-layer encryption, it is straightforward to support DIO.
Therefore, make ext4 support DIO on files that are using inline
encryption. Since ext4 uses iomap for DIO, and fscrypt support was
already added to iomap DIO, this just requires two small changes:
- Let DIO proceed when supported, by checking fscrypt_dio_supported()
instead of assuming that encrypted files never support DIO.
- In ext4_iomap_begin(), use fscrypt_limit_io_blocks() to limit the
length of the mapping in the rare case where a DUN discontiguity
occurs in the middle of an extent. The iomap DIO implementation
requires this, since it assumes that it can submit a bio covering (up
to) the whole mapping, without checking fscrypt constraints itself.
Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Link: https://lore.kernel.org/r/20220128233940.79464-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a done_before argument to iomap_dio_rw that indicates how much of
the request has already been transferred. When the request succeeds, we
report that done_before additional bytes were tranferred. This is
useful for finishing a request asynchronously when part of the request
has already been completed synchronously.
We'll use that to allow iomap_dio_rw to be used with page faults
disabled: when a page fault occurs while submitting a request, we
synchronously complete the part of the request that has already been
submitted. The caller can then take care of the page fault and call
iomap_dio_rw again for the rest of the request, passing in the number of
bytes already tranferred.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Pull ext4 updates from Ted Ts'o:
"In addition to some ext4 bug fixes and cleanups, this cycle we add the
orphan_file feature, which eliminates bottlenecks when doing a large
number of parallel truncates and file deletions, and move the discard
operation out of the jbd2 commit thread when using the discard mount
option, to better support devices with slow discard operations"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
ext4: make the updating inode data procedure atomic
ext4: remove an unnecessary if statement in __ext4_get_inode_loc()
ext4: move inode eio simulation behind io completeion
ext4: Improve scalability of ext4 orphan file handling
ext4: Orphan file documentation
ext4: Speedup ext4 orphan inode handling
ext4: Move orphan inode handling into a separate file
ext4: Support for checksumming from journal triggers
ext4: fix race writing to an inline_data file while its xattrs are changing
jbd2: add sparse annotations for add_transaction_credits()
ext4: fix sparse warnings
ext4: Make sure quota files are not grabbed accidentally
ext4: fix e2fsprogs checksum failure for mounted filesystem
ext4: if zeroout fails fall back to splitting the extent node
ext4: reduce arguments of ext4_fc_add_dentry_tlv
ext4: flush background discard kwork when retry allocation
ext4: get discard out of jbd2 commit kthread contex
ext4: remove the repeated comment of ext4_trim_all_free
ext4: add new helper interface ext4_try_to_trim_range()
ext4: remove the 'group' parameter of ext4_trim_extent
...
JBD2 layer support triggers which are called when journaling layer moves
buffer to a certain state. We can use the frozen trigger, which gets
called when buffer data is frozen and about to be written out to the
journal, to compute block checksums for some buffer types (similarly as
does ocfs2). This avoids unnecessary repeated recomputation of the
checksum (at the cost of larger window where memory corruption won't be
caught by checksumming) and is even necessary when there are
unsynchronized updaters of the checksummed data.
So add superblock and journal trigger type arguments to
ext4_journal_get_write_access() and ext4_journal_get_create_access() so
that frozen triggers can be set accordingly. Also add inode argument to
ext4_walk_page_buffers() and all the callbacks used with that function
for the same purpose. This patch is mostly only a change of prototype of
the above mentioned functions and a few small helpers. Real checksumming
will come later.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210816095713.16537-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Convert ext4 to use mapping->invalidate_lock instead of its private
EXT4_I(inode)->i_mmap_sem. This is mostly search-and-replace. By this
conversion we fix a long standing race between hole punching and read(2)
/ readahead(2) paths that can lead to stale page cache contents.
CC: <linux-ext4@vger.kernel.org>
CC: Ted Tso <tytso@mit.edu>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull ext4 updates from Ted Ts'o:
"New features for ext4 this cycle include support for encrypted
casefold, ensure that deleted file names are cleared in directory
blocks by zeroing directory entries when they are unlinked or moved as
part of a hash tree node split. We also improve the block allocator's
performance on a freshly mounted file system by prefetching block
bitmaps.
There are also the usual cleanups and bug fixes, including fixing a
page cache invalidation race when there is mixed buffered and direct
I/O and the block size is less than page size, and allow the dax flag
to be set and cleared on inline directories"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (32 commits)
ext4: wipe ext4_dir_entry2 upon file deletion
ext4: Fix occasional generic/418 failure
fs: fix reporting supported extra file attributes for statx()
ext4: allow the dax flag to be set and cleared on inline directories
ext4: fix debug format string warning
ext4: fix trailing whitespace
ext4: fix various seppling typos
ext4: fix error return code in ext4_fc_perform_commit()
ext4: annotate data race in jbd2_journal_dirty_metadata()
ext4: annotate data race in start_this_handle()
ext4: fix ext4_error_err save negative errno into superblock
ext4: fix error code in ext4_commit_super
ext4: always panic when errors=panic is specified
ext4: delete redundant uptodate check for buffer
ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()
ext4: make prefetch_block_bitmaps default
ext4: add proc files to monitor new structures
ext4: improve cr 0 / cr 1 group scanning
ext4: add MB_NUM_ORDERS macro
ext4: add mballoc stats proc file
...
Eric has noticed that after pagecache read rework, generic/418 is
occasionally failing for ext4 when blocksize < pagesize. In fact, the
pagecache rework just made hard to hit race in ext4 more likely. The
problem is that since ext4 conversion of direct IO writes to iomap
framework (commit 378f32bab3), we update inode size after direct IO
write only after invalidating page cache. Thus if buffered read sneaks
at unfortunate moment like:
CPU1 - write at offset 1k CPU2 - read from offset 0
iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT);
ext4_readpage();
ext4_handle_inode_extension()
the read will zero out tail of the page as it still sees smaller inode
size and thus page cache becomes inconsistent with on-disk contents with
all the consequences.
Fix the problem by moving inode size update into end_io handler which
gets called before the page cache is invalidated.
Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com>
Fixes: 378f32bab3 ("ext4: introduce direct I/O write using iomap infrastructure")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use the fileattr API to let the VFS handle locking, permission checking and
conversion.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Pass a set of flags to iomap_dio_rw instead of the boolean
wait_for_completion argument. The IOMAP_DIO_FORCE_WAIT flag
replaces the wait_for_completion, but only needs to be passed
when the iocb isn't synchronous to start with to simplify the
callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[djwong: rework xfs_file.c so that we can push iomap changes separately]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
When the first file is opened, ext4 samples the mountpoint of the
filesystem in 64 bytes of the super block. It does so using
strlcpy(), this means that the remaining bytes in the super block
string buffer are untouched. If the mount point before had a longer
path than the current one, it can be reconstructed.
Consider the case where the fs was mounted to "/media/johnjdeveloper"
and later to "/". The super block buffer then contains
"/\x00edia/johnjdeveloper".
This case was seen in the wild and caused confusion how the name
of a developer ands up on the super block of a filesystem used
in production...
Fix this by using strncpy() instead of strlcpy(). The superblock
field is defined to be a fixed-size char array, and it is already
marked using __nonstring in fs/ext4/ext4.h. The consumer of the field
in e2fsprogs already assumes that in the case of a 64+ byte mount
path, that s_last_mounted will not be NUL terminated.
Link: https://lore.kernel.org/r/X9ujIOJG/HqMr88R@mit.edu
Reported-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Protect all superblock modifications (including checksum computation)
with a superblock buffer lock. That way we are sure computed checksum
matches current superblock contents (a mismatch could cause checksum
failures in nojournal mode or if an unjournalled superblock update races
with a journalled one). Also we avoid modifying superblock contents
while it is being written out (which can cause DIF/DIX failures if we
are running in nojournal mode).
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20201216101844.22917-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds main fast commit commit path handlers. The overall
patch can be divided into two inter-related parts:
(A) Metadata updates tracking
This part consists of helper functions to track changes that need
to be committed during a commit operation. These updates are
maintained by Ext4 in different in-memory queues. Following are
the APIs and their short description that are implemented in this
patch:
- ext4_fc_track_link/unlink/creat() - Track unlink. link and creat
operations
- ext4_fc_track_range() - Track changed logical block offsets
inodes
- ext4_fc_track_inode() - Track inodes
- ext4_fc_mark_ineligible() - Mark file system fast commit
ineligible()
- ext4_fc_start_update() / ext4_fc_stop_update() /
ext4_fc_start_ineligible() / ext4_fc_stop_ineligible() These
functions are useful for co-ordinating inode updates with
commits.
(B) Main commit Path
This part consists of functions to convert updates tracked in
in-memory data structures into on-disk commits. Function
ext4_fc_commit() is the main entry point to commit path.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20201015203802.3597742-6-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull ext4 updates from Ted Ts'o:
"Improvements to ext4's block allocator performance for very large file
systems, especially when the file system or files which are highly
fragmented. There is a new mount option, prefetch_block_bitmaps which
will pull in the block bitmaps and set up the in-memory buddy bitmaps
when the file system is initially mounted.
Beyond that, a lot of bug fixes and cleanups. In particular, a number
of changes to make ext4 more robust in the face of write errors or
file system corruptions"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits)
ext4: limit the length of per-inode prealloc list
ext4: reorganize if statement of ext4_mb_release_context()
ext4: add mb_debug logging when there are lost chunks
ext4: Fix comment typo "the the".
jbd2: clean up checksum verification in do_one_pass()
ext4: change to use fallthrough macro
ext4: remove unused parameter of ext4_generic_delete_entry function
mballoc: replace seq_printf with seq_puts
ext4: optimize the implementation of ext4_mb_good_group()
ext4: delete invalid comments near ext4_mb_check_limits()
ext4: fix typos in ext4_mb_regular_allocator() comment
ext4: fix checking of directory entry validity for inline directories
fs: prevent BUG_ON in submit_bh_wbc()
ext4: correctly restore system zone info when remount fails
ext4: handle add_system_zone() failure in ext4_setup_system_zone()
ext4: fold ext4_data_block_valid_rcu() into the caller
ext4: check journal inode extents more carefully
ext4: don't allow overlapping system zones
ext4: handle error of ext4_setup_system_zone() on remount
ext4: delete the invalid BUGON in ext4_mb_load_buddy_gfp()
...
In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.
After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:
Running time on unfixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real 0m2.051s
user 0m0.008s
sys 0m2.026s
Running time on fixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real 0m0.471s
user 0m0.004s
sys 0m0.395s
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Running with some debug patches to detect illegal blocking triggered the
extend/unaligned condition in ext4. If ext4 needs to extend the file (and
hence go to buffered IO), or if the app is doing unaligned IO, then ext4
asks the iomap code to wait for IO completion. If the caller asked for
no-wait semantics by setting IOCB_NOWAIT, then ext4 should return -EAGAIN
instead.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/76152096-2bbb-7682-8fce-4cb498bcd909@kernel.dk
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4_mark_inode_dirty() can fail for real reasons. Ignoring its return
value may lead ext4 to ignore real failures that would result in
corruption / crashes. Harden ext4_mark_inode_dirty error paths to fail
as soon as possible and return errors to the caller whenever
appropriate.
One of the possible scnearios when this bug could affected is that
while creating a new inode, its directory entry gets added
successfully but while writing the inode itself mark_inode_dirty
returns error which is ignored. This would result in inconsistency
that the directory entry points to a non-existent inode.
Ran gce-xfstests smoke tests and verified that there were no
regressions.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20200427013438.219117-1-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently we start transaction for mapping every extent for writing
using direct IO. This is unnecessary when we know we are overwriting
already allocated blocks and the overhead of starting a transaction can
be significant especially for multithreaded workloads doing small writes.
Use iomap operations that avoid starting a transaction for direct IO
overwrites.
This improves throughput of 4k random writes - fio jobfile:
[global]
rw=randrw
norandommap=1
invalidate=0
bs=4k
numjobs=16
time_based=1
ramp_time=30
runtime=120
group_reporting=1
ioengine=psync
direct=1
size=16G
filename=file1.0.0:file1.0.1:file1.0.2:file1.0.3:file1.0.4:file1.0.5:file1.0.6:file1.0.7:file1.0.8:file1.0.9:file1.0.10:file1.0.11:file1.0.12:file1.0.13:file1.0.14:file1.0.15:file1.0.16:file1.0.17:file1.0.18:file1.0.19:file1.0.20:file1.0.21:file1.0.22:file1.0.23:file1.0.24:file1.0.25:file1.0.26:file1.0.27:file1.0.28:file1.0.29:file1.0.30:file1.0.31
file_service_type=random
nrfiles=32
from 3018MB/s to 4059MB/s in my test VM running test against simulated
pmem device (note that before iomap conversion, this workload was able
to achieve 3708MB/s because old direct IO path avoided transaction start
for overwrites as well). For dax, the win is even larger improving
throughput from 3042MB/s to 4311MB/s.
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20191218174433.19380-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We were using shared locking only in case of dioread_nolock mount option in case
of DIO overwrites. This mount condition is not needed anymore with current code,
since:-
1. No race between buffered writes & DIO overwrites. Since buffIO writes takes
exclusive lock & DIO overwrites will take shared locking. Also DIO path will
make sure to flush and wait for any dirty page cache data.
2. No race between buffered reads & DIO overwrites, since there is no block
allocation that is possible with DIO overwrites. So no stale data exposure
should happen. Same is the case between DIO reads & DIO overwrites.
3. Also other paths like truncate is protected, since we wait there for any DIO
in flight to be over.
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20191212055557.11151-4-riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Earlier there was no shared lock in DIO read path. But this patch
(16c5468859: ext4: Allow parallel DIO reads)
simplified some of the locking mechanism while still allowing for parallel DIO
reads by adding shared lock in inode DIO read path.
But this created problem with mixed read/write workload. It is due to the fact
that in DIO path, we first start with exclusive lock and only when we determine
that it is a ovewrite IO, we downgrade the lock. This causes the problem, since
we still have shared locking in DIO reads.
So, this patch tries to fix this issue by starting with shared lock and then
switching to exclusive lock only when required based on ext4_dio_write_checks().
Other than that, it also simplifies below cases:-
1. Simplified ext4_unaligned_aio API to ext4_unaligned_io. Previous API was
abused in the sense that it was not really checking for AIO anywhere also it
used to check for extending writes. So this API was renamed and simplified to
ext4_unaligned_io() which actully only checks if the IO is really unaligned.
Now, in case of unaligned direct IO, iomap_dio_rw needs to do zeroing of partial
block and that will require serialization against other direct IOs in the same
block. So we take a exclusive inode lock for any unaligned DIO. In case of AIO
we also need to wait for any outstanding IOs to complete so that conversion from
unwritten to written is completed before anyone try to map the overlapping block.
Hence we take exclusive inode lock and also wait for inode_dio_wait() for
unaligned DIO case. Please note since we are anyway taking an exclusive lock in
unaligned IO, inode_dio_wait() becomes a no-op in case of non-AIO DIO.
2. Added ext4_extending_io(). This checks if the IO is extending the file.
3. Added ext4_dio_write_checks(). In this we start with shared inode lock and
only switch to exclusive lock if required. So in most cases with aligned,
non-extending, dioread_nolock & overwrites, it tries to write with a shared
lock. If not, then we restart the operation in ext4_dio_write_checks(), after
acquiring exclusive lock.
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20191212055557.11151-3-riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch introduces a new direct I/O write path which makes use of
the iomap infrastructure.
All direct I/O writes are now passed from the ->write_iter() callback
through to the new direct I/O handler ext4_dio_write_iter(). This
function is responsible for calling into the iomap infrastructure via
iomap_dio_rw().
Code snippets from the existing direct I/O write code within
ext4_file_write_iter() such as, checking whether the I/O request is
unaligned asynchronous I/O, or whether the write will result in an
overwrite have effectively been moved out and into the new direct I/O
->write_iter() handler.
The block mapping flags that are eventually passed down to
ext4_map_blocks() from the *_get_block_*() suite of routines have been
taken out and introduced within ext4_iomap_alloc().
For inode extension cases, ext4_handle_inode_extension() is
effectively the function responsible for performing such metadata
updates. This is called after iomap_dio_rw() has returned so that we
can safely determine whether we need to potentially truncate any
allocated blocks that may have been prepared for this direct I/O
write. We don't perform the inode extension, or truncate operations
from the ->end_io() handler as we don't have the original I/O 'length'
available there. The ->end_io() however is responsible fo converting
allocated unwritten extents to written extents.
In the instance of a short write, we fallback and complete the
remainder of the I/O using buffered I/O via
ext4_buffered_write_iter().
The existing buffer_head direct I/O implementation has been removed as
it's now redundant.
[ Fix up ext4_dio_write_iter() per Jan's comments at
https://lore.kernel.org/r/20191105135932.GN22379@quack2.suse.cz -- TYT ]
Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/e55db6f12ae6ff017f36774135e79f3e7b0333da.1572949325.git.mbobrowski@mbobrowski.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In preparation for implementing the iomap direct I/O modifications,
the inode extension/truncate code needs to be moved out from the
ext4_iomap_end() callback. For direct I/O, if the current code
remained, it would behave incorrrectly. Updating the inode size prior
to converting unwritten extents would potentially allow a racing
direct I/O read to find unwritten extents before being converted
correctly.
The inode extension/truncate code now resides within a new helper
ext4_handle_inode_extension(). This function has been designed so that
it can accommodate for both DAX and direct I/O extension/truncate
operations.
Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/d41ffa26e20b15b12895812c3cad7c91a6a59bc6.1572949325.git.mbobrowski@mbobrowski.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch introduces a new direct I/O read path which makes use of
the iomap infrastructure.
The new function ext4_do_read_iter() is responsible for calling into
the iomap infrastructure via iomap_dio_rw(). If the read operation
performed on the inode is not supported, which is checked via
ext4_dio_supported(), then we simply fallback and complete the I/O
using buffered I/O.
Existing direct I/O read code path has been removed, as it is now
redundant.
Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/f98a6f73fadddbfbad0fc5ed04f712ca0b799f37.1572949325.git.mbobrowski@mbobrowski.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull ext4 updates from Ted Ts'o:
"Added new ext4 debugging ioctls to allow userspace to get information
about the state of the extent status cache.
Dropped workaround for pre-1970 dates which were encoded incorrectly
in pre-4.4 kernels. Since both the kernel correctly generates, and
e2fsck detects and fixes this issue for the past four years, it'e time
to drop the workaround. (Also, it's not like files with dates in the
distant past were all that common in the first place.)
A lot of miscellaneous bug fixes and cleanups, including some ext4
Documentation fixes. Also included are two minor bug fixes in
fs/unicode"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
unicode: make array 'token' static const, makes object smaller
unicode: Move static keyword to the front of declarations
ext4: add missing bigalloc documentation.
ext4: fix kernel oops caused by spurious casefold flag
ext4: fix integer overflow when calculating commit interval
ext4: use percpu_counters for extent_status cache hits/misses
ext4: fix potential use after free after remounting with noblock_validity
jbd2: add missing tracepoint for reserved handle
ext4: fix punch hole for inline_data file systems
ext4: rework reserved cluster accounting when invalidating pages
ext4: documentation fixes
ext4: treat buffers with write errors as containing valid data
ext4: fix warning inside ext4_convert_unwritten_extents_endio
ext4: set error return correctly when ext4_htree_store_dirent fails
ext4: drop legacy pre-1970 encoding workaround
ext4: add new ioctl EXT4_IOC_GET_ES_CACHE
ext4: add a new ioctl EXT4_IOC_GETSTATE
ext4: add a new ioctl EXT4_IOC_CLEAR_ES_CACHE
jbd2: flush_descriptor(): Do not decrease buffer head's ref count
ext4: remove unnecessary error check
...
Add most of fs-verity support to ext4. fs-verity is a filesystem
feature that enables transparent integrity protection and authentication
of read-only files. It uses a dm-verity like mechanism at the file
level: a Merkle tree is used to verify any block in the file in
log(filesize) time. It is implemented mainly by helper functions in
fs/verity/. See Documentation/filesystems/fsverity.rst for the full
documentation.
This commit adds all of ext4 fs-verity support except for the actual
data verification, including:
- Adding a filesystem feature flag and an inode flag for fs-verity.
- Implementing the fsverity_operations to support enabling verity on an
inode and reading/writing the verity metadata.
- Updating ->write_begin(), ->write_end(), and ->writepages() to support
writing verity metadata pages.
- Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl().
ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
past the end of the file, starting at the first 64K boundary beyond
i_size. This approach works because (a) verity files are readonly, and
(b) pages fully beyond i_size aren't visible to userspace but can be
read/written internally by ext4 with only some relatively small changes
to ext4. This approach avoids having to depend on the EA_INODE feature
and on rearchitecturing ext4's xattr support to support paging
multi-gigabyte xattrs into memory, and to support encrypting xattrs.
Note that the verity metadata *must* be encrypted when the file is,
since it contains hashes of the plaintext data.
This patch incorporates work by Theodore Ts'o and Chandan Rajendra.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Remove unnecessary error check in ext4_file_write_iter(),
because this check will be done in upcoming later function --
ext4_write_checks() -> generic_write_checks()
Change-Id: I7b0ab27f693a50765c15b5eaa3f4e7c38f42e01e
Signed-off-by: shisiyuan <shisiyuan@xiaomi.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull libnvdimm updates from Dan Williams:
"Primarily just the virtio_pmem driver:
- virtio_pmem
The new virtio_pmem facility introduces a paravirtualized
persistent memory device that allows a guest VM to use DAX
mechanisms to access a host-file with host-page-cache. It arranges
for MAP_SYNC to be disabled and instead triggers a host fsync()
when a 'write-cache flush' command is sent to the virtual disk
device.
- Miscellaneous small fixups"
* tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
virtio_pmem: fix sparse warning
xfs: disable map_sync for async flush
ext4: disable map_sync for async flush
dax: check synchronous mapping is supported
dm: enable synchronous dax
libnvdimm: add dax_dev sync flag
virtio-pmem: Add virtio pmem driver
libnvdimm: nd_region flush callback support
libnvdimm, namespace: Drop uuid_t implementation detail
Dont support 'MAP_SYNC' with non-DAX files and DAX files
with asynchronous dax_device. Virtio pmem provides
asynchronous host page cache flush mechanism. We don't
support 'MAP_SYNC' with virtio pmem and ext4.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
According to the chattr man page, "a file with the 'i' attribute
cannot be modified..." Historically, this was only enforced when the
file was opened, per the rest of the description, "... and the file
can not be opened in write mode".
There is general agreement that we should standardize all file systems
to prevent modifications even for files that were opened at the time
the immutable flag is set. Eventually, a change to enforce this at
the VFS layer should be landing in mainline. Until then, enforce this
at the ext4 level to prevent xfstests generic/553 from failing.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: stable@kernel.org
Unaligned AIO must be serialized because the zeroing of partial blocks
of unaligned AIO can result in data corruption in case it's overlapping
another in flight IO.
Currently we wait for all unwritten extents before we submit unaligned
AIO which protects data in case of unaligned AIO is following overlapping
IO. However if a unaligned AIO is followed by overlapping aligned AIO we
can still end up corrupting data.
To fix this, we must make sure that the unaligned AIO is the only IO in
flight by waiting for unwritten extents conversion not just before the
IO submission, but right after it as well.
This problem can be reproduced by xfstest generic/538
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Ext4 needs to serialize unaligned direct AIO because the zeroing of
partial blocks of two competing unaligned AIOs can result in data
corruption.
However it decides not to serialize if the potentially unaligned aio is
past i_size with the rationale that no pending writes are possible past
i_size. Unfortunately if the i_size is not block aligned and the second
unaligned write lands past i_size, but still into the same block, it has
the potential of corrupting the previous unaligned write to the same
block.
This is (very simplified) reproducer from Frank
// 41472 = (10 * 4096) + 512
// 37376 = 41472 - 4096
ftruncate(fd, 41472);
io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
io_submit(io_ctx, 1, &iocbs[1]);
io_submit(io_ctx, 1, &iocbs[2]);
io_getevents(io_ctx, 2, 2, events, NULL);
Without this patch the 512B range from 40960 up to the start of the
second unaligned write (41472) is going to be zeroed overwriting the data
written by the first write. This is a data corruption.
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
*
0000a000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
With this patch the data corruption is avoided because we will recognize
the unaligned_aio and wait for the unwritten extent conversion.
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
*
0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
*
0000b200
Reported-by: Frank Sorenson <fsorenso@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: e9e3bcecf4 ("ext4: serialize unaligned asynchronous DIO")
Cc: stable@vger.kernel.org
This patch is reworked from an earlier patch that Dan has posted:
https://patchwork.kernel.org/patch/10131727/
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
Now that we have explicit driver pfn_t-flag opt-in/opt-out for
get_user_pages() support for DAX we can stop setting VM_MIXEDMAP. This
also means we no longer need to worry about safely manipulating vm_flags
in a future where we support dynamically changing the dax mode of a
file.
DAX should also now be supported with madvise_behavior(), vma_merge(),
and copy_page_range().
This patch has been tested against ndctl unit test. It has also been
tested against xfstests commit: 625515d using fake pmem created by
memmap and no additional issues have been observed.
Link: http://lkml.kernel.org/r/152847720311.55924.16999195879201817653.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If fs is frozen after mount and before the first file open, the
update of s_last_mounted bypasses freeze protection and prints out
a WARNING splat:
$ mount /vdf
$ fsfreeze -f /vdf
$ cat /vdf/foo
[ 31.578555] WARNING: CPU: 1 PID: 1415 at
fs/ext4/ext4_jbd2.c:53 ext4_journal_check_start+0x48/0x82
[ 31.614016] Call Trace:
[ 31.614997] __ext4_journal_start_sb+0xe4/0x1a4
[ 31.616771] ? ext4_file_open+0xb6/0x189
[ 31.618094] ext4_file_open+0xb6/0x189
If fs is frozen, skip s_last_mounted update.
[backport hint: to apply to stable tree, need to apply also patches
vfs: add the sb_start_intwrite_trylock() helper
ext4: factor out helper ext4_sample_last_mounted()]
Cc: stable@vger.kernel.org
Fixes: bc0b0d6d69 ("ext4: update the s_last_mounted field in the superblock")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Use new return type vm_fault_t for fault handler. For now,
this is just documenting that the function returns a
VM_FAULT value rather than an errno. Once all instances are
converted, vm_fault_t will become a distinct type.
commit 1c8f422059 ("mm: change return type to vm_fault_t")
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>