Changes in 5.15.22
drm/i915: Disable DSB usage for now
selinux: fix double free of cond_list on error paths
audit: improve audit queue handling when "audit=1" on cmdline
ipc/sem: do not sleep with a spin lock held
spi: stm32-qspi: Update spi registering
ASoC: hdmi-codec: Fix OOB memory accesses
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx()
ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx()
ALSA: usb-audio: Correct quirk for VF0770
ALSA: hda: Fix UAF of leds class devs at unbinding
ALSA: hda: realtek: Fix race at concurrent COEF updates
ALSA: hda/realtek: Add quirk for ASUS GU603
ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks
ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset)
ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows
btrfs: don't start transaction for scrub if the fs is mounted read-only
btrfs: fix deadlock between quota disable and qgroup rescan worker
btrfs: fix use-after-free after failure to create a snapshot
Revert "fs/9p: search open fids first"
drm/nouveau: fix off by one in BIOS boundary checking
drm/i915/adlp: Fix TypeC PHY-ready status readout
drm/amd/pm: correct the MGpuFanBoost support for Beige Goby
drm/amd/display: watermark latencies is not enough on DCN31
drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels
nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts()
mm/debug_vm_pgtable: remove pte entry from the page table
mm/pgtable: define pte_index so that preprocessor could recognize it
mm/kmemleak: avoid scanning potential huge holes
block: bio-integrity: Advance seed correctly for larger interval sizes
dma-buf: heaps: Fix potential spectre v1 gadget
IB/hfi1: Fix AIP early init panic
Revert "fbcon: Disable accelerated scrolling"
fbcon: Add option to enable legacy hardware acceleration
mptcp: fix msk traversal in mptcp_nl_cmd_set_flags()
Revert "ASoC: mediatek: Check for error clk pointer"
KVM: arm64: Avoid consuming a stale esr value when SError occur
KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
RDMA/cma: Use correct address when leaving multicast group
RDMA/ucma: Protect mc during concurrent multicast leaves
RDMA/siw: Fix refcounting leak in siw_create_qp()
IB/rdmavt: Validate remote_addr during loopback atomic tests
RDMA/siw: Fix broken RDMA Read Fence/Resume logic.
RDMA/mlx4: Don't continue event handler after memory allocation failure
ALSA: usb-audio: initialize variables that could ignore errors
ALSA: hda: Fix signedness of sscanf() arguments
ALSA: hda: Skip codec shutdown in case the codec is not registered
iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()
iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
spi: bcm-qspi: check for valid cs before applying chip select
spi: mediatek: Avoid NULL pointer crash in interrupt
spi: meson-spicc: add IRQ check in meson_spicc_probe
spi: uniphier: fix reference count leak in uniphier_spi_probe()
IB/hfi1: Fix tstats alloc and dealloc
IB/cm: Release previously acquired reference counter in the cm_id_priv
net: ieee802154: hwsim: Ensure proper channel selection at probe time
net: ieee802154: mcr20a: Fix lifs/sifs periods
net: ieee802154: ca8210: Stop leaking skb's
netfilter: nft_reject_bridge: Fix for missing reply from prerouting
net: ieee802154: Return meaningful error codes from the netlink helpers
net/smc: Forward wakeup to smc socket waitqueue after fallback
net: stmmac: dwmac-visconti: No change to ETHER_CLOCK_SEL for unexpected speed request.
net: stmmac: properly handle with runtime pm in stmmac_dvr_remove()
net: macsec: Fix offload support for NETDEV_UNREGISTER event
net: macsec: Verify that send_sci is on when setting Tx sci explicitly
net: stmmac: dump gmac4 DMA registers correctly
net: stmmac: ensure PTP time register reads are consistent
drm/kmb: Fix for build errors with Warray-bounds
drm/i915/overlay: Prevent divide by zero bugs in scaling
drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled
ASoC: fsl: Add missing error handling in pcm030_fabric_probe
ASoC: xilinx: xlnx_formatter_pcm: Make buffer bytes multiple of period bytes
ASoC: simple-card: fix probe failure on platform component
ASoC: cpcap: Check for NULL pointer after calling of_get_child_by_name
ASoC: max9759: fix underflow in speaker_gain_control_put()
ASoC: codecs: wcd938x: fix incorrect used of portid
ASoC: codecs: lpass-rx-macro: fix sidetone register offsets
ASoC: codecs: wcd938x: fix return value of mixer put function
pinctrl: sunxi: Fix H616 I2S3 pin data
pinctrl: intel: Fix a glitch when updating IRQ flags on a preconfigured line
pinctrl: intel: fix unexpected interrupt
pinctrl: bcm2835: Fix a few error paths
scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe
nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.
gve: fix the wrong AdminQ buffer queue index check
bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
selftests/exec: Remove pipe from TEST_GEN_FILES
selftests: futex: Use variable MAKE instead of make
tools/resolve_btfids: Do not print any commands when building silently
e1000e: Separate ADP board type from TGP
rtc: cmos: Evaluate century appropriate
kvm: add guest_state_{enter,exit}_irqoff()
kvm/arm64: rework guest entry logic
perf: Copy perf_event_attr::sig_data on modification
perf stat: Fix display of grouped aliased events
perf/x86/intel/pt: Fix crash with stop filters in single-range mode
x86/perf: Default set FREEZE_ON_SMI for all
EDAC/altera: Fix deferred probing
EDAC/xgene: Fix deferred probing
ext4: prevent used blocks from being allocated during fast commit replay
ext4: modify the logic of ext4_mb_new_blocks_simple
ext4: fix error handling in ext4_restore_inline_data()
ext4: fix error handling in ext4_fc_record_modified_inode()
ext4: fix incorrect type issue during replay_del_range
net: dsa: mt7530: make NET_DSA_MT7530 select MEDIATEK_GE_PHY
cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
tools include UAPI: Sync sound/asound.h copy with the kernel sources
gpio: idt3243x: Fix an ignored error return from platform_get_irq()
gpio: mpc8xxx: Fix an ignored error return from platform_get_irq()
selftests: nft_concat_range: add test for reload with no element add/del
selftests: netfilter: check stateless nat udp checksum fixup
Linux 5.15.22
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9143b858b768a8497c1df9440a74d8c105c32271
commit 897026aaa73eb2517dfea8d147f20ddb0b813044 upstream.
While running "./check -I 200 generic/475" it sometimes gives below
kernel BUG(). Ideally we should not call ext4_write_inline_data() if
ext4_create_inline_data() has failed.
<log snip>
[73131.453234] kernel BUG at fs/ext4/inline.c:223!
<code snip>
212 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
213 void *buffer, loff_t pos, unsigned int len)
214 {
<...>
223 BUG_ON(!EXT4_I(inode)->i_inline_off);
224 BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);
This patch handles the error and prints out a emergency msg saying potential
data loss for the given inode (since we couldn't restore the original
inline_data due to some previous error).
[ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30)
Reported-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conditionally remove all cached extents belonging to an inode
when truncating its inline data. It's only necessary to attempt to
remove cached extents when a conversion from inline to extent storage
has been initiated (!EXT4_STATE_MAY_INLINE_DATA). This avoids
unnecessary es lock overhead in the more common inline case.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210819144927.25163-2-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix a bug in how we update i_disksize, and the error path in
inline_data_end. Finally, drop an unnecessary creation of a journal
handle which was only needed for inline data, which can give us a
large performance gain in delayed allocation writes.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now that the inline_data file write end procedure are falled into the
common write end functions, it is not clear. Factor them out and do
some cleanup. This patch also drop ext4_da_write_inline_data_end()
and switch to use ext4_write_inline_data_end() instead because we also
need to do the same error processing if we failed to write data into
inline entry.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210716122024.1105856-4-yi.zhang@huawei.com
Current error path of ext4_write_inline_data_end() is not correct.
Firstly, it should pass out the error value if ext4_get_inode_loc()
return fail, or else it could trigger infinite loop if we inject error
here. And then it's better to add inode to orphan list if it return fail
in ext4_journal_stop(), otherwise we could not restore inline xattr
entry after power failure. Finally, we need to reset the 'ret' value if
ext4_write_inline_data_end() return success in ext4_write_end() and
ext4_journalled_write_end(), otherwise we could not get the error return
value of ext4_journal_stop().
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210716122024.1105856-3-yi.zhang@huawei.com
JBD2 layer support triggers which are called when journaling layer moves
buffer to a certain state. We can use the frozen trigger, which gets
called when buffer data is frozen and about to be written out to the
journal, to compute block checksums for some buffer types (similarly as
does ocfs2). This avoids unnecessary repeated recomputation of the
checksum (at the cost of larger window where memory corruption won't be
caught by checksumming) and is even necessary when there are
unsynchronized updaters of the checksummed data.
So add superblock and journal trigger type arguments to
ext4_journal_get_write_access() and ext4_journal_get_create_access() so
that frozen triggers can be set accordingly. Also add inode argument to
ext4_walk_page_buffers() and all the callbacks used with that function
for the same purpose. This patch is mostly only a change of prototype of
the above mentioned functions and a few small helpers. Real checksumming
will come later.
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210816095713.16537-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The location of the system.data extended attribute can change whenever
xattr_sem is not taken. So we need to recalculate the i_inline_off
field since it mgiht have changed between ext4_write_begin() and
ext4_write_end().
This means that caching i_inline_off is probably not helpful, so in
the long run we should probably get rid of it and shrink the in-memory
ext4 inode slightly, but let's fix the race the simple way for now.
Cc: stable@kernel.org
Fixes: f19d5870cb ("ext4: add normal write support for inline data")
Reported-by: syzbot+13146364637c7363a7de@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This brings the ext4 casefolded encryption implementation in line with
the patches that have gone upstream.
This is equivalent to reverting
"ANDROID-ext4-Handle-casefolding-with-encryption.patch"
"ANDROID-ext4-Optimize-match-for-casefolded-encrypted-dirs.patch"
and applying the equivalent upstream patches.
This now advertises the feature in sysfs under
/sys/fs/ext4/features/encrypted_casefold
Change-Id: Ia8f437c5f137f16edf0a90c090a226d10c9e5660
Bug: 161184936
Signed-off-by: Daniel Rosenberg <drosen@google.com>
This adds support for encryption with casefolding.
Since the name on disk is case preserving, and also encrypted, we can no
longer just recompute the hash on the fly. Additionally, to avoid
leaking extra information from the hash of the unencrypted name, we use
siphash via an fscrypt v2 policy.
The hash is stored at the end of the directory entry for all entries
inside of an encrypted and casefolded directory apart from those that
deal with '.' and '..'. This way, the change is backwards compatible
with existing ext4 filesystems.
[ Changed to advertise this feature via the file:
/sys/fs/ext4/features/encrypted_casefold -- TYT ]
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210319073414.1381041-2-drosen@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Steps on the way to 5.10-rc1
Resolves conflicts in:
fs/ext4/dir.c
fs/ext4/ext4.h
fs/ext4/namei.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I397787046920175eb183fa5342d2923f819bb2f4
This adds support for encryption with casefolding.
Since the name on disk is case preserving, and also encrypted, we can no
longer just recompute the hash on the fly. Additionally, to avoid
leaking extra information from the hash of the unencrypted name, we use
siphash via an fscrypt v2 policy.
The hash is stored at the end of the directory entry for all entries
inside of an encrypted and casefolded directory apart from those that
deal with '.' and '..'. This way, the change is backwards compatible
with existing ext4 filesystems.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Test: Boots, /data/media is case insensitive
Bug: 138322712
Change-Id: I07354e3129aa07d309fbe36c002fee1af718f348
ext4_mark_inode_dirty() can fail for real reasons. Ignoring its return
value may lead ext4 to ignore real failures that would result in
corruption / crashes. Harden ext4_mark_inode_dirty error paths to fail
as soon as possible and return errors to the caller whenever
appropriate.
One of the possible scnearios when this bug could affected is that
while creating a new inode, its directory entry gets added
successfully but while writing the inode itself mark_inode_dirty
returns error which is ignored. This would result in inconsistency
that the directory entry points to a non-existent inode.
Ran gce-xfstests smoke tests and verified that there were no
regressions.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20200427013438.219117-1-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Using a separate function, ext4_set_errno() to set the errno is
problematic because it doesn't do the right thing once
s_last_error_errorcode is non-zero. It's also less racy to set all of
the error information all at once. (Also, as a bonus, it shrinks code
size slightly.)
Link: https://lore.kernel.org/r/20200329020404.686965-1-tytso@mit.edu
Fixes: 878520ac45 ("ext4: save the error code which triggered...")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4") into
android-mainline
Handle the ext4 merge issues in one small merge to make it more obvious.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I988ab727a579da9475cdf907032e233a403b6fa1
This allows the cause of an ext4_error() report to be categorized
based on whether it was triggered due to an I/O error, or an memory
allocation error, or other possible causes. Most errors are caused by
a detected file system inconsistency, so the default code stored in
the superblock will be EXT4_ERR_EFSCORRUPTED.
Link: https://lore.kernel.org/r/20191204032335.7683-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
To make the 5.4-rc1 merge easier, merge at a prerelease point in time
before the final release happens.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I29b683c837ed1a3324644dbf9bf863f30740cd0b
Currently when the call to ext4_htree_store_dirent fails the error return
variable 'ret' is is not being set to the error code and variable count is
instead, hence the error code is not being returned. Fix this by assigning
ret to the error return code.
Addresses-Coverity: ("Unused value")
Fixes: 8af0f08227 ("ext4: fix readdir error in the case of inline_data+dir_index")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Move the calculation of the location of the dirent tail into
initialize_dirent_tail(). Also prefix the function with ext4_ to fix
kernel namepsace polution.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Functions such as ext4_dirent_csum_verify() and ext4_dirent_csum_set()
don't actually operate on a directory entry, but a directory block.
And while they take a struct ext4_dir_entry *dirent as an argument, it
had better be the first directory at the beginning of the direct
block, or things will go very wrong.
Rename the following functions so that things make more sense, and
remove a lot of confusing casts along the way:
ext4_dirent_csum_verify -> ext4_dirblock_csum_verify
ext4_dirent_csum_set -> ext4_dirblock_csum_set
ext4_dirent_csum -> ext4_dirblock_csum
ext4_handle_dirty_dirent_node -> ext4_handle_dirty_dirblock
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Adds tracepoints in ext4/f2fs/mpage to track readpages/buffered
write()s. This allows us to track files that are being read/written
to PIDs.
Bug: 120445624
Change-Id: I44476230324e9397e292328463f846af4befbd6d
[joelaf: Needed for storaged fsync accounting ("storaged --uid" and
"storaged --task".)]
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
[AmitP: Folded following android-4.9 commit changes into this patch
a5c4dbb05ab7 ("ANDROID: Replace spaces by '_' for some
android filesystem tracepoints.")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[astrachan: Folded 63066f4acf92 ("ANDROID: fs: Refactor FS
readpage/write tracepoints.") into this patch
Signed-off-by: Alistair Strachan <astrachan@google.com>
This patch implements the actual support for case-insensitive file name
lookups in ext4, based on the feature bit and the encoding stored in the
superblock.
A filesystem that has the casefold feature set is able to configure
directories with the +F (EXT4_CASEFOLD_FL) attribute, enabling lookups
to succeed in that directory in a case-insensitive fashion, i.e: match
a directory entry even if the name used by userspace is not a byte per
byte match with the disk name, but is an equivalent case-insensitive
version of the Unicode string. This operation is called a
case-insensitive file name lookup.
The feature is configured as an inode attribute applied to directories
and inherited by its children. This attribute can only be enabled on
empty directories for filesystems that support the encoding feature,
thus preventing collision of file names that only differ by case.
* dcache handling:
For a +F directory, Ext4 only stores the first equivalent name dentry
used in the dcache. This is done to prevent unintentional duplication of
dentries in the dcache, while also allowing the VFS code to quickly find
the right entry in the cache despite which equivalent string was used in
a previous lookup, without having to resort to ->lookup().
d_hash() of casefolded directories is implemented as the hash of the
casefolded string, such that we always have a well-known bucket for all
the equivalencies of the same string. d_compare() uses the
utf8_strncasecmp() infrastructure, which handles the comparison of
equivalent, same case, names as well.
For now, negative lookups are not inserted in the dcache, since they
would need to be invalidated anyway, because we can't trust missing file
dentries. This is bad for performance but requires some leveraging of
the vfs layer to fix. We can live without that for now, and so does
everyone else.
* on-disk data:
Despite using a specific version of the name as the internal
representation within the dcache, the name stored and fetched from the
disk is a byte-per-byte match with what the user requested, making this
implementation 'name-preserving'. i.e. no actual information is lost
when writing to storage.
DX is supported by modifying the hashes used in +F directories to make
them case/encoding-aware. The new disk hashes are calculated as the
hash of the full casefolded string, instead of the string directly.
This allows us to efficiently search for file names in the htree without
requiring the user to provide an exact name.
* Dealing with invalid sequences:
By default, when a invalid UTF-8 sequence is identified, ext4 will treat
it as an opaque byte sequence, ignoring the encoding and reverting to
the old behavior for that unique file. This means that case-insensitive
file name lookup will not work only for that file. An optional bit can
be set in the superblock telling the filesystem code and userspace tools
to enforce the encoding. When that optional bit is set, any attempt to
create a file name using an invalid UTF-8 sequence will fail and return
an error to userspace.
* Normalization algorithm:
The UTF-8 algorithms used to compare strings in ext4 is implemented
lives in fs/unicode, and is based on a previous version developed by
SGI. It implements the Canonical decomposition (NFD) algorithm
described by the Unicode specification 12.1, or higher, combined with
the elimination of ignorable code points (NFDi) and full
case-folding (CF) as documented in fs/unicode/utf8_norm.c.
NFD seems to be the best normalization method for EXT4 because:
- It has a lower cost than NFC/NFKC (which requires
decomposing to NFD as an intermediary step)
- It doesn't eliminate important semantic meaning like
compatibility decompositions.
Although:
- This implementation is not completely linguistic accurate, because
different languages have conflicting rules, which would require the
specialization of the filesystem to a given locale, which brings all
sorts of problems for removable media and for users who use more than
one language.
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent()
while still holding the xattr semaphore. This is not necessary and it
triggers a circular lockdep warning. This is because
fiemap_fill_next_extent() could trigger a page fault when it writes
into page which triggers a page fault. If that page is mmaped from
the inline file in question, this could very well result in a
deadlock.
This problem can be reproduced using generic/519 with a file system
configuration which has the inline_data feature enabled.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
In case of error, ext4_try_to_write_inline_data() should unlock
and release the page it holds.
Fixes: f19d5870cb ("ext4: add normal write support for inline data")
Cc: stable@kernel.org # 3.8
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Variable retries is not initialized in ext4_da_write_inline_data_begin()
which can lead to nondeterministic number of retries in case we hit
ENOSPC. Initialize retries to zero as we do everywhere else.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: bc0ca9df3b ("ext4: retry allocation when inline->extent conversion failed")
Cc: stable@kernel.org
A specially crafted file system can trick empty_inline_dir() into
reading past the last valid entry in a inline directory, and then run
into the end of xattr marker. This will trigger a divide by zero
fault. Fix this by using the size of the inline directory instead of
dir->i_size.
Also clean up error reporting in __ext4_check_dir_entry so that the
message is clearer and more understandable --- and avoids the division
by zero trap if the size passed in is zero. (I'm not sure why we
coded it that way in the first place; printing offset % size is
actually more confusing and less useful.)
https://bugzilla.kernel.org/show_bug.cgi?id=200933
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Wen Xu <wen.xu@gatech.edu>
Cc: stable@vger.kernel.org
The inline data code was updating the raw inode directly; this is
problematic since if metadata checksums are enabled,
ext4_mark_inode_dirty() must be called to update the inode's checksum.
In addition, the jbd2 layer requires that get_write_access() be called
before the metadata buffer is modified. Fix both of these problems.
https://bugzilla.kernel.org/show_bug.cgi?id=200443
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Pull ext4 bugfixes from Ted Ts'o:
"Bug fixes for ext4; most of which relate to vulnerabilities where a
maliciously crafted file system image can result in a kernel OOPS or
hang.
At least one fix addresses an inline data bug could be triggered by
userspace without the need of a crafted file system (although it does
require that the inline data feature be enabled)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: check superblock mapped prior to committing
ext4: add more mount time checks of the superblock
ext4: add more inode number paranoia checks
ext4: avoid running out of journal credits when appending to an inline file
jbd2: don't mark block as modified if the handle is out of credits
ext4: never move the system.data xattr out of the inode body
ext4: clear i_data in ext4_inode_info when removing inline data
ext4: include the illegal physical block in the bad map ext4_error msg
ext4: verify the depth of extent tree in ext4_find_extent()
ext4: only look at the bg_flags field if it is valid
ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
ext4: always check block group bounds in ext4_init_block_bitmap()
ext4: always verify the magic number in xattr blocks
ext4: add corruption check in ext4_xattr_set_entry()
ext4: add warn_on_error mount option
Use a separate journal transaction if it turns out that we need to
convert an inline file to use an data block. Otherwise we could end
up failing due to not having journal credits.
This addresses CVE-2018-10883.
https://bugzilla.kernel.org/show_bug.cgi?id=200071
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
When converting from an inode from storing the data in-line to a data
block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
copy of the i_blocks[] array. It was not clearing copy of the
i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
used by ext4_map_blocks().
This didn't matter much if we are using extents, since the extents
header would be invalid and thus the extents could would re-initialize
the extents tree. But if we are using indirect blocks, the previous
contents of the i_blocks array will be treated as block numbers, with
potentially catastrophic results to the file system integrity and/or
user data.
This gets worse if the file system is using a 1k block size and
s_first_data is zero, but even without this, the file system can get
quite badly corrupted.
This addresses CVE-2018-10881.
https://bugzilla.kernel.org/show_bug.cgi?id=200015
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org