-
Notifications
You must be signed in to change notification settings - Fork 1.8k
zfs-2.3.2 patchset #17214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: zfs-2.3-release
Are you sure you want to change the base?
zfs-2.3.2 patchset #17214
Conversation
…zfs#17137) Adding fields to zinject_record_t unexpectedly extended zfs_cmd_t, preventing some things working properly with 2.3.1 userspace tools against 2.3.0 kernel module. This reverts commit fabdd50. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
Update the META file to reflect compatibility with the 6.14 kernel. Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: @ImAwsumm
The operator can override TEST_BASE_DIR by setting its source var FILEDIR through zfs-tests.sh -d. There were a handful of cases where this was not honoured. By default FILEDIR (and so TEST_BASE_DIR) is /var/tmp, so there should be no functional change if the operator does nothing. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
The default file vdevs, constrained binpath and temporary runfiles were all explicitly places in /var/tmp. Instead, put them under FILEDIR, which is set from -d and defaults to /var/tmp. TEST_BASE_DIR is also initialised from FILEDIR, which means all data for the run will now end up under the operator-specified data dir. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
The config file value overrides any set by the operator, making it quite difficult to put the test output elsewhere. The default is /var/tmp/test_results (via BASEDIR in test-runner) so this shouldn't change anything for the default case. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
The default outputdir had a timestamp appended in TestRun.__init__, and then the timestamp was unconditionally applied again after the runfile had been loaded, assuming that an outputdir would be set in the runfile too. If the runfile didn't have an outputdir, then the outputdir would get a second timestamp appended. Further, if test groups or individual tests themselves specificed an outputdir, those would be set on their config, but would not get a timestamp appended. It's not entirely clear if that's wrong or not, but it is certainly not consistent with the rest. To clean all this up, change things to append a timestamp to a received outputdir (from arg or runfile) before setting it in any TestRun, TestGroup or Test object. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
Many tests use mktemp to create temporary files and dirs, which will usually put them in /tmp unless instructed otherwise. This had led to many tests trying to give mktemp a useful temp path in ad-hoc ways, and others just using it directly without knowing they're potentially leaving stuff lying around. So we set TMPDIR to FILEDIR, which makes the simplest uses of mktemp put things in the wanted work dir. Included here is a hack to get TMPDIR into the test. If a test has to be run as a different user (most of them), it is run through sudo. ld.so from glibc will not pass TMPDIR to a setuid program, so instead we re-set TMPDIR after sudo before running the target command. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
In all cases, rely on mktemp itself to make the best decision about where to place the file or directory. In all cases, that decision will be $TMPDIR, which we have set globally. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
Most of these are trying to use TMPDIR to put their work files somewhere sensible. Now that we've set up correctly, they can all just use mktemp to do the job. In a couple of places cleaning up temp files wasn't being done correctly, which has been fixed. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]>
The new Fast Dedup feature has a lot of moving parts, and only some of them have tests. We have some tests for prefetch and quota, and a generic ZAP shrinking test, but we don't have anything for the pruning command or specific to DDT zap shrinking. Here we add a couple small new tests for zpool ddtprune and DDT-specific ZAP shrinking. Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#17049
PR openzfs#14161 made spa_do_crypt_objset_mac_abd() to ignore MAC errors if local MAC can not be calculated at the time. But it does not mean we should also ignore portable MAC errors there. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17122
We had a case where we were autoreplacing a disk and zpool_prepare_disk failed for some reason, and ZED didn't log the return code. This commit logs the code. Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#17124
since 4.10, bio->bi_opf needs to be checked to determine all kinds of flush requests. this was the case prior to the commit referenced below, but the order of ifdefs was not the usual one (newest up top), which might have caused this to slip through. this fixes a regression when using zvols as Qemu block devices, but might have broken other use cases as well. the symptoms are that all sync writes from within a VM configured to use such a virtual block devices are ignored and treated as async writes by the host ZFS layer. this can be verified using fio in sync mode inside the VM, for example with fio \ --filename=/dev/sda --ioengine=libaio --loops=1 --size=10G \ --time_based --runtime=60 --group_reporting --stonewall --name=cc1 \ --description="CC1" --rw=write --bs=4k --direct=1 --iodepth=1 \ --numjobs=1 --sync=1 which shows an IOPS number way above what the physical device underneath supports, with "zpool iostat -r 1" on the hypervisor side showing no sync IO occuring during the benchmark. with the regression fixed, both fio inside the VM and the IO stats on the host show the expected numbers. Fixes: 846b598 "config: remove HAVE_REQ_OP_* and HAVE_REQ_*" Signed-off-by: Fabian-Gruenbichler <[email protected]> Co-authored-by: Alexander Motin <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
The reward for good work is more work. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Closed openzfs#17141
Now instead of crashing when attempting to read the corrupt block pointer, ZFS will return ECKSUM, in a stack that looks like this: ``` none:set-error zfs.ko`arc_read+0x1d82 zfs.ko`dbuf_read+0xa8c zfs.ko`dmu_buf_hold_array_by_dnode+0x292 zfs.ko`dmu_read_uio_dnode+0x47 zfs.ko`zfs_read+0x2d5 zfs.ko`zfs_freebsd_read+0x7b kernel`VOP_READ_APV+0xd0 kernel`vn_read+0x20e kernel`vn_io_fault_doio+0x45 kernel`vn_io_fault1+0x15e kernel`vn_io_fault+0x150 kernel`dofileread+0x80 kernel`sys_read+0xb7 kernel`amd64_syscall+0x424 kernel`0xffffffff810633cb ``` This patch should hopefully also prevent such corrupt block pointers from being written to disk in the first place. And in zdb, don't crash when printing a block pointer with no valid DVAs. If a block pointer isn't embedded yet doesn't have any valid DVAs, that's a data corruption bug. zdb should be able to handle the situation gracefully. Finally, remove an extra check for gang blocks in SNPRINTF_BLKPTR. This check, which compares the asizes of two different DVAs within the same BP, was added by illumos-gate commit b24ab67[^1], and I can't understand why. It doesn't appear to do anything useful, so remove it. [^1]: illumos/illumos-gate@b24ab67 Fixes openzfs#17077 Sponsored by: ConnectWise Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Alek Pinchuk <[email protected]> Signed-off-by: Alan Somers <[email protected]> Closes openzfs#17078
This statx(2) mask returns the alignment restrictions for O_DIRECT access on the given file. We're expected to return both memory and IO alignment. For memory, it's always PAGE_SIZE. For IO, we return the current block size for the file, which is the required alignment for an arbitrary block, and for the first block we'll fall back to the ARC when necessary, so it should always work. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16972
Implementation of DDT pruning introduced verification of DVAs in a block pointer during ddt_lookup() to not by mistake free previous pruned incarnation of the entry. But when writing a new block in zio_ddt_write() we might have the DVAs only from override pointer, which may never have "D" flag to be confused with pruned DDT entry, and we'll abandon those DVAs if we find a matching entry in DDT. This fixes deduplication for blocks written via dmu_sync() for purposes of indirect ZIL write records, that I have tested. And I suspect it might actually allow deduplication for Direct I/O, even though in an odd way -- first write block directly and then delete it later during TXG commit if found duplicate, which part I haven't tested. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17120
* FreeBSD 12 is EoL. Drop it. * Use the latest FreeBSD 13 and 14 versions. * Add FreeBSD 15.0-CURRENT. * Use the current python version. Sponsored by: ConnectWise Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Alan Somers <[email protected]> Closes openzfs#17139
Force receive (zfs receive -F) can rollback or destroy snapshots and file systems that do not exist on the sending side (see zfs-receive man page). This means an user having the receive permission can effectively delete data on receiving side, even if such user does not have explicit rollback or destroy permissions. This patch adds the receive:append permission, which only permits limited, non-forced receive. Behavior for users with full receive permission is not changed in any way. Fixes openzfs#16943 Reviewed-by: Ameer Hamza <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Gionatan Danti <[email protected]> Closes openzfs#17015
Looking pretty good. Still hewing very closely to master, which I like for now :) Please include:
Consider the following:
I've no skin in any of these; they just look small and important in their contexts. Maybe the last one has unintended performance consequences, which maybe isn't good for a stable series, but I think it's defensible. If you want the lot, here's a branch ready to go: |
@tonyhutter please include #17199 (87f8bf6) and #17219 (88e3885) |
9a39991
to
3038b1e
Compare
Just checking https://github.com/openzfs/zfs/commits/master/ . Between Apr 3, 2025 and today are some commits regarding bugfixing, manpage, quotas.... if they are not "too new", maybe pull them in if applicable ? |
The trivial fix for ZED would be great too 😀: 189dc26 |
Signed-off-by: Attila Fülöp <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: SHENGYI HONG <[email protected]> Closes openzfs#17101
FreeBSD nowadays supports FPU in the kernel on powerpc*, so enable it. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Piotr Kubaj <[email protected]> Closes openzfs#17191
Since spa_dspace accounts only normal allocation class space, spa_nonallocating_dspace should do the same. Otherwise we may get negative overflow or respective assertion spa_update_dspace() if removed special/dedup vdev is bigger than all normal class space. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#17183
FreeBSD kernel's WITNESS code detected lock ordering violation in spa_vdev_remove_cancel_sync(). It took svr_lock while holding ms_lock, which is opposite to other places. I was thinking to resolve it similar to openzfs#17145, but looking closer I don't think we even need svr_lock at that point, since we already asserted svr_allocd_segs is empty, and we don't need to add there segments we are going to call free_mapped_segment_cb for. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17164
cmd/zinject/zinject.c: - use PRIu64 when printing uint64_t tests/zfs-tests/cmd/clonefile.c: - use an unsigned long long to store result from strtoull() - use %jd for printing off_t, %zu for size_t, %zd for ssize_t tests/zfs-tests/tests/functional/vdev_disk/page_alignment.c: - use %zx to print size_t Discovered when compiling on FreeBSD i386. Signed-off-by: Martin Matuska <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: @ImAwsumm
When opening a vdev and setting the nonrot property, we used to wait for each child to be opened before examining its nonrot property. When the change was made to open vdevs asynchronously, we didn't move the nonrot check out of the main loop. As a result, the nonrot property is almost always set to false, regardless of the actual type of the underlying disks. The fix is simply to move the nonrot check to a separate loop after the taskq has been waited for. Sponsored-by: Klara, Inc. Sponsored-by: Eshtek, Inc. Signed-off-by: Paul Dagnelie <[email protected]> Co-authored-by: Paul Dagnelie <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Alexander Motin <[email protected]>
- don't include foreign machine assembly files - reduce diff to FreeBSD module Makefile Discovered in FreeBSD port filesystems/openzfs-kmod Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes openzfs#17219
Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
When a dedup write fails, we try to roll the DDT entry back to a known good state. However, this also rolls the refcounts and the last-update time back to the state they were at when we started this write. This doesn't appear to be able to cause any refcount leaks (after the fix in 17123). This PR prevents that from happening by only rolling back the parts of the DDT entry that have been updated by the write so far. Sponsored-by: iXsystems, Inc. Sponsored-by: Klara, Inc. Signed-off-by: Paul Dagnelie <[email protected]> Co-authored-by: Paul Dagnelie <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
…fs#15972). (openzfs#17213) The problem was identified in handling of the zpool get state command line arguments. A pointer vdev was used to point to the argv[1], and its address set to cb.cb_vdevs.cb_names(pointer to array of strings) so any increment to cb_names resulted in a segfault. Fix covers a special case of root parameter at argv[1] and remaining cases are handled by passing in the argv + 1, which allows cb_names iteration of next command line arguments (vdevs). Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Attila Fülöp <[email protected]> Signed-off-by: Syed Shahrukh Hussain <[email protected]>
Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Richard Kojedzinszky <[email protected]> Closes openzfs#17208
Debian requires libtirpc-dev. Update our debian/control file to match Debian's upstream one. Closes: openzfs#17197 Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: @manfromafar
The 6.0 kernel removes the 'migratepage' VFS op. Check for migratepage. Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]
The tiniest typo in dd2a46b (openzfs#17106) broke it, by setting the wrong var with the test var, resulting in it always producing "no". Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Alexander Motin <[email protected]> Reviewed by: Attila Fülöp <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17236
zfs_notify_email will now include an empty line separating the header from the body of the email in case the subject is not provided via a command line argument. This is necessary for programs like sendmail to function correctly (everything up to the first empty line is interpreted as header, which previously resulted in either missing message parts or unsent emails) Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Felix Schmidt <[email protected]> Closed openzfs#17238
IVs != 96 bits get hashed with GHASH to bring them to 96 bits. Any call to GHASH will mix the ghash state in gcm_ghash. This is expected to be zero at first use in an encrypt or decrypt operation, so it needs to be zeroed after using GHASH in setup. gcm_init() does this, but gcm_avx_init() zeroed it before setup, not after, resulting in incorrect encrypt/decrypt results when using AVX GCM with an IV != 96 bits. OpenZFS _always_ uses a 96 bit IV (ZIO_DATA_IV_LEN) so this will never have been hit in any real-world use, which is extremely fortunate, as we would have incorrectly-encrypted data on-disk. Still, as long as we have this code here we should make sure it's correct. Thanks-to: Joel Low <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Attila Fülöp <[email protected]>
Fix build errors on Fedora 42 like: module/zcommon/zfs_valstr.c:193:16: error: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (3 chars into 2 available) The arrays in zpool_vdev_os.c and zfs_valstr.c don't need to be NULL terminated, but we do so to make GCC happy. Closes: openzfs#17242 Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]>
d201d49
to
49c5bd3
Compare
@tonyhutter maybe #17255 can be considered for inclusion? Thanks. |
@tonyhutter, Quick question: At what point do we say "we have collected enough bugfixes for 2.3.2" and go into doing a release candidate for 2.3.2? Typically, I wait till .2 patchset release to upgrade a big project like ZFS. So, just curious as to how far along is it. You don't have to answer if its too much bother. |
Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]>
META file and changelog updated. Signed-off-by: Tony Hutter <[email protected]>
49c5bd3
to
d9c3bd6
Compare
@devsk there's no hard and fast rule. Typically these patchset PRs are out for about 3 weeks to get feedback and gather additional last-minute patches. I don't have anything else in mind to pull in for this one. We will need to build the new zfs-release RPMs for Fedora 42 though, which is on my TODO list. |
@shodanshok #17255 has not been merged yet, so no plans for inclusion at this point. |
@tonyhutter How come the betas are not used to verify compatibility, instead waiting until release to take action? |
I admit to some naivety here so please bare with. I am mostly interested in your thoughts, so that I may better understand the processes here. If this isn't the forum for it (it likely isn't), just let me know. ( <3 ) Would it be possible to split the patch sets into more normalized sets? One for general ZFS patches, one for Linux current release, and another prepping for Linux next (or a variation thereof)? Would this effect your current branching, tagging, and build strategy? I think it may lead to a higher release cadence, but one where linux compat PR's could be merged and released in closer cadence with linux current release. I am one of the foolish who runs root-on-enc-zfs on a rolling distro (arch), with a systemd initramfs... as my daily driver. I don't really mind being behind for long, I just keep a handy lts kernel ready incase I botch an update. I'm comfortable playing with DKMS and getting zfs-current to compile for myself, but I rather enjoy knowing the ZFS releases are well tested and version tracked with my package manager. |
Motivation and Context
Proposed zfs-2.3.2 patchset
Description
Bump META file to support 6.14 kernel, support STATX_DIOALIGN, misc fixes.
How Has This Been Tested?
Runners will test
Types of changes
Checklist:
Signed-off-by
.