I’m shopping for a new NVMe SSD drive for my laptop and with the second deciding factor being Linux compatibility, I’d looked up the names of specific drives in the source code of Linux and discovered that their controllers have quirks that have to be worked around.
Now, I figured out more or less how quirks affecting one of the controllers impact its functionality under Linux, but there’s another controller that I have a trouble understanding how disabling the aforementioned command limits the functionality of, if at all; therefore I’d like to ask you all, under what circumstances is the command used by a host and can disabling it lower the performance or power efficiency of an impacted controller/drive?
To be clear, the quirk workaround I’m talking a about is NVME_QUIRK_DISABLE_WRITE_ZEROES
.
Considering what specific nvme drive to use on Linux hasn’t been a real problem for at least 5 years, especially if you just plan to stick it in a laptop. Just buy a drive from a reputable brand from a reputable source.
I have a WD black SN770 in my main desktop and it works no issues. I even have btrfs on it, and some people suggested that btrfs would have some issues with nvmes, but here I am over a year later with zero issues. Speed on these things is out of this world.
That’s good news, but what if this affects wear leveling? Or efficiency in some other way, that would result in these dying sooner than expected (still years probably, but… yeah)
I make back ups. If this dies, I’ll get another one. They became very cheap lately. 1TB is like $55
Others may still be interested in longevity. Like me. $55 is, while not expensive, not cheap either for me, or at least not in the way that I would be comfortable replacing it every 1-2 years. I’m not swimming in money at that amount has a better place than a recurring spending. I doubt I am in a minority with this.
It’s good to know what the quirks mean, and if it means any longevity problem, because if the above turns out as a problem, it may be cheaper in the long term to buy a better drive.
What were the reported problems of BTRFS on an NVME? There are actually a few upsides to BTRFS on flash: if you enable compression there will be less writing to the flash, and reflinking capabilities means that copy-pastes to the same partition won’t cause extra writes.
I’ve heard about freezing. Screens going black and so on. Never had any of those issues
That’s nonsense and couldn’t be affected by BTRFS short of something catastrophic happening (probably in the kernel).
The main downside for NVME that I’d be aware of is that BTRFS is a lot slower to write lots of tiny files than something like XFS (which is generally the fastest filesystem available). I’d guess this is due to BTRFS’s metadata overhead, which includes writing two copies of metadata for redundancy. I have a very fast gen 4 NVME and I can get ~2GBPS tiny file write speed with XFS and ~300MBPS tiny file write speed with BTRFS. IMO this is negligible though, because tiny files are inherently tiny - 300MB of tiny files is a lot of files for 1 second of effort. Also, usually read speed is more important than write speed for day-to-day tasks. Large file writes and large/small file reads are roughly the same though, with any small advantages going towards XFS.
BTRFS will never be faster than something like XFS because it has way more features - with how fast NVMEs are I personally think it’s worth trading some of that ludicrous speed for features like data checksumming, compression, and snapshots. On the flip side, BTRFS is a very good idea to run on slow HDDs because its transparent compression will actually increase the drive’s speed, as the act of compressing/decompressing is way faster than the act of reading physical data from rust.
I agree on the “nonsense” part, as I’ve had none of those issues for over a year using this drive. Shit has been amazing. Also, I appreciated the lecture. I didn’t know any of what you said, so thank you. I did try xfs myself, but for my use case, I didn’t see any difference at all. Like nothing. I’m just a casual user who gets into the terminal some times, but that’s about it. So, btrfs works wonders for me with those sweet snapshots. Don’t know if xfs has snapshots, too, but I’m familiar with btrfs and timeshift, so I stuck with it.
XFS doesn’t support snapshots, but it does support reflinking like BTRFS. Reflinks allow data to be shared fully or partially between two files, which means that technically with a lot of elbow grease you could probably write a snapshotting system for XFS built on reflinks. There’s actually a “filesystem” named Stratis that takes vanilla XFS and layers a ton of modern features from e.g. BTRFS/ZFS onto it. Unfortunately it’s not as fast as XFS because of these features so it’s not a silver bullet yet.
tl;dr, BTRFS’s features are useful for most users, and I wouldn’t worry about filesystem speed unless you’ve got a very specific usecase like a database.
Nice. Thank you. I’ve learned a ton from just a couple of comments. Much appreciated
Isn’t that referring to the write zeros ata command to erase the whole drive?
No, it is referring to the NVMe write zeroes command that is used to set a range of blocks to zero. It seems like it is related to deallocate/TRIM functionality but I can only find documentation about the command without a good definition of why it would be used.
Some drives say they support it but don’t really, or it negatively affects performance, so they have quirks.
What if you try to wipe a NVME-drive for which this quirk is enabled by default in the kernel? Does that mean that even if you used something like the ‘erase device’ function in GNOME Disks on said drive, it would in fact not actually completely zero the drive? What if you use GNOME Disks to wipe a partition on said drive?
Or does this quirk refer to an entirely different operation?
Different feature. NVMe drives include a format command that can zero the drive or do a more in depth erasure, newer drives also have the sanitize option. I think this command just lets the system send a bunch of zeroes instead of having to send each one individually.
Thanks for the answer! So do I assume correctly then that things like using the ‘overwrite data’-option when running BleachBit, or wiping/erasing partitions or disks with zeroes in GNOME Disks or GParted actually does zero the data like normal, even if the quirk is enabled?
It depends how those tools are working at a lower level. Usually it is better to use a security erase feature in the drive itself, or the NVMe format command. Every drive has some sort of implementation.
SSDs have a lot of tricks to increase performance and longevity, things like wear leveling or not actually writing all those zeroes to the NAND, so writing all zeroes may leave a lot of data untouched on the actual drive while the firmware keeps a tally.
I am coming at this from a more general angle, so this may not be as applicable.
When a repeated series of inputs is written to a drive, a number of optimisations can be applied. If the user wants a large number of zeros written, the drive’s firmware can use a short cut. Instead of writing those zeros, it can make a note that a number of blocks, at a location, are all zero.
This becomes apparent if one runs fio against the raw block devices versus a software later like LVM. Some operations against some layers will be unreasonably fast. There is a lot of nuance to this.
My read of the quirk is an incompatibility between what Linux expects and the firmware. Enable the quirk only if the dmesg errors are present. Do not expect that the drive has been zeroed. If you want your data secure at disposal of the drive, the then use encryption.
Thanks a lot! This clarifies it for me, and if I understand correctly, it shouldn’t be a concern for me since my laptop isn’t used for data-intensive computing.
What if you try to wipe a NVME-drive for which this quirk is enabled by default in the kernel? Does that mean that even if you used something like the ‘erase device’ function in GNOME Disks on said drive, it would in fact not actually completely zero the drive? What if you use GNOME Disks to wipe a partition on said drive?
Or does this quirk refer to an entirely different operation?
A bit outside of my knowledge, but I understand that too be a long standing issue. Wiping issues are a good reason to encrypt a NVMe drive.
This page suggest the nvmi-cli has a secure format command that will do it. http://blog.pythonaro.com/2018/05/how-to-securely-wipe-nvme-drive.html?m=1
Hopefully, someone more knowledgeable will also tag me in their response.