My 970 EVO and 980 Pro are reporting 98% and 97% respectively. However, I've noticed that whichever one is the OS drive ends up having bit rot corrupt a few files here and there every month (after running scandisk and checksumming against my backup). It's always old data files that haven't been accessed in a long time. But when used as a storage drive instead of an OS drive, the issue goes away.
That should never happen, unless a storage device is on its way out. It not happening "when used as a storage drive" should make no difference, unless there's some kind of firmware bug. Did you check that it was using the latest firmware? Unless I confirmed that it was a known issue and addressed in a new firmware version, I would immediately replace the device.
The SSD's own controller is supposed to do a patrol scrub of the blocks, and rewrite any with too many correctable errors.
Speaking of which, I wonder what SMART reports about the drive. Does it indicate any uncorrectable reads? Because there are other ways to get bad data - it could be an interface-level error or even a problem on the host hardware or software.
On Linux, filesystems like BTRFS and OpenZFS keep their own checksums and will tell you when you're getting bad data - so, no need to manually check against backups.
Maybe the controller does not parity check older files often enough for correction that they build up on unused data? If the controller only parity checks files that are used more often and less used files less often then I could see this happening.
No. NAND flash is based on charge-storage. Since the cell charge in modern NAND chips will decay much sooner than the warranty period of the drive, the controller
must periodically scan & refresh the drive contents. A little bit like DRAM refreshes, but we're talking about days or weeks, rather than like microseconds.
This is also why you can't use modern SSDs for cold storage.