Apologies in advance, this will be long because I want to provide as much info as possible.
I tried to copy my Witcher 3 GOG setup folder from my Samsung 970 EVO 1TB NVMe SSD (boot drive) to my Samsung 860 QVO 4TB (data drive). The folder is about 49GB. When I tried to copy it threw an error about one of the .bin files (not able to read I think) and giving an option to skip which I took. Not long after the file copying speed dropped to 1mb and after about a minute the PC crashed with a DPC_Watchdog_Violation.
I let it sit for about 5 minutes to give it a chance to "collect some error info" but it never progressed past 0% so I rebooted. On reboot the PC just hung at a black screen for more than a minute before finally coming up with an error about no available boot drives, please reboot and set one in the bios. (Sorry I didn't capture exact wording).
Checking in the BIOS, my NVMe drive had vanished and my first boot device was an 860 QVO (which is not a boot drive). I checked all the settings I could find and no NVMe drive. I tried CSM on/off, changed boot options from legacy to uefi first and none of the changes helped at all. Every time I rebooted it wouldn't detect the NVMe drive.
I pulled the NVMe drive out of the PC to have a look at it for visible damage but couldn't see any. I left the drive sitting on my desk for a couple hours while I tried to arrange to try to drive in a friends PC. Put the drive back into my PC so I could take it to my friends and I tried another boot, and lo and behold it booted into Windows just fine.
I immediately ran Samsung Magician and it says the drive health of all my drives is good and the temps are normal. I can't really tell what I should be looking for in the SMART info to see if something is going wrong. But the "Critical Warning" value is 0 which seems good at least. There are values in the "Media Errors" and "Number of Error Information Log Entries" but I don't know whether they're problematic. I can post the SMART values here if it's of use.
I thought I'd copy files off the NVMe drive to my QVO to make sure they're backed up in case the drive actually dies and I've hit copy errors again on several files (like 100ish MB video files I shot on holiday). I skipped them and the copy finished correctly. I went back and checked those files and they all play just fine in VLC. I tried copying them again individually and they copied just fine. I haven't tried copying the Witcher installers again because I'm a bit worried I'll hose my PC again if I do. I will give it a try after posting this though.
Checking the system even log I can see a bunch (several dozen) of errors "The device, \Device\Harddisk5\DR5, has a bad block." Harddisk5 is the NVMe drive. Looking at the times they correspond to the copying I just mentioned. Going back further there seem to be about a dozen similar errors at the time I got the DPC_Watchdog_Error. The last entry before the crash appears to be a bad block error. The next entry in the log is several hours later and corresponds to when I managed to successfully boot it again.
I've run a chkdsk on the NVMe drive and it says "No errors found". Just run SFC and "did not find any integrity violations".
In terms of hardware I've had the NVMe drive for 2 years and never had problems with it. The QVO drive for well over a year and again no problems. Got Kingston 64GB (2x32GB) HyperX Fury 3200MHz DDR4 RAM and have had it for more than 6 months. I have recently upgraded the bios on the mobo (Asus Crosshair VII Wifi) so that I could use a Ryzen 5800X on it. But that's been running just fine for a few weeks now. I also switched out a GTX1070 for an Asrock Phantom Gaming D RX6900XT and again that's been running happily for a few weeks. PSU is "only" a Corsair AX750i but even as peak I've never seen power draw over 620W (joy of digital monitoring) and that's when benchmarking or gaming. Running Windows 10 64bit which is obviously up to date. No overclocking, though I am running the DOCP settings for my RAM.
So what should my next step be? I'm trying to figure out whether my NVMe drive is failing or whether there is another hardware problem. Or maybe it's a windows or driver problem. I suppose it's possible that the crash was just a random thing, but the fact the NVMe drive disappeared from BIOS for about two hours, and after managing to get it back it's showing some bad blocks worries me. I had an old Sata SSD die on me and basically you get no warning. So I'm trying to determine whether I should buy a new drive or try to deal with Samsung about warranty on this one.
I tried to copy my Witcher 3 GOG setup folder from my Samsung 970 EVO 1TB NVMe SSD (boot drive) to my Samsung 860 QVO 4TB (data drive). The folder is about 49GB. When I tried to copy it threw an error about one of the .bin files (not able to read I think) and giving an option to skip which I took. Not long after the file copying speed dropped to 1mb and after about a minute the PC crashed with a DPC_Watchdog_Violation.
I let it sit for about 5 minutes to give it a chance to "collect some error info" but it never progressed past 0% so I rebooted. On reboot the PC just hung at a black screen for more than a minute before finally coming up with an error about no available boot drives, please reboot and set one in the bios. (Sorry I didn't capture exact wording).
Checking in the BIOS, my NVMe drive had vanished and my first boot device was an 860 QVO (which is not a boot drive). I checked all the settings I could find and no NVMe drive. I tried CSM on/off, changed boot options from legacy to uefi first and none of the changes helped at all. Every time I rebooted it wouldn't detect the NVMe drive.
I pulled the NVMe drive out of the PC to have a look at it for visible damage but couldn't see any. I left the drive sitting on my desk for a couple hours while I tried to arrange to try to drive in a friends PC. Put the drive back into my PC so I could take it to my friends and I tried another boot, and lo and behold it booted into Windows just fine.
I immediately ran Samsung Magician and it says the drive health of all my drives is good and the temps are normal. I can't really tell what I should be looking for in the SMART info to see if something is going wrong. But the "Critical Warning" value is 0 which seems good at least. There are values in the "Media Errors" and "Number of Error Information Log Entries" but I don't know whether they're problematic. I can post the SMART values here if it's of use.
I thought I'd copy files off the NVMe drive to my QVO to make sure they're backed up in case the drive actually dies and I've hit copy errors again on several files (like 100ish MB video files I shot on holiday). I skipped them and the copy finished correctly. I went back and checked those files and they all play just fine in VLC. I tried copying them again individually and they copied just fine. I haven't tried copying the Witcher installers again because I'm a bit worried I'll hose my PC again if I do. I will give it a try after posting this though.
Checking the system even log I can see a bunch (several dozen) of errors "The device, \Device\Harddisk5\DR5, has a bad block." Harddisk5 is the NVMe drive. Looking at the times they correspond to the copying I just mentioned. Going back further there seem to be about a dozen similar errors at the time I got the DPC_Watchdog_Error. The last entry before the crash appears to be a bad block error. The next entry in the log is several hours later and corresponds to when I managed to successfully boot it again.
I've run a chkdsk on the NVMe drive and it says "No errors found". Just run SFC and "did not find any integrity violations".
In terms of hardware I've had the NVMe drive for 2 years and never had problems with it. The QVO drive for well over a year and again no problems. Got Kingston 64GB (2x32GB) HyperX Fury 3200MHz DDR4 RAM and have had it for more than 6 months. I have recently upgraded the bios on the mobo (Asus Crosshair VII Wifi) so that I could use a Ryzen 5800X on it. But that's been running just fine for a few weeks now. I also switched out a GTX1070 for an Asrock Phantom Gaming D RX6900XT and again that's been running happily for a few weeks. PSU is "only" a Corsair AX750i but even as peak I've never seen power draw over 620W (joy of digital monitoring) and that's when benchmarking or gaming. Running Windows 10 64bit which is obviously up to date. No overclocking, though I am running the DOCP settings for my RAM.
So what should my next step be? I'm trying to figure out whether my NVMe drive is failing or whether there is another hardware problem. Or maybe it's a windows or driver problem. I suppose it's possible that the crash was just a random thing, but the fact the NVMe drive disappeared from BIOS for about two hours, and after managing to get it back it's showing some bad blocks worries me. I had an old Sata SSD die on me and basically you get no warning. So I'm trying to determine whether I should buy a new drive or try to deal with Samsung about warranty on this one.
Last edited by a moderator: