[SOLVED] Very Slow Speeds Between Two M.2 PCIe NVME SSD Drives (Samsung 970 EVO Plus 1TB)

May 3, 2020
4
1
25
Hello.

I am getting terrible transfer speeds between two M.2 NVME SSD drives. Both drives are working perfectly independently and hit advertised max speeds. Also, both get amazing transfer speeds to other SATA6 SSD drives, USB3 drives, and even network share writes to an UNRAID server are solid. However, when I try to transfer files (huge or small) between the 2 M.2 NVME drives directly, my write times get destroyed and hover around 60 MB/s. I'll run through my specs below, but there has to be a bottleneck somewhere. I have tried everything I know to do (seasoned IT guy).... and this machine is a gaming beast. I have no other issues except for this oddity. I searched the internet for hours and cannot find anyone asking about this similar problem. I figured after 15 years of watching from the stands, I would finally register as a user and make a post. I am not sure what read/write times are expected between 2 of these drives on the same motherboard, but it has to be higher than 60 MB/s one would assume! Can you even get the max speeds of 3,000 MB/s between these drives? Let me know if you are! I know that each of these drives in the M.2 specific install slots I chose on this particular motherboard are getting full X4 PCIe Gen 3 speeds (Samsung Magician confirmed). All drivers, firmware, OS updates, etc. are done. I have even swapped the drives/slots on the Motherboard. Crystal Disk Mark 6 shows everything is normal. Samsung Magician shows everything is normal.

Core Hardware:
  • OS: Windows 10 Home 64-bit
  • Motherboard: GIGABYTE Z390 AORUS ULTRA LGA 1151
  • CPU: Intel Core i9-9900K
  • RAM: Corsair Vengeance Pro 32GB DDR4 RGB
  • PSU: Corsair HX-1200i
  • Graphics Card: EVGA Nvidia GeForce RTX 2080 Ti
Disk Hardware:
  • OS Drive (SATA6): Samsung SSD 850 EVO 250GB (Firmware EMT02B6Q)
  • Drive 1 (M.2 NVME PCIe): Samsung SSD 970 EVO Plus 1TB PCIe Gen 3 X4 (Firmware 1B2QEXM7)
  • Drive 2 (M.2 NVME PCIe): Samsung SSD 970 EVO Plus 1TB PCIe Gen 3 X4 (Firmware 1B2QEXM7)
  • Other Drives: I have X3 Samsung SSD 850 EVO 1TB drives in Raid-0 3TB setup. Transfers to this volume are very high. I also have X3 Seagate Backup Plus 8TB USB3 drives attached. Transfers to any of these 3 drives from either of the 2 NVME are as expected.
Finally transfers between ALL OTHER DRIVES in EVERY COMBINATION are normal. I have ran through exhaustive testing. The only drop in speed I see is between these 970 EVO Plus NVME drives.

I am running Drive 1 in slot "M2M" (Above Graphics Card) and Drive 2 in slot "M2A" (Below Graphics card). See the link below for a complete breakdown. In these slots (without using SATA ports 4/5), I have no motherboard bottlenecks or anything disabled for my hardware architecture.View: https://www.reddit.com/r/gigabyte/comments/a1u57t/z390_auros_ultra_m2_question/

I am dying to figure this out. Let me know what you need. I have lots of documentation and screenshots ready. Thanks in advance for any help!
 
Last edited:
Solution
Any large writes of very large file sizes are going to stagnate over the limited size of the write cache, limiting the entire thing....

that you can read at 2000-3000 MB/sec (best case) from one drive does little if writing to another drive, for which once the write cache is full/saturated, write speeds slow down massively... (ergo, the whole transfer is now limited/throttled, especially with a batch of larger than cache files

Does the transfer at least start fast, then slow down quickly? Quite possibly just cache limitations..

Nope it is a static 60 Mb/s.

BUT... You helped solve my issue! (y)

I just found out what the problem was. I have been using TeraCopy (File Handling Program with Hash Check...
This isn't the cause, but be aware that on Intel consumer boards (like yours with the Z390 chipset) all M.2 sockets are over the chipset. This means they share bandwidth with everything else on the chipset up to a maximum of ~3.5 GB/s for sequential traffic. Obviously different for smaller files and such but it is full-duplex/bi-directional so writing from one to another should hit up to 3 Gb/s within the SLC cache that is. You notice I'm using a lot of qualifiers here. The M2A + M2M sockets don't have any conflicts beyond SATA ports as you noted (M2P conflicts with a PCIe slot). Obviously you want to be using Samsung's NVMe driver, which it seems you are, so you have all the obvious potential issues covered.

So: run CrystalDiskMark with two instances, one on each drive, simultaneously, and see if there is a conflict there. Then try to copy files from one to the other in safe mode or in some similar circumstance (e.g. bootable live CD). Rule out software basically. That's always my first suggestion. Sometimes there are issues with drives having the same identifier although that should not be an issue on Windows 10.
 
  • Like
Reactions: falconexe
May 3, 2020
4
1
25
This isn't the cause, but be aware that on Intel consumer boards (like yours with the Z390 chipset) all M.2 sockets are over the chipset. This means they share bandwidth with everything else on the chipset up to a maximum of ~3.5 GB/s for sequential traffic. Obviously different for smaller files and such but it is full-duplex/bi-directional so writing from one to another should hit up to 3 Gb/s within the SLC cache that is. You notice I'm using a lot of qualifiers here. The M2A + M2M sockets don't have any conflicts beyond SATA ports as you noted (M2P conflicts with a PCIe slot). Obviously you want to be using Samsung's NVMe driver, which it seems you are, so you have all the obvious potential issues covered.

So: run CrystalDiskMark with two instances, one on each drive, simultaneously, and see if there is a conflict there. Then try to copy files from one to the other in safe mode or in some similar circumstance (e.g. bootable live CD). Rule out software basically. That's always my first suggestion. Sometimes there are issues with drives having the same identifier although that should not be an issue on Windows 10.

Maxxify,

Thanks for this suggestion. I really appreciate it. So I ran these simultaneously and independently. The exact same behavior I describe is happening within CrystalDiskMark 6. Individually, each drive hits max read/write speeds. When both run together, there are huge dips in performance on both drives. I was seeing this when copying/moving files between the drives, and it is interesting that it can also be caused by just shear I/O on both drives at the same time. Makes sense though...

Lowest Seq Q32T1 (9 Pass/4GB) simultaneous Reads/Writes are 1216.6/999.1 MB/s as opposed to independent maxes of 3563.8/2844.0 MB/s. Even with these impacted speeds though, I still should be seeing faster transfer times than a locked 60 MB/s between these drives. I would add screenshots, but apparently this forum does not allow local hosting of images.

So in my opinion it is one of 3 things.
  • Motherboard Hardware Limitation
    • As manufacturer advertised, and in the current slots, this should not be happening.
    • I have swapped these 2 drives in all combinations of the native X3 M.2 slots, and there is no change.
    • I have purchased 2 DIFFERENT Samsung 970 EVO Plus 1TB drives (NO Change, so Rules Out the Drives)
      • I'll be throwing these into my UNRAID server (45 Drives Storinator Q30 Turbo) via PCIe NVMe adapters to make a RAID Cache Pool)
  • Windows 10 Software Issue
    • All OS/drivers/firmware are up to date.
    • Device Manger Looks Perfect.
    • Samsung Magician Looks Perfect.
    • I have even uninstalled the Samsung NVMe drivers, back to OS Default, and reinstalled the Samsung driver. No change.
    • I may just rebuild the OS clean and report back...
  • BIOS Configuration
    • I am not running the latest firmware, but I see no updates in the release notes pertaining to NVMe.
    • It is possible this is the issue, but I will have to dig further.
    • I am throwing this build into a new Corsair Crystal Series 680X RGB (Freaking Amazing Case BTW) soon, so I will update the BIOS then.
My other option is to just put down the $600 and pick up the MSI MEG Z390 GODLIKE motherboard which is the best setup for this CPU/socket out there. I have other uses for the GODLIKE, so it wouldn't be a waste of money. I could live with just not transferring data between these drives to save $600, but nevertheless, this still bugs me ha ha.
 
Any large writes of very large file sizes are going to stagnate over the limited size of the write cache, limiting the entire thing....

that you can read at 2000-3000 MB/sec (best case) from one drive does little if writing to another drive, for which once the write cache is full/saturated, write speeds slow down massively... (ergo, the whole transfer is now limited/throttled, especially with a batch of larger than cache files

Does the transfer at least start fast, then slow down quickly? Quite possibly just cache limitations..
 
May 3, 2020
4
1
25
Any large writes of very large file sizes are going to stagnate over the limited size of the write cache, limiting the entire thing....

that you can read at 2000-3000 MB/sec (best case) from one drive does little if writing to another drive, for which once the write cache is full/saturated, write speeds slow down massively... (ergo, the whole transfer is now limited/throttled, especially with a batch of larger than cache files

Does the transfer at least start fast, then slow down quickly? Quite possibly just cache limitations..

Nope it is a static 60 Mb/s.

BUT... You helped solve my issue! (y)

I just found out what the problem was. I have been using TeraCopy (File Handling Program with Hash Check Functionality) for all of my transfers to my UNRAID server to ensure file integrity. I had also defaulted it in Windows for all file handling as well since it is pretty slick.

I know what some may be thinking already. Oh, the hash check was causing it. Well, the hash check happens after the entire copy completes. So if that was the case, I would see insane speeds up until the file fully copied, and then the hash check would crawl. Even that does not make sense though because the read speeds would be similar to write speeds and I have never seen it slower to hash than to copy. In 100% of my experience, the Hash check is equal to or less than the time/speed it takes to copy the file you are hashing and performing validation on.

Whatever the reason, TeraCopy is causing severe lag while copying files between NVMe drives on the same OS/Motherboard. If I default back to Windows Explorer, that 4 GB temp file created by CrystalDiskMark 6 (I kept a copy for testing purposes) is literally done copying between these drives in about 3 seconds. Unfreakingbelieveable! :rolleyes::LOL:

So, definitely a software issue. I can see the cache fill up in Windows Explorer for a split second, then BAM, the remaining part of the file copies instantaneously after the cache clears. So you were correct in that there is not enough cache to handle the insane speed of these drives with huge files, but it clears so quickly again and again, that overall, the speeds are still about 3,000 MB/s to transfer between these drives.

Troubleshooting 101, what sits between the problem? This software has been so integrated into my workflow that I forgot it was even there. Ha Ha! My bad. Thanks for getting me on this train of thought. That was the one piece of software that remained throughout all of my testing.

I am now going to use it only for UNRAID transfers over my 10 GBe NICs on a dedicated peer-to-peer connection and my new NVMe UNRAID RAID Cache Pool. I should be able to saturate the network connection with that setup and hit 1,000MB/s+ on transfers up and down.

Thanks again to both Maxxify & mdd1963 for the help!
 
Last edited:
  • Like
Reactions: Maxxify
Solution