News Microsoft's DirectStorage API to Support PCIe 3.0 NVMe SSDs, DirectX 12 GPUs

ThatMouse

Distinguished
Jan 27, 2014
224
95
18,660
I'm not a game developer, but isn't it already possible to use the GPU to decompress files? Why is this something only the CPU can do? It also looks like they are storing the compressed files in VRAM which would eat up VRAM for files the game cannot use.
 
It was actually expected. If RTX IO and what AMD is planning as alternative works over PCIe in general then it should work not only over PCIe 4.0, but also with PCIe 3.0. Interesting what kind of other uses besides gaming this will introduce. Certainly will be useful for video encoding. Also for file compression and decompression as one mentioned above. Probably a plenty of other uses too. Wish GPUs would be available :)
 

InvalidError

Titan
Moderator
I'm wondering why is it necessary to copy from storage to system RAM, then to VRAM?
Because standard IO functions expect system memory addresses as the source/destination argument.

Also, due to BAR size limit pre-resizable BAR, there is no way for the NVMe SSD to know whether the GPU's BAR is mapped to the correct VRAM block so if you are going to waste microseconds doing context switches for the kernel/drivers to make all of the checks to move the BAR around if needed and make sure nothing else was attempting to access it, you are better off using system memory as a go-between and not worry about it. It would make sense if resizable BAR was a prerequisite for Direct Storage to greatly simplify things.
 
  • Like
Reactions: hotaru.hino

Gillerer

Distinguished
Sep 23, 2013
361
81
18,940
I'm not a game developer, but isn't it already possible to use the GPU to decompress files? Why is this something only the CPU can do?

I think the current (game asset) compression methods are only suited to general purpose CPU cores, and as stated in the slides, new ones need to be developed for GPUs to be any good at it (or be able to perform without affecting the game performance negatively).

It also looks like they are storing the compressed files in VRAM which would eat up VRAM for files the game cannot use.

The data has to be in memory before the GPU (or CPU) can process it. Without memory as buffer, the CPU/GPU would spend much too much time waiting for data to trickle in from the (relatively) slow NVMe device and PCIe bus. If there is memory pressure, the compressed data can always be discarded when it's no longer needed.

I could also see the benefit of keeping compressed data in VRAM, and instead discarding unused uncompressed data sets. If decompression is easy and doesn't affect game performance, any data needed could then be quickly decompressed again without ever going to NVMe or the PCIe bus. This would be an actually working version of the "double your RAM" scam compression software of the past.

*

Also, they're not storing compressed files (as in up to 1GB files in the game installation) in memory, but compressed data. A game can pick and choose which parts of a file to read to memory, based on the locations of the assets it needs. This means the memory usage is only as much as the level requires. The large file sizes in games are mainly due to how file systems are so much better at handling few huge files than thousands of small ones; especially if using a hard drive.

*

I worry that Microsoft will attempt to tie DirectStorage to UWP somehow... :-(
 
  • Like
Reactions: ThatMouse
Because standard IO functions expect system memory addresses as the source/destination argument.
This doesn't make sense to me because if VRAM is memory mapped in the virtual address space, it should be directly addressable anyway. Or at least the portion that's been mapped.

Also, due to BAR size limit pre-resizable BAR, there is no way for the NVMe SSD to know whether the GPU's BAR is mapped to the correct VRAM block so if you are going to waste microseconds doing context switches for the kernel/drivers to make all of the checks to move the BAR around if needed and make sure nothing else was attempting to access it, you are better off using system memory as a go-between and not worry about it. It would make sense if resizable BAR was a prerequisite for Direct Storage to greatly simplify things.
But this provides a better answer.