According to this https://forums.developer.nvidia.com/t/what-is-cores-per-sm/29997/4 Cuda cores are part of SMs. So do SMs provide individual MMUs over a 32-bit memory data bus to specific VRAM ICs? Or is there an independent MMU that accesses all VRAM for the individual SMs using some sort of ring bus or infinity fabric? In the past I had thought I read somewhere that each SM had its own independent , 32-bit, memory data bus.