This has been a topic of much debate over the years... the first graphics accelerators used multiple chips for different parts of the pipeline, and even a separate card for 2D in the case of the early Voodoo cards as you mention...
Switching from a monolithic chip to chiplets has worked out well for AMD on the cpu side, although it has been noted that the 8 cores in a single die for Renoir perform comparatively better than the 2 x 4 core CCD + separate IO die of desktop Ryzen, despite having much less cache - so that decision is very much a compromise. This is usually far worse in graphics, which is why multi-chip graphics cards have never worked as well as a larger single chip - mainly due to the large amounts of data involved. Whenever a large block of data has to be moved it costs both power and latency to do so - which is detrimental to graphics performance. It's not as big an issue on the cpu as often the cpu cores are working independently of each other.
I think the 'holly grail' required to allow all of this to work is to get around the inherent problem of moving data - AMD have done a lot of work on this (and I'm sure nVidia and Intel have done similar projects) - where multiple discrete chips have fully coherent shared memory - that would allow the data to remain in one place and be worked on by multiple different devices without the drawbacks. I've not heard anything to suggest this is part of Ampere though.
The other case where it might make sense is if the RTX functions can be handled truly independently of the rest of the graphics pipeline, in which case moving it off on a separate accelerator would be possible, although given it's effectively a lighting calculation I'm not sure that is the case. What I would say though is in the long run, ray tracing hardware will ultimately finish up integrated into the GPU, just like texture units, geometry setup engines, rops and so on have all done. I think any multi chip approach would be a short term stop gap solution.