I think it's the first option. 2x GB203 glued together, but clocked lower.
How else would a dual 400W turn into 600W?
Speculation
Did they choose 600W because they didn't want to add another 16pin 12VHPWR connector?
or does the CoWoS-L interconnect bottleneck and not benefit from more power into the chip?
or they are leaving headroom for an 800W 5090Ti?
I expect the 5090 to run cool, so long as your case can keep up with the total heat it dumps. That is a huge die and would be easy to cool with a vapor chamber.
Generally power and performance do not scale linearly, once you pass a threshold adding more power doesn't do much for performance, just look at the Ryzen 9000 series reviews, specifically stock vs PBO enabled, around double the power for maybe 10 % more performance.
It could be a case of with the 5080 they are pushing it as far as possible in terms of power because they have the headroom, given they have 675 W available (600 W from power connector plus 75 W from PCIe slot) so pushing to 400 W isn't too hard but it may not actually gain much performance versus if they capped it at say 300 W. Then with the 5090 they may just not be pushing it so hard and cap the whole thing at 600 W but are still able to get most of the total theoretical performance out of the chip.
For applications where you want high performance but low power you generally want to go for more cores but lower clocks and lower power limits to run those cores more efficiently since running the cores close to their maximum performance isn't very efficient.