The main issue is the different hight of both packages. not the HBM. This 0,1mm are a lot of space.
That would only be an issue if the HBM stacks required that much cooling. In the more likely case that they don't, then AIB manufacturers can simply use a flat contact surface, locate heatsink mounting holes accordingly to center the pressure on the GPU die instead of the whole package and leave the HBM thermal paste gap be whatever it needs to be.
The package is asymetric. You can't center the pressure on the die, this is physically impossible because the die is not in the middle (only two edges). That's the reason why they are using stoppers to be save. You can optimze this stoppers for 0,1 mm more or less, but not both. I discussed it with a few thermal engineers in the factory. The reference cooler is optimzed for the higher package. The lower package must life with more TIM. This costs cooling performance. They are using for this packages now other screws and two holes more. But this info came too late. The follow is a delay.