FWIW, I think Intel should keep Falcon Shores as HPC-oriented and Gaudi as their AI platform. Just because Nvidia has been successful using GPUs for AI doesn't mean it's the best architecture for that problem. I predict Nvidia will fork off a line of AI accelerators that look less like GPUs than their current products do - and they might not even support CUDA!
I also think you don't need HBM to do AI right, especially if you're not constrained to a GPU-like architecture. What you need is lots of SRAM, so the thing to do is just stack your compute dies on SRAM. AI is a dataflow problem, whereas graphics isn't. That means GPUs have to be much more flexible with data movement & reuse, and that's why they need such crazy amounts of bandwidth to main memory. By contrast, AI has much better data locality, so you can do quite well with adequate amounts of SRAM and a mesh-like interconnect.