News Intel Scales Neuromorphic Loihi to 64 Chips, 8 Million Neurons

Is it efficient enough that heat isn't an issue? I'm not too knowledgeable on this type of chip and am assuming its heat is low enough that passive cooling is enough.
 
Is it efficient enough that heat isn't an issue? I'm not too knowledgeable on this type of chip and am assuming its heat is low enough that passive cooling is enough.
The article says a GPU uses 109x the amount of power, for inferencing. According to the Intel press release, that's based on this paper:


...which compared it to the GK106-based Nvidia Quadro K4000, among other things. That's a suspicious choice, since it's essentially a down-clocked GTX 650. And the paper was published in December 2018, after not only the Maxwell and Pascal series, but also the Turing GPUs had been released. So, either the researchers were specifically trying to make the Intel chip look as good as possible, or they literally just used an old GPU they happened to have lying around. Either way, it's pretty lame to compare a 28 nm GPU from 2013, with no optimizations for deep learning, to a purpose-built 14 nm ASIC.

Pascal is much more efficient (I'm going to guess about 3x or more, if you're using it's int8 support) and Turing's Tensor cores add another multiple somewhere in the single digits. So, if they did an honest comparison with like a Tesla T4, they'd struggle to get even 10x efficiency advantage.

Anyway, getting back to your question about absolute power consumption, you'll find that paper quotes the dissipation of a single chip at 0.110 W. Multiply that by 64 and you get about 7 W. So, no need for pesky heatsinks.
 
  • Like
Reactions: Growle
I just noticed they used a batch size of 1, for that data! GPUs are much faster and more efficient with large batch sizes, as the paper subsequently acknowledges. When they increase the batch size to 64, the GPU gets a 47x speedup (it was already 2.6x as fast as Intel's Loihi chip, to begin with), while being about 50x as efficient as batch size 1! So, that cuts Intel's efficiency benefit to a mere ~2x, unless you have some sort of workload that absolutely cannot be batched (e.g. realtime/embedded control). Once you add the efficiency gains of Pascal or even Maxwell, their massive efficiency advantage should completely fall away. And we haven't even gotten to Turing's Tensor cores.

BTW, the GPU is so slow that it's even outrun by their Xeon E5-2630 CPU! They don't say which version, but I'm guessing v3 or v4, which would have AVX2. Worse, that GPU is only about 2x the speed of the Jetson X1 SoC, which is essentially what powers the Nintendo Switch. So, we're talking about a really crap GPU.

Maybe I'll give the paper a full read, and post up anything more I find. So far, what sounded like a very promising development now seems like a PR whitewashing exercise.
 
Last edited: