Nvidia JetPack 2.3 Doubles Deep Learning And Compilation Performance

Status
Not open for further replies.
Impressive that software optimizations accounted for such an improvement. I wonder how much of it is down to simply improving data access patterns.

For those who don't know, this has 4x A57 cores and 256 Maxwell cores. Memory is dual-channel LPDDR4-1600 w/ 2 MB L2 Cache. 20 nm process node. 512 GFLOPS, single-precision; 1024 GFLOPS @ fp16. ~10 W.

They call it an "embedded supercomputer". But, in terms of raw performance, a low-end gaming desktop with a GTX 950 is faster.
 
Status
Not open for further replies.