Google's Machine Learning Chip Is Up To 30x Faster, 80x More Efficient Than CPUs And GPUs

Status
Not open for further replies.
"GPU u can do anything" - Amdlova

Well.. No. If GPU's could do anything in a practical sense, then there wouldn't be any need for CPU's.

Just as GPU's are optimized for graphics rendering, AI chips are optimized for AI computing. This one happens to be optimized for Google's AI methodologies.

Good for google.

This trend of moving software algorithms to hardware will continue as the limits of Silicon computation are reached. It is a natural consequence of trying to squeeze more computational power out of a technology that has reached it's limits of raw computational power in the form of traditional CPU design.

The advantage of these new kinds of chips is massive parallelism optimized to suit a specific task.

Neural Network processing for example can make use of litterally hundreds of billions of parallel computing elements each taking the place of a single neuron.

Put enough of those things together with some interneuron communication a little internal memory and you have yourself a simulated brain.

You can certainly simulate such things on CPU's and even GPU's but neither are well suited to the task since both are far, far too course grained to produce rapid simulation results.

The trick with massive parallelism is in who gets what messages, and when, and where is the memory locaed and how is it accessed.
 
"dude above me" - Romeoreject

The dude above you is a moron.


"This sounds amazing." - Romeoreject

The Google algorithms seem to work pretty well for pattern matching.

What the production of the ASIC tells you is that Google is confident enough in the methods used and see enough speed advantage in the methods used, that they are willing to commit to the production of an ASIC to implement much of those methods in hardware - which can not be modified once produced.

Do you think the NSA uses similar ASIC's to monitor you?
 
Don't know, but I also don't care. The level of sophistication of their monitoring technology isn't the issue.

What I do care about is China's monitoring of non-Chinese. I think that's far more likely to impact me, personally. Maybe not for 5 years or more, but it could be even worse than all these ad networks tracking us.
 
They are getting advantage from a stripped down functionality circuit the question is what else is it useful for. If it is only useful for a narrow range of things, then what is the use of dwelling on it.

It mentions limited instruction set, so it obviously is not completely hardwired, and they are tailoring it for more functionality for future editions. This extra functionality is likely at increased circuit complexity, which maybe would take some of the speculated speed improvements out of it, and reduce the amount of processing power per area compared to what could be. But what it does mean, is that of will still be a lot more powerful and efficient then present. But what if you want to make it more functional again, to do everything but 3D/graphics, then these metric advantages will again become less, but this is the sort of circuit we really do need working alongside a GPU graphic card. I call this a Workstation Processing Unit. GPU manufacturers can work on something like this from their GP-GPU experience alongside their GPU lines.
 
Because machine learning is interesting, as well as GPU computing and the benefit they got by going with an ASIC.

Maybe, but you don't really know what constitutes the primitives. If the instructions are like computing the tensor product between A and B, then the programmability is probably adding very little overhead.

They already do that. Nvidia has the Tesla cards, which have GPU without any video output. In the case of the P100, the graphics circuitry seems to have omitted from the chip, entirely. That said, I don't know what proportion of a modern GPU is occupied by the raster engines, but it might not be very much.

GPUs can't really compete with integer-only machine learning ASICs. A GPU must have lots of fp32, and that's going to waste a lot of die area, if you're only using integer arithmetic. I expect that Nvidia is already working on dedicated machine learning chips. If they built an inference engine like Google's, on the same scale as their current GPUs, it would stomp Google's effort into the ground. And bury it.
 
Status
Not open for further replies.