Tom's article seem to be short on details. I got some insight.
A Finnish startup called Flow Computing is making one of the wildest claims ever heard in silicon engineering: by adding its proprietary companion chip,
techcrunch.com
CPUs have gotten very fast, but even with nanosecond-level responsiveness, there’s a tremendous amount of waste in how instructions are carried out simply because of the basic limitation that one task needs to finish before the next one starts.
What Flow claims to have done is remove this limitation, turning the CPU from a one-lane street into a multi-lane highway.
The CPU is still limited to doing one task at a time, but Flow’s PPU, as they call it, essentially performs nanosecond-scale traffic management on-die to move tasks into and out of the processor faster than has previously been possible.
Think of the CPU as a chef working in a kitchen.
The chef can only work so fast, but what if that person had a superhuman assistant swapping knives and tools in and out of the chef’s hands, clearing the prepared food and putting in new ingredients, removing all tasks that aren’t actual chef stuff?
The chef still only has two hands, but now the chef can work 10 times as fast. It’s not a perfect analogy, but it gives you an idea of what’s happening here, at least according to Flow’s internal tests and demos with the industry (and they are talking with everyone).
The PPU doesn’t increase the clock frequency or push the system in other ways that would lead to extra heat or power; in other words, the chef is not being asked to chop twice as fast. It just more efficiently uses the CPU cycles that are already taking place.
Flow’s big achievement, in other words, isn’t high-speed traffic management, but rather doing it without having to modify any code on any CPU or architecture that it has tested. It sounds kind of unhinged to say that arbitrary code can be executed twice as fast on any chip with no modification beyond integrating the PPU with the die.
Therein lies the primary challenge to Flow’s success as a business: Unlike a software product, Flow’s tech needs to be included at the chip-design level, meaning it doesn’t work retroactively, and the first chip with a PPU would necessarily be quite a ways down the road.
Chart showing improvements in an FPGA PPU-enhanced chip versus unmodified Intel chips. Increasing the number of PPU cores continually improves performance.