News Arm Co-Founder: Nvidia Owning Arm Would Be a Disaster

To me, Nvidia's intentions for Arm are fairly obvious. They want to be even bigger in supercomputers and the ARM architecture is very well suited for those because of its low power consumption. It becomes even more clear when you consider that Nvidia has said they are porting CUDA to run on the ARM instruction set. Then consider that ARM-based CPUs have been made with 80 cores and 4-way SMT. It seems to me what Nvidia wants to do is make a massively parallel MCM-based machine. They will go with at least 4-way SMT but think they will aim for 8 or 16-way. This would make one ARM CPU look like a streaming multiprocessor which are 64-way SMT currently. Then they will put 64 of these on chip module which will be good for 512 or 1024 threads. Then they will put 8 of those on one MCM and have 4096 or 8192 threads available on one module. This will be look like just another GPU to CUDA code. The biggest benefit is it won't be a co-processor so data transfers will be minimized. All together, this architecture will be astonishingly fast and will change the design of super computers for years to come.

That's where I think Nvidia is going with this and they want Arm as a design resource more than anything else.
 
If it is not clear NVDA can do that competitively, last time Fujitsu did its supercomputer using ARMs, it uses 3.4M watts to do 2 exaflop, Intel is likely to have 1.2 exaflops with 1.x M watts for the Argonne Lab, it is far more power efficient than the Arms, It is not clear NVDA is smarter than Fujitsu.
 
  • Like
Reactions: bit_user
If I was an ARM licensee, I'd be looking into RISC-V and the few other truly open ISAs out there to make sure I don't get screwed again by ISAs getting locked up behind prohibitively steep licensing fees by another buy-out.
 
  • Like
Reactions: bit_user
If I was an ARM licensee, I'd be looking into RISC-V and the few other truly open ISAs out there to make sure I don't get screwed again by ISAs getting locked up behind prohibitively steep licensing fees by another buy-out.

I don't think any of those open source ISA's are close to the same maturity level of ARM. Apple could do something like this but they are even more heavily invested in ARM since building there own ARM laptop chip. If Nvidia does buy ARM I highly suspect open source ISA's will look better and you will see a large group of companies come together to produce one standard ISA. I could see Amazon, Apple, and Google join together to make one ISA.
 
I would love this. Apple moving to ARM which would be owned by NVidia whould mean that they would probably be forced to support nVidia Gpus & Cuda
 
If Nvidia owns ARM, it will only be a matter of time before the other companies that license ARM (Apple, Qualcomm, Amazaon, etc) will get screwed with heavy licensing fees and basically forced to buy ARM chips made by Nvidia. Selling to a direct competitor is a bad idea, IMHO.
 
  • Like
Reactions: bit_user
If it is not clear NVDA can do that competitively, last time Fujitsu did its supercomputer using ARMs, it uses 3.4M watts to do 2 exaflop, Intel is likely to have 1.2 exaflops with 1.x M watts for the Argonne Lab, it is far more power efficient than the Arms, It is not clear NVDA is smarter than Fujitsu.

You are comparing the Aurora supercomputer project, which maybe Intel will complete in the future, in 2 years from now, and only by using Intel GPUs manufactured at TSMC, as disclosed by Intel recently, with the Fujitsu ARM computer, which exists and works right now.

Therefore the comparison is meaningless. The Fujitsu ARM CPUs have about the same power efficiency as the NVIDIA Volta GPUs, which is unprecedented for a CPU. Of course, the new NVIDIA Ampere will surpass them and take again the first place in power efficiency, but this time with a much less advance than over past CPUs.

The current Intel Cascade Lake and Cooper Lake CPUs, when using AVX-512, have a power efficiency 3 times less than either the NVIDIA Volta GPUs or the Fujitsu ARM CPUs.

Of course, the high power efficiency of the Fujitsu ARM CPUs has much less to do with them implementing the ARM instruction set than to the fact that they implement the new SVE instruction set extension for vector computation.
 
I don't think any of those open source ISA's are close to the same maturity level of ARM.
ARM wasn't built in one day either, give them some time. RISC-V's biggest problem for now is that it is still mostly an academic curiosity and as such, the ISA tends to get some extensive revisions when devs run into issues either writing software or designing CPUs and tweak the ISA to smooth those out.

I wouldn't put too much faith in large corporations necessarily faring a whole lot better since those would be under considerable internal pressure to get some sort of product out the door instead of minimizing design and performance roadblocks from ISA to silicon and software.
 
RISC-V's biggest problem for now is that it is still mostly an academic curiosity
Why do you say that?

the ISA tends to get some extensive revisions when devs run into issues either writing software or designing CPUs and tweak the ISA to smooth those out.
Did you check out the revision history of RISC V? They're already up to v2.1:


Here are some benchmarks of a board running Linux on RISC V from > 2 years ago:


Of course, the performance is nothing to write home about, but the fact that they could already boot the OS and run a benchmark suite, back then, is saying something about maturity.

I wouldn't put too much faith in large corporations necessarily faring a whole lot better since those would be under considerable internal pressure to get some sort of product out the door instead of minimizing design and performance roadblocks from ISA to silicon and software.
That's why it's good to have open source OS kernels and toolchains - because they have gatekeepers who block low-quality patches, forcing contributors to set aside the schedule and resources to do it right.
 
It becomes even more clear when you consider that Nvidia has said they are porting CUDA to run on the ARM instruction set.
They meant as a host CPU. Analogous to how it currently runs on x86 CPUs, in order to utilize their GPUs.


Then consider that ARM-based CPUs have been made with 80 cores and 4-way SMT. It seems to me what Nvidia wants to do is make a massively parallel MCM-based machine. They will go with at least 4-way SMT but think they will aim for 8 or 16-way. This would make one ARM CPU look like a streaming multiprocessor which are 64-way SMT currently.
Like @Jimbojan said, that won't deliver competitive perf/W. General purpose CPUs are optimized for single-thread performance, whereas GPUs are throughput-optimized. The differences cut very deep. Failure to recognize this is what spelled doom for Intel's Xeon Phi.

That's where I think Nvidia is going with this and they want Arm as a design resource more than anything else.
I think Nvidia wants to own the full stack for HPC and embedded/self-driving SoCs, and they want to disadvantage Intel and AMD from moving into those markets with ARM cores of their own.
 
you will see a large group of companies come together to produce one standard ISA. I could see Amazon, Apple, and Google join together to make one ISA.
Google is a platinum member of the RISC V Foundation:



Granted, Apple and Amazon aren't there, as they're both firmly in the ARM camp, but is that group of companies not large enough for you?
 
  • Like
Reactions: JamesSneed
Apple moving to ARM which would be owned by NVidia whould mean that they would probably be forced to support nVidia Gpus & Cuda
Apple is already designing their own GPUs for mobile, which they'd probably rather scale up than give Nvidia even more leverage over them.

And even if they did support Nvidia, Apple has deprecated all GPU APIs besides their own Metal. Even OpenGL and OpenCL (which they co-founded!) are out! So, certainly don't get your hopes up for them embracing CUDA.
 
The Fujitsu ARM CPUs have about the same power efficiency as the NVIDIA Volta GPUs, which is unprecedented for a CPU.
Where do you get that? Last I heard, they (at 7 nm) were worse than Volta (12 nm) by a factor of 2.

CPUs cannot beat a GPU at its own game, and vice versa. That project made sense for reasons beyond purely technical. It avoided depending on non-Japanese vendors for critical IP.

the high power efficiency of the Fujitsu ARM CPUs has much less to do with them implementing the ARM instruction set than to the fact that they implement the new SVE instruction set extension for vector computation.
In this case, I agree.
 
Where do you get that? Last I heard, they (at 7 nm) were worse than Volta (12 nm) by a factor of 2.

CPUs cannot beat a GPU at its own game, and vice versa. That project made sense for reasons beyond purely technical. It avoided depending on non-Japanese vendors for critical IP.


In this case, I agree.

See the last Green 500 list, where Fujitsu has 16.876 Gflops/watt, while the best NVIDIA Volta has 16.285 Gflops/watt.

https://www.top500.org/lists/green500/2020/06/

For a short time Fujitsu was on the 1st place, but meanwhile it has been surpassed by 3 systems, one with the new NVIDIA Ampere and by 2 computers with custom accelerators, so none of those that are better than Fujitsu are general-purpose CPUs.

The efficiency quoted above are real measured efficiencies of the complete systems.

If you would compare just the NVIDIA Volta GPU package with the Fujitsu CPU package (both have HBM memory inside the package), then the theoretical efficiency of Volta is a little better than of Fujitsu, but for the entire system Fujitsu is better because it does not consume power with extra CPUs, like a GPU.
 
Why do you say that?
How many full production RISC-V ASICs are there out in the wild? The 2.1 spec was published in December 2019 and broke binary compatibility with 2.0 spec by changing how long instructions are encoded, so any chips designed prior to that will require code compiled with an alternate code path. Most RISC-V development kits I can find are either software emulators or FPGAs, not baked-in ASICs.
 
Last edited:
If I was an ARM licensee, I'd be looking into RISC-V and the few other truly open ISAs out there to make sure I don't get screwed again by ISAs getting locked up behind prohibitively steep licensing fees by another buy-out.

Using an alternative ISA doesn't shield one from patent infringement lawsuits. The endeavor you described requires taking on ARM and Intel simultaneously. A kamikaze mission.
 
Using an alternative ISA doesn't shield one from patent infringement lawsuits. The endeavor you described requires taking on ARM and Intel simultaneously. A kamikaze mission.
Between CPU-related patents that have expired and the amount of stuff Sun/Oracle and IBM have put into the public domain by opening SPARC and POWER, it should be perfectly feasible to design a pretty decent chip while relying exclusively on 20+ years old patents - there hasn't been any fundamentally new advance in CPU design since out-of-order execution ~25 years ago, the next largest single-feature performance bump since then was integrated memory controllers.

ARM itself appears to be integrating new features into its architecture at about the same pace that relevant patents are expiring, which puts it in a rather quite weak position to litigate any fundamental CPU tech, shouldn't have to worry about them unless you want to rip off their HDL, in which case your main concerns would be copyrights instead of patents.
 
To me, Nvidia's intentions for Arm are fairly obvious. They want to be even bigger in supercomputers and the ARM architecture is very well suited for those because of its low power consumption. It becomes even more clear when you consider that Nvidia has said they are porting CUDA to run on the ARM instruction set. Then consider that ARM-based CPUs have been made with 80 cores and 4-way SMT. It seems to me what Nvidia wants to do is make a massively parallel MCM-based machine. They will go with at least 4-way SMT but think they will aim for 8 or 16-way. This would make one ARM CPU look like a streaming multiprocessor which are 64-way SMT currently. Then they will put 64 of these on chip module which will be good for 512 or 1024 threads. Then they will put 8 of those on one MCM and have 4096 or 8192 threads available on one module. This will be look like just another GPU to CUDA code. The biggest benefit is it won't be a co-processor so data transfers will be minimized. All together, this architecture will be astonishingly fast and will change the design of super computers for years to come.

That's where I think Nvidia is going with this and they want Arm as a design resource more than anything else.
Go look at the Top500 - only 1 runs ARM - the rest almost all run Nvidia GPUs. Their intention is to control the ARM market.

Super scalar - more than 1 instruction per clock is ancient at this point - SPARC & MIPS did that years ago - nothing in ARM makes that easier. Cool story, already seen it, watched the 25 year remaster and now it's just old news.
 
How many full production RISC-V ASICs are there out in the wild? The 2.1 spec was published in December 2019 and broke binary compatibility with 2.0 spec by changing how long instructions are encoded, so any chips designed prior to that will require code compiled with an alternate code path. Most RISC-V development kits I can find are either software emulators or FPGAs, not baked-in ASICs.
Bit User is crazyfor RISC V - he thinks it's the next biggest bestest thing since Donald Trump.

RISC V is cool and all, but I don't ever see it getting much past the novelty stage. You might have to explain FPGAs to him.
 
Where do you get that? Last I heard, they (at 7 nm) were worse than Volta (12 nm) by a factor of 2.

CPUs cannot beat a GPU at its own game, and vice versa. That project made sense for reasons beyond purely technical. It avoided depending on non-Japanese vendors for critical IP.


In this case, I agree.
Process doesn't matter - TSMC "7nm" is 10nm class and as mature as GFs 12nm is - they are virtually tied. Like the 300W "7nm" Vega VII that got embarassed by one of Nvidia's 2070s. Process doesn't matter.
 
Apple is already designing their own GPUs for mobile, which they'd probably rather scale up than give Nvidia even more leverage over them.

And even if they did support Nvidia, Apple has deprecated all GPU APIs besides their own Metal. Even OpenGL and OpenCL (which they co-founded!) are out! So, certainly don't get your hopes up for them embracing CUDA.
Apple is vertically integrating - reducing the supply chain and reducing the number of vendors - AMD was removed from some of their latest builds - and with a walled garden, they can do as they please - and paying for patents / licensing is not a problem for a company sitting on $300B.

Doesn't need to be RDNA or CUDA - can be called Honeycrisp if they want.
 
Google is a platinum member of the RISC V Foundation:



Granted, Apple and Amazon aren't there, as they're both firmly in the ARM camp, but is that group of companies not large enough for you?
Where is the silicon? Google is all in on TPU and we see tons of Tensor units. Most big companies support alot of stuff out in left field. RISC V will remain a niche novelty for the foreseeable future - much like ARM on the Windows Desktop - yeah MS supports it - but do they really?
 
If Nvidia owns ARM, it will only be a matter of time before the other companies that license ARM (Apple, Qualcomm, Amazaon, etc) will get screwed with heavy licensing fees and basically forced to buy ARM chips made by Nvidia. Selling to a direct competitor is a bad idea, IMHO.
The sale won't happen - it's terrible for ARM and will be nothing but an albatross around NVidia's neck - since the ARM licensees will flee.