News Arm Co-Founder: Nvidia Owning Arm Would Be a Disaster

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
To me, Nvidia's intentions for Arm are fairly obvious. They want to be even bigger in supercomputers and the ARM architecture is very well suited for those because of its low power consumption. It becomes even more clear when you consider that Nvidia has said they are porting CUDA to run on the ARM instruction set. Then consider that ARM-based CPUs have been made with 80 cores and 4-way SMT. It seems to me what Nvidia wants to do is make a massively parallel MCM-based machine. They will go with at least 4-way SMT but think they will aim for 8 or 16-way. This would make one ARM CPU look like a streaming multiprocessor which are 64-way SMT currently. Then they will put 64 of these on chip module which will be good for 512 or 1024 threads. Then they will put 8 of those on one MCM and have 4096 or 8192 threads available on one module. This will be look like just another GPU to CUDA code. The biggest benefit is it won't be a co-processor so data transfers will be minimized. All together, this architecture will be astonishingly fast and will change the design of super computers for years to come.

That's where I think Nvidia is going with this and they want Arm as a design resource more than anything else.
The Ampere CPU you are referring to has 80 cores, no SMT, though they are planning on a 128 core variant. That, however, is very much a server oriented processor, having more in common with AMD's EPYC or Ryzen Threadripper Pro platforms than any consumer chips. Impressive, yes, but Nvidia would have different plans (not that I am ruling out them coming up with their own server cpu). The CUDA idea you speak of is very interesting, though I think at much lower thread counts at this point in time. Die shrinks and further advancements in the architecture will dictate if it will if they could pull off those kind of thread counts.

That said, I would be very interested in seeing a Desktop ARM based solution from Nvidia. Windows having finally caught up with ARM support and offering a worthwhile version of 10 for ARM is nothing to sniff at (Windows has been the major driving force for desktop computing for decades no matter how much Apple or Linux fans try to dismiss it). If they were to do a 32c/64t ARM processor, 3.0+GHz with dual channel DDR5 and 20-40 PCIe4.0 lanes around 100W or less, that would be a very intriguing processor indeed.

As for the other ARM CPU Licensees, I really don't think Nvidia would be stupid enough to mess with them. With RISC-V as a very real option for them to move to, it would be in Nvidia's best interest to do what they can to keep ARM Licensees happy.
 
Process doesn't matter - TSMC "7nm" is 10nm class and as mature as GFs 12nm is - they are virtually tied. Like the 300W "7nm" Vega VII that got embarassed by one of Nvidia's 2070s. Process doesn't matter.
Only reason TSMC's 7nm is considered 10nm class is because it is being compared to Intel's 10nm, and how many high end CPU's does Intel have at 10nm? None, just some quad core mobile variants. That is because they tried to cram too many new ideas into their 10nm process and yeilds have suffered greatly because of it. And now they're doing the same with their 7nm process, and that is delayed until at least late 2022. Now their stock took a hit and AMD directly benefited from it.

As for AMD's GPU versus Nvidia, Nvidia just has a much better architecture at this point in time, and the memory subsystem is a major driving force behind it (read up on it, great stuff). RDNA2 will close the gap some more, but I still expect Nvidia's 3000 series to be at the top.
 
  • Like
Reactions: bit_user
Between CPU-related patents that have expired and the amount of stuff Sun/Oracle and IBM have put into the public domain by opening SPARC and POWER, it should be perfectly feasible to design a pretty decent chip while relying exclusively on 20+ years old patents - there hasn't been any fundamentally new advance in CPU design since out-of-order execution ~25 years ago, the next largest single-feature performance bump since then was integrated memory controllers.

Patents related to power management should be relatively fresh, since performance-per-watt didn't become an issue until the 2000's. Nothing in the SPARC or POWER portfolio would help in that regard.

Someone like Qualcomm probably has a large enough patent war-chest to deter a lawsuit. Then again, when you have that kind of clout, you can probably get reasonable terms for either ARM or Atom.
 
  • Like
Reactions: bit_user
Google is a platinum member of the RISC V Foundation:



Granted, Apple and Amazon aren't there, as they're both firmly in the ARM camp, but is that group of companies not large enough for you?

They have come together on paper and RISC-V has gained a lot of traction over the last year. However nobody is really making CPU's or has publicly said they are going to outside of Alibaba. Alibaba are giving a talk at Hot Chips on there RISC-V cpu which I bet will be interesting. I kind of hope Nvidia buys ARM as that might be the push for a big investment from one of the big players. Would really be nice for competition if there was one open source ISA from phones to servers.
 
Patents related to power management should be relatively fresh, since performance-per-watt didn't become an issue until the 2000's.
While the fanciest temperature/voltage-dependent clocking schemes may be covered under fresh patents, much of the stuff they are built on like different power states, variable clocks and clock gating are 20+ years old too. There is no shortage of expired power management patents.

Power-efficiency on the desktop may have come to the forefront after 2000 but research in the area started for mobile and other power-conscious chips long before that.
 
Only reason TSMC's 7nm is considered 10nm class is because it is being compared to Intel's 10nm, and how many high end CPU's does Intel have at 10nm? None, just some quad core mobile variants. That is because they tried to cram too many new ideas into their 10nm process and yeilds have suffered greatly because of it. And now they're doing the same with their 7nm process, and that is delayed until at least late 2022. Now their stock took a hit and AMD directly benefited from it.

As for AMD's GPU versus Nvidia, Nvidia just has a much better architecture at this point in time, and the memory subsystem is a major driving force behind it (read up on it, great stuff). RDNA2 will close the gap some more, but I still expect Nvidia's 3000 series to be at the top.

Zero reason whatsoever that at some point AMD can come out on top - unlikely since very little $$$ R&D compared to Nvidia's $700M per quarter. Was only a couple quarters ago when AMD cleared a massive $38M net for the quarter - which most seems to have gone into a bonus for Dr Su.

Whether you are build cars, airplanes or semicon - Heavy spending on R&D is necessary. R&D spending deficits don't show up right away - they are a time bomb. For all the hoopla over a stock with less than $7B per year in sales, a 200+ PE and a $100B valuation (besides being set up for a pump n dump) - AMD still doesn't have the $$$ to fund an R&D department split between it's CPU and GPU division. Nvidia spends $2.8B in R&D per year.

As far as Intel - definitely glad they got rid of the worst hire Brian ever did - Murthy was incompetent and the C levels were incompetent in not noticing and acting on his incompetence.

Rocket Lake - Later this year
Alder Lake - sometimes in the 2nd half of next year - so a year apart from RKL - timeline for Alder Lake release was never nailed down - was "2021" - that's as detailed as it got.
Meteor Lake - the 7nm shrink of Alder Lake - a year after Alder Lake - so 2nd half of 2022... where is the delay again?

The big issue is that the yields are not good enough for Xe HPC - which is going into the exascale system - along with Sapphire Rapids which is a reunification of the Xeon line under 1 arch (vs Cooper Lake/14nm/PCIe3/4-8S & Ice Lake SP/10nm+ & PCIe4 / 1-2S) - which is probably 3rd/4th quarter 2021. Granite Rapids (7nm shrink of Sapphire Rapids) would be a year after that - so 3rd / 4th Quarter 2022... Again.. where is the change, other than the early Xe HPC not being ready on full EUV 7nm?

So we were going to have Ice Lake SP, then a couple months later Sapphire Rapids, and then a couple months later Granite Rapids? I never saw that cadence outlined in a slide - but that is what it would have to have been if not for the 7nm issues.

Did you think you were going to get Alder Lake and then a few months later Meteor Lake? Did you think that Granite Rapids would be out so quick after Sapphire Rapids? (if that was the case, then why Sapphire Rapids in the Cray sled rather than Gtanite Rapids?)

Only in the Casino of Wall Street can a company that posts record quarter after record quarter get hit while a perennial dumpster fire which FINALLY got a CEO who can execute the basics (that is not an insult, quite a few C levels don't) gets the stock price buoyed by a 200x PE and over exuberance by the Half Wits in Wall Street. ~$7B in revenue vs $76B in revenue. 200:1 vs 10:1.

That high PE indicates that at some point AMD has to start to deliver - saying they have the highest market uptake since 2013 is newsworthy - 10x 0.8% = 8% - doesn't seem that impressive when enumerated. They were bound to pick up SOME. Epyc will always be hampered by the architecture itself - dual Epyc loaded down as 4 NUMA domains, and performance nosediving - is one of the reasons that Epyc will never reach the uptake of the upcoming Ice Lake (Dual CPU, 128 expose PCIe4 lanes (same as dual Epyc) and 8ch DDR4 ECC (same as dual Epyc) - a few fewer cores, and under load doesn't devolve into 4 NUMA domains...

I am sure I am going to get all kinds of wonderful responses - don't waste you breath, I won't read them, won't be on for a few months. So bask in the artificial glow of your wonderful, damaged, AMD. Intel has issues - but when TSMC starts in with Cobalt and really getting into full stack EUV - it is likely their cadence will also slow - 5nm/4nm is where Intel 7nm is - and Cobalt (copper is dead at those levels - much too high impedence/power usage) is another can of worms that TSMC has not tackled to the extent that Intel has (M0/M1 are full Cobalt). Merry Xmas in advance, my special little people.

AMD GOOD! INTEL BAD! NVIDIA BAD! AMD GOOD! - there that should make the shills happy.
 
does this guy always have to find a reason to bash, or downplay anything and very thing AMD says or does ? man, its kind of pathetic. he must either have intel stock, or is paid by them, i haven't seen him post anything positive towards AMD, ever.
" So bask in the artificial glow of your wonderful, damaged, AMD. Intel has issues " right now, intel is the one that is damaged, and has issues. 10nm STILL isn't working where intel wants, or needs it to do, 7nm looks like it is also messed up, Deicidium369 seems to be practically the ONLY one that says intel is fine and back on track. " I won't read them, won't be on for a few months " good, then we wont have to suffer with your BS posts about how intel is the king of everything, and all the other BS you post, i hope you dont come back, to be honest
 
  • Like
Reactions: bit_user
The Ampere CPU you are referring to has 80 cores, no SMT, though they are planning on a 128 core variant. That, however, is very much a server oriented processor, having more in common with AMD's EPYC or Ryzen Threadripper Pro platforms than any consumer chips. Impressive, yes, but Nvidia would have different plans (not that I am ruling out them coming up with their own server cpu). The CUDA idea you speak of is very interesting, though I think at much lower thread counts at this point in time. Die shrinks and further advancements in the architecture will dictate if it will if they could pull off those kind of thread counts.

That said, I would be very interested in seeing a Desktop ARM based solution from Nvidia. Windows having finally caught up with ARM support and offering a worthwhile version of 10 for ARM is nothing to sniff at (Windows has been the major driving force for desktop computing for decades no matter how much Apple or Linux fans try to dismiss it). If they were to do a 32c/64t ARM processor, 3.0+GHz with dual channel DDR5 and 20-40 PCIe4.0 lanes around 100W or less, that would be a very intriguing processor indeed.

As for the other ARM CPU Licensees, I really don't think Nvidia would be stupid enough to mess with them. With RISC-V as a very real option for them to move to, it would be in Nvidia's best interest to do what they can to keep ARM Licensees happy.

I got the numbers mixed up. I was actually thinking of Marvell's ThunderX processor with 96 cores and 4-way SMT for 384 threads. I think this is where Nvidia wants to go and specifically for HPC servers. The key thing is it would eliminate data transfers to co-processors. Using CUDA with an architecture like this would have GPU-level performance without the power draw of GPU co-processors. The cores would be much more capable then current GPU cores and this would also help performance a lot, as it would be a lot less susceptible to the effects of thread divergence. In my day job I work on an industrial HPC app using CUDA and I find the possibility of a machine like this to be fascinating as it has the potential to be screaming fast. This would not be for video card GPUs. It would be for HPC server CPUs that would look like a GPU.
 
the amount of stuff Sun/Oracle and IBM have put into the public domain by opening SPARC and POWER,
Did they actually put it in the public domain, or just guarantee free use to anyone implementing their respective ISAs?

it should be perfectly feasible to design a pretty decent chip while relying exclusively on 20+ years old patents
It needs to be more than merely decent, if we're talking about anything besides heavily cost-constrained markets.

there hasn't been any fundamentally new advance in CPU design since out-of-order execution ~25 years ago, the next largest single-feature performance bump since then was integrated memory controllers.
Somehow, I think the sheer volume of IP dredged up in a patent search would beg to differ.

Just off the top of my head, I'm guessing quite a bit has been done on power management and things like uOP-fusion, crypto acceleration, virtualization, and aspects related to multi-core. But, I'm sure there are also quite a few other areas.

ARM itself appears to be integrating new features into its architecture at about the same pace that relevant patents are expiring, which puts it in a rather quite weak position to litigate any fundamental CPU tech,
A major piece of its business model centers around the idea that you can't even build your own implementation of their ISA, without paying royalties. That wouldn't work if they didn't have patents that are virtually essential for implementing it.
 
Last edited:
Super scalar - more than 1 instruction per clock is ancient at this point - SPARC & MIPS did that years ago - nothing in ARM makes that easier. Cool story, already seen it, watched the 25 year remaster and now it's just old news.
The ARMv8-A ISA is cheaper to decode and has a larger GP register file, which potentially simplifies register renaming logic and reduces scheduling logjams. It also has relaxed memory ordering guarantees, which again simplies scheduling and allows for greater flexibility in reordering. I'm sure there are other benefits, besides.

Compare that with x86-64, which is already like 15 years old and just not designed for the technological challenges and opportunities of this era. ARMv8-A is a whole 10 years newer, and ARMv9-A is rumored to be launching in the next year or so.

I'd expected you to be smart & wise enough to grasp the ideas that details matter, and the world isn't just one of absolutes. In a race for CPU performance & efficiency, numerous little advantages can accumulate to something substantial.

Bit User is crazyfor RISC V - he thinks it's the next biggest bestest thing since Donald Trump.
Last warning: I consider this overt trolling, and will report the next time you do it.

I am interested in RISC V, but hardly a fanboy. I think it will gain significant traction, but I'm not sure it has what's needed to truly contend for a significant market share (outside of embedded or other fairly niche markets). RISC V might end up being more significant as a stepping stone to whatever follows it - RISC VI, or maybe something arising through an entirely different consortium or organization.

As for Trump, that's completely off-topic in this thread.

You might have to explain FPGAs to him.
This is also the last time I'm going to tell you that I know quite well what FPGAs are. I even dabbled in a bit of Verilog, at one point.
 
Last edited:
Process doesn't matter - TSMC "7nm" is 10nm class and as mature as GFs 12nm is - they are virtually tied.
No, TSMC 7 nm is not tied with GF 12 nm.

Like the 300W "7nm" Vega VII that got embarassed by one of Nvidia's 2070s. Process doesn't matter.
That's absurd. Of course process matters - it's just not the only factor.

I don't know which benchmark you're talking about, but let's leave that aside and consider the case of comparing like-with-like. Just look at how Radeon VII compares with Vega 64. Both are basically the same architecture and the Vega 64 even has 4 more CUs (a 6.7% advantage), yet Radeon VII beats it hands-down in every benchmark.
 
Where is the silicon? Google ...
You're taking that out of context, where I was pointing out that Google is already a member of a consortium of the sort that James Sneed was talking about.

Google is first & foremost a services company - not a chip maker. They make chips where they find competitive advantage in doing so, but it's silly to think they would eventually build every chip they ever use. And just because they don't make a chip doesn't mean they have no stake in it - they want to influence future platforms for the sake of their own needs, as primarily a developer of the software that runs on them.
 
Windows has been the major driving force for desktop computing for decades no matter how much Apple or Linux fans try to dismiss it
Nobody is dismissing Windows in desktop computing. It's only the phone market, embedded and cloud computing, and all those laptops that are either Macs or Chomebooks. So, Windows can keep the desktop, while the world moves on...

Actually, what bothers me most about Windows is that even Microsoft no longer seems very committed to it, as an end in itself. It's hard for me to see Win 10 as anything but a step backwards in reliability, privacy, and performance. Meanwhile, MS seems more interested in using it as a vehicle to monetize me like an asset, with their spyware, app store, and all the cloud services they're trying to push.
 
Somehow, I think the sheer volume of IP dredged up in a patent search would beg to differ.
Can't invention-patent math, so I bet the vast majority of them could be invalidated in court by parties with the necessary funds to go all the way, which is why nearly all such cases end up with either a cross-licensing agreement or an NDA settlement to avoid the risk of mutually assured patent destruction by judges.

ISAs themselves are trickier since they are designs, not inventions. They can be locked down indefinitely using both design patents and copyrights.
 
ISAs themselves are trickier since they are designs, not inventions. They can be locked down indefinitely using both design patents and copyrights.
I assume ARM's strategy is to patent reasonable implementations of key portions or aspects of the ISA.

IIRC, maybe the GIF file format was ensnared in a similar way - the format itself wasn't patented, but the technique required for encoding images as GIFs was.
 
I assume ARM's strategy is to patent reasonable implementations of key portions or aspects of the ISA.
Can't patent math. So unless their implementation includes some unique and non-obvious novel trick, any such patents would be very weak compared to the ISA design patents and copyrights. ARM's core business is licensing the ISA itself, its reference CPU designs in HDL and fab-ready IP-core forms, and related stuff like GPUs.

IIRC, maybe the GIF file format was ensnared in a similar way - the format itself wasn't patented, but the technique required for encoding images as GIFs was.
And when Compuserve started attempting to enforce its questionable patent with the added burden of having to explain why it waited 15+ years before doing so, it was forced to relax its licensing terms from attempting to squeeze everyone to free for consumers, FOSS and small developers, likely far less than successful defense costs for most of the rest when software developers started ripping GIF functionality from their products instead of bothering with a license or a court challenge they may not be able to afford regardless of defensibility.

Same goes with most of RAMBUS' patent portfolio - much of it wouldn't stand in court but companies license it anyway since that is still cheaper than a successful defense. Pay one patent troll to keep other trolls at bay.
 
https://en.wikipedia.org/wiki/RISC-V#Implementations


Ironically, Nvidia is was actually one of the earlier RISC V implementors I heard about. However, they've been building ARM cores for longer, and it's more strategically important to their product line.

I'm aware but if you look at implementations we are talking little SoC's and controllers etc. Nothing really for high performance except for Alibaba. RISC-V needs to focus more on the Core IP instead of the ISA of course that is the hard part. ARM focused on the Core IP and its what paid off as its cookie cutter for high performance CPU's to use ARM.

The irony of Nvidia buying arm and forcing everyone to mature RISC-V wasn't lost on me I was aware Nvidia was a long time RISC-V member.