News US wields the banhammer against sanctions-compliant Nvidia RTX 4090D 'Dragon' — updated law prohibits 70 teraflops or greater GPUs from export to C...

atomicWAR

Glorious
Ambassador
I have been waiting for this to happen. Regardless of whether you approve of the sanctions, the constant compute lowering tactics by both sides is getting a little old. Why make a standard for manufacturers to follow, have them comply only to lower the processing power bar again so its new 'high-end' products must be redesigned to fall under the new max allowable limit? You want sanctions that is all fine and well but make your mind up at what your trying to limit. Because you know companies will design products that walk right up to that line. Not that Nvidia is free of guilt here but, AND I can't believe I am going to say this, they aren't doing anything that any other company wouldn't do.

Let's make an analogy. Lets say the government decided it want to sanction soda cans sold in large quantities to country X. First they say look we can't have anything sold that contains more than eight cans of soda in a single package. So Coke drops selling twelve packs of cans and starts selling their newly designed eight packs for their high end can count. Then a few months latter the government goes 'Hey we see what your doing here trying to skirt right up to the edge of what the law allows and we don't like it' so they lower the bar on how much you can sell again but this time makes it so soda manufactures can only sell packages that have six cans. So if the government only ever wanted soda makers to sell six packs, why not start there? This isn't political it's just common sense. You make a rule/law...someone will always push right up to the edge of what is allowed.

If governments didn't want China to have anything faster that a 4080/7900xtx for example, then just say that be it literally...only 4080's and 7900XTX or via their compute numbers. smh at compute antics...
 
I have been waiting for this to happen. Regardless of whether you approve of the sanctions, the constant compute lowering tactics by both sides is getting a little old. Why make a standard for manufacturers to follow, have them comply only to lower the processing power bar again so its new 'high-end' products must be redesigned to fall under the new max allowable limit? You want sanctions that is all fine and well but make your mind up at what your trying to limit. Because you know companies will design products that walk right up to that line. Not that Nvidia is free of guilt here but, AND I can't believe I am going to say this, they aren't doing anything that any other company wouldn't do.

Let's make an analogy. Lets say the government decided it want to sanction soda cans sold in large quantities to country X. First they say look we can't have anything sold that contains more than eight cans of soda in a single package. So Coke drops selling twelve packs of cans and starts selling their newly designed eight packs for their high end can count. Then a few months latter the government goes 'Hey we see what your doing here trying to skirt right up to the edge of what the law allows and we don't like it' so they lower the bar on how much you can sell again but this time makes it so soda manufactures can only sell packages that have six cans. So if the government only ever wanted soda makers to sell six packs, why not start there? This isn't political it's just common sense. You make a rule/law...someone will always push right up to the edge of what is allowed.

If governments didn't want China to have anything faster that a 4080/7900xtx for example, then just say that be it literally...only 4080's and 7900XTX or via their compute numbers. smh at compute antics...
The problem is that the government can't specifically target a product or company by name. So it can define rules and say, "nothing above this level" and then it can say what products fail that criterion, but it can't say, "no RTX 4090 cards."

But I totally agree with the rest of what you're saying. It's asinine that the US DoC created rules, realized that they didn't like how companies were complying with those rules, and so created new rules that affected more products... and then when products were still being sold, lowered the limit yet again. That's idiot bureaucracy at it's worst.

And an even bigger part of the problem here is that if the US wants to limit sales of AI and GPU hardware that can do 70 teraflops of FP32, well, China just needs twice as many 35+ teraflops GPUs. And so many GPUs have already been sold to China before the updated restrictions were put into place that it's an attempt to put the cat back in the box.

I felt like the initial rules were fine and were designed to prevent the sale of future GPUs to China. And they would have worked for that — nothing using Blackwell B200 will be allowed for sale in China. But trying to retcon the whole situation will never work.
 

Notton

Prominent
Dec 29, 2023
322
283
560
Okay, so hear me out.
The people implementing these sanctions have no idea what a TFLOP is.
They are out of touch with technology, and haven't bothered to do their homework by hiring an expert in the field.
They probably think TFLOP is a physical object that gets consumed when it is used, and applied sanctions like they were done historically for consumables.

They are, literally, trying to put restrictions on how fast mathematics can be done.
I am sure we all know that 1+1=2, and all you have to do is buy 2x 4080s to circumvent this new rule.
 

Amdlova

Distinguished
Okay, so hear me out.
The people implementing these sanctions have no idea what a TFLOP is.
They are out of touch with technology, and haven't bothered to do their homework by hiring an expert in the field.
They probably think TFLOP is a physical object that gets consumed when it is used, and applied sanctions like they were done historically for consumables.

They are, literally, trying to put restrictions on how fast mathematics can be done.
I am sure we all know that 1+1=2, and all you have to do is buy 2x 4080s to circumvent this new rule.
Gpu not scale well... you miss the sli and crossfire days.
The rtx 4090D is a try against the America government. Nvidia should be sanctioned by Uncle Sam
 
  • Like
Reactions: SirStephenH

SirStephenH

Distinguished
Mar 22, 2015
11
10
18,515
The reason for the change is obviously because the "compliant" 4090D can easily be overclocked to full 4090 performance. Maybe instead of simply lowering the overall performance limit the government could, oh, I don't know, target the specific ways manufacturers can skirt the law.
 
Gpu not scale well... you miss the sli and crossfire days.
The rtx 4090D is a try against the America government. Nvidia should be sanctioned by Uncle Sam
We’re not talking about scaling for games. AI tends to scale much better with multi-GPU, though the inter-GPU communications do become a serious bottleneck as you start moving to hundreds and thousands of GPUs.
The reason for the change is obviously because the "compliant" 4090D can easily be overclocked to full 4090 performance. Maybe instead of simply lowering the overall performance limit the government could, oh, I don't know, target the specific ways manufacturers can skirt the law.
First, the 4090D isn’t skirting the law. It was in full compliance. The law just changed (again) because of morons who don’t know how the tech industry works.

Second, I can guarantee it’s not about end user overclocking. That only hit the news recently, and this was in the works basically since the last changes happened. Large-scale installations are not going to bother with redlining the cards for 5-10 percent more performance if it compromises stability.

The problem is that the sanctions aren’t working as well as the govt would like and so they keep trying to plug the holes with their thumbs. Meanwhile, head-sized holes keep leaking.
 

tracker1

Distinguished
Jan 15, 2010
36
17
18,535
tracker1.dev
The govt should have set the limit at the second tier. Like between the 7900xt and xtx. Just chopping off the highest end altogether.

I think some of the people involved just like the conflict and drama in the back and forth.

As TFA mentioned, it's likely not cutting off the resellers and side markets, just capping NVidia. Not that I mind knocking NVidia down a peg or two.
 

bit_user

Polypheme
Ambassador
First, the 4090D isn’t skirting the law. It was in full compliance.
Being overclockable to exceed the limit shows Nvidia acting in bad faith. IMO, they should get slapped with a fine amounting to multiple times the value of any such units sold in China. Not that they would feel it, right now, but such a violation should not pass without some response. Especially after all the noise they made about wanting to cooperate.
 
NV got caught playing the rules, so US Gov slaps them down with lower ceiling shows world we mean business, NV has to rejig production line, costs them a few sheckles, plays the rules again, rinse and repeat until US Gov sets a really low ceiling and NV just says OK, we'll play ball. This continues for a few years and generations of cards, until China says F.U we've got our own stuff that's just as good, no cash for western manufacturers.
 
I want to know how the government intends to stop them, when they are made in Taiwan, and China is a 10 minute plane ride away? How will they even know?
Because taiwan wants to at least appear to support the USA. They worry about china for much larger reasons. Then again you constantly see stories about china customs finding people with 100s of cpu chips taped to their body. They could just pretend they don't see the guy with 10 4090 stuffed in his pants.

That though is too big a hassle. Large china company just has another company they own in some country that does not enforce USA sanctions like say vietnam buy a pallet of 4090 and then ship it into china.
 
Jul 7, 2022
614
564
1,760
The problem is that the government can't specifically target a product or company by name. So it can define rules and say, "nothing above this level" and then it can say what products fail that criterion, but it can't say, "no RTX 4090 cards."

But I totally agree with the rest of what you're saying. It's asinine that the US DoC created rules, realized that they didn't like how companies were complying with those rules, and so created new rules that affected more products... and then when products were still being sold, lowered the limit yet again. That's idiot bureaucracy at it's worst.

And an even bigger part of the problem here is that if the US wants to limit sales of AI and GPU hardware that can do 70 teraflops of FP32, well, China just needs twice as many 35+ teraflops GPUs. And so many GPUs have already been sold to China before the updated restrictions were put into place that it's an attempt to put the cat back in the box.

I felt like the initial rules were fine and were designed to prevent the sale of future GPUs to China. And they would have worked for that — nothing using Blackwell B200 will be allowed for sale in China. But trying to retcon the whole situation will never work.
I completely agree about the “use 2x 35 teraflop cards instead”. All the bureaucrats are doing is make Chinese companies pay more money for 2x cards and use more electricity to reach 70 teraflops forcing their coal fired power plants to spew more emissions into the air. I thought protecting the environment was an important thing to do, but like most things in politics, it’s important…until it’s not.
 
If you're following this thread, please note that we have been contacted by Nvidia regarding these new "clarifications" from the government, which it turns out were about as clear as mud. We have updated the article introduction to reflect our new understanding of the terms and limits described. In short, the RTX 4090D isn't impacted by these changes at all, nor is the Hopper H20, and both GPUs remain sanctions compliant.
 
Jul 7, 2022
614
564
1,760
Because taiwan wants to at least appear to support the USA. They worry about china for much larger reasons. Then again you constantly see stories about china customs finding people with 100s of cpu chips taped to their body. They could just pretend they don't see the guy with 10 4090 stuffed in his pants.

That though is too big a hassle. Large china company just has another company they own in some country that does not enforce USA sanctions like say vietnam buy a pallet of 4090 and then ship it into china.
The “guy with 10 4090 stuffed in his pants” part reminded me of a 90’s Colombine high school shooting documentary that was shown to us at school. In one segment they were trying to fear monger by saying kids can easily conceal full size weapons in their clothes, then, on camera, walks in a kid in abnormally baggy clothing, with legs that couldn’t bend at the knee, wobbling to the center of frame, then pulling out shotguns from his pants. My reaction was, “bro this is the most disingenuous fail I have ever seen. Not even remotely realistic, how would anyone not see this kid coming from a mile away”
 

bit_user

Polypheme
Ambassador
I completely agree about the “use 2x 35 teraflop cards instead”. All the bureaucrats are doing is make Chinese companies pay more money for 2x cards and use more electricity to reach 70 teraflops
Except that's not how AI works. Have you ever read about the NVLink interconnect in modern Nvidia datacenter GPUs? Do you know why they build such a high-bandwidth (e.g. 900 GB/s) interconnect into them? Because it's needed!


Using PCIe @ 64 GB/s (NVLink's figures are bidir, so I quoted bidir PCIe 4.0 x16) will seriously bottleneck training performance, on large models!

Also, there are only so many GPUs you can fit into a single chassis. As I explained, the old "mining" trick of using PCIe x1 links to connect each GPU won't work, because AI training actually needs that link's bandwidth. So, by restricting the maximum compute power of any single GPU, they're definitely restricting the performance at which China can train big models.

I know you're a smart guy and it bugs you when someone tries to hold forth on a topic they know next to nothing about...
 
Except that's not how AI works. Have you ever read about the NVLink interconnect in modern Nvidia datacenter GPUs? Do you know why they build such a high-bandwidth (e.g. 900 GB/s) interconnect into them? Because it's needed!
One of the things I've gleaned is that the high interconnect bandwidths are often more useful for training than for inference. Certainly it can also help a lot with inference on very large models (i.e. GPT-4), but often inference gets quantized down to levels where it may only need a relatively small amount of VRAM and can run multiple instances on a single GPU. (This is what the MIG stuff can be used for, among other things.)

Obviously it's much less efficient to use more slower GPUs to do a lot of AI stuff, but I believe there are also ways to work around that. Scaling isn't as good, maybe it hits a maximum throughput at some point where adding more GPUs wouldn't even help, but I believe there are ways to utilize non-data center GPUs in tandem to do training. How effective are they? I have no idea, but as the saying goes, necessity is the mother of invention.

We do know that China has apparently been getting disproportionately more 4090 GPUs before the ban went into effect, and that people were paying thousands of dollars ($3~$4 grand) for such cards at one point. Were those just rich people, or profiteers, or were they actually AI and government people trying to procure hardware? Again, we don't know, and I suspect no one who actually knows whether it's the latter would be saying anything.
 
Jul 7, 2022
614
564
1,760
Except that's not how AI works. Have you ever read about the NVLink interconnect in modern Nvidia datacenter GPUs? Do you know why they build such a high-bandwidth (e.g. 900 GB/s) interconnect into them? Because it's needed!

Using PCIe @ 64 GB/s (NVLink's figures are bidir, so I quoted bidir PCIe 4.0 x16) will seriously bottleneck training performance, on large models!

Also, there are only so many GPUs you can fit into a single chassis. As I explained, the old "mining" trick of using PCIe x1 links to connect each GPU won't work, because AI training actually needs that link's bandwidth. So, by restricting the maximum compute power of any single GPU, they're definitely restricting the performance at which China can train big models.

I know you're a smart guy and it bugs you when someone tries to hold forth on a topic they know next to nothing about...
Well forgive my ignorance, but wouldn’t they simply add additional chassis connected via NVlink switches? Multi-chassis AI servers are a thing I thought.
 

bit_user

Polypheme
Ambassador
One of the things I've gleaned is that the high interconnect bandwidths are often more useful for training than for inference.
That doesn't mean you can't train using cards with only a PCIe interface, such as RTX 4090's have (SLI has been shown to be broken for a couple generations, now). See Tiny Corp.'s hardware, which teams 6x gaming GPUs (now either AMD or Nvidia) for training. However, forcing each card to be slower hampers this approach even further.

Well forgive my ignorance, but wouldn’t they simply add additional chassis connected via NVlink switches? Multi-chassis AI servers are a thing I thought.
I'm not up-to-date on the latest limits, but I'm pretty sure they clamped down on interconnect bandwidth a while ago.