NVIDIA GT300 Taped Out

4745454b

Titan
Moderator
Hmmm, weird. First, its great news if true. If they haven't tapped it out by now, we most likely wouldn't see parts by the end of this year. I'd feel better hearing that they have tapped out the finalized silicon, but at least they have something. This part didn't make sense to me however.

512 cores, 512-bit memory controller

I thought this card was to use GDDR5? Why are they using a 512Mb memory controller if they are using GDDR5? I realize they more then doubled the number of shaders and have plenty of die pad space, but the bit seems much to high. I also wonder about yields. The move to 40nm isn't going to allow for a lot of extra cores per wafer, they should get even less then they do now. First because the cores will be twice as large (512 shaders vs 240), and second because I doubt the yields are as good at 40nm as they are at 55nm.

More info, I can't wait.
 
The larger bus width would help with CUDA, thanks to being able to send more data over the bus per LOAD operation, so it makes sense NVIDIA would favor a larger bus. Still might go with the DDR5 for overkill value though...

This card sounds more and more like a monster. Lets see if ATI can compete with this generations equivalent of the 8800GTX...
 

jennyh

Splendid
It's possible, just.

Nvidia might be banking on the neverending stupidy of people, which is a pretty safe bet tbh.

The problem is, we've had over a year of seeing 'as good as' costing half the price. How stupid are people? Because I can assure you that if those specs are near true then the first G300's are going to be released near the $500 mark just like the bad old days before ATI brought sense and decency to the market.
 

The_Blood_Raven

Distinguished
Jan 2, 2008
2,567
0
20,790


So nVidia are going to charge us $100s more for extra bandwidth and all these other CUDA related features that wont help in gaming and only in a very limited application? If this is true then ATI should do just fine...
 

4745454b

Titan
Moderator
Actually, I don't understand what gamerk wrote at all. 256bit GDDR5 provides more bandwidth then 512bit GDDR3. By leaving it at 512bit you have so much bandwidth I doubt the shaders will be able to use it all. (program should matter)

And as I write this someone else comes along posting that AMD is also moving back to 512bit with GDDR5. I must be missing something here, perhaps 512 shaders is enough.
 
Here's an example. This was for CPU's, but makes the point I want to make clear, as the same principles apply:

http://www.pcguide.com/ref/cpu/arch/extData-c.html

Every bus is composed of two distinct parts: the data bus and the address bus. The data bus is what most people refer to when talking about a bus; these are the lines that actually carry the data being transferred. The wider the data part of the bus, the more information that can be transmitted simultaneously. Wider data buses generally mean higher performance. The speed of the bus is dictated by the system clock speed and is the other main driver of bus performance.

The bandwidth of the data bus is how much information can flow through it, and is a function of the bus width (in bits) and its speed (in MHz). You can think of the data bus as a highway; its width is the number of lanes and its speed is how fast the cars are traveling. The bandwidth then is the amount of traffic the highway can carry in a given unit of time, which is a function of how many lanes there are and how fast the cars can drive in them.

Memory bus bandwidth is extremely important in modern PCs, because it is often a main bottleneck to system performance. With processors today running so much faster than other parts of the system, increasing the speed at which data can be fed to the processor from the "outside" usually has more of an impact on overall performance than speeding up the processor itself. This is why for example, a Pentium 150 is not much faster than a Pentium 133; the P150 runs on a 60 MHz memory bus and the P133 on a 66 MHz bus. 10% more clock speed on the system bus improves overall performance much more than a 10% faster processor.

Now, heres the issue. Most of the time, ATI gains all the performance back because most of the data will be stored in the GPU RAM, which is much faster compared to NVIDIA's solution. However, it is possible to load data directly into a GPUs (or CPUs) registers, execute, and send data again without having to store the value in the GPU's RAM. Anything mathematical would theoretically run better on a higher data bus (less time getting the necessary values from main memory, and streamlined equations can be done almost exclusivly in GPU's registers). Coincidentally, NVIDIA is going for the GpGPU solution, and already has such a API that uses mathematical forumula executed on the GPU (PhysX). I've actually wondered for some time how much slower PhysX would run on an ATI card with a smaller data bus and faster RAM...(maybe why ATI won't support it???)

Basically, the larger bus would be useless for gaming as it exists now. For CUDA/PhysX, a larger Bus is a necessity.
 

4745454b

Titan
Moderator
Except that we (should) all know you can decrease the one without harming performance if you increase the other. The 7600GT had about the same performance as the 6800GT, even though the 7600GT used a 128bit bus while the 6800GT used the 256bit memory bus. This is possible because the 7600GTs memory bus was faster then the 6800GTs. 256bit memory bus running at 500MHz moves the same amount of data as a 128bit memory bus running at 1GHz. A 512bit memory bus running GDDR5 moves four times the amount of data as a 512bit memory bus running GDDR3. This is the part I'm having trouble understanding. Nvidia's "old" cards used 448bit and above bit wide memory bus, and now even with the move to GDDR5, they are increasing the bus as well. It seems like overkill to me. If my math is right, to equal a 512bit GDDR3 bus, you'd need the same clock speeds, but it would need to only be 128bits wide with GDDR5.

Having a 512bit wide bus tells me either they didn't move to GDDR5, or the memory is feeding such a monster chip that they need to give it monster memory access. Here's to hoping.
 
Correct expect when you remember that most of the data going to the GPU comes from main memory. Every time the GPU does a LOAD operation, you need to access that memory, via the SMB (Shared Memory Bus, or whatever they call it these days...). Of course, as only one device can access main memory at any one time, having a faster bus risks causing memory related delays, that a wider bus can avoid.

Heck, I personally think the current structure of PC's is not going to last much longer, due to all the potential bottlenecks on the system bus's, the memory bus being the primary culprit. Thats another debate though...
 

It's actually a doubling, not quadrupling. Other than that, everything you said is right. A 256 bit bus of GDDR5 will be equal to a 512 bit GDDR3 bus, roughly speaking (eg HD4870 vs GTX 280).
 

4745454b

Titan
Moderator
Of course, as only one device can access main memory at any one time, having a faster bus risks causing memory related delays, that a wider bus can avoid.

I don't understand this at all. If you need to transfer 1GB of data, what difference does it make if you do so using a 256it @ 500MHz or 128bit at 1GHz? They have the same amount of through put, so what difference does it make?

It's actually a doubling, not quadrupling. Other than that, everything you said is right. A 256 bit bus of GDDR5 will be equal to a 512 bit GDDR3 bus, roughly speaking (eg HD4870 vs GTX 280).

When you look at video cards, why do they say 900MHz, 3600Gbps? Sure seems like its 4 times faster. (wait, I bet I know the answer, actual vs effective speed, or you have to take DDR into account.)

http://www.newegg.com/Product/Product.aspx?Item=N82E16814102801
 

The thing is, 900MHz GDDR3 is 1800Mbps. So, 900MHz, 3600Mbps GDDR5 is only double the speed, not quadruple. It gets even more complicated if you look at the various clock domains in each memory architecture, but as a rough approximation, GDDR5 is double the speed of GDDR3.