Why don't NVIDIA make 512 and 448 bit cards today like in the Fermi era?

So, the 970 and 980 have launched. Great cards, though I'm a little disappointed that they are having a 256 bit memory interface. Why don't NVIDIA produce wider memory interfaces now if they used to do it for the 5xx series cards? 224GB/s seems a little low compared to the 320GB/s of the 290x and the 336 GB/s of the 780Ti, even though the 980 is faster than the 780Ti, and has insane memory clocks - more than 7GHz!
It just doesn't make sense to me that they're going for the small improvement in frequency rather than the large one in bit width. I saw the overclocking profile of the 290x Lightning and it was 1600 MHz memory for a total of 432 GB/s! I was like what the heck!
What do others think? Is there a logic behind this that I'm missing?
 
Solution


as i can understand different GPU will have different efficiency in using their bandwidth. despite having...

yialanliu

Honorable
Apr 23, 2012
184
0
10,690
That's because the 970 and 980 are mid range cards. Even though they are not priced like one, there is actually another set of cards that are going to be slotted higher than these.
 
I know, the GM200 GPU's coming, but that's also a 384bit GPU, not a 512 bit like a R9 290x.
That's why my question. If you can get 224GB/s on a 256 bit memory module, then you could get 448GB/s on a similar 512 bit module. Overclocked, that could easily cross half a terabyte a second of bandwidth.
I have even seen reviews of some games like Shadow of Mordor which have to have AA turned down for playable FPS.
 
Hello... Price of memory is High due to mass production of SSD type chips... I'm thinking about the sales goal to the general consumer here 1) How many people brag or even know about their Memory BIT/bandwidth? 2) Most people want more GB's of memory 3) I'm sure you can have them build you about anything you want with a phone call and $$$$... If you are in need of a Professional Graphic/Cad card.
I.E. http://www.tomshardware.com/answers/id-2321828/video-card.html
I myself always look for at least 256 bit memory with a new purchase.
 

Eximo

Titan
Ambassador
It is hard to compare architecture design choices when we only have a high level overview of how things really work inside of a GPU. If they claim more onboard cache and data compression within the memory bus can negate bandwidth limitations, well the proof seems to be in the output. If a 980 can beat out a 780Ti with the larger memory bus, that information is good enough for me. Obviously the same architecture with the wider bus will perform better, but you will always be in a situation where the next big thing is coming out 'soon'
 
@noidea77 It should - it measures the width of the path through which data travels to and fro the GPU and memory, much like the CPU and RAM, but much, much more throughput.
If you have more data travelling to and from the GPU per second, as is in the case of higher resolution and more AA/AF then you need more memory bandwidth.
 

Well, yes, that's true. NVIDIA did say they had implemented data compression and more cache.
So instead of upping memory bandwidth they've reduced data that needs to travel per second... that seems equally good. But is it having the same effect in benchmarks? Nothing is ideal - and nothing ever will be.
 
Hello... Yes you can see the difference in performance... each component used and difference can be seen in performance Passmarks... for example most nvidia 60's series cards had 256 bit memory, but the Gtx660 (192), the brother 670, Or Cousin 760 had the 256 bit memory installed, here is the Passmark for the difference http://www.videocardbenchmark.net/video_lookup.php?gpu=GeForce+GTX+670 look where the 660 is, 4120.
 

roguecomgeek

Distinguished
Jul 16, 2010
39
0
18,560
Adding to what Yialanliu said. They may have a bigger card they have been able to reproduce in small yields but haven't got the manufacturing process down to get the yield they want yet. I mean there has gotta be a reason why they released the GTX750 which is also Maxwell but didn't release the GTX970 and gtx980. I wonder why they haven't been able to shrink the transistor size smaller then 28nm for two generations now. Both the GTX600 and GTX700 series were both 28nm also.

In processors I have read about caching and how if they add too much cache it can slow down the processor from keeping itself busy. Its a delicate balance between adding more cache and having fast cache to keep the processor busy. I wonder if the GPU runs in to a similar situation with the limitations of adding too much that it can degrade the performance or not add enough to benefit it. Or of course they are bringing out smaller versions when they see the market fit for it.
 

But that's limited by the data, the gpu reads or writes to the memory and that - in real world applications - is far from the theoretical maximum. And synthetic benchmarks, like Passmark will never tell you anything about the real performance with your software / game.
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530


The problem is that most reviews don't push the cards. They test with 2XAA, the most 4XAA. It seems like they don't want to tackle the issue and do a full test like any owner of a GTX 980 or 970 would use it -- in other words, with SS and AA cranked up to max.

I had the same doubt when I bought my GTX 770 (because I read an AnadTech review saying that the GTX 680 had a bus bottleneck in higher resolutions). But no one could give me a proper answer and even today, when Nvidia insists in keeping 256 bits for high end cards, the reviews refuse to tackle the issue and do a proper test to see if the memory bus will bottleneck at higher AA and SS settings.

And I have stuttering with my GTX 770 playing maxed out at 1920X1080 with SS and TrackIR. I always wondered if a bigger bus would eliminate the stuttering, but no one talks about it and there is no test to show how much memory bus is necessary for some games to be maxed out with max AA and SS.

Are they afraid of Nvidia? It looks like it, because the question is obvious.

 
actually even in fermi era the highest memory interface used by nvidia is 384 bit with GTX480 and GTX580. that 512bit and 448bit was from tesla generation (GT200). but back then nvidia were using much wider memory interface because they really need it. nvidia if they can help it will do what ever they can to minimize the cost. and going for much smaller memory interface in one of them.

actually i have been thinking why people care so much about memory interface and not the actual bandwidth instead since memory interface only one aspect in determine total bandwidth of a card. so is having much wider memory interface will yield much better performance than having much higher bandwidth throughput? also there is a matter of how efficient the gpu going to use the available bandwidth on them since it is proven simply having much higher bandwidth does not necessarily mean better performance in actual games.
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530
Nowadays they mention the total memory bus, taking into consideration memory clock, bits, etc. The review I mentioned, about the GTX 680 bottleneck, was just because the GTX 770 was launched with a higher memory clock, elevating the memory bus. They mentioned that the bus increase gave an advantage over the GTX 680 that was bottlenecking at higher resolutions.

The GTX 680 has 192GBs of memory bandwidth. With the higher memory clock, the GTX 770, the GTX 970 and the GTX 980 reach 224GBs. They are not that far apart, so I would say memory bus is a concern in the Nvidia cards until someone do a test proving otherwise.

Summarizing, the issue is real and no one likes to talk about it.
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530
Here is a test with the GTX 660 Ti: http://www.tomshardware.com/reviews/geforce-gtx-660-ti-memory-bandwidth-anti-aliasing,3283-11.html

It says: "Nvidia’s GeForce cards simply do not do well with 8x MSAA applied. The Radeon HD 7950 and Radeon HD 7870 not only beat their competition, but AMD's Radeon HD 7870 even beats the GeForce GTX 670."

It is just a GTX 660 Ti (144GBs) in full HD, but what about people who are playing in 1560x1440? Would 224GBs handle it properly? Or even some games in full HD? The GTX 670 (that lost for the HD 7870) has the same bus of the GTX 680.

This is the kind of review I would like to see.
 

RobCrezz

Expert
Ambassador


Yeah with the colour compression it ends up getting over 300 Gb/s bandwidth.

Look at the 1440p benchmarks, it performs very well.
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530


If you are talking about Tom's Hardware review of the GTX 980 and 970, the ultra preset seems to be always with 4XAA, with defeats the purpose, since the own Tom's Hardware says in the review I posted "Nvidia’s GeForce cards simply do not do well with 8x MSAA applied". I don't want to buy a $ 600 card to be limited to 4XAA or be afraid to crank up the resolution.

They should do a special review pushing the cards with AA and SS. According to Nvidia, their new compression would improve the memory bus in 25%, if I recall correctly, but I am skeptical with these claims. For example, Hyper Threading seldom works at its maximum capacity.

 


i thought it was kepler itself that is not as good when handling MSAA compared to radeon and not so much because lack of bandwidth. with kepler generation nvidia are pushing more for stuff like FXAA. for example the samaritan demo than took 3 GTX580 to handle. when they demo samaritan using a single 680 one of the changes they make to the demo was replacing MSAA with FXAA. during fermi generation nvidia cards was better than radeon in regards to MSAA. i saw one bench the performance gap between GTX460 and 5850 are getting smaller when much higher level of MSAA being applied.
 

RobCrezz

Expert
Ambassador


Why the hell would you want to use 8x AA at high resolution? totally pointless. At 1080p I can see the benefit, but not at 1440p and upwards.
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530
I am not that savvy about architectures so I can't comment, but the bandwidth has been a gray area with Nvidia and I remember AnandTech saying the GTX 770 was an improvement over the GTX 680, which was bottlenecking at higher resolutions, and the aforementioned review also tackled the subject.

So I think it was time for some reviewer to take these new cards to the limit and prove once and for all if they do bottleneck or not with top notch AA and SS. My wild guess is that after a few generations without bus improvement they are on the verge. I thought more than twice before buying my GTX 770, and now I am considering an upgrade, but I am might skip this generation just because of the bandwidth (I might get a 1440 monitor). And I don't have money to buy the incoming 384 bits GTX 980 Ti.
 

RobCrezz

Expert
Ambassador


The bus width is irrelevant. Its the total bandwidth that is relevant to the performance - This is affected by not just bus width, but also vram speed and also any compression processes. Nvidia has tended to use much faster vram to make up for the smaller bus - it also means that it can use more of the die space for processing cores, improving performance in other areas (vram bandwith is not the only important part....)

Its clear from the benchmarks that with the faster vram and compression that the 256 bit bus is not severly limiting performance with the amount of cuda cores available. When the big chip comes out it will no doubt have a bigger bus as there will be a lot more cores to supply with data, so the larger bus will be necessary (unless they start using HBM by then)
 

Contiusa

Honorable
Oct 8, 2013
33
2
10,530


But that's exactly what they state, total bandwidth.

GTX 680 - 192GBs.
GTX 770 / 970 / 980 - 224GBs.

If the GTX 680 was already hitting the ceiling, I am not so sure about the others, even with compression technology and all. And the GTX 670 (with the same bandwidth of the GTX 680) was having trouble to follow the HD 7870 in 8XAA. They are just too close to say it is safe. Hence why I think a review could shed some light on it.

 

RobCrezz

Expert
Ambassador


... You arent reading it correctly. The bandwidth of the 970/980 is effectively over 300 because of the colour compression.

8xmsaa is mostly pointless, hence it not being reviewed much, there are better more efficient techniques available now.

I use my gtx 680 @ 1440p and I cant see any difference between 2x msaa and 8x msaa.

Have not you noticed that despite the large bandwidth on the 290x, it cannot outperform the GTX 980, which ends up being much more efficent partly because of its 256bit bus, not consuming as much power as the 512bit bus on the 290x.
 


as i can understand different GPU will have different efficiency in using their bandwidth. despite having much less bandwidth 680 still perform much better than 7970 in general (which means kepler is more efficient than GCN in term of utilizing available bandwidth) . so in the test you link above 7870 is beating 670 when 8x MSAA being use. so did you think bandwidth limitation causing 670 being beaten by 7870? then again look again at 7870 spec.

http://www.techpowerup.com/reviews/AMD/HD_7850_HD_7870/

it has the same bus with 670 which is 256bit. 670 have total bandwidth of 192GB/s while 7870 only have around 154 GB/s. it has the same memory width and much lower bandwidth than 670 then why 7870 slightly beat 670 when using 8x MSAA? the way i see it AMD GCN architecture are much better in handling MSAA than Kepler.

but there are also talks about 680/670 being bottlenecked by bandwidth. but to simply attribute that 680/670 being weak with higher level of AA because lack of bandwidth is not entirely accurate. it is more like 680/670 in general being bottleneck by it's total bandwidth.

 
Solution