Nvidia GeForce GTX 690 4 GB: Dual GK104, Announced

Page 7 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


There is no such thing as GK110 at all right now. Perhaps you are referring to the proposed GK100? GK110 would mean that it's a second generation Kepler GPU. GK100 would be the first generation Big Kepler GPU. That's why the 680 has the GK104 instead of GK114. It's like the GTX 400 Fermi GPU's names were, just with the F replaced by a K. IE, GF104, GF100. GTX 500 had GF114/GF110/etc. because it was second generation Fermi. There is no second generation Kepler yet, so there are no GK11x GPU model names.
 


RAID and CF/SLI are not comparable. RAID splits each access (assuming that you were referring to RAID 0) across two or more drives, whereas SLI/CF are splitting not each access, but interleaving them between each GPU.

I'm sorry, but I can't find any evidence supporting your claim about the memory being less taxed with a dual GPU configuration than with a single GPU configuration running the exact same workload. Could you give me a link (or better, a few) that show this phenomenon?
 

silverblue

Distinguished
Jul 22, 2009
1,199
4
19,285
The cards will render to the best of their ability. Having two men digging the same hole would suggest one would need to wait for the other to vacate the hole, however in Crossfire and SLi this would generally mean two men digging as fast as they can alternately - one digs whilst the other offloads the soil he's just removed (for want of a better term).
 
[citation][nom]silverblue[/nom]The cards will render to the best of their ability. Having two men digging the same hole would suggest one would need to wait for the other to vacate the hole, however in Crossfire and SLi this would generally mean two men digging as fast as they can alternately - one digs whilst the other offloads the soil he's just removed (for want of a better term).[/citation]

Instead of one man digging ten holes, but only one at a time, two GPUs will dig two different holes at a time. The holes are in a line and when one man finishes, h goes to the hole next to the other man (the man in the rear arrived first, so he'll finish first and need to move around the other man to dig the next whole and that pattern continues).

I think that that might be the best analogy within the context of two men digging holes.
 
[citation][nom]mojorisin23[/nom]$1k is ridiculous. after 4-6 months it'll be about half that.[/citation]

Probably not. It's worth, by GPU performance, about $850 to $925. If it had 4GB of VRAM per GPU, then it would be worth it's $1000 price tag if you go by performance per unit of currency (such as USD). It has similar performance per dollar to the GTX 560 TI 2GB and those cards are some of the best bangs for your buck on Nvidia's side of the high-end graphics market.
 

silverblue

Distinguished
Jul 22, 2009
1,199
4
19,285
[citation][nom]blazorthon[/nom]Instead of one man digging ten holes, but only one at a time, two GPUs will dig two different holes at a time. The holes are in a line and when one man finishes, h goes to the hole next to the other man (the man in the rear arrived first, so he'll finish first and need to move around the other man to dig the next whole and that pattern continues).I think that that might be the best analogy within the context of two men digging holes.[/citation]
Maybe, however the reason for the single hole seemed more analogous (at the time, anyway) to the two cards drawing a single display area.
 
[citation][nom]silverblue[/nom]Maybe, however the reason for the single hole seemed more analogous (at the time, anyway) to the two cards drawing a single display area.[/citation]

Perhaps, but that's not what two (or more) GPUs do. Each GPU draws a full frame itself and the two interleave the frames that they draw. Older multi-GPU technologies had GPUs work together on the same frame, but that isn't done anymore because the interleaved method provides better performance scaling.

One GPU could be said to be a single man digging holes. He needs to dig a long line of holes. He can have another man dig the hole next to him as he digs his hole. When he finishes, then he moves around the other man and starts the next hole. When the other man finishes his hole, he moves around the first man and starts another. This is a much closer analogy to how the GPUs are operating. It even works with more than two GPUs. Just add a third or fourth man and when someone finishes, instead of going to the hole next over, they simply go to the next available hole.

It can even explain the scaling difference between two GPUs and more than two GPUs (two GPUs usually scales performance very well, but adding a third and fourth offer diminishing returns on scaling).

The time it takes a man to move from a finished hole to the next available hole increases when you have more men, so the scaling by adding more men quickly diminishes if they are all digging holes that they can dig quickly. They become more uniform because the chances of three or four men finishing a hole at the same time are far lower than the chances of two men finishing at the same time (if two men or GPUs finish at about the same time, then they are both in transit at the same time, so there is a stutter in work being done because there is no work being done at all for a short period of time).

I think that my analogy works more or less perfectly for GPUs. I can work in representations for the memory capacity and bandwidth and how it affects the performance and representations for how having two or more different GPUs (two different men in this analogy) instead of identical GPUs affect it.
 

PCgamer81

Distinguished
Oct 14, 2011
1,830
0
19,810
But in dual card solutions the memory bandwidth is doubled, albeit not the size. It is still more beneficial and a helluva lot faster than with a single card solution, and that's a fact.
 

actionjksn

Distinguished
Jan 6, 2010
49
0
18,530
[citation][nom]j_e_d_70[/nom]Great that they're cannibalizing chips needed to get 680s out the door to paying customers so they can produce this monstrosity. Grrrr... At least wait til AMD comes out with a card that needs to be trumped. My nerd rage against nVidia is growing.[/citation]

Yeah that's bullshit, they should run their company the way you want them to.
 


The bandwidth is not doubled. It is the same. For the bandwidth to double, the GPUs would need to be able to access the other GPU's memory at full speed in addition to it's own (having two GPUs does not mean that the bandwidth between each GPU and it's respective memory suddenly doubles either). We have already agreed that this does not happen. The memory bandwidth is still the same because each GPU can still only access it's own memory. Each GPU draws an entire frame for all monitors connected to the entire system (even if one or more monitors are connected to another graphics card, so long as they are all in CF or SLI) and all of the data required for that frame must be present in the VRAM buffer of the GPU drawing that frame.

If you have three monitors, two on one GPU and another connected to the other GPU, each GPU still draws a frame for all monitors as if they were one monitor connected to the GPU drawing the frame at the time.

Assume that you have a GTX 690 with 4GB per GPU (8GB total). You have three 1080p displays connected to it for a total resolution of 5760x1080. Each 5760x1080 frame must be completed by one of the GPUs, they take turns doing every other frame with only one GPU working on each frame at a time, even if both GPUs are doing work at once, they are working on two separate, independent frames. All of the data necessary for a frame must be in the VRAM buffer of the GPU rendering that frame.

The GPU can only talk to it's own memory, so it's memory bandwidth remains unchanged. Some card venders market dual GPU cards as having double the bandwidth, but it's a lie.

A dual GPU card generally scales performance slightly better than two single GPU cards that have otherwise identical specs to the dual GPU card, but this is mainly because the two GPUs and such are on the same card where they can have a higher bandwidth and lower latency connection. This has nothing to do with them having any benefit over two of the otherwise same single GPU cards besides a better connection. They don't have double the bandwidth and I still haven't found any benchmark that supports your claim about dual GPU cards having a VRAM advantage, nor have you supplied such a benchmark. I did look, just in case you were wondering, albeit only for a few minutes.
 

silverblue

Distinguished
Jul 22, 2009
1,199
4
19,285
It's a strange one; the card overall has the capability of practically double the memory bandwidth, however this is split between the two GPUs, neither of which can use the other GPU's memory. So technically, the bandwidth isn't affected, however as the GPUs share the work, you can make a point that bandwidth is still effectively twice that of a single card.

I hate semantics.
 

PCgamer81

Distinguished
Oct 14, 2011
1,830
0
19,810
Semantics aside, I am of the opinion that 2GB of VRAM in a SLi/CrossfireX solution is more advantageous than 2GB of VRAM in a single card.

I guess even that's subjective.
 

silverblue

Distinguished
Jul 22, 2009
1,199
4
19,285
Well, it should be; your frame rate is higher and you can get away with higher levels of AA. Each GPU would be doing less work than a single GPU would per frame.
 
Well, we could then compare the GTX 680 to the GTX 590 for this one. The 590 has less bandwidth per GPU, but if it has effectively double the bandwidth, then it should end up with more bandwidth than the 680. If it has a VRAM advantage, then it should beat the 580 when it comes to VRAM capacity bottlenecks.

The only problem is that your link is practically ancient (2006) and for all we know, the technologies are very different for modern cards.
 
I've worked it all out and checked some sites, and no... It's memory bandwidth is not doubled at all.

However, the necessary memory bandwidth for a certain performance level is divided by the amount of GPUs in the system. It ACTS as if the bandwidth has doubled if you have a dual GPU system, but actually, it has not doubled.

This is because each GPU is only doing a fraction of the work, so each GPU only needs a fraction of the bandwidth (formula is simpe, 1/x, x being the number of GPUs in use, assuming that they are doing equal amounts of work per GPU. Asynchronous multi GPU setups would need a more complex formula). Each GPU still needs the full memory capacity because each GPU needs enough capacity for an entire frame of ALL pixels across all displays connected to all graphics cards in the SLI/CF setup. Crossfire works by connecting all of the displays to all of the GPUs in order for each frame that a GPU draws to be displayed across all displays.

If you have two GPUs pumping out a total of 60FPS, then each GPU only needs enough memory bandwidth for 30FPS because each GPU is only doing half of the work. I still can't find anything supporting your claim about the memory capacity not needing to be at least as high as a single card setup per GPU.

My reasoning is that with a dual GPU system, you will increase the resolutions, settings, and AA and that doing this could increase the memory bandwidth bottleneck (AA especially increases the amount of data passing through the memory interface). Even if it doesn't, the GK104 is already a bottle-necked GPU (it has the same bandwidth as the 580, but it is so much faster than the 580). This is shown in particularly bandwidth-heavy games such as Crysis 2 and Metro 2033 on the 680 when it gets so bad that the 7970 overtakes the 680. The 7950 and 7970 have no such memory bandwidth problem, so if the 7970 scales just a little better than the 680 does for multi-GPU configurations, then it can overtake the 680/690 SLI setups with three or four GPUs in more games than just Metro 2033 and Crysis 2.
 

PCgamer81

Distinguished
Oct 14, 2011
1,830
0
19,810
I feel you are mistaken concerning the memory bandwidth in dual-card configurations. And while I can't really find a whole lot online to prove it, I am kicking myself for not coming across what I have found earlier - it's almost too simple.

Memorybandwidth.png


Memorybandwidth2.png


As you can see, it isn't exactly doubled, but it's close. That's because even though each GPU processes data apart from the other (effectively leaving the amount of VRAM where it's at), the efficiency at which that data is processed is dramatically improved due to twice the overall bandwidth. And it helps.

As for how the size of the VRAM effects bandwidth...it doesn't necessarily have to.

Memorybandwidth3.png


So whether or not the VRAM in doubled in general (which it isn't) is irrespective of how SLi/CF affects the memory's bandwidth.

I am content to say that we are both right. Based on what has been shown, my insistence that SLi/CF is beneficial as for as bandwidth is concerned appears to be true, and your insistence that the bandwidth isn't doubled is also true.
 


Those are synthetic benchmarks. In fact, they aren't even benchmarks, they are just a guy doing some math on a calculator and not trying to explain what is going on because it has little bearing to whether or not it really has the bandwidth increase, it acts as if it does and the regular person doesn't need to know that because it has no bearing on them at all. I explained to you how the VRAM bandwidth worked and why it isn't doubled, but it acts as if it is doubled and might as well be for all intents and purposes. It is not related to dual GPU cards, it's just a phenomenon of how multiple GPU work together with current multi-GPU technologies. If I have two GPUs that can each do 30FPS and need 100GB/s to do it, they can work together to get 60FPS. Guess what? The bandwidth didn't double, it's the same. So, why did it scale to 60FPS even if the bandwidth is ONLY enough for 30FPS?

It happened because the bandwidth only needs to be enough per GPU. Each GPU is still doing 30FPS, so the 100GB/s is still enough despite it taking 200GB/s to get 60FPS. The bandwidth did not change, but it LOOKS like it did if you don't know that it's a dual GPU system. It's no different than the fact that if a program is dual threaded, a second CPU core will more or less double performance without actually being any faster. If you did not know it was a dual core CPU and you thought it was a single CPU system, you would think that the dual core has a core that is twice as fast. Whether or not this is true has no bearing on the fact that the program is running twice as fast, but it is still the ACTUAL truth of the situation that it is two cores that each supply about half of the performance. The GPUs and their memory bandwidth are the same. If you didn't know it was a dual GPU system, then you wouldn't know that it actually is two half speed GPUs with half of the memory bandwidth because the game does not tell you this unless you specifically look for this information.

The bandwidth is unchanged, but the necessary bandwidth for a certain FPS is halved, so the bandwidth could be said to have EFFECTIVELY been doubled, but not truly doubled. Another analogy would be looking at DDR memory. We call PC3 12800 as DDR3 1600MHz, when it is actually 800MHz with a double data rate and the proper terminology would be 1600MT/s or 800Mz DDR or something else along those lines. We call it that for simplicity's sake because it doesn't make a difference in practical use, only when you want to know something about how it actually works. The clock frequency is unchanged by making it from 800MHz into 800MHz DDR, but it's effective clock frequency has doubled because the frequency needed for a certain level of performance has halved.
 


That is an old benchmark with a different card. It does not apply, especially since the bandwidth of the 580 and the 680 is the same, despite the 680 being much faster, meaning that the 580 is not bottle-necked by it's VRAM bandwidth, but the 680 is.

Nearly doubling a GPU's performance, but giving it the same VRAM bandwidth creates a VRAM bottleneck. No doubt about it, the 680 probably scales far better with memory bandwidth improvements than the other cards would when you increases their memory bandwidth. IT's the most VRAM bandwidth bottle-necked high end card that I've ever heard of. It's probably not quite as bad as Llano is about this, but it's bad.

However, the second paragraph is technically speculation even if I'm damn sure about it and so are many others, so I'll get some links for you.
 

PCgamer81

Distinguished
Oct 14, 2011
1,830
0
19,810
You are the one arguing semantics, and very stubbornly and unyieldingly at that. If you remember correctly, this discussion began with me merely stating that it is more beneficial irrespective of VRAM and whether or not it is doubled. So saying that the bandwidth is effectively doubled as it pertains to performance and not as it pertains to actual facts has no bearing on my original claim. In fact, it validates it.

Furthermore, HWcompare isn't just some guy with a calculator. They are a respected and oft quoted team of professionals providing information and actual in-game benchmarks relating to the products displayed on their site. They go to great lengths to be as informative as humanly possible.

I highly doubt they would leave something so ambiguously omissible.

And if they did, it can only be because it relays what is true for all intents and purposes. That the bandwidth is improved, whether in application, theory, performance, or what have you. And it is. The performance jump as it handles high levels of AA and AF is indicative of that much.
 


The bandwidth did not improve, only the effective bandwidth did. No amount of arguing will change that. I'm not saying that they aren't respected nor that they're stupid or anything like that, only that they did not point this out.

Nothing will make increasing GPU count also increase the memory bandwidth by an amount that is more or less proportional to the increase in active GPU count in the system. Nothing will change that. It's just like the DDR argument. It get's simplified because not everyone understands it, let alone cares. However, that does not change the fact that what we call DDR3 1600MHz is not really 1600MHz, it is 800MHz that transmits data on both ends of a clock cycle.

Call it semantics, but that is how it is. This isn't subjective and no one can change that either. The bandwidth did not double, plain and simple. You can try to argue about it further, but this is something that neither you nor I nor anybody else can change.

Also, if you read their pages, they clearly state that these aren't in game benchmarks and that they aren't even really benchmarks, just math. They even tell you how they got their answers. Their *tests* are even more synthetic than a synthetic benchmark. They even tell you that real world usage can and will vary. There's nothing WRONG about how they do this and they tell you that things can vary, but they aren't really telling you how things work out in reality, nor are they trying to complicate things anymore than this stuff already is. They probably don't even care that the 6990 didn't actually have that much bandwidth, it just pretended to have it. They probably also don't care that DDR3 1600 is really 800MHz memory and the same is true for other such situations.

The argument may have started out with me saying that the bandwidth was both the same and effectively the same as a single GPU, but I've obviously realized why that was wrong and am now right. It's not enough that I admit that I was wrong before, you're bothered by me now that I am correct. At least I admited when I was wrong, the least you can do is either prove me wrong (you can't because I know for a fact that I'm right, but you can try) or admit that what I'm saying now is the absolute truth about this.

Also, for triple 1080p and 5760x1200, 2GB GTX 680s, even in triple SLI, can't do any more than FXAA in many games because of thier VRAM limit, so having multiple GPUs each with their own VRAM does not seem to help this at all.

http://www.hardocp.com/article/2012/04/25/geforce_gtx_680_3way_sli_radeon_7970_trifire_review/1

Notice how the 7970's problems have nothing to do with VRAM because it is not limited be VRAM capacity nor bandwidth, but the 680 can't even do proper AA for such a system. Also, a lot of these games (such as Batman) are Nvidia-optimized and the 7970s are using the original drivers they had at launch do to AMD's new drviers not working on the 7970 in eyefinity, so don't take the 7970's poor perfomrance to heart. Despite it's poor performance, it has nothing to do with VRAM capacity bottlenecks like the 680's does and [H] makes that very clear. If it confuses you or anyone else, I'd be glad to explain things in further detail (where I can) than [H] did.

If three 680s have a VRAM advantage over a single 680 like you suggest (I doubt it), then they would be unable to do even as well with AA and quality settings as the three 680s did here. Oh, and unlike your link, this is as not out of date (at least, until ground breaking drivers for Nvidia and/or AMD come out, but they wouldn't change VRAM capacity usage, just performance, feature addition/fixing, compatibility, and bug fixes).
 
Status
Not open for further replies.