DX 10.1 games comig soon

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680


Randomizer, in fact it's NOT free 4xAA. Just applying all shader filters AND AA at the same pass. This is the place we get the performance increase.

When you look at the shader code for many effects, they're also doing some of the calculations required for AA. So, shader scheduler just combines and reschedules the code to execute it more efficiently.

The problem with nVidia isn't supporting AA on shaders is, in fact you need a f***ing amount of shader processors (look at HD4800 series cards from ATI: they have 800) to get a decent performance. nVidia doesn't have this amount of shader processors. Instead, nVidia has a dedicated AA hardware processor which is extremely optimized and in fact pretty well working.

Something off the topic: With the size of GX200 GPUs from nVidia, it might be a little bit of problem for them to pack up that much shader processors in an area as small as R770.
 
I could be completely wrong here so please correct me if i have this A about T.
I was of the belief that the ATI cards since the 2xxx series were doing the AA in the shaders as was the original spec for DX10 pre the M$ Nvidia thing. And now with the 4XXX series cards they were going back to doing it completely in the rops, IE going from software back to hardware AA.
Is that right or have i got this all wrong ?
Mactronix [edit for spelling]
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680

As I wrote, they aren't incapable. In fact, they are extremely efficient.
BUT they need an extra pass AFTER shaders are applied.

So, because of this reason, HD2900 was badly hit by the nVidia optimized games. Although it was a DX10 card, it was built DX10.1 in mind. As you all know, DX10.1 is in fact, the original DX10 which MS had to cripple in order to give nVidia G80 cards with the title "DX10 supported", as there were no other GPUs even getting near it.

When you write a game which is using DX10.1, vs DX10, DX10.1 should be around 20-40% faster than DX10 counterpart because all shader effects and AA are processed in one pass, not 2 passes.

ATI HD series GPUs have a very high number of shader processors to cope with this kind of a problem. (HD2900 , HD3800 have 320, HD4800 has 800)

Considering the fact that nVidia has now 240 shader processors in an area which is 2,5 times larger than R770, sticking in 800 of them would be a GPU about the size of the graphics card itself. So, nVidia is sticking with "The Way It's Meant To Be Played" as long as it can. But at the end, we can expect them come with it with the next generation (GTX200 isn't a new generation, it's an increment on G80 architecture) as CUDA and PhysX would also benefit highly from this transition.
 
From what I gather, theyre doing both, or to be more precise, capable of either way. I know that nVidias shaders are more of a dedicated type, whereas ATI's are more flexable in what they can do. If theyre being done in shaders, thwey still have to have the drivers set for it. I understand you dont have to have an extra pass to complete up to 4xAA resolve. Thats why its faster. So, in essence people equate that as being free, which is technically wrong, but its much more efficient.
 

Slobogob

Distinguished
Aug 10, 2006
1,431
0
19,280


They would take a harder hit because there simply are less shaders to distribute the work to. I suspect it could mess up their SF unit which would really, really slow things down in games that actually do complex shader processing.


You are correct. Well, technically speaking they are not going back. They never started to do AA with their shaders since DX10 as it was specified before the NV debacle never took off. And to be really picky about it, actually AMD didn' fold either. They improved their Rops but the massive increase in Shading-Power will keep them more than able to do AA with those units if a developer want to use it. As i understand it, many developers didn't like the Shader-based AA because of it's implementation. I don't know the details about that though.
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680
Yeah! Exactly.
Being a physicist and applied mathematician, I can explain the logic behind it but no need for that. :)
Just most of the Effects already do some of the calculations required for AA. so, when AA phase has come, we already have 20-40% of the preliminary calculations done. Just completing the rest, we han finish with AA quicker than raising the interrupt to tell the app that we finished with shaders and Application calling back and saying "OK, now do the AA" we have the performance hit if nothing else just because of that round trip.
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680


Well the messy part is, a programmer should think more before submitting batches of work to the GPU. In the seperate AA, you just set the flag saying that you want this or that AA, but with shader AA, you need to include it into the HLSL you're submitting. Many DX programmers don't like to hassle with HLSL, but just call some functions which were written a couple of years ago as the "game engine".

It's not the game programmers, but the game engine programmers, who don't want to use it much. They have to make too many changes in their programs. (At the very least, they have to remove all calls for requesting AA and change all HLSL code to include AA in them. It's a massive work, in fact.)
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680


Pretty much yes. It's a PS3 & XBox 360 title ported to PC. BTW, XBox 360 has the same GPU architecture as HD series ATI GPUs. ;)
 

Slobogob

Distinguished
Aug 10, 2006
1,431
0
19,280

A very interesting point. Another thing to remember though, is that Nvidia tends to clock their shaders higher than AMD. A lot actually. If you compare the clock of the 4850 Core to the GTX Shader speed, Nvidia runs it at roughly 2x the speed.
I suspect that this is the only way they can keep up with the raw processing power of the R770 (even though the raw processing power of the 48xx series measured in Tflops exceeds the GTX, There is a huge difference between theoretical flops and the real code running on the card actually doing something). The extra pass of AA hits AMD way harder than NVidia. Once Dx10.1 gets used, NV will take a hit, but it won't be as big as the hit AMD took with their 2xxx and 3xxx series. 20-40%, as you mentioned, is quite probable and a good guess.
As you already pointed out CUDA and PhysX will play a bigger role for Nvidias next Generation. I'm curious if they just add more Shaders with slight modifications (to keep compatibility) or if they will throw in a few new, more specialised "shaders" just to keep AMD out of their CUDA/Physx scheme.
The new GTX series is nice and everything, but it is not like the 8800GTX. Time won't be as kind to it as it was on the original GTX.
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680


In fact, the hit ATi/AMD got with their first implementation in R600/HD2900 is because they were changing their structure fundamentally and they passed their deadline by 6 months. So, they said: OK let's launch this and we'll iron out the problems in time later. Yes, they fixed most of the problems, but still, they saw the processing power of R600 wasn't enough for shader AA. So they upped shaders from 320 to 800. It gives them an edge on both GPGPU and Havok implementations also, as they can use all of the GPU's processing power for GPGPU (CUDA on nVidia) or physics applications. nVidia still can't use all the processing power of their GPUs. If they really want to compete with intel's Larrabee, they have to take the same road as ATi/AMD already taken. Just they also need time to come up with their new design.
 

Slobogob

Distinguished
Aug 10, 2006
1,431
0
19,280

I agree. That's actually always the problem. Programmers usually do what they "can get away" with. I'm not blaming them. On really fast hardware optimization is not needed - at least on the PC. Optimization may be a huge cost (and time) factor, but on the PC it is mostly voluntary. If a game doesn't run blazing fast, wait three months and buy the next generation of GPUs. At least that's what is happening. There is no real incentive to implement these features. Investing more work to get better performance on some(AMD GPUs), but worse performonce on some other (NV GPUs) computers? That is kind of hard to sell.
Looking at it, it is no longer the software that has to be tailored. Sure, a few tweaks here and there but at some point it just gets easier to get more specialised hardware. And that is something nvidia has done quite well. They look at what is coming out during the release time-frame of their GPUs and tailor their hardware to suit it best. AMD takes a more flexible approach that involves more software.
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680
I agree jaydeejohn. It kicked the door "ajar" only. It's not open yet. They need more to get into the bandwagon. First of all, if they want it to be used in games, they need something that can cope with both the great graphics and those particles flying around. :) You need more shader processors than 240 for this. Even 800 might not be enough. ;)
 

duzcizgi

Distinguished
Apr 12, 2006
243
0
18,680


They both have their strong and weak points. The problem is, programmers are lazier than before. With all these RAD tools to do the dirty work of coding, they just click drag and drop. (Hey, I'm also earning my living with these click drag and drops. :p I know it for good. ) There's no sufficient education or training for efficient programming except for msdn.microsoft.com but even there, most of the best practices aren't implemented by programmers because they involve actually writing code!

That's why, any given game written on top of a given engine has more or less the same performance level. You can expect the upcoming games using Cryengine 2 would perform on par with Crysis.
 
From what Im reading, the density of the ATI shaders is good, and when they finished the R7 series, they actually had more room than they needed, which gave them the extra numbers we see. Im hoping it continues. Rumors of the R8 series have 1600 shaders if I recall
 

Slobogob

Distinguished
Aug 10, 2006
1,431
0
19,280

Given the heat and power specifications of the R600 i really doubt they could've packed in more shader back then, which supports your point. I suspect the shading power of GPUs has to grow again by quite a lot if both Physics and regular "graphics" have to be done on the GPU. The number of threads a R700 can tackle on seems too low for that. Maybe not for now, since there are no games that actually employ physics in a prominent manner, but once they do, the current amount won't be enough. It's actually interesting to see, but i think with these more programmable shaders we will see a transition from graphic cards to something we can actually call a gaming or processing cards (depending on what it is used for).
 
Status
Not open for further replies.