Ashes Of The Singularity Beta: Async Compute, Multi-Adapter & Power

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Let me explain some things regarding async, because people here seem to be completely lost... As soon as async is used properly, AMD will have the advantage. There is no question regarding this. It is basically a free performance increase, so, for it not to be used seems to be quite the waste, and that's an understatement. I don't think developers wanting to use async will be a problem. They've been talking about it for a long time, and it's regularly being implemented for consoles using GCN. To port it to AMD's GCN on PC would not be a huge task. The porting itself is harder than making the already implemented async to work. There are also multiple games to be released this year that will be using it.
Wouldn't be the first time though, that an ATi/AMD technology gets ignored because of nVidia's power in the gaming industry. When DX10.1 gave almost free anti-aliasing, it was ignored because nVidia hardware couldn't handle it. Even worse, they removed it from a game;
http://www.anandtech.com/show/2549/7

Back to async...
To do async like it's supposed to, nVidia requires a preemption context switch. To explain what that means, I have to highlight some other things first so that you people can understand what's going on.

When people are talking about AMD's async compute, what they actually mean is that graphics/shader tasks and compute tasks are processed in parallel, AND they are at the same time processed in a 'random' order. The latter part is not exactly accurate, but it makes it easy to understand. Some tasks (no matter if they're of a computing or graphical nature) are long, and some are short, and what I mean by processing in this random order is, that you can basically insert the short tasks in between the long tasks avoiding the GPU from idling. The GCN compute units can handle all the long + the short graphics/shader tasks AND the long + the short compute tasks mixed within each other like a soup. All the long, short, graphics/shader, and compute tasks are interchangeable with each other to be processed within AMD's compute units. This blending makes the efficiency of the GPU go very high.

nVidia's hardware cannot do this in the same way. They can handle either the mixing of long and short graphics/shader tasks, OR handle the mixing of long and short compute tasks. This is what we mean when we say it requires a context switch. You have to keep switching between graphics/shaders and compute tasks. This is obviously less efficient than AMD's hardware solution. And yes, being able to blend the short and long graphics/shader tasks is more efficient than doing them in order. Same for compute. But a context switch is costly. If you're doing async graphics now, you have to basically throw out your whole bowl of graphics soup to create a new compute soup, and this causes delay. What you gain by running the graphics/shaders soup and the compute soup separately in an asynchronous manner, is lost by having to switch between them.

Obviously, nVidia is claiming it can do Async, and they would not be lying necessarily, but, it's completely different than AMD's, and well, it's borderline useless for performance gains. And they will not be admitting this, because they advertised their graphics cards as superior for DX12, due to being capable of DX12.1, even though it doesn't exist. And they would get some backlash if it turns out that some graphics cards from 2011 (GCN 1.0) do some things (like async) better under DX12 than their "DX12.1" 2015 cards.

I hope this clarifies some things... Ashes of the Singularity is representative of DX12 performance + async. And I hope you understand now that it is futile to hope for nVidia's performance to be anywhere near AMD's performance when async is used. The elimination of the CPU overhead issue under DX11 with DX12 already gave AMD's cards a huge boost. Add in async, and the only card that can maybe compete is the 980 Ti to the Fury X. Within all other price points, AMD's cards will smoke nVidia's under DX12 + async.

nVidia has admitted that their preemption problem is still a long way from being solved. Here, on page 23 of their 2015 GDC presentation it is stated;
http://www.reedbeta.com/talks/VR_Direct_GDC_2015.pdf

That's why I'm saying that Pascal won't fare any better, and Polaris will be the cards to go for. Especially since Polaris will be fixing its front end to eliminate the DX11 issues that current GCN cards face. Combine that with the async benefits, and it's a no-brainer. That is of course, if Polaris indeed delivers what it promised.
 
^You are making a very bold claim. You are claiming that not only will nVidia not utilize ASYNC nor plan to but also that ASYNC will be the end all be all for gaming which I don't think it will be. There never is a one technology to rule them all. If that were true there should have been other times where AMDs GPUs should have out sold nVidia.

For example, a 980Ti has a peak TFLOPS performance of 4.61TFLOPS while a HD7970GHz has a peak of 4.3TFLOPS. That says that a HD7970GHz should perform similarly to a 980Ti but it doe not, not by a long shot as I have both GPUs.

It is easy to make claims or theorize what will happen but the only thing that will matter is games and by the time enough games utilize ASYNC and DX12 we will probably be beyond Pascal and Polaris.
 

TJ Hooker

Titan
Ambassador
@NightAntilli is there any guarantee that asynch compute is something that can be readily taken advantage of in a wide variety of games? As far as I can tell, we have exactly one data point (Ashes benchmark) showing the performance increase from asynch. Extrapolating that to conclude that this performance increase will become the norm in future games seems premature.

The way you describe asynch seems reminiscent of SMT (hyper-threading), i.e. trying to intelligently assign tasks to keep all execution elements busy. Sounds great in theory, but real world benefits vary based on the application.
 


Oxide itself has actually claimed that they are only using a 'little' async. I don't know if they have increased it or not as development went along. But indeed its efficiency will depend on the application. But compute and graphics tasks will always both be present in games, and because AMD's compute units can handle both without requiring a context switch, they will always provide a performance increase, provided no stalls happen due to bugs etc.

Remember back in the day when we had a separate vertex shader count and a separate pixel shader count on graphics card? Then came in the unified shader architecture right? It brought some performance benefits across pretty much everything. This is basically the same thing, except for compute and graphics.
 

TJ Hooker

Titan
Ambassador
Fair enough. I certainly hope you're right, would make my R9 380 an even more prudent purchase :). But for now I'm going to take the wait-and-see approach.

Don't actually remember the separate shader counts, before my time, haha. Only been following PC hardware news for the last couple years.
 
Nothing that I stated here is secret. It's all over the place if you actually make the effort to understand what is available and has officially been published and investigated, which is exactly why I find it atrocious that multiple tech website which are supposed to know better than a layman (which I am), is making unfounded claims on nothing more than empty opinion without any data to back it up. And it gets annoying when these comments are always to the benefit of the same one and detrimental to the other.
If this was the other way around, nVidia having Async and not AMD, no one would be saying that it's not representative or that we have to wait and see, but it would be seen as expected because nVidia is supposedly always in the lead anyway. Objectivity is nowhere to be seen, and if someone points this out, others have the audacity to call that someone biased.

Going back to what I just explained regarding async... If I as a layman that simply keeps up with PC hardware news as a hobby (with a full-time job in something completely different and at the same time is building a house btw) can figure these things out, why can't the staff of multiple websites do this, even though they have superior access to information in every way? What is the media doing these days...? iknowhowtofixit just killed every concern regarding async with a simple link. I don't understand why tech websites seem to be so ignorant lately. Makes me wonder what is happening behind the scenes.
 


Interesting link although that is their Technical Marketer, his job is to sell their ideas to people and make them sound better than they could be. That is the job of all marketing departments and when done well can cause rather mediocre ideas/products do better and become very successful. Just look at the iPod. There were vastly better MP3 players out at the time but they had vastly better marketing.

That said, I don't think ASYNC is a simple implementation. There has to be some work and optimization for it. It is not just a switch in the code of the API. It is probably simpler to implement than tesselation, for example, because tesselation requires the model to be designed with it in mind hence why some games have some models that look better than others.

A better source would be an actual game developer who can tell you how easy/complicated it is to work on something.
 
Well, I guess an indication is how Oxide started with it. Basically they said they 'gave it a whirl'. Here;

Saying we heavily rely on async compute is a pretty big stretch. We spent a grand total of maybe 5 days on Async Shader support. It essentially entailed moving some ( a grand total of 4, IIRC) compute jobs from the graphics queue to the compute queue and setting up the dependencies. Async compute wasn't available when we began architecting (is that a word?) the engine, so it just wasn't an option to build around even if we wanted to. I'm not sure where this myth is coming from that we architected around Async compute. Not to say you couldn't do such a thing, and it might be a really interesting design, but it's not OUR current design.

Saying that Multi-Engine (aka Async Compute) is the root of performance increases on Ashes between DX11 to DX12 on AMD is definitely not true. Most of the performance gains in AMDs case are due to CPU driver head reductions. Async is a modest perf increase relative to that. Weirdly, though there is a marketing deal on Ashes with AMD, they never did ask us to use async compute. Since it was part of D3D12, we just decided to give it a whirl.

BTW: Just to clarify, the AoS benchmark is just a script running our game. Thus, whatever the benchmark does, the game does. There is nothing particularly special about the benchmark except a really nice UI that our intern wrote around it. It's really our own internal tool we use to optimize and test. We just decided to make it public with a polished UI.

http://www.overclock.net/t/1575638/wccftech-nano-fury-vs-titan-x-fable-legends-dx12-benchmark/100_20#post_24475280

Note that this was last year around October, so, not sure how much more they decided to optimize for it.
 


I never said it was misleading but you wont hear him talk badly about a specific technology that his company has a vested interest in. No one would buy into it if their own marketing team talked badly about it, would they?

When you go to buy a car are you going to buy from the sales guy that touts the features or the one who talks badly about them? And I know, you probably do all your own research. My point is that a marketing guy is not going to tell you that ASYNC is hard to program or requires any work. It is going to be said to be seamless and easy. That is marketing.

A developer being paid will still be able to tell you if something is worth while or is not.



I would assume they did a bit more optimizing considering the performance has improved since the first few benchmarks.
 

Madmaxneo

Reputable
Feb 25, 2014
335
2
4,810
FYI, just last night (7 March 2016) nVidia released a new driver, specifically with Ashes of the Singularity mentioned in the update. I have an EVGA GTX 980 SC ACX 2.0 and after the update I am getting between 60fps to 90fps in the benchmark tests and no crashes or issues whatsoever, at least so far.
 

See what I bolded above... Do you believe me now?

Hitman-PC-DirectX-12-Benchmarks_1-635x426.jpg
 

Madmaxneo

Reputable
Feb 25, 2014
335
2
4,810


FYI, since those benchmarks were posted nVidia has updated their drivers, twice. They mention Hitman in these last two driver updates. I am playing Ashes of the Singularity on DX 12 and those drivers have made a huge difference, about a 20 to 30 fps difference (at least in Ashes)...
FYI, the last update was just this morning......and it makes a difference.
 

They used the GeForce 364.51 driver... Which stated it will ensure optimal performance for Hitman.
http://www.computerbase.de/2016-03/hitman-benchmarks-directx-12/2/#diagramm-hitman-mit-directx-12-2560-1440
 
Until now, every Gaming Evolved title ran well on both Nvidia and AMD. Gameworks on the other hand... Your argument doesn't fly.

I guess I'll leave it at this. The amount of excuses will simply escalate anyway. Ashes will be released soon in its final version. And more titles will be added. I wonder how much evidence you people need.
 


http://www.pcgameshardware.de/Hitman-Spiel-6333/Specials/DirectX-12-Benchmark-Test-1188758/

http://i.imgur.com/70wUHzT.png[/.img]

Conflicting results. Shows there that the 980Ti is faster than the Fury (I set it to 2560x1440 in the options) and well faster than the R9 390 in Windows 10 on DX12.

So which one is correct?

In fact the link you put has two results for Hitman with one showing the 980 Ti on top and another showing the R9 390 being faster than a 980 Ti.

That is why I really do not like that site. How do you write a report with conflicting information?
 

FormatC

Distinguished
Apr 4, 2011
981
1
18,990
PCGH is always trying to make Day-One-reviews. Especially with the buggy Hitman this kind of fast shots is not the best idea to get representative results. It is the same with "The Division". The results has been changed with each patch/driver and all this values from the beta are now just for fun only, not more. I know Wolfgang from Computerbase personally and I think he made a better job, because he takes more time.

But nobody of this fast guys mentioned, that the built-in benchmark is really usable - if you understand, that only the log file is corrupt and you have to monitor all things by yourself. The game is buggy as hell and gives you totally wrong numbers. This guys lost at the end a lot of time and accuracy with their own benchmark scenes.

The faster card renders less frame as the slower card in the same time? The buil-in benchmark said yes, I say no:

01-Ingame-Curves_w_600.png


If I compare my cutted time-window of the built-in scene with dynamic world savegames, I get 99% reproducable results and not up to 5% difference for the same card in 5 runs. Especially with more or less equal cards you can't stay objective.

02-Ingame-vs-Walk_w_600.png


And now we see the main problem of all this fast reviews. You have no time to work on a scientific base! Each want to be the first but nobody if proofing the basics first. All things needs enough time to be accurate :D

The complete review is here:
Tom Clancy's The Division: 22 VGA cards and 6 CPUs tested (in German)

This isn't DirectX 12, but a good example for the old API and what a developer can make right or not. We don't need every time new APIs, we need better and stable apps first ;)

And Hitman + DirectX 12? This game is too buggy (and not finished yet) to make serious revies. I've played it for a few hours and decided to wait a week (or so). All current results are also outdated after the patch yesterday. Let's wait after CeBIT and make an accurate review ;)
 


There is conflicting information. In the image you linked, it was a 980 Ti getting matched by a R9 390. In the article from WCCFTech there was another image with the Strix beating the Fury and the R9 390 by a wide margin at the same settings (1440P maxed DX12).

If that is not conflicting information then I do not know what is since again, who is going to buy a stock 980 Ti? Very very few people.

@FormatC, that is an interesting observation. I am not even sure I like what the developers are doing with the game. It seems like it wont be finished for quite a while as they are releasing it in chunks.
 

Madmaxneo

Reputable
Feb 25, 2014
335
2
4,810


I have the beta and it is due for full release in the next month or two. But there are so many bugs popping up now it may be delayed. The game was doing great for awhile but now not so well. I am having a few issues with it and it will constantly crash when in DX12 mode and still overheats my GTX 980 in DX11 mode...Plus whenever I have to save and quit a game to come back to it later the game freezes. Their graphics engine does not seem to be as good as they hope it to be with all the problems. There is also an issue in DX12 with on screen displays. It apparently causes driver crashes every so often.
 


Doesn't matter what people are going to buy. One was benchmarked with a stock 980 Ti, one with the Strix that has a significant manufacturer overclock. That is the only difference. There is no conflicting information. You're just grasping at straws. Doesn't change the fact that an R9 390 benchmarked as reaching 980 Ti levels, like I predicted. It significantly beat the 980 strix in both benchmarks as well, so... Yeah. That is enough confirmation that the only difference is 980 Ti stock vs 980 Ti OC.

In a way, Hitman is more tailored to AMD hardware than Ashes. Ashes is completely unbiased because they have a specific vendor path for nVidia hardware on the request of nVidia. They have not gimped anything for anyone. It is representative of balanced DX12 titles, whether you want to believe it or not.

It's quite funny. When nVidia wins in benchmarks, everyone says nVidia is the best once again. If anyone then mentions to wait for newer drivers etc in the case of AMD, everyone says you're wasting your time by waiting because AMD drivers suck.
If AMD wins, 'benchmarks are unreliable', 'we need to wait and nVidia will catch up with new drivers', 'AMD sponsored game', 'just one game' etc.

The double standard is uncanny.
 

Madmaxneo

Reputable
Feb 25, 2014
335
2
4,810


Now that you mention it I realize I also do the double standard thing when it comes to the AMD/nVidia battle. In all fairness do you blame me? I learned to be that way because in the past 10 to 15, 20 years it has been that way. AMD would release something that was supposedly better than nVidia (or intel for that matter) and results would come out initially to show that AMD was on top. Then others would get into the testing and prove otherwise.
Over due time AMD has produced some good stuff that did initially beat out the competition but shortly thereafter nVidia would come out with something that defeated their crown. The primary issue in the more recent years has been heat and power draw, which is not good.

TBH I will remain skeptical of AMD for now. Time will tell if those recent numbers that put AMD on top hold up or not, if they do I may change my opinion over time. FYI, there are more than just the sites listed in the replies above. I wonder if there is a site that lists a comparison of the numbers and charts of all the sites that do testing?
 

Math Geek

Titan
Ambassador
if i was amd i would bundle it as well :)

seems like they gain a lot from the game and it is in their interest to promote that. whatever reason it may be, hard to deny that this game like amd cards a lot more than nvidia cards.
 

Madmaxneo

Reputable
Feb 25, 2014
335
2
4,810


The funny thing is at one point it was running great on my nVidia card. They did an update and suddenly it is all kinds of buggy. I am starting to think the devs are doing this on purpose.
 
Status
Not open for further replies.