Ashes Of The Singularity Beta: Async Compute, Multi-Adapter & Power

Math Geek · Mar 14, 2016

of all of this i am most excited by being able to mix and match gpu's. this is what people have wanted for a long time now and it is finally here.

i don't pick sides and don't care which works better but overall i wonder about what else can be done with multiple random gpu's. so many features not explored yet and i look forward to seeing how they work out. this is but one game and it surely seems to favor amd (still wondering about the fx cpu's getting a bit of a boost in this game) but on the technical side of it that i have read it seems like amd has kind of planned ahead and has the right thing in place at the right time. sure nvidia will catch up but it has been a long time since amd has been able to claim even a small victory. so it is nice to see.

turkey3_scratch · Mar 14, 2016

Math Geek :

I'm afraid that if I mixed GPUs, there would be too little games to support this and it might be unstable.

Math Geek · Mar 14, 2016

no doubt. so far only know of this one. give it time and maybe it will get implemented a lot down the line.

but you got to start somewhere.......

NightAntilli · Mar 15, 2016

Actually, Stardock (which is partnered with Oxide) is working on a software solution for DX12 for over a year now, which allows not only nVidia and AMD cards to work together seamlessly, but also different brands and generations.

Math Geek · Mar 15, 2016

look forward to seeing how they get it going as well. though you can mix generations and such with this game already.

the problem is the slower card seems to cap the performance. they stated that they could only ever realize double the performance of the slowest card. so a really old card would completely cripple a new one if it is too far apart in capability.

i wonder if this is something that can be worked around or if it will hold true across the board.

jimmysmitty · Mar 21, 2016

NightAntilli :

Again no one will buy a stock 980 Ti nor will anyone buy a stock anything. Every GPU people buy, unless they are early adopters, is overclocked by some margin thus any results with a stock are hard to trust.

And again, the game has been shown to have multiple issues, bugs and performance problems. Look again what FormatC wrote. It needs a lot of work before any results can be verified.

And there is no double standard here. I have been using Radeon GPUs for 14 years straight before finally buying a nVidia GPU again. At the time the 980 Ti Strix was a better value than the Fury X, hell I couldn't even find a Fury X locally when I bought my current GPU. The only thing here is that you are using unstable results that are across the board depending on where you get them as proof when the only proof we have right now is that Hitman is very unstable.

NightAntilli · Mar 28, 2016

jimmysmitty :

NightAntilli :

It is hard to trust any one person on a forum vs what we have seen.

One thing you forgot to remember, the PS4 is not using DX12. It uses its own version of OpenGL since it uses a Linux based kernal. The XB1 will use DX12 but from everything I have already read it wont benefit nearly as much as the PC will since the way that the XB1s DX is written currently it already takes better advantage of that hardware that is in there. So even the ASYNC might not benefit the XB1.

Another factor is that DX12 is more than just ASYNC. ASYNC is just one aspect, much like Tesselation was just one aspect of DX11. It is not the end all be all. For all we know, this specific benchmark was optimized solely for ASYNC work and not for other aspects which could make a massive difference.

Since the consoles still have their differences on the API level that could influence game developers and for PC game developers will have to look at the market, which right now is heavily nVidia based.

If AMD really wants ASYNC to take off then they need to do what nVidia is doing and not what they normally do. They normally partner with one company to show case a tech instead of trying to partner with multiple companies. Mantle was mostly shown off in BF4. The only other major title I remember it in was Thief and it only really benefited systems with super high end GPUs and low end CPUs.

I still say this is in no way a 100% picture. We need games that are developed for the purpose of entertainment, not for the purpose of benchmarking.

And yet another one... See bolded quote above.
Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness.
http://www.bitsandchips.it/52-english-news/6785-rumor-pascal-in-trouble-with-asyncronous-compute-code

jimmysmitty · Mar 28, 2016

NightAntilli :

jimmysmitty :

And yet another one... See bolded quote above.
Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness.
http://www.bitsandchips.it/52-english-news/6785-rumor-pascal-in-trouble-with-asyncronous-compute-code

The most important part of what you linked is this: [RUMOR]

That word means it may or may not be true and last I checked most rumors are not true. Otherwise we would all be sitting pretty with 8GB HBM Fury Xs. Or was it 8GB GDDR5 R9 390Xs based on Fiji? Wait I know, it is a double every aspect of the R9 290 GPU that AMD was working on...

Rumors are rumors, they mean nothing until we have official information. Going based on a rumor is why some sites never become more than a one off and fade into the background.

jimmysmitty · Mar 28, 2016

iknowhowtofixit :

But you are assuming that nVidia did not know about ASYNC at all when designing Pascal or chose to ignore it.

Again it is just a rumor because we have no information on the uArch itself except what nVidia is claiming the performance gains will be.

And to further it, I have seen people use these types of rumors in the past as total facts only to back track when confronted when they turn out to be false.

Same is with early beta style game benchmarks.

I am not standing up for nVidia, as I said I used Radeon for 14 years due to bad experiences with them before but we cannot allow anything that starts with the word rumor to be used as any sort of fact. When TH or Anandtech get their hands on Pascal and Polaris they will tell us the real story. Until then other sites will continue to post any rumor they can find to generate traffic.

Hell I was watching the Android update schedule for the Galaxy S6 to see when it got Android 6.0 and one site has a new article on the subject every day with no real info.

They are all about those clicks.

NightAntilli · Mar 29, 2016

There are reasons why I say things. I'm not making things up.

In order to have something on the market awaiting the node shrink, both nVidia and AMD brought another batch of cards. AMD did it by slightly changing and improving the R9 200 series and releasing the R9 300 series, along with the Fury cards. nVidia did it by adjusting their future 16nm chip to the 28nm node, by cutting out certain parts. In other words, Maxwell 2 is pretty much a cut down Pascal.

Unless they managed to revamp Pascal in such a short time, you can forget Async. I bet they tried though... But the time was so short that it would practically be impossible, especially considering that at GDC 2015 they stated they were a long way off of achieving it.

Again, there are reasons for what I have stated in that post.

jimmysmitty · Mar 29, 2016

NightAntilli :

Everything I have read is that Maxwell V2 is a larger, enhanced version of Maxwell.

Where are you getting the info that Maxwell V2 is a 28nm cutdown version of Pascal?

AndrewJacksonZA · Mar 30, 2016

jimmysmitty :

About those clicks, no content. ;-)

jimmysmitty · Mar 30, 2016

AndrewJacksonZA :

Glad someone got that.

nikoli707 · Apr 2, 2016

^

the first release of pascal wont be anything ground breaking except for efficiency. they will replace the 750ti, 960, 980ti, titanx blah blah blah with a slightly faster more efficient pascal version. what the 960 is to the 760/670... the 970 to the 780 or 980 to 780ti. this pascal titan will only barely be faster than the 980ti. its going to be next year this time april 2017 until big pascal hits the market where we see dramatic improvements. it definitely will not happen sooner, period, end of story..... same story for amd and their sucessor; we are stuck with the horsepower +5% we have now.

in that sense yes, v1 pascal is just a die shrink and a preview efficiency progress, v2 will be the horsepower show off. amd is doing the same routine as well now, though efficiency wise not quite as good as nvidia, but horsepower wise they are definitely keeping up. it will be interesting to see VR benchmarks in a few months when relatively optimized drivers are out for the current crop of VR games.

those who have fury/980ti dont fret, there is nothing coming out anytime soon that is a game changer. if you have a 390/970, same deal except prices will drop and you may afford something faster. but dont hold out thinking there is something a few months away that will offer 25%+ more improvement at the same price point.

turkey3_scratch · Apr 3, 2016

nikoli707 :

^

the first release of pascal wont be anything ground breaking except for efficiency. they will replace the 750ti, 960, 980ti, titanx blah blah blah with a slightly faster more efficient pascal version. what the 960 is to the 760/670... the 970 to the 780 or 980 to 780ti. this pascal titan will only barely be faster than the 980ti. its going to be next year this time april 2017 until big pascal hits the market where we see dramatic improvements. it definitely will not happen sooner, period, end of story..... same story for amd and their sucessor; we are stuck with the horsepower +5% we have now.

in that sense yes, v1 pascal is just a die shrink and a preview efficiency progress, v2 will be the horsepower show off. amd is doing the same routine as well now, though efficiency wise not quite as good as nvidia, but horsepower wise they are definitely keeping up. it will be interesting to see VR benchmarks in a few months when relatively optimized drivers are out for the current crop of VR games.

those who have fury/980ti dont fret, there is nothing coming out anytime soon that is a game changer. if you have a 390/970, same deal except prices will drop and you may afford something faster. but dont hold out thinking there is something a few months away that will offer 25%+ more improvement at the same price point.

Source?

nikoli707 · Apr 5, 2016

http://www.tomshardware.com/news/nvidia-pascal-tesla-p100-gpu,31557.html
there is the official big pascal announcement... 2017q1

AndrewJacksonZA · Apr 6, 2016

@iknowhowtofixit: Unfortunately I didn't see anything about consumer cards there. :-/

turkey3_scratch · Apr 6, 2016

iknowhowtofixit :

Source?

nikoli707 · Apr 6, 2016

all the information you need about big pascal is there. obviously the 1080ti(30-40% faster than 980ti) or whatever it will be called is not going to be here by surprise tomorrow or anytime soon. but we do know we are getting a new titan, but it definitely will not be based on the big pascal chip. and i wouldn't be surprised in the next few months we see pascal geforce cards that replace the 970/980/980ti in each price tier/bracket and are a tiny bit faster. but nothing earth shattering is obviously happening.

NightAntilli · Apr 7, 2016

jimmysmitty :

Going from FL11_0 to (almost) 12_1 by just 'enhancing'? Yeah right...

But look at multiple nVidia's slides recently. They say Pascal = Maxwell + mixed precision + 3d memory + Nvlink. You can say that it's a larger enhanced version of Maxwell, but, that's merely how it looks from the outside. Internally, things shifted quite a bit. The difference between Maxwell and Maxwell 2 is a lot bigger than the difference between Maxwell 2 and Pascal. Look at the changes on Wikipedia, and you'll see what I mean.

Also, look at their road maps...
Before;

After

Notice there's no Pascal in the first one? Maxwell was supposed to be the node shrink to 20nm. But since 20nm pretty much failed, they had to change their plan.
Notice how Maxwell was supposed to have unified memory, but wasn't able to since the node shrink didn't happen. Instead, they upgraded Kepler and named it Maxwell rather than it being the node shrink, in order to buy time. Things still not looking up with the node shrink, they needed more time. They had no choice but to look at their future GPU and engineer it for it to be viable for 28 nm.
So what they did was, Volta and the dropped 20nm Maxwell were 'fused' and was renamed as Pascal. Notice how Pascal has adopted both unified memory (what 20nm Maxwell was supposed to have) and 3D memory/stacked RAM (what Volta was supposed to have). This was likely still based on 20nm, meaning it needed to be cut down to have a reasonable size for 28nm, and that's when Maxwell 2 was born. Pascal being 16nm could include everything that was designed for the 20nm node. Volta itself has shifted to be released later, in order to be adapted to include Async.

jimmysmitty · Apr 7, 2016

NightAntilli :

They don't say that at all.

The first one is older and shows what plans were. The second one is an update. How many times has AMDs road maps for their CPUs changed? Or even their GPUs due to the same 20nm flub?

Basically you have no real proof that Maxwell V2 is Pascal at 28nm and are just assuming it based on road maps (that are always subject to change, especially if your fabrication vendor fails to release a proper process node that can be used) and what you think happened and not what probably really happened.

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/3

Drilling down to the SMMs for a second, there are a couple of small changes that need to be noted. Organizationally the GM204 SMM is identical to the GM107 SMM, however GM204 gets 96KB of shared memory versus 64KB on GM107. Separate from the combined L1/texture cache, this shared memory services a pair of SMMs and their associated texture units to further reduce the need to go to L2 cache or beyond.

The Polymorph Engines have also been updated. There are not any major performance differences with the 3.0 engines, but they are responsible for implementing some of the new functionality we’ll reference later.

Other than this, GM204’s SMM is identical to the GM107 SMM. This includes the use of 4 shared texture units per 2 SMMs, leading to a 16:1 compute-to-texture ratio, and a 512Kb register file for each SMM

NightAntilli · Apr 7, 2016

Road maps indeed shows plans and it shows how plans change. Do you really think it's a coincidence that Pascal has what both Volta and Maxwell were supposed to have? Combine that with the 20nm node failure not being a secret and it's not hard to figure out. Volta might also get a name change etc, so we might not see the same name back...

But don't worry, when the cards are released, you'll see that they're pretty much 16nm Maxwell 2, with the 16nm allowing more room for more processors. That's it. With NVLink etc obviously.

And;
http://www.tweaktown.com/news/41666/nvidia-rumored-to-skip-20nm-gpus-with-tsmc-making-its-16nm-gpus/index.html

Going back to Ashes, we have these new benchmarks;
http://www.overclock3d.net/reviews/gpu_displays/ashes_of_the_singularity_pc_performance_review/10

jimmysmitty · Apr 7, 2016

NightAntilli :

In that case we could just say that Maxwell V2 is even just Fermi since it uses the same base design or that Fiji XT is just Tahiti since it is just GCN with a few new features to it.

Still you have not shown any proof. All you did was link an article that was rumoring what we all know to be true, that the 20nm process node failed and nVidia made Maxwell V2 on 28nm but nowhere doe that article state that Pascal was moved to 28nm.

I, on the other hand, posted a link showing how similar Maxwell and Maxwell V2 are short from some basic changes. And until we get the Pascal layout we know nothing more than what we have seen.

Wisecracker · Apr 7, 2016

In that case we could just say that Maxwell V2 is even just Fermi since it uses the same base design...

But, it's not, and it doesn't.

Fermi/Kepler designs (SMX) have 50% more CUDA cores than Maxwell (SMM), different allocation of execution resources for any given cycle, and lack texture compression. Resources were traded for power efficiency.

jimmysmitty · Apr 7, 2016

Wisecracker :

Fermi was the last major change to nVidias uArch. Same with Tahiti and GCN. Even though Fiji XT is similar to Tahiti XT it is different. Tahiti has only 2 ACE engines while Fiji XT has 8 ACE engines, along with some other changes.

I was just making a point that you consider anything similar the same.

Ashes Of The Singularity Beta: Async Compute, Multi-Adapter & Power

Titan

Expert

Titan

Honorable

Titan

Champion

Honorable

Champion

Champion

Honorable

Champion

Distinguished

Champion

Judicious

Expert

Judicious

Distinguished

Expert

Judicious

Honorable

Champion

Honorable

Champion

Splendid

Champion

Share this page