News Microsoft says DirectX Raytracing 1.2 will deliver up to 2.3x performance uplift

"Also, AMD has certain scheduling optimizations that may mimic how SER works, so if game developers take time to optimize for Radeon GPUs, the latter may get some speed improvements."

So AMD cards won't get any benefits from these. Got it.

And good to see MS actually is doing something with DirectX. DX feels dead and has felt dead for a while. Zero new tech from their end. Same thing I can say about Khronos and Vulkan/OGL, though**.

Regards.
 
Last edited:
  • Like
Reactions: qwertymac93
DXR is a real advantage over Vulkan. Someone has to push for a vendor agnostic standard here, else Nvidia will continue to monopolize it just like they do the AI space.
Whut?!
The only vendor agnostic standard would be software rendering and that would be terrible.
You have to use the hardware to make it fast and that hardware is going to change from one to the other.
If they make a standard for certain hardware then hardware will not progress anymore for fear of losing compatibility.
Which is why we still have the x86 base from the 1920ies (hyperbole)
 
Hopefully it's a SW thing and RDNA 2-4 will get it sooner or later 🤷‍♂
My understanding in talking with Nvidia about SER is that there are hardware features alongside the software tweaks, which is why Ampere and Turing don't support SER. You can reorganize things in software before sending the shaders to the GPU to execute, but apparently that doesn't do much. ?

Frankly, SER has always felt a bit like something that's mostly software and could be implemented for other GPUs. Sort of like how Framegen and MFG could be done on tensor cores. So, it wouldn't be surprising if there is a benefit to DXR 1.2 with SER on pre-Ada RTX cards, along with AMD and Intel GPUs. Maybe not as much of a benefit as if there are hardware hooks, though.
 
  • Like
Reactions: bit_user
My understanding in talking with Nvidia about SER is that there are hardware features alongside the software tweaks, which is why Ampere and Turing don't support SER. You can reorganize things in software before sending the shaders to the GPU to execute, but apparently that doesn't do much. ?

Frankly, SER has always felt a bit like something that's mostly software and could be implemented for other GPUs. Sort of like how Framegen and MFG could be done on tensor cores. So, it wouldn't be surprising if there is a benefit to DXR 1.2 with SER on pre-Ada RTX cards, along with AMD and Intel GPUs. Maybe not as much of a benefit as if there are hardware hooks, though.
You are probably wrong about SER. There are several hardware requirements for it - a huge L2 (Nx10MB) capable of containing all the register files because it is through it that their reordering occurs, as well as the reordering function. In addition, these calls must be encoded in the game, which no one has done except CP2077. In my opinion, the technology looks terrible, if you imagine how many of these permutations will be written in each frame, it's just a waste of L2 bw and overhead...
Probably for this reason this technology is not used and the implementation in dx will not advance it much. More about SER on c&c.

AMD's approach to hardware thread scheduling in RDNA4 without the need for a huge L2 seems more interesting, but we don't know the implementation details yet.
 
Last edited:
  • Like
Reactions: -Fran-
So now Microsoft decides it wants to do something. Nvidia brought out Ray tracing in 2018. Microsoft is way behind. They need to get into the gaming mood. But I guess since their partners with AMD they decided they're going to do something now that AMD finally has ok Ray tracing support after all these years. Shame.
 
So now Microsoft decides it wants to do something. Nvidia brought out Ray tracing in 2018. Microsoft is way behind. They need to get into the gaming mood. But I guess since their partners with AMD they decided they're going to do something now that AMD finally has ok Ray tracing support after all these years. Shame.
I mean, unless you are talking about the consoles, AMD has had solid ray tracing support in its GPUs since 2020.
 
So AMD cards won't get any benefits from these. Got it.
Maybe not yet, at least.

And good to see MS actually is doing something with DirectX. DX feels dead and has felt dead for a while. Zero new tech from their end. Same thing I can say about Khronos and Vulkan/OGL, though**.
I don't follow DX12 very closely, but I'm wondering where you got the idea that Vulkan feels dead or like it's not getting new tech?

Vulkan 1.4 was released just last December.

And while 1.4 might not sound like a very active or mature standard, I think they're probably using semantic versioning, where they'd advance to 2.0 when making changes not backwards compatible with earlier Vulkan revisions, which I think has yet to occur.

The main way that Vulkan advances is through extensions. According to the Mesa compatibility tracker, there are currently 252 Vulkan extensions so far defined.


It's so many that it's actually become a problem, with developers not knowing which they can count on all relevant implementations having. Vulkan 1.3 sought to address that by introducing Profiles.

Now, you also mentioned OpenGL. That, in fact, is basically in maintenance mode. Every now and then, some new extensions will get defined. Some vendors, like Imagination and even the new Nvidia Linux driver aren't even bothering to write an OpenGL driver. Instead, they simply recommend customers use Zink, which is an OpenGL compatibility layer that runs atop Vulkan.
 
  • Like
Reactions: JarredWaltonGPU
So now Microsoft decides it wants to do something. Nvidia brought out Ray tracing in 2018. Microsoft is way behind.
Huh? Microsoft released DXR before Nvidia ever shipped Turing! This announcement is dated March 18th, 2018:

In fact, Nvidia and Microsoft worked together to define DXR, which Nvidia then demo'd with a software implementation running on Volta (the Darth Vader + storm troopers demo).

I guess since their partners with AMD they decided they're going to do something now that AMD finally has ok Ray tracing support after all these years. Shame.
You'd think maybe Microsoft would be working more closely with AMD, due to their collaboration on XBox. However, that wasn't true of the original DXR and it's not true of this v1.2, which adds support for features that only Nvidia currently implements in their hardware.

I mean, unless you are talking about the consoles, AMD has had solid ray tracing support in its GPUs since 2020.
"solid" seems like an overstatement. I think ray tracing on AMD GPUs wasn't terribly usable, before RDNA3 (2022).
 
  • Like
Reactions: JarredWaltonGPU
You are probably wrong about SER. There are several hardware requirements for it - a huge L2 (Nx10MB) capable of containing all the register files because it is through it that their reordering occurs, as well as the reordering function.
Huh?

The Chips & Cheese article you cited mentions that Ada does have enough L2 cache for all the registers, but they explicitly state they don't believe that's how it actually works!

"I’m guessing SER only reorders threads within the SM. That reduces the scope of reordering, making the process faster.
... reordering across the entire GPU would be another order of magnitude more difficult.
... doing a GPC-wide or GPU-wide barrier would be quite costly."

In my opinion, the technology looks terrible, if you imagine how many of these permutations will be written in each frame, it's just a waste of L2 bw and overhead...
They point out that "NVIDIA is apparently able to do all this quite efficiently, because they show examples where SER is invoked many times within a shader." In the section I quoted from, above, they observe that SM-local shared memory could be used to store the keys and are careful to note that it's only live registers that must be spilled.

Finally, SER is one of the areas where Blackwell introduced improvements at the microarchitecture level. I'm not sure if the improvements are limited to code involving Tensor core operations, or if the point of the slide it just to show how SER also benefits neural shaders.

Lmd46nH8KghhEr8ZVyoxoC.jpg


Source: https://www.tomshardware.com/pc-com...t-the-upgrades-coming-with-rtx-50-series-gpus

Frankly, the basic idea behind SER is something I've been thinking about, pretty much since learned how modern GPUs actually work. It just makes sense, when dealing with heavily-divergent code. However, I can think of other ways to achieve similar ends, though none without at least some overhead.

Nvidia has been designing & programming GPUs for 30 years and they're pretty good at it. And they know how to do math. So, I'm pretty confident SER is a net-positive, if you're reasonably judicious where & how you use it.
 
Maybe not yet, at least.


I don't follow DX12 very closely, but I'm wondering where you got the idea that Vulkan feels dead or like it's not getting new tech?

Vulkan 1.4 was released just last December.

And while 1.4 might not sound like a very active or mature standard, I think they're probably using semantic versioning, where they'd advance to 2.0 when making changes not backwards compatible with earlier Vulkan revisions, which I think has yet to occur.

The main way that Vulkan advances is through extensions. According to the Mesa compatibility tracker, there are currently 252 Vulkan extensions so far defined.


It's so many that it's actually become a problem, with developers not knowing which they can count on all relevant implementations having. Vulkan 1.3 sought to address that by introducing Profiles.

Now, you also mentioned OpenGL. That, in fact, is basically in maintenance mode. Every now and then, some new extensions will get defined. Some vendors, like Imagination and even the new Nvidia Linux driver aren't even bothering to write an OpenGL driver. Instead, they simply recommend customers use Zink, which is an OpenGL compatibility layer that runs atop Vulkan.
The thing with extensions is that they've always been there, so saying "Vulkan is not dead in the water because there's extensions" is, to me, similar to "Windows is fine as long as you installl X Y or Z software". Why isn't it already part of the spec if it's pushing it forward? That's on Khronos to address.

And yes, none of the Vulkan updates has introduced anything that I would be willing to say "yeah, that's a really good addition to the spec and it'll help devs/games/visuals a lot".

It's a matter of expectations, for sure. Saying they haven't done anything is technically incorrect, but from my personal expectations, it is the case.

Regards.
 
The thing with extensions is that they've always been there,
Huh? No, there's a continual stream of new extensions being added all the time. Mostly, what Vulkan releases do is promote a subset of the extensions into the core API.

so saying "Vulkan is not dead in the water because there's extensions" is, to me, similar to "Windows is fine as long as you installl X Y or Z software".
I don't agree with that analogy, because there are two classes of extensions. There are vendor-specific extensions, which is sort of like a 3rd party software, except that even they are specified in the core Vulkan registry - they just haven't gone through the full vetting process needed to become official extensions. And even there, a middle ground exists, where you have extension proposals that haven't yet been ratified.

Why isn't it already part of the spec if it's pushing it forward? That's on Khronos to address.
The development model of Vulkan is sort of like trialing new capabilities as extensions, giving the industry time to try them and iterate, before they're eventually ratified. Due to the broad range of devices Vulkan supports, relatively few extensions are promoted all the way to mandatory features.

And yes, none of the Vulkan updates has introduced anything that I would be willing to say "yeah, that's a really good addition to the spec and it'll help devs/games/visuals a lot".
I guess, if you're talking about core features, they're not going to make something like ray tracing mandatory. At least, not yet. Unlike DX12, Vulkan is supported on everything from tiny, embedded SoCs to CPUs and big gaming GPUs. So, doing that would force them to be implemented even on low-end devices, where they'd be completely unusable and it would just be a waste of effort.

It's a matter of expectations, for sure. Saying they haven't done anything is technically incorrect, but from my personal expectations, it is the case.
I just don't even know what you're talking about. Vulkan even introduced an extension for cooperative vectors, at the same time as Microsoft, in order to support Neural shaders. I don't know what's more cutting-edge than that.

What's an example of something it doesn't do that you'd expect it to?
 
  • Like
Reactions: Nitrate55
They point out that "NVIDIA is apparently able to do all this quite efficiently, because they show examples where SER is invoked many times within a shader." In the section I quoted from, above, they observe that SM-local shared memory could be used to store the keys and are careful to note that it's only live registers that must be spilled.
My assessment is mainly from the point of view of a software developer - such an approach as NV's for inserting SER calls approximately every 5 lines is simply thrown into the trash 😅

Either it is implemented completely transparently on hardware like in CPU, or it is not used. Two years after the presentation, we see only the lack of implementation.

So I consider it the same crutch as dlss - you just have to mangle your code, instead of the real optimizations you could do instead.

Likewise, it's possible to hog the entire GPU magically sorting shaders/threads/rays instead of doing the real work, namely the computations in those shaders/threads/rays. This necessitates a strict balance between these manipulations. Unnecessary headache.

Trying to dump all this work on developers is doomed to failure in my opinion. It looks like an attempt to abandon fully transparent hardware OOO. Therefore, the implementation of OOO in RDNA 4 is of real interest
 
Last edited:
My assessment is mainly from the point of view of a software developer - such an approach as NV's for inserting SER calls approximately every 5 lines is simply thrown into the trash 😅
No, it wouldn't be used every 5 lines. The natural places to use it are when you're already either synchronizing (i.e. barrier) or have written out most of the intermediates to memory and facing significant divergence.

As they say, using it involves computing a key that can be used for sorting. Maybe that key naturally falls out of your algorithm, but probably you'd at least have to do something extra to make it. So, it has an overhead that's visible to the developer.

Either it is implemented completely transparently on hardware like in CPU, or it is not used. Two years after the presentation, we see only the lack of implementation.
That's a little unfair, because prior to this DXR 1.2 update, any code using it would have to involve a codepath custom-written for Nvidia. With SER getting integrated into DirectX, it can be used in a portable fashion.

So I consider it the same crutch as dlss - you just have to mangle your code, instead of the real optimizations you could do instead.
Didn't you hear that MS also introduced a portable API for upscaling? Yes, these upscalers require computation of analytic motion vectors (like TAA did) and providing a few other residuals and byproducts to it, but some commonality has coalesced around what sorts of inputs these scalers want.

Likewise, it's possible to hog the entire GPU magically sorting shaders/threads/rays instead of doing the real work, namely the computations in those shaders/threads/rays. This necessitates a strict balance between these manipulations. Unnecessary headache.
There are zillions of ways game developers can shoot themselves in the foot, if they don't know what they're doing. I once asked an AMD driver engineer about the level of sophistication of the game developers he deals with, and he said it was quite high. He said you basically don't get to that level of game development unless you know what you're doing.

Trying to dump all this work on developers is doomed to failure in my opinion.
They don't have to use it. It's just another tool they now have in their toolbox. Even when it is used, it's likely to be employed only in a relatively small number of places where it's a big win.

It looks like an attempt to abandon fully transparent hardware OOO.
It's nothing to do with OOO. That's concerning the order in which operations happen. SER is about addressing divergence in the control flow, which is deterministic. As long as you're still using SIMD, OOO doesn't help you at all with divergence. If you're advocating for moving away from SIMD, that's something none of the GPUs are doing.
 
  • Like
Reactions: Peksha
Huh? No, there's a continual stream of new extensions being added all the time. Mostly, what Vulkan releases do is promote a subset of the extensions into the core API.


I don't agree with that analogy, because there are two classes of extensions. There are vendor-specific extensions, which is sort of like a 3rd party software, except that even they are specified in the core Vulkan registry - they just haven't gone through the full vetting process needed to become official extensions. And even there, a middle ground exists, where you have extension proposals that haven't yet been ratified.


The development model of Vulkan is sort of like trialing new capabilities as extensions, giving the industry time to try them and iterate, before they're eventually ratified. Due to the broad range of devices Vulkan supports, relatively few extensions are promoted all the way to mandatory features.


I guess, if you're talking about core features, they're not going to make something like ray tracing mandatory. At least, not yet. Unlike DX12, Vulkan is supported on everything from tiny, embedded SoCs to CPUs and big gaming GPUs. So, doing that would force them to be implemented even on low-end devices, where they'd be completely unusable and it would just be a waste of effort.


I just don't even know what you're talking about. Vulkan even introduced an extension for cooperative vectors, at the same time as Microsoft, in order to support Neural shaders. I don't know what's more cutting-edge than that.

What's an example of something it doesn't do that you'd expect it to?
I don't want to rebute every individual quoting, as you breaking down things like that is annoying to me.

So, I'll just reply to whatever is last in your text. This is a good example: https://registry.khronos.org/vulkan/specs/latest/man/html/VK_KHR_portability_subset.html

Extensions have always been a crutch in both OGL and Vulkan, where in Vulkan they could have gotten rid of them and take back control of the full spec, but I guess Khronos can't be that brave. Yes, I'm over simplifying it, but it's not wrong. I hate to say this, but that is one important element that Microsoft has handled very well with DirectX and the... Uh... Features was it?

Regards.
 
Extensions have always been a crutch in both OGL and Vulkan, where in Vulkan they could have gotten rid of them and take back control of the full spec, but I guess Khronos can't be that brave.
I think you're missing the point of how Vulkan evolves. They do promote certain extensions into features and 1.3 did introduce the notion of Profiles, in order to bring some sanity and draw some lines in the sand. Profiles enable application developers to know which extensions they can count on from a certain class of device and device manufacturers know which extensions they must provide to support a certain class of applications.

Yes, I'm over simplifying it, but it's not wrong. I hate to say this, but that is one important element that Microsoft has handled very well with DirectX and the... Uh... Features was it?
Even Direct3D 12 has lots of runtime capability queries:

But, I think a big difference you're glossing over is the range of device support provided by Vulkan vs. Direct3D. Vulkan scales down to run on smartphones, smart watches, and little IoT devices. By contrast, Direct3D pretty much only supports iGPUs and dGPUs in PCs and laptops (yes, and XBox). In order to cover a wider range of device classes, you need a more modular and flexible API.

If Vulkan started raising the bar and mandating lots of features that small devices either couldn't implement or would make no sense to use on them, these devices would be left behind and the manufacturers would simply just continue to implement only last Vulkan version they could support. While that might not sound so bad, it means the devices wouldn't pick up newer features they could support, and this leaves users and app developers worse off.

Finally, let's not forget where this all started. You said:

DX feels dead and has felt dead for a while. Zero new tech from their end. Same thing I can say about Khronos and Vulkan/OGL, though**.

Now that we've cleared up the matter that extensions aren't just like installing some 3rd party application on a OS, but are instead fundamental to the Vulkan standardization process - which has indeed been proceeding at a brisk pace - I'm going to consider at least that issue settled.
 
Now that we've cleared up the matter that extensions aren't just like installing some 3rd party application on a OS, but are instead fundamental to the Vulkan standardization process - which has indeed been proceeding at a brisk pace - I'm going to consider at least that issue settled.
Emphasis mine. To me extensions is anything BUT "standarasing" the spec. Hell, I'd even argue that by definition, allowing custom things into a spec stops being "standard". The moment you ask developers to check if an extension is present, it's no longer something that is "part of the API". That is my beef with extensions and has always been.

Also, you mention "wide support of devices" as some kind of saving grace. Sorry, but that's not an excuse. There's a reason why OGL-ES exists. No idea if they are using "the one Vulkan" API for everything, but, again, that also is not "standard" accross devices if you have a different spec (subset or not) of another specific for types of devices. You just have a lot of them, but not "the one".

I'll stop here as I won't change my perception of Vulkan and its "evolution" based on my own impressions and usage of it. I'm not an expert on DirectX or it's equivalent "features", but at least I know it has been better handled historically so it does not become fragmented and it works, for the most part, reliably. Double edged swords and all that.

Regards.
 
Emphasis mine. To me extensions is anything BUT "standarasing" the spec. Hell, I'd even argue that by definition, allowing custom things into a spec stops being "standard". The moment you ask developers to check if an extension is present, it's no longer something that is "part of the API". That is my beef with extensions and has always been.
I'm going to leave this be. As I said, Direct3D has an analogous concept of capability querying (see link), so you're dealing with runtime checks no matter what.

Also, you mention "wide support of devices" as some kind of saving grace. Sorry, but that's not an excuse.
Having the same API span a wide range of devices enables shared tooling, shared implementations (both in apps & drivers), and widens the pool of expertise.

There's a reason why OGL-ES exists.
Android deprecated OpenGL-ES and Apple dropped it even earlier. It stopped evolving even before the main OpenGL, with the last release being almost exactly 10 years ago.

FWIW, I'm actually not a huge Vulkan fan. It's painful to use, among its first and foremost issues. I wish Khronos would define a higher API layer that's still below the level of a game engine, which handles a lot of the chores and dirty details that Vulkan forces you to confront. When I've mentioned this to people, the refrain is just to use OpenGL, but that doesn't sound like a good option, especially with development on it largely having ceased. Plus, I've used modern OpenGL and it's definitely accumulated quite a lot of cruft and complexity.
 
Last edited:
Microsoft's DXR 1.2 can unlock additional performance potential of Intel and Nvidia GPUs.

Microsoft says DirectX Raytracing 1.2 will deliver up to 2.3x performance uplift : Read more

if Microsoft says 2.3x it will be a lot less or some obscure test.

i assume ms has seen the threat of steam os again.

also the fact this doesnt work on radeon gpus and steam os seems to favour Radeon i wouldn't be shocked if amd gamers start using the os once its fully compatible.
"Also, AMD has certain scheduling optimizations that may mimic how SER works, so if game developers take time to optimize for Radeon GPUs, the latter may get some speed improvements."

So AMD cards won't get any benefits from these. Got it.

And good to see MS actually is doing something with DirectX. DX feels dead and has felt dead for a while. Zero new tech from their end. Same thing I can say about Khronos and Vulkan/OGL, though**.

Regards.

i mean this isnt shocking ms had no corporate rival now they have steam machines popping up left and right.
so they have to innovate to bring something to the table to keep users on windows.

if steam released a fully functional plug and play steam os with full gpu support tomorrow people would leave in droves lol.

1. free
2. practically same features.
3. no weird add ons like copilot that no one asked for.

this is all speculation on my part anyway.

valve helped with vulkan so i wont be shocked if we see improvements come out on that next.


same year that the steam machines where in development ms was working on direct x 12. and got it out the door before the release of steam machines.
 
  • Like
Reactions: bit_user