News Startup claims its Zeus GPU is 10X faster than Nvidia's RTX 5090: Bolt's first GPU coming in 2026

bit_user · Mar 7, 2025

blppt said:
Bitboys are back, lol!

I get the joke (i.e. about Glaze3D being vaporware), but also BitBoys' IP got renamed to Adreno, when Qualcomm acquired them from ATI/AMD. So, really, who got the last laugh?

renz496 · Mar 7, 2025

Faster than 5090? then show us some game performance first.

Joke aside i already see some people hope this thing can offer cheap gaming alternative compared to current GPU.

Mr_Wobble · Mar 7, 2025

I registered for an account just so I could let you know what a sleazy, cheap, clickbait garbage headline this is.

bit_user · Mar 7, 2025

Mr_Wobble said:
I registered for an account just so I could let you know what a sleazy, cheap, clickbait garbage headline this is.

Yeah, pretty much. The Path Tracing performance claim is quite far out there, by my read of the docs.

Not only that, but the 10x number is for their top-spec 500W 4-cluster version, while they only showed a board containing the single-cluster version.

It's also pretty funny to see them claim that realtime path tracing requires 280 RTX 5090 GPUs. Yes, 280 GPUs teamed together, in order to produce 4k @ 120 Hz with 100 spp, as if they didn't know that Nvidia is already doing Global Illumination with only a couple rays per pixel. Coupled with DLSS3+, that can let you hit > 60 fps at 4K on a single RTX 5090.

Meanwhile, if you use their approach, they're saying you need 28 of their 500W models to hit that same aggregate performance. If you've only got budget for just a single one of their 500W cards, then you have to sacrifice on framerate, image quality (spp), resolution, or some combination. A bullet point in their slides seems to imply that you can get by with the 2c version, using 8 spp, 5 bounces, and denoising.

Alex/AT · Mar 8, 2025

Bottom line, this is not a GPU, this is just a specialized accelerator board for very specific and limited calculations.

And much seems another RISC-V hoax.

Findecanor · Mar 8, 2025

bit_user said:
And someone eventually found a later model Xeon Phi that still had display interfaces on it, in a dumpster outside of Intel's labs.

I've been told that Larrabee had been used for prototyping tech that went into Intetl's current discrete GPUs, so that does not surprise me. It only reinforces the rumour.

Findecanor · Mar 8, 2025

Alex/AT said:
Bottom line, this is not a GPU, this is just a specialized accelerator board for very specific and limited calculations.

Actually, if it according to the article implements the RISC-V Vector extension, then it should be suited for pretty much any massive computing task. It can do pretty much anything that AVX-512 or ARM SVE can.

According to info from others above, it has physical hardware units also specifically for graphics, accessible through RISC-V instruction extensions.

The only thing it seems to be missing is output to a display, but that can be added.
In a large data-center or supercomputer, the GPUs don't typically have them anyway. I'm guessing that they are looking for large orders of that kind to gain momentum and capital before they start doing retail graphics card, which has its own set of problems.

RISC-V's Vector extension had been designed by technology experts over many years to be a good foundation for technology such as this. This is not the first time I've heard about it being used for GPUs, and won't be the last.

It is also a strength of RISC-V that the standard allows for proprietary extensions.
Proprietary extensions tested and evaluated in the field can lead to valuable insights that get used to develop the official standard further.

likeacactus · Mar 8, 2025

This screams "Ey, buy my company, we have shiny new thing wink wink"

bobcollum · Mar 8, 2025

Mr_Wobble said:
I registered for an account just so I could let you know what a sleazy, cheap, clickbait garbage headline this is.

I registered to like your comment and agree with you. My blood pressure went up a little when I got to "it has little of no chance to become one of the best graphics cards".

t3t4 · Mar 8, 2025

If it can just straight handle video encode/decode then it's gonna be light years ahead of the competition, maybe. I'm so sick to death of all this "gaming" noise all the damn time, I couldn't care less about another useless gaming benchmark! Show me some real work, how many FPS can this thing do running Topaz Video AI (or the like) ? Most have no idea how much brute force power it takes to do this kind of real work. If this card can do upscaling in Video AI at a messily 10 FPS, then it would be about 500 times faster then a 4090 which would make this card incredibly valuable to me!

RodroX · Mar 8, 2025

t3t4 said:
If it can just straight handle video encode/decode then it's gonna be light years ahead of the competition, maybe. I'm so sick to death of all this "gaming" noise all the damn time, I couldn't care less about another useless gaming benchmark! Show me some real work, how many FPS can this thing do running Topaz Video AI (or the like) ? Most have no idea how much brute force power it takes to do this kind of real work. If this card can do upscaling in Video AI at a messily 10 FPS, then it would be about 500 times faster then a 4090 which would make this card incredibly valuable to me!

... "show me some real work", and then you talk about AI Video, upscaling and brute force ... sorry it made me smile a little bit.

But I do get your point.

Time will tell if they get anywhere near to a final product, and available for purchase.

Fr0gg0 · Mar 8, 2025

Nice clickbait title.

bit_user · Mar 8, 2025

Findecanor said:
According to info from others above, it has physical hardware units also specifically for graphics, accessible through RISC-V instruction extensions.

Yeah, I think the ray tracing must be hardware-accelerated. I had initially assumed it just implemented software ray tracing on thousands of simple, in-order RISC-V cores but the doc on their website gives me the impression it has relatively fewer cores and relies more on special-purpose accel for that.

As for how many cores per cluster, there are a couple stats we might use to work this out. We can probably get in the ballpark, if we work back from that fp64 number of 5 fp64 TFLOPS for the single-cluster version (shown in the picture). If we assume a modest clock of 2.5 GHz, that tells us we need to account for 2k floating point ops per cycle. Figure that they're talking about FMA, which gives us 2 ops per lane. So, 1k * 64-bit = 64k bits worth of SIMD pipelines, that you can divide up among cores as you like.

SIMD Width per RISC-V Core (bits)	Number of RISC-V cores (approx)	Rays per cycle per RISC-V core
512	128	0.24
1024	64	0.48
2048	32	0.96
4096	16	1.93

Note that I'm considering cumulative SIMD per core, which could be divided up amongst multiple pipelines. I don't even consider less than 512-bit, because Xeon Phi implemented two pipelines of AXV-512 per core, almost a decade ago. Also, 512-bit is only SIMD-16 (fp32), which only Intel GPUs support. AMD and Nvidia haven't gone below SIMD-32. For a GPU or GPU-like architecture, wider SIMD makes more sense, because you have enough data parallelism and you want to keep down the overheads of things like instruction decoding.

If we consider AMD, RDNA uses Wave-32, but (last I checked) packs two of those engines per CU, giving the equivalent of the same 2048-bit SIMD per CU that they first introduced with GCN and have retained for CDNA. That said, CDNA is 64-bit native, so I guess when you combine that with their Wave-64 ISA, it should mean that CDNA SIMD throughput per cycle is actually 4096 bits.

Finally, I think Nvidia is still using 4 warp pipelines per SM, giving them the highest width at 4096 bits of SIMD throughput per cycle per SM.

There's nothing terribly exotic about a 16-core - or even 64-core - CPU, these days. These numbers are very believable.

Edit: I've gone back to add in the number of rays/cycle/core, based on these core counts and the figure of 77 GRays/s on the base model. Also, if that that figure of 77 GRays/s is theoretical and not measured, then it suggests maybe the actual clockspeed is about 2.6 GHz.

Findecanor said:
The only thing it seems to be missing is output to a display, but that can be added.

They actually have that. HDMI and DisplayPort.

Findecanor said:
RISC-V's Vector extension had been designed by technology experts over many years to be a good foundation for technology such as this. This is not the first time I've heard about it being used for GPUs, and won't be the last.

Think-Silicon announced it, way back in 2022. However, since they've gotten absorbed into Applied Materials, all of the old links are dead and I have no idea whether that IP went anywhere. It would be a shame, since an embedded SoC having a wide vector array you could retask for running other sorts of compute threads makes a fair bit of sense to me.

bit_user · Mar 8, 2025

t3t4 said:
If it can just straight handle video encode/decode then it's gonna be light years ahead of the competition, maybe.

Their PDF, which I've been going through, claims "2x 8K60 streams" of "AV1, H.264/265" video encoding throughput, on the base model. If you look at my above analysis of how many cores I believe it has, I think maybe they're just using a pure software implementation.

According to recent benchmarks from Phoronix, a 64-core Zen 4 Threadripper can achieve about 15, 64, or 222 fps of AV1 encoding throughput, when processing a single stream of 4k video, depending on the quality settings. I'd bet they're claiming towards the faster end of the presets, meaning we could estimate it by dividing the 222 FPS number to account for the additional resolution and number of streams. That gives us only about 28 fps. However, keep in mind that multi-stream encoding should scale better than single-stream. Furthermore, resolution-scaling appears to be super-linear. The Phoronix' data indicates the 1080p performance varies from 2.62x to 2.93x as fast as the corresponding 4k, which suggests my estimate should be more along the lines of 38 fps.

https://www.phoronix.com/news/SVT-AV1-3.0-Released

t3t4 said:
I'm so sick to death of all this "gaming" noise all the damn time, I couldn't care less about another useless gaming benchmark!

It's funny you say that, because their PDF doesn't actually cite gaming performance (although they do talk about interactive rendering and make some extrapolations for 4k @ 120 fps). The scene it claims to use for RT benchmarking is like what movies or other production renders would use - not at all the sort of geometry you'd use for gaming.

Also, that presentation spends a few slides looking at professional computing applications, hence the focus on fp64 performance. BTW, they claim 300x accuracy vs. modern GPU and CPU fp64 arithmetic, although I think that's probably just because they support denormals. Either that, or they use higher-accuracy implementations of transcendental functions.

t3t4 said:
Show me some real work, how many FPS can this thing do running Topaz Video AI (or the like) ? Most have no idea how much brute force power it takes to do this kind of real work. If this card can do upscaling in Video AI at a messily 10 FPS, then it would be about 500 times faster then a 4090 which would make this card incredibly valuable to me!

They're sort of limited by what has been optimized for RISC-V. That assumes they can even run general-purpose CPU workloads on it (see my earlier point about lack of any mention what OS it runs or whether their interconnect is even cache-coherent).

Also, they don't claim to surpass the RTX 5090 on AI performance. So, if it's AI you want, then this isn't going to be your savior.

TBH, I don't believe their AI numbers reflect anything remotely close to real-world performance. AI is very bandwidth-intensive and that's one of their obvious weak spots.

bit_user · Mar 8, 2025

Dwelling a bit more on the matter of cores & clock speeds, they do have a slide which mentions "Cache per FP32 core" and "Memory Bandwidth per FP32 core". This doesn't directly tell us how the SIMD is distributed among their CPU cores, but it does confirm the aggregate SIMD width.

The slide (page 32, if you're following along) says 64 kB per FP32 core. On page 36, they state the smallest config has 128 MB of cache, yielding a figure of 2000 FP32 "cores". That supports my estimate of 64k bits of over total SIMD width.

Likewise, page 32 references 177 MB/s of memory bandwidth. The figure of 177.08 MB/s per "FP32 core" * 2k "FP32 cores" = 354.16 GB/s, which is a little shy of their page 36 claim of 363 GB/s.

On the matter of memory bandwidth, we should also point out that the majority of it is coming from the meager 32 GB LPDDR5X. So, if that's your "high-bandwidth" memory, then it's really not that much better off than a RTX 5090, capacity-wise.

So, after going through their PDF and extracting everything I can, I tried to do a little more searching, to see if we can learn anything about their RISC-V cores, which I have a hunch they probably licensed from someone like SiFive. Although I found no announcements of such a deal, I did run across Jon Peddie's coverage of this product announcement:

Bolting out of the shadows

Revolutionary path-tracing chip company exits stealth mode.

www.jonpeddie.com

I think his is clearly the best take, on this thing. He didn't go into quite the nooks and crannies as I did, but his focus made sense and I agree with everything he said.

What he didn't say is that Nvidia has invested substantial resources into their AI-based denoising and ray-sampling technologies, which I'm sure make quite a bit of difference, when using their GPUs for path tracing and makes up for (and then some?) the supposed 10x deficit on raw ray-intersection performance, claimed by the Bolt folks.

t3t4 · Mar 8, 2025

bit_user said:
Their PDF, which I've been going through, claims "2x 8K60 streams" of "AV1, H.264/265" video encoding throughput, on the base model. If you look at my above analysis of how many cores I believe it has, I think maybe they're just using a pure software implementation.

According to recent benchmarks from Phoronix, a 64-core Zen 4 Threadripper can achieve about 15, 64, or 222 fps of AV1 encoding throughput, when processing a single stream of 4k video, depending on the quality settings. I'd bet they're claiming towards the faster end of the presets, meaning we could estimate it by dividing the 222 FPS number to account for the additional resolution and number of streams. That gives us only about 28 fps. However, keep in mind that multi-stream encoding should scale better than single-stream. Furthermore, resolution-scaling appears to be super-linear. The Phoronix' data indicates the 1080p performance varies from 2.62x to 2.93x as fast as the corresponding 4k, which suggests my estimate should be more along the lines of 38 fps.

https://www.phoronix.com/news/SVT-AV1-3.0-Released

It's funny you say that, because their PDF doesn't actually cite gaming performance (although they do talk about interactive rendering and make some extrapolations for 4k @ 120 fps). The scene it claims to use for RT benchmarking is like what movies or other production renders would use - not at all the sort of geometry you'd use for gaming.

Also, that presentation spends a few slides looking at professional computing applications, hence the focus on fp64 performance. BTW, they claim 300x accuracy vs. modern GPU and CPU fp64 arithmetic, although I think that's probably just because they support denormals. Either that, or they use higher-accuracy implementations of transcendental functions.

They're sort of limited by what has been optimized for RISC-V. That assumes they can even run general-purpose CPU workloads on it (see my earlier point about lack of any mention what OS it runs or whether their interconnect is even cache-coherent).

Also, they don't claim to surpass the RTX 5090 on AI performance. So, if it's AI you want, then this isn't going to be your savior.

TBH, I don't believe their AI numbers reflect anything remotely close to real-world performance. AI is very bandwidth-intensive and that's one of their obvious weak spots.

This is the kind of "real work" I'm referencing below:

View: https://youtu.be/naV-J1kfZmQ

When top end hardware can only muster 2.2 FPS, then clearly we need to change the way we do things. I think you misunderstood my gaming reference. Nobody can even talk about a GPU anymore it seems without using meaningless gaming benchmarks, as if games are the only thing in the world that matters. I'm sick to death of all the meaningless "gaming" noise everywhere all the freakin time!

bit_user · Mar 8, 2025

t3t4 said:
I think you misunderstood my gaming reference. Nobody can even talk about a GPU anymore it seems without using meaningless gaming benchmarks, as if games are the only thing in the world that matters. I'm sick to death of all the meaningless "gaming" noise everywhere all the freakin time!

If you're talking about the article's author or people posting in these forums, I'd just point out that (for better or for worse) this site does have a bias towards gamers. I'm not really sure why that is, but maybe the more general computing enthusiasts fled to sites like ServeTheHome or Phoronix, which cater more to non-gaming interests.

If you saw the questionnaire the site just launched, it sounds like they're currently in the process of re-evaluating their priorities. That would probably be a good avenue to make your opinions known. Just be aware that (according to others - I have yet to open it) it does seem like a feeler for a premium subscription. I think you can still just fill out what you're comfortable with and maybe it will have a positive impact.

t3t4 · Mar 9, 2025

bit_user said:
If you're talking about the article's author or people posting in these forums, I'd just point out that (for better or for worse) this site does have a bias towards gamers. I'm not really sure why that is, but maybe the more general computing enthusiasts fled to sites like ServeTheHome or Phoronix, which cater more to non-gaming interests.

If you saw the questionnaire the site just launched, it sounds like they're currently in the process of re-evaluating their priorities. That would probably be a good avenue to make your opinions known. Just be aware that (according to others - I have yet to open it) it does seem like a feeler for a premium subscription. I think you can still just fill out what you're comfortable with and maybe it will have a positive impact.

Thanks for the site suggestions.

But it's not this article or even this site that I'm complaining about, it's the industry as a whole that I'm referring to in regards to GPU testing and review across the board. It's all games all the time everywhere it seems, I'm just sick of it. Gaming is probably the least important thing a GPU is capable of doing and yet seems to be the singular focus by almost everyone almost everywhere almost every time.

And that's my rant for the day 😁.

Oh, and I did fill out their survey and yes, it does seem to be about my willingness to pay for a subscription. The answer provided was a rock solid hell no!

bit_user · Mar 9, 2025

t3t4 said:
But it's not this article or even this site that I'm complaining about, it's the industry as a whole that I'm referring to in regards to GPU testing and review across the board. It's all games all the time everywhere it seems, I'm just sick of it. Gaming is probably the least important thing a GPU is capable of doing and yet seems to be the singular focus by almost everyone almost everywhere almost every time.

Yeah, I've followed the evolution of GPUs since the days, when the only hardware acceleration of 3D graphics was in expensive UNIX workstations and definitely not for gaming. Back then, pixel shaders were only used in Hollywood movie production (see Rendeman) and inconceivable in anything realtime. No doubt, gaming has driven the evolution of GPUs, but then we also have things like VR that have even surpassed the CAD and scientific visualization applications that were among their first drivers.

The thing about AI is that GPUs aren't really the best architecture for it. GPUs are basically the second most general type of processor, following CPUs. They don't care how coherent your data access is or how much communication you do between threads. If you can divide your workload into tons of threads and if it's SIMD-friendly, then it will work on a GPU. That's why it was a natural choice for neural networks.

NPUs are more specialized towards exactly the data access patterns and types of arthmetic operations that are needed by deep learning. That why, per Watt or per mm^2 of silicon, they're much more efficient than GPUs. I'd even say NPUs are more closely related to DSPs than GPUs.

It will be interesting to see if AMD's upcoming UDNA architecture truly manages to bridge the gap. It is definitely weird that phones and now laptops have separate GPU and NPU blocks, in spite of the functional overlap. Seems like a waste of silicon.

t3t4 said:
And that's my rant for the day 😁.

Thanks for explaining your points.

UknownUsr · Mar 20, 2025

bit_user said:
I think his is clearly the best take, on this thing. He didn't go into quite the nooks and crannies as I did, but his focus made sense and I agree with everything he said.

What he didn't say is that Nvidia has invested substantial resources into their AI-based denoising and ray-sampling technologies, which I'm sure make quite a bit of difference, when using their GPUs for path tracing and makes up for (and then some?) the supposed 10x deficit on raw ray-intersection performance, claimed by the Bolt folks.

Actually no, it doesn't, I do 3d rendering and the tech nvidia has for denoising/upscaling hasn't been really a gamechanger or well implemented in renderers, it's literally only made for games it seems as of yet, I don't use nvidia denoising in vray for example, and vray is extremely popular/most known so I think using it is a good example here (just went and tested again with vray, in 1 scene nvidia ai denoising is better in the sense that it smoothed out the image a lot more than vray's one which in this one specifically I think was good as there wasn't many details, in another with much more complex materials/meshes, it literally removes all of the details, whereas vray's denoiser kept the details where I wanted it to, vray's denoiser can also be tuned in intensity/radius when you can't with nvidia's one, in overall I prefer having finer control and a denoiser that keeps the details where needed than a stupid "smoothing everything" denoiser), and in any way when it comes to offline rendering, the least denoising you need to use the better. Vray also has an option to use nvidia upscaling, and it's also bad imo.

here in those videos it actually shows/finds what I'm talking about, 1st one is extreme, but opendenoiser is better, and same thing on the 2nd one:

View: https://www.youtube.com/watch?v=NoRxUJrXGqk

View: https://www.youtube.com/watch?v=bGQ8y2FJff8

(optix = nvidia)

So a card that can produce super quickly images with super high sample could actually be a game changer yes for offline 3d rendering, no denoising is capable of beating that as of yet, clearly not like how dlss can upscale to near native quality.

Also at one point you asked about cpu vs gpu in renderers I think ? I remember this video that's quite interesting and shows the same render using the cpu, cpu + gpu or gpu:

View: https://youtu.be/ayx0vuLbDns?t=232

Important to note, vray has a vray cpu and vray gpu renderer, they're both literally different renderers, that's why in the video she writes CPU and GPU at the beginning of each lines, and also why benchmarks numbers are different I think when comparing, but you can run the GPU version on the CPU still like she did where she writes gpu: cuda_cpu.

bit_user · Mar 23, 2025

UknownUsr said:
Actually no, it doesn't, I do 3d rendering and the tech nvidia has for denoising/upscaling hasn't been really a gamechanger or well implemented in renderers, it's literally only made for games it seems as of yet, I don't use nvidia denoising in vray for example, and vray is extremely popular/most known so I think using it is a good example here (just went and tested again with vray, in 1 scene nvidia ai denoising is better in the sense that it smoothed out the image a lot more than vray's one which in this one specifically I think was good as there wasn't many details, in another with much more complex materials/meshes, it literally removes all of the details, whereas vray's denoiser kept the details where I wanted it to, vray's denoiser can also be tuned in intensity/radius when you can't with nvidia's one, in overall I prefer having finer control and a denoiser that keeps the details where needed than a stupid "smoothing everything" denoiser), and in any way when it comes to offline rendering, the least denoising you need to use the better. Vray also has an option to use nvidia upscaling, and it's also bad imo.

Thanks for posting that detailed analysis. This is great info, because you also answered a lot of follow-on questions I would've had.

UknownUsr said:
So a card that can produce super quickly images with super high sample could actually be a game changer yes for offline 3d rendering, no denoising is capable of beating that as of yet, clearly not like how dlss can upscale to near native quality.

Yes, but imagine Nvidia gave you the same ability to adjust the strength of their denoiser, enabling you to dial it back like VRay's denoiser? It seems like Nvidia's denoising tech is capable of producing better quality images for a given number of rays, and it just might be enough to compensate for the difference in performance vs. Zeus. That's my thinking, anyway.

I might be wrong about that, but I think Zeus is taking a risk in trying to compete on brute-force performance, alone.

UknownUsr said:
Also at one point you asked about cpu vs gpu in renderers I think ? I remember this video that's quite interesting and shows the same render using the cpu, cpu + gpu or gpu:

View: https://youtu.be/ayx0vuLbDns?t=232

Important to note, vray has a vray cpu and vray gpu renderer, they're both literally different renderers, that's why in the video she writes CPU and GPU at the beginning of each lines, and also why benchmarks numbers are different I think when comparing, but you can run the GPU version on the CPU still like she did where she writes gpu: cuda_cpu.

Again, thank you very much for this.

zascha · Apr 19, 2025

It's wayyy above my understanding, but here's a patent filed by Bolt in 2022, that may provide some more info on their tech

Abstract:

A ray tracing system and method of operation comprising one or more memories configured to store data used by the ray tracing system and one or more memory interfaces configured read and or write data to the one or more memories. A ray tracing engine, in communication with the memory via the one or more memory interfaces, comprising one or more ray generation modules configured to generate ray data defining rays. Also part of the ray tracing engine are one or more acceleration structure generators configured to process geometry data that is stored in the one or more memories to create an acceleration structure based on the geometry data. One or more intersection testers are configured to compare the ray data to the acceleration structure to determine which rays intersect which elements in the acceleration structure and generate secondary ray data, such that the secondary rays represent reflections.

bit_user · Apr 19, 2025

zascha said:
It's wayyy above my understanding, but here's a patent filed by Bolt in 2022, that may provide some more info on their tech

Okay, I can take a swing at that.

The patent abstract said:
A ray tracing system and method of operation comprising one or more memories configured to store data used by the ray tracing system and one or more memory interfaces configured read and or write data to the one or more memories. A ray tracing engine, in communication with the memory via the one or more memory interfaces, comprising one or more ray generation modules configured to generate ray data defining rays.

This just describes a basic Ray Tracing "core".

The patent abstract said:
Also part of the ray tracing engine are one or more acceleration structure generators configured to process geometry data that is stored in the one or more memories to create an acceleration structure based on the geometry data.

This part says they have hardware-accelerated creation of something like a BVH (Bounding Volume Hierarchy). I think this is the direction AMD and Nvidia have been moving in, but I'm not exactly sure where they both stand, on this front.

In the RTX 4000 series, Nvidia added a unit they call the DMM (Displacement Micro Mesh) Engine, which they say accelerates BVH construction by up to 10x and shrinks it by up to a factor of 20.

NVIDIA Ada Lovelace Architecture

Designed to Deliver Revolutionary Performance!!

www.nvidia.com

That's not directly saying it constructs the BVH, though. Also, here's a quote from the Ampere docs:

"The GA10x RT Core includes new acceleration features that work in concert with small modifications to the BVH to significantly accelerate both moving geometry and deforming geometry types of motion blur."

source: https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf

I think BVH creation is mainly an issue for interactive rendering. Especially if you're doing global illumination, it should only account for a very small proportion of the overall work.

This article seems to confirm that even AMD's RDNA4 is still building the BVH using the shader cores:

https://chipsandcheese.com/p/rdna-4s-raytracing-improvements

The patent abstract said:
One or more intersection testers are configured to compare the ray data to the acceleration structure to determine which rays intersect which elements in the acceleration structure and generate secondary ray data, such that the secondary rays represent reflections.

And this last part basically just takes us back to the basics of what every pretty much every RT core does.

News Startup claims its Zeus GPU is 10X faster than Nvidia's RTX 5090: Bolt's first GPU coming in 2026

Titan

Champion

Titan

Honorable

Distinguished

Distinguished

Prominent

Dignified

Commendable

Titan

Titan

Titan

Prominent

Titan

Prominent

Titan

Titan

Titan

Share this page