Yeah, pretty much. The Path Tracing performance claim is quite far out there, by my read of the docs.I registered for an account just so I could let you know what a sleazy, cheap, clickbait garbage headline this is.
I've been told that Larrabee had been used for prototyping tech that went into Intetl's current discrete GPUs, so that does not surprise me. It only reinforces the rumour.And someone eventually found a later model Xeon Phi that still had display interfaces on it, in a dumpster outside of Intel's labs.
Actually, if it according to the article implements the RISC-V Vector extension, then it should be suited for pretty much any massive computing task. It can do pretty much anything that AVX-512 or ARM SVE can.Bottom line, this is not a GPU, this is just a specialized accelerator board for very specific and limited calculations.
I registered to like your comment and agree with you. My blood pressure went up a little when I got to "it has little of no chance to become one of the best graphics cards".I registered for an account just so I could let you know what a sleazy, cheap, clickbait garbage headline this is.
If it can just straight handle video encode/decode then it's gonna be light years ahead of the competition, maybe. I'm so sick to death of all this "gaming" noise all the damn time, I couldn't care less about another useless gaming benchmark! Show me some real work, how many FPS can this thing do running Topaz Video AI (or the like) ? Most have no idea how much brute force power it takes to do this kind of real work. If this card can do upscaling in Video AI at a messily 10 FPS, then it would be about 500 times faster then a 4090 which would make this card incredibly valuable to me!
Yeah, I think the ray tracing must be hardware-accelerated. I had initially assumed it just implemented software ray tracing on thousands of simple, in-order RISC-V cores but the doc on their website gives me the impression it has relatively fewer cores and relies more on special-purpose accel for that.According to info from others above, it has physical hardware units also specifically for graphics, accessible through RISC-V instruction extensions.
SIMD Width per RISC-V Core (bits) | Number of RISC-V cores (approx) | Rays per cycle per RISC-V core |
---|---|---|
512 | 128 | 0.24 |
1024 | 64 | 0.48 |
2048 | 32 | 0.96 |
4096 | 16 | 1.93 |
They actually have that. HDMI and DisplayPort.The only thing it seems to be missing is output to a display, but that can be added.
Think-Silicon announced it, way back in 2022. However, since they've gotten absorbed into Applied Materials, all of the old links are dead and I have no idea whether that IP went anywhere. It would be a shame, since an embedded SoC having a wide vector array you could retask for running other sorts of compute threads makes a fair bit of sense to me.RISC-V's Vector extension had been designed by technology experts over many years to be a good foundation for technology such as this. This is not the first time I've heard about it being used for GPUs, and won't be the last.
Their PDF, which I've been going through, claims "2x 8K60 streams" of "AV1, H.264/265" video encoding throughput, on the base model. If you look at my above analysis of how many cores I believe it has, I think maybe they're just using a pure software implementation.If it can just straight handle video encode/decode then it's gonna be light years ahead of the competition, maybe.
It's funny you say that, because their PDF doesn't actually cite gaming performance (although they do talk about interactive rendering and make some extrapolations for 4k @ 120 fps). The scene it claims to use for RT benchmarking is like what movies or other production renders would use - not at all the sort of geometry you'd use for gaming.I'm so sick to death of all this "gaming" noise all the damn time, I couldn't care less about another useless gaming benchmark!
They're sort of limited by what has been optimized for RISC-V. That assumes they can even run general-purpose CPU workloads on it (see my earlier point about lack of any mention what OS it runs or whether their interconnect is even cache-coherent).Show me some real work, how many FPS can this thing do running Topaz Video AI (or the like) ? Most have no idea how much brute force power it takes to do this kind of real work. If this card can do upscaling in Video AI at a messily 10 FPS, then it would be about 500 times faster then a 4090 which would make this card incredibly valuable to me!
Their PDF, which I've been going through, claims "2x 8K60 streams" of "AV1, H.264/265" video encoding throughput, on the base model. If you look at my above analysis of how many cores I believe it has, I think maybe they're just using a pure software implementation.
According to recent benchmarks from Phoronix, a 64-core Zen 4 Threadripper can achieve about 15, 64, or 222 fps of AV1 encoding throughput, when processing a single stream of 4k video, depending on the quality settings. I'd bet they're claiming towards the faster end of the presets, meaning we could estimate it by dividing the 222 FPS number to account for the additional resolution and number of streams. That gives us only about 28 fps. However, keep in mind that multi-stream encoding should scale better than single-stream. Furthermore, resolution-scaling appears to be super-linear. The Phoronix' data indicates the 1080p performance varies from 2.62x to 2.93x as fast as the corresponding 4k, which suggests my estimate should be more along the lines of 38 fps.
It's funny you say that, because their PDF doesn't actually cite gaming performance (although they do talk about interactive rendering and make some extrapolations for 4k @ 120 fps). The scene it claims to use for RT benchmarking is like what movies or other production renders would use - not at all the sort of geometry you'd use for gaming.
Also, that presentation spends a few slides looking at professional computing applications, hence the focus on fp64 performance. BTW, they claim 300x accuracy vs. modern GPU and CPU fp64 arithmetic, although I think that's probably just because they support denormals. Either that, or they use higher-accuracy implementations of transcendental functions.
They're sort of limited by what has been optimized for RISC-V. That assumes they can even run general-purpose CPU workloads on it (see my earlier point about lack of any mention what OS it runs or whether their interconnect is even cache-coherent).
Also, they don't claim to surpass the RTX 5090 on AI performance. So, if it's AI you want, then this isn't going to be your savior.
TBH, I don't believe their AI numbers reflect anything remotely close to real-world performance. AI is very bandwidth-intensive and that's one of their obvious weak spots.
If you're talking about the article's author or people posting in these forums, I'd just point out that (for better or for worse) this site does have a bias towards gamers. I'm not really sure why that is, but maybe the more general computing enthusiasts fled to sites like ServeTheHome or Phoronix, which cater more to non-gaming interests.I think you misunderstood my gaming reference. Nobody can even talk about a GPU anymore it seems without using meaningless gaming benchmarks, as if games are the only thing in the world that matters. I'm sick to death of all the meaningless "gaming" noise everywhere all the freakin time!
If you're talking about the article's author or people posting in these forums, I'd just point out that (for better or for worse) this site does have a bias towards gamers. I'm not really sure why that is, but maybe the more general computing enthusiasts fled to sites like ServeTheHome or Phoronix, which cater more to non-gaming interests.
If you saw the questionnaire the site just launched, it sounds like they're currently in the process of re-evaluating their priorities. That would probably be a good avenue to make your opinions known. Just be aware that (according to others - I have yet to open it) it does seem like a feeler for a premium subscription. I think you can still just fill out what you're comfortable with and maybe it will have a positive impact.
Yeah, I've followed the evolution of GPUs since the days, when the only hardware acceleration of 3D graphics was in expensive UNIX workstations and definitely not for gaming. Back then, pixel shaders were only used in Hollywood movie production (see Rendeman) and inconceivable in anything realtime. No doubt, gaming has driven the evolution of GPUs, but then we also have things like VR that have even surpassed the CAD and scientific visualization applications that were among their first drivers.But it's not this article or even this site that I'm complaining about, it's the industry as a whole that I'm referring to in regards to GPU testing and review across the board. It's all games all the time everywhere it seems, I'm just sick of it. Gaming is probably the least important thing a GPU is capable of doing and yet seems to be the singular focus by almost everyone almost everywhere almost every time.
Thanks for explaining your points.And that's my rant for the day 😁.
Actually no, it doesn't, I do 3d rendering and the tech nvidia has for denoising/upscaling hasn't been really a gamechanger or well implemented in renderers, it's literally only made for games it seems as of yet, I don't use nvidia denoising in vray for example, and vray is extremely popular/most known so I think using it is a good example here (just went and tested again with vray, in 1 scene nvidia ai denoising is better in the sense that it smoothed out the image a lot more than vray's one which in this one specifically I think was good as there wasn't many details, in another with much more complex materials/meshes, it literally removes all of the details, whereas vray's denoiser kept the details where I wanted it to, vray's denoiser can also be tuned in intensity/radius when you can't with nvidia's one, in overall I prefer having finer control and a denoiser that keeps the details where needed than a stupid "smoothing everything" denoiser), and in any way when it comes to offline rendering, the least denoising you need to use the better. Vray also has an option to use nvidia upscaling, and it's also bad imo.I think his is clearly the best take, on this thing. He didn't go into quite the nooks and crannies as I did, but his focus made sense and I agree with everything he said.
What he didn't say is that Nvidia has invested substantial resources into their AI-based denoising and ray-sampling technologies, which I'm sure make quite a bit of difference, when using their GPUs for path tracing and makes up for (and then some?) the supposed 10x deficit on raw ray-intersection performance, claimed by the Bolt folks.
Thanks for posting that detailed analysis. This is great info, because you also answered a lot of follow-on questions I would've had.Actually no, it doesn't, I do 3d rendering and the tech nvidia has for denoising/upscaling hasn't been really a gamechanger or well implemented in renderers, it's literally only made for games it seems as of yet, I don't use nvidia denoising in vray for example, and vray is extremely popular/most known so I think using it is a good example here (just went and tested again with vray, in 1 scene nvidia ai denoising is better in the sense that it smoothed out the image a lot more than vray's one which in this one specifically I think was good as there wasn't many details, in another with much more complex materials/meshes, it literally removes all of the details, whereas vray's denoiser kept the details where I wanted it to, vray's denoiser can also be tuned in intensity/radius when you can't with nvidia's one, in overall I prefer having finer control and a denoiser that keeps the details where needed than a stupid "smoothing everything" denoiser), and in any way when it comes to offline rendering, the least denoising you need to use the better. Vray also has an option to use nvidia upscaling, and it's also bad imo.
Yes, but imagine Nvidia gave you the same ability to adjust the strength of their denoiser, enabling you to dial it back like VRay's denoiser? It seems like Nvidia's denoising tech is capable of producing better quality images for a given number of rays, and it just might be enough to compensate for the difference in performance vs. Zeus. That's my thinking, anyway.So a card that can produce super quickly images with super high sample could actually be a game changer yes for offline 3d rendering, no denoising is capable of beating that as of yet, clearly not like how dlss can upscale to near native quality.
Again, thank you very much for this.Also at one point you asked about cpu vs gpu in renderers I think ? I remember this video that's quite interesting and shows the same render using the cpu, cpu + gpu or gpu:View: https://youtu.be/ayx0vuLbDns?t=232
Important to note, vray has a vray cpu and vray gpu renderer, they're both literally different renderers, that's why in the video she writes CPU and GPU at the beginning of each lines, and also why benchmarks numbers are different I think when comparing, but you can run the GPU version on the CPU still like she did where she writes gpu: cuda_cpu.
A ray tracing system and method of operation comprising one or more memories configured to store data used by the ray tracing system and one or more memory interfaces configured read and or write data to the one or more memories. A ray tracing engine, in communication with the memory via the one or more memory interfaces, comprising one or more ray generation modules configured to generate ray data defining rays. Also part of the ray tracing engine are one or more acceleration structure generators configured to process geometry data that is stored in the one or more memories to create an acceleration structure based on the geometry data. One or more intersection testers are configured to compare the ray data to the acceleration structure to determine which rays intersect which elements in the acceleration structure and generate secondary ray data, such that the secondary rays represent reflections.
Okay, I can take a swing at that.It's wayyy above my understanding, but here's a patent filed by Bolt in 2022, that may provide some more info on their tech
This just describes a basic Ray Tracing "core".The patent abstract said:A ray tracing system and method of operation comprising one or more memories configured to store data used by the ray tracing system and one or more memory interfaces configured read and or write data to the one or more memories. A ray tracing engine, in communication with the memory via the one or more memory interfaces, comprising one or more ray generation modules configured to generate ray data defining rays.
This part says they have hardware-accelerated creation of something like a BVH (Bounding Volume Hierarchy). I think this is the direction AMD and Nvidia have been moving in, but I'm not exactly sure where they both stand, on this front.The patent abstract said:Also part of the ray tracing engine are one or more acceleration structure generators configured to process geometry data that is stored in the one or more memories to create an acceleration structure based on the geometry data.
And this last part basically just takes us back to the basics of what every pretty much every RT core does.The patent abstract said:One or more intersection testers are configured to compare the ray data to the acceleration structure to determine which rays intersect which elements in the acceleration structure and generate secondary ray data, such that the secondary rays represent reflections.