AMD's RDNA2 chips contain one Ray Accelerator per CU, which is similar to what Nvidia has done with it's RT cores. Even though AMD sort of takes the same approach as Nvidia, the comparison between AMD and Nvidia isn't clear cut. The BVH algorithm depends on both ray/box intersection calculations and ray/triangle intersection calculations. AMD's RDNA2 architecture can do four ray/box intersections per CU per clock, or one ray/triangle intersection per CU per clock.
There's an important distinction here that we do need to point out. AMD apparently does the ray/box intersections using modified texture units. While a rate of four ray/box intersections per clock might sound good, we don't have exact details of how that compares with Nvidia's RTX hardware. What we do know is that, in general, Nvidia's ray tracing performance is better. Also note that
Intel's Arc Architecture does up to 12 ray/box intersections per clock.
From our understanding, Nvidia's Ampere architecture can do up to two ray/triangle intersections per RT core per clock, plus some additional extras, but it's not clear what the ray/box rate is. In testing, Big Navi RT performance generally doesn't come anywhere close to matching Ampere, though it can usually keep up with Turing RT performance. That's likely due to Ampere's RT cores doing more ray/box and ray/triangle intersections per clock.