Citations, please?
That's not a citation. That's an advertising/promotional gimmick, and, with particularly selected (re)actors. digitalgriffin already explained this.
So, at least you're proving that you have no sources. I meant actual tests, reviews, etc., like the sort of thing you're demanding from Tom's Hardware . . NOT videos of people reacting to it.
Hell, you're complaining that it's not benchmarked, and you're using a gimmick video to PROVE that 8K@60 is already here? Which is it? Are there benchmarks to prove it's already here, or not?
This is a simple problem to solve. Go buy a 8k display and 3080 and test it yourself.Nvidia CEO said it reached 60fps on 8K . Go sue him !
My source is Nvidia CEO. and all what I am asking is for Tomshardware to Benchmark it and see how TRUE it is . how do we know he is truthful or not ? we BENCHMARK !
what is your problem exactly ? cant you see that I want it to be BENCHMARKED to SEE if Nvidia CEO is telling the truth or not ?
So, your "proof" of performance is the word of the guy who's motivated to show his own products in a better light?
Well, I say 8k@60 isn't even close to being a thing. You can tell it's true because you have my word for it.
That's not a source. That's just a guy talking. There's no full test . . the kind you're demanding from this site.
My problem is that you stated that 8K60 is already here - insisting that people should cater to YOUR beliefs and cave to YOUR demands for testing that serves no practical purpose.
I mean, go buy it if you want, you're free to do so. Just don't try to sell everyone else on the idea that this video that amounts to an advertising blurb constitutes proof of your claims that 8k gaming is here in any practical way.
This is a simple problem to solve. Go buy a 8k display and 3080 and test it yourself.
Yeah, and the funny thing is the only proper 8K displays right now are TVs. The 8K Dell requires two cables, which means the GPU sees it as two half-8K displays and that potentially causes some odd behavior. Anyway, I'm going to test 8K via DSR (Dynamic Super Resolution). It's not totally the same -- DSR I think is usually 3-5% slower due to a bit of extra work the GPU has to do to render and then scale the result -- but it will be 'close enough.' Of course without a 65-inch 8K display, I'm not getting the proper 8K experience so I can't do jaw drops and "holy ****!" faces for a promotional video.This is a simple problem to solve. Go buy a 8k display and 3080 and test it yourself.
The outside edge has mostly memory controllers on Ampere and Turing. It's interesting to see how they've been shifted around for GA102 vs. TU102, plus the whole middle section of TU102 has always been a bit of a mystery to me. Is it cache on the left and right? And the very center block -- WTF is that?In the Tu102 the SMs take up roughly the same amount of space: ~55%.
It's interesting how we can visibly see the difference between the two processes. The Samsung produced die is more squarish, whereas the TSMC die has a definite preferred direction.
Anyway, there doesn't seem to be any extra functional units in Turing that would accord for the extra transistors. I think it's down to the process.
Not sure what the things around the perimeter are. They're proportionally bigger in Ampere.
That TV has an advertised 6ms response time. That, combined with it needing 2 cables and the other latency mumbo-jumbo I'm not familiar with, tells me that no, 8K 60 is NOT here, except in a controlled environment...The 8K Dell
That is what you are doing,when you say that the CEO said it than provide a link to were you read or seen it so we can see what he actually said.You left the whole context and cut out "60 fps is here" to play around it ?
The outside edge has mostly memory controllers on Ampere and Turing.
It's interesting to see how they've been shifted around for GA102 vs. TU102, plus the whole middle section of TU102 has always been a bit of a mystery to me. Is it cache on the left and right?
And the very center block -- WTF is that?
GA102 is far more sensible to decipher, for me anyway.
TechPowerUp did it for the 3080, if you're interested: https://www.techpowerup.com/review/nvidia-geforce-rtx-3080-founders-edition/35.htmlI believe the power demanded by the card is too much and it's not emphasised enough. Performance per watt is always bad on high end cards but there is still 3090 thing which will be worse possibly. Will we get a performance per watt chart after you review partner cards and such?
Would it? I mean, the cache on Intel's CPUs is extremely obvious. You get huge blocks of 'easy' transistor logic compared to the mess that is GPU cores. GPU ALUs end up being very small but lots of them, so you can easily spot all of those -- but the shader cores should be very close to some L1/L2 cache. You don't want to fetch data from the other end of chip I don't think. Still, I'm not at all sure what the various middle blocks are on TU102, and the same goes for the lower/middle part of GA102.Ah, that makes sense. Processing the PAM4 signaling would require more circuitry.
The cache would be too small relative to the overall size for us to pick out. I have no idea what those things could be.
I don't think so. There are 48 square 'blocks' in the bottom section of GA102. That doesn't match GPCs, RTs, ROPs, or anything really. It could just be miscellaneous logic for routing data. Furthermore, the SMs are supposed to contain 64 FP32 CUDA, 64 FP32/INT CUDA, 4 Tensor, 1 RT, and some L1$/shared memory. The TPCs are supposed to have L2 cache, ROPS, Texture Units, and the Polymorph engine. So, with that in mind, here's my 'guess' on the GA102 die shot from Nvidia. (The top-right and bottom areas are the major unknowns -- especially that chunk just above the very bottom where I assume the video interfaces and such are located).Those are probably clusters of RT cores. In the GA102 die there are seven of them, which sort of match the reported 84 RT cores. So the 3080 can withstand one faulty RT cluster and still have half a cluster disabled for market segmentation purpose. Yield should be massively better than Turing.
Samsung's process provides a lot of flexibility. You can lay things out on the silicon the way you want. It's error-prone as features are lithed and etched in four separate steps. Any misalignment between steps would destroy the whole wafer. The process adopted by TSMC and Intel doesn't have this problem, but then you have to lay out your chip in a very specific way.
Processes have trade-offs. If you're making a CPU, you get a lot of benefits from denser caches and higher frequency. You don't get those when you're making a GPU. A GPU is a bunch of weak processors that win through sheer number. It makes absolutely perfect sense, if you think about it, to use the same process to make high-end GPUs as for making chips in low-end, cheap-o phones sold in India.
Furthermore, the SMs are supposed to contain 64 FP32 CUDA, 64 FP32/INT CUDA, 4 Tensor, 1 RT, and some L1$/shared memory.
So here's the big question: are the RT and Tensor cores part of the SMs, as Nvidia says in its architecture whitepaper, or are they located somewhere else?It wouldn't make sense though to spread the RT cores among the SMs, since they're coupled more to each other than to the shaders. Whereas the SMs each handle a small part of the scene (a triangle or a pixel), all RT cores uses the BVH of the full scene.
My guess is that the RT cores are housed in the clusters under the SMs along with the L2 cache (that bright rectangle). To the RT cores the L2 would actually be L1.
The stuff below reminds me of the "a bunch of stuff" block on the Tiger Lake die.
The 3090 will undoubtedly draw more power than the 3080, but it could be similar, or perhaps even slightly more efficient on a performance-per-watt basis. It has 20% more active cores, but only around a 10% higher TDP, so I would expect it to run into power limits and typically operate at lower boost clocks as a result. That will likely mean it won't be getting 20% more performance though, and some review leaks going around are suggesting closer to 10% more performance than a 2080, more or less in line with the difference in TDP. Some of that extra power will be going to the extra memory chips too though, which will likely be sitting around mostly unused in games for quite some time to come.I believe the power demanded by the card is too much and it's not emphasised enough. Performance per watt is always bad on high end cards but there is still 3090 thing which will be worse possibly. Will we get a performance per watt chart after you review partner cards and such?
I have to assume they're in the SMs, based on the counts and everything else we know. If the RT cores were separated out from the main SMs, turning on ray tracing functions would make it more costly (latency) to pass data from the RT cores to the shader cores and vice versa, requiring a lot of extra routing of data -- and they'd be away from the L1 cache and even L2 cache. Also, we have the past three generations of Nvidia 102 chips:
FYI, I have asked my Nvidia contacts about this, and they confirm that the RT cores truly are part of the SMs. That 'unknown' area is most likely related to cache, the internal bus, and other elements. Also, the Nvidia-provided die shots should not be taken as 100% accurate representations of how those die look and how they're laid out. "More like 70%," apparently with some things changed to protect trade secrets, and some things are just enhanced to make it look more legible. It's more like a high level view of the chip layout and not an actual scan of the chip.There shouldn't be all that much traffic between the SM and the RT cores. Nvidia claims 10 gigarays per second on the 2080 Ti. That's 2.3 million rays per core. Divide the game clock by this number yields 670 cycles. A ray intersection search is an expensive operation even with dedicated hardware. I imagine a RT core would spend most of its time waiting for different parts of the BVH tree to transfer from VRAM. Effective caching is key to performance. Spreading out the RT cores is detrimental to that with bringing any clear benefit.
While I could see why they might potentially do that, it seems like any company with the ability to manufacture their own graphics chips would also have the ability to X-ray and otherwise tear-down their competitor's hardware without the need for analyzing marketing slides. I suppose it could slow them down if they didn't have access to the hardware yet though. But if I had to guess, all these companies likely have informants working for their competitors to keep them up to date on what the competition has planned....apparently with some things changed to protect trade secrets...
My guess is the unknown stuff is the equivalent of the front-end of a CPU, i.e., the instruction scheduling and whatnot. If we tried to map the die to the simpler looking block diagrams, the GigaThread engine is the one thing that sticks out to me.So my take is that the 'unknown' stuff for caching, memory routing, or whatever else ... but that the RT cores aren't in that area and are within the SMs, as Nvidia has said in their block diagrams and such.