Mike Friesen :
How is T4 optimized for 720p? Most tablets running it are 1080/1200p etc. It has the power of S800, so I guess you think ALL current SOCS are optimized for 720p? Xaomi MI3 is at the top of the charts and it's a phone. Note3/Xperia z ultra are phablets and the mi3 keeps up with them. They are both 2.3/2.2ghz S800's, and xiaomi only uses 1.8ghz T4 (not 1.9 you see benched in tablets). As you can see from the benchmarks it has no trouble keeping up with S800 even clocked down a bit. So I don't understand your comment. Shield chose 720p for a great gaming experience, and I have no problem with that as it is better for that, and 1080p can't reliably be done on consoles as many games show they're upscaled (so 720p for shield was a great choice and already WAY above Vita/3DS which is it's real competition).Even T4i is S600 quality, is S600 a 720p SOC? I wouldn't want to play above that in games on either chip, but for anything else they're ok at 1080p. I'd want S805/K1 though for real 1080p gaming, but a pigeonholed T4 is dumb unless you put S800 there too. None of the socs are great 1080p gamers yet without dumbing down graphics in games. For REAL 1080p gaming at xbox1/ps4 levels you'll probably need 14/10nm. They will probably catch Xbox1 with 14nm, and surpass both at 10nm with large cache tricks/faster die shrunk memory etc that can be done at 14/10nm vs .28nm Jaguars. IE Intel already has 128MB cache on IRIS at 22nm. So what will other fabs be able to do on Volta based Tegras, and how much faster vs. current chips with large IRIS type caches or better, coupled with far faster memory too? We are talking another die shrink past 20nm which hasn't even hit socs yet, then another past 14nm. Even 14nm should bring IRIS sized cache, so I'm thinking 10nm will have double that IRIS cache or more.
You really think they'll hit xb1/ps4 at 10nm? You have to be joking. We're talking about phones here right? I mean, the k1 may or may not come out on phones (maybe just tablets), and admittedly it is pretty awesome - It is similar to my brother's 2 year old laptop's gpu - a 540m - but the K1 is months away, and to match a modern console? My gtx 660m is not even equivalent on graphics power to a new console's gpu (about 2/3 or 3/5 as powerful), and it has a power consumption of 45 watts. Now, maxwell halves that, 14nm halves, and 10nm miraculously halves again. Now we're still at ~6W of power, double what a phone can steadily take, and that completely ignores the cpu's power consumption. Also, consider that while my laptop is relatively close to matching the specs, it will cease to be able to run modern games fluently in 2-4 years, maybe less (the witcher 2 runs 35 fps at 1080p, med/low settings, a 2011 game - yet still the most I've struggled with frames). Also, the power consumption will not fall in half for three years straight. That would be awesome - but is a tad preposterous.
Now I know what you're thinking - the k1 has a full kepler unit (192 cuda cores), only half of the gtx 660m, which you claim is not far off next gen performance, and sips power enough to fit in a large phone. But the thing is - that's not entirely true. The k1 doesn't run at 1ghz, and most importantly, has huge constraints on bandwidth. The consoles have access to gddr5, and lots of it, with (I think) over 100GB/s of memory bandwidth. The k1 has 17. Memory uses a lot of power.
Anyways, I'd say that the K1 doesn't even reach xbox 360 levels of performance, although that may be close. But phones reaching next gen console perf? And by 10 nm? Not going to happen.
K1 is 28nm. So 20nm M1 (maxwell based 2015), 14nm probably V1 (volta based 2016?), 10nm (whoknows1? LOL). Volta already has on chip memory etc. Intel as I said has 128MB on Iris at 22nm. So we're talking two more die shrinks from 20nm where you would be able to add pretty much what Intel has now at 22nm. I'm guessing at 1nm we'll have 256MB-512MB on chip and if Intel loses the ARM (pun intended) race, they'll end up fabbing stuff on their processes for OTHER people. If they buy NVDA they will surely start producing Nvidia stuff on their best process. Iris blows away Intel's regular models, so crystalwell works great. The connection to crystalwell is 50GB/s bidirectional (100GB/s aggregate). I'm guessing that number goes up in the next 2 shrinks right?
Kepler mobile is 192 core. So 20nm will be 384, 10nm will be probably 768. Couple that with far faster memory, also shrunk twice, the on die L4, etc etc. What I said isn't much of a stretch when you consider the 10nm chips will be helped by many things that will be shrunk/added. You'll have more compute stuff being added (cores) and far faster speeds. Look at just the switch from T4 to K1. Both have the same 28nm, but K1's A15's run up to 2.3ghz vs 1.9ghz for T4 probably due to some degree to A15 r3p3, and the more MOBILE optimized TSMC 28nm HPM. 192 cores vs. 72 cores.
http://www.anandtech.com/show/7622/nvidia-tegra-k1/3
Already at xbox360/ps3 level and we're talking 3 die shrinks, note he says the same games that run on those consoles can easily be ported to K1. 20nm about to happen, 14nm then finally 10nm. Already at 17GB/s bandwidth. Note he also says NV plans to catch desktops perf in 4 gens! I'm guessing that means he thinks they'll be at desktop 780 or something by maxwell, volta, x, y (he's talking the Y chip in this line). We are talking that X chip here at 10nm, and even half of 780 would be a VERY impressive mobile chip right? Current consoles are not 1/2 780. We are talking a 28nm Jaguar vs. 10nm 2nd rev of Denver (3rd?)+Whatever comes after Volta for gpu. I think they have a good shot at catching xbox1/ps4 at 10nm, if not outright surpassing them.
http://www.tomshardware.com/reviews/tegra-4-tegra-4i-gpu-architecture,3445-2.html
T4 had 3x the throughput of T3. One shrink, we're talking 3 more. So 9x for 3 shrinks? 9x17=153GB/s and we're not talking any other things they figure out to effectively get far more over those shrinks. You could add channels at some point also. Say two 64bit channels vs the 2x32bit T4 has now. Now that 153 turns into ~300 right? I'm pretty sure we'll be using at least 2x64bit by the time we hit 2016/2017. OpenGL will have all the overhead removed by then also, so will be performing like AMD MANTLE right? They are showing it now as articles in the last few days everywhere show (anandtech, tomshardware etc talking about OpenGL basically having Mantle LIKE features today). They will be using pretty close to metal by 10nm (if not my K6, certainly K7, and certainly the K8 or whatever they call the one after Volta, so M1, V1, whoknows1
). Everyone is concentrating on GPU on these now.
A57 is a major leap over A15 already so not much to do there but increase mhz as it shrinks and it should easily take over jaguar on cpu as even AMD has shown A57 almost makes their x86 server chips pointless for some cases already.
http://www.arm.com/products/processors/cortex-a50/cortex-a57-processor.php
A15's don't run over 1.9ghz mostly, so 2.5ghz+ and a perf/watt improvement (probably 3.5ghz+ by 10nm) will be far past jaguar's cpu perf. They will likely be applying almost everything from the 20/14/10nm shrinks to gpus. Intel isn't adding much to cpu either, as we need better gpu right now in mobile not cpu really. I can always use more power, just saying we only need to match jaguar cpu power then all can be added to gpu. A57 should match or beat it anyway and that isn't a CUSTOM one like Denver etc), so not much else is needed.
http://semiaccurate.com/forums/showpost.php?p=194897&postcount=224
IRIS 5100 is already 28w. Two more die shrinks easily puts that to 10w and you can probably cherry pick for far lower to get into a phone etc. Future chips will be helped by other advancements in tech that surround that chip also. Far faster memory etc in phones/tablets. You're thinking like everything else around it stays stagnate. Your 660m has todays memory, while 10nm's SOC's will have whatever we're using then. Your 660m doesn't have a large L4 type cache like Iris either. Like Intel's 22nm IRIS, 10nm SOCS will have a large cache that drastically improves bandwidth. Volta comes with 3D stacked edram type stuff, getting 1TB/s and it gets that to ALL memory on the card, not just on the chip, so where do you think a Volta mobile chip is? 100-200 or more? So where is the chip based on whatever Volta is replaced with? Maxwell is the first NV chip to have unified memory also, so we're talking Volta, and it's replacement. Software should catch up to hardware by then, so you have to account for that also.
http://images.bit-tech.net/news_images/2013/03/gtc-2013/Tegra%20Roadmap.jpg
If this is anywhere near accurate (and T3/T4/K1 all seem to be pretty close to what they've said), the Volta one should do pretty well with 3D memory, and the one past it surely is pretty dang good especially if it can use the same tech. You're talking two chips past Parker/T6 in that roadmap and parker already has 3D finfet. So rev3 of finfet/3d. There are many things at work here that will vault performance ahead on mobile vs. where they are today. The gpu/soc departments are working hand in hand now at Nvidia as they've mentioned many times before. Whatever your Maxwell brings for xmas will be on M1 in July next year. Whatever Volta brings a year later (2015 xmas) will be on V1 2016 July and so on. So we're talking Volta's V1 replacement in July 2017 (10nm) and made at TSMC, Samsung or Intel depending on how everything plays out. While I believe they will catch them, they don't have to. They just have to get close enough to cause people to NOT buy a console, as that person would already have the phone or tablet instead. You're just looking at the shrink of the chip and calling my statement bunk. In reality everything around it gets better too which helps this all come true. Not to mention game devs already know kepler (and ever rev of NV/AMD gpus) inside out, so no learning curve like a new console. As desktop becomes mobile that homework is already done the year before in discrete implementations.