AMD CPU speculation... and expert conjecture

Page 256 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

8350rocks

Distinguished


Running Virtual Machines is predominantly what benefits from it...

Xeons use it pretty heavily on the server side. It's really mostly hype though...
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


AFAI remember, on the presentation of BD AMD claimed the ability of 4 discrete 64bit FP operations on the 2 128bit FMAC. Now if each of those 64bits ops can be MADD, it will be 8... but then why didn't they claimed 8 instead of 4 ?

Nevertheless 8 FLOPS can be from the fact of SR having 2 FlexFPUs... it will have "kind of" double decoders and for sure it will have a dedicated "dispatcher" per Integer cluster/core, so 2 FlexFPUs attending the great modularity of the design is not far fetched, even those 2 FlexFPUs sharing a same FP dispatcher front-end

The 8 FLOPS can also be from the fact that each FAMC pipe is now 256bit large and able of 256bit ops per cycle without halves... which is one of the reveals around (FP256)... in that case following the logic each FMAC pipe will be able of 4 discrete 64bit ops per cycle, and if MADD, 8 ops per pipe, 16 per FLexFPU... and if 2 of them will be 32 ops...

Now that will be something lol... everybody talks Jim Keller, but i would like more to see Gustafson mark on it lol...

32x4Ghz = 128 GFLOP per module (with 2 FPUs), 256 GLOPS for APU, 512 GFLOPS for CPU... umm why do i think its too much lol

Juanrga, 8x4Ghz = 32GFLOPs not 128... and following x86 scheme of the architectural RF, 32ops will be the same number of registers of 64bit, the larger registers are zoroed. And even if 8 ops is 64bit and so single precision 32bits is double, it will be 16ops per cycle or 64 GFLOPs not 128.

Yes 128GFLOPS by your logic is for 4 modules chips and for 64bit/32bit FP ops, but APUs wont have 4 modules. So i think is pertinent to reference what you are pointing APU or CPU.



No intel is not
i7-3770k: 224 GFLOP
i7-4770k: 448 GFLOP

For FP ops those chips have 3 128bit ports, max possible will be 6MADD or 12 64/32bit FP ops,

12x3.6Ghz x 4 cores = 43.2 GFLOPS x4 = 172.8 GFLOPS (max possible) either IB or Hasfail. Its less none the case, because from the L/S buffers it could be possible to sustain that FP throughput, Intel designs only have 1 L/S engine per core or 2 threads, while AMD have 2 L/S engines per module or 2 threads, AMD could sustain "potentially" double of Intel, for the same number of threads per chip.(edt)

And no... doubt any case Steamroller uses the same FPU of Piledriver. Presenting officially a FPU 30% smaller do to HDLs, and only 1 MMX pipe, and presenting FP256 wouldn't make sense then.

It could be quite different... and better... more so if 2FlexFPUs per module which is quite possible.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


Another urban myth out of propaganda lol

Neither a k10 core would be more efficient or faster. Just for start you forgot that each FlxeFPU on a module is comparatively almost the size of a K10 core... yes that big... so for "size" extrapolation you'd have to take that in account, meaning a K10 design will be like 8 cores compared with Thuban, that is, a Thuban with 8 cores, but 4 will be integer and 4 FP.

Besides an effort was made to contain lengthening the pipeline (15 stages) yet having a tremendously low FO4 and high speed latchups... the problem of BD not having much more speed is due to fab process not design... AMD shows this with "Centurion" only at the expense of power, that SOI process could had been much better.(edt) So any design would suffer the same.

Also don't forget that BD has double the cache size of a Thuban design (very good for large data set jobs), this one hasn't sorted out very good on first BD, but has been very easy to fix. So a Thuban at 45nm with the same capabilities of BD would have equivalent to 8 cores and double the cache and 1 more Hypertransport link which is bulky... and probably would be quite above 500mm² at 45nm... more if the L3 topology is distributed, like in BD, which is more efficient for code locality... and the first 32nm SOI process was an hybrid, only around 40% shrink, so a Thuban with the same capacity of BD would had resulted in a notoriously larger chip than BD, its not only design philosophy that lead to sharing resources between "cores"... and in this we are talking a K10 with only 4 integer cores and 4 FP cores, it will have the same FP power (or a little better) but it would have half the integer count of Orochi.

Besides, having the FP traffic out of the Integer path will always be more efficient, more flexible, and faster (clock prone) than having all together, it could had been arranged for a k10 like design, but then each FPU "core" would have to have its own dedicated cache, enlarging the chip even further... or if sharing cache!.. then all reverts back to the notion of "modules" LOL.

Why do i have this feeling that there was a lot of try and error before AMD hit the right module and chip topologies for BD !!?? LOL ... they might wanted to do in the lines of all those anti-BD rants, but they couldn't LOL... that is why it toke so long, specially "vertical multithreading" is really superior, in a simple word superior, ultra fast context switch capable of dealing with pipeline stages, and this on open thread contexts, is a world first debut AFAIK... its superior, way above to anything SMT (aka Hyperthreading) LOL...

 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


Thinking out loud, if i were AMD all my production would revert to FD-SOI, including GPUS.

TSMC doesn't have FD-SOI ?... too bad... only chipsets and entry level GPUs would be done there, one node behind the top more expensive node. 28nm TSMC in a year could be really cheap, 20nm in 2 years also... all the rest FD-SOI (it will be better, and better than better cheaper to lol). If IBM opens up, top high performance SKUs there, if only GF than no choice, but its manageable.

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


I am not sure what you try to say here.

First, do you have a link to that ES @ 2.6GHz or it is just another piece of unverifiable data?

Second, ESs don't necessarily reproduce final silicon. Look at Haswell, the ESs got higher frequencies (and lower temperatures) than final chips.

Finally, 2.6GHz is very far from 4.0GHz. Even if AMD magically achieved jaguar to work @ 4.0 GHz on air, it wouldn't provide twice the performance of jaguar @ 2.0Ghz, because doesn't scale linearly. However, as confirmed by Feldman (AMD), Steamroller offers twice the performance of jaguar.



Therefore, my suspicion that you don't have any link or source contradicting what AMD said about kaveri being bulk confirms. What you have is your interpretation of an interview to an executive of STMicroelectronics talking about Glofo!!!!!

(1) It is possible SOI was cheaper, but cost was not the main reason why AMD were choosing bulk.

(2) Again possible, but irrelevant to the main reason why AMD chose bulk.

(3) What delay announcement? AMD has debunked the VR-ZONE rumour of a delay and emphasized, in an official communication, that everything with Steamroller/Kaveri follows the plans that AMD communicated to partners/OEMS early this year (Computex). Shipping of Kaveri to OEMs starts this last quarter as planned in the official roadmap.

FPUs:

(1) I already said that even if the FPU is the same, improvements in the front-end will improve the performance of the module. The same about the integer units. They are the same than in Piledriver, but will perform better because the front-end is much better.

(2) AMD gaming strategy starts on consoles. AMD is moving compute from jaguar cores to the GPU. E.g. physics computations in the several demos and engines available today are being made in the GPU. It makes sense that AMD maintains this strategy on the PC and develop a HSA APU where the GPU will be used for compute. It makes less sense that AMD was focusing on improving the FPU units in desktop/mobile CPUs, when games are going to rely more and more on the GPU for compute.

The same about servers. AMD is releasing APU Berlin as substitute to Opteron CPU, and using the GPU to count the performance improvements:

almost 8X the gigaflops per-watt compared to current AMD Opteron

http://www.amd.com/us/press-releases/Pages/amd-unveils-2013june18.aspx

(3) According to AMD, Piledriver focus was on greater IPC (per core) and freq. Steamroller focus is on "greater parallelism"

AMD-Steamroller.jpg


Greater parallelism is achieved by improvements in the front-end.
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


Going through the posts since my vacation. ... ROFL IGP

Idiot.

For the CPU power consumption test, we use the standard testing system (with an NVIDIA GTX 670 installed) and wait until the system and discrete GPU are at idle speeds in order to log the idle power consumption. After this, WPrime 1024M is looped for 15 minutes while the power consumption is logged with a calibrated power meter to determine the peak watts.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790



On the presentation of BD, AMD claimed "4 DP/BD" for SSE2 and "8 DP/BD" for FMA4.

The 8 FLOPs claimed by AMD correspond to a single (shared) FlexFPU per module. The diagram of the Steamroller module has been posted here numerous times. I posted again a pair of post above: there is no "2 FlexFPUs".

The 8 FLOPs claimed by AMD correspond to 128 bit FMAC units. The diagram of the Steamroller module has been posted here numerous times. I posted again a pair of post above: there is no "each FAMC pipe is now 256bit large", but the twe 128-bit FMAC units can be fused in one 256 bit FMAC superunit, if needed.

The 128 GFLOP are not per module but per CPU (2 modules).

The 128 GFLOP are not per 4 module but per CPU (2 modules).

Adding the 128 GFLOP of the 2 Steamroller modules to the 922 GFLOP of the GPU we obtain the 1050 GFLOP claimed by AMD for the APU.



Wrong again. The above GFLOP are the numbers claimed by Intel for the CPU. The 448 GFLOP of haswell i7 CPU are reported in several sites.

Intel likes to use DP in its technical datasheets. If you want to obtain DP values you only need to divide the above SP numbers per 2. E.g. the 224 GFLOP (SP) correspond to 112 GFLOP (DP).
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


AMD from what I read designs around the node. If its designed for SOI, it won't work well on bulk and vice-versa. Thats why AMD is stuck waiting on GF for their SOI cpus. Adding more orders to GF will make even fewer overall cpus/apus since it seems GF isn't doing very many SOI fabs.


As far as the Stars cores or Phenom II, I don't think its a compatible design with HSA, the main reason AMD separated the alu/fpu. The problem with HSA is its time to deployment. Its going to take years to get it into "mainstream", but as it gets more adopted, even the older "Trinity" and "Kaveri" apus will speed up with age instead of turning into uselessness.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


The core design/architecture doesn't fully dictate it's speed. The transistor layout/optimization and process technology have a bigger hand in that. The same design can be tuned for low power or high performance with different standard cell libraries. Intel does it all the time between their UltraLow voltage, mobile and desktop versions of chips.

The A6-5200 is their top Jaguar chip which is standard at 2.0Ghz. I doubt they would ship product at maximum rated clock if that was the true limit. It may be near the limit on the TSMC process today but Sony registered the PS4 with 2.75Ghz max clock speed.

The 2Ghz limit is likely to just meet their desired target of 25W. I expect they will release new SKUs to keep up with Intel Silvermont when they launch. Jaguar supports turbo modes so they could add a 2.0/2.4 version with say 30W TDP.

If you look in the Kabini guide the registers actually support up to 6.3Ghz.
http://support.amd.com/us/Processor_TechDocs/48751_BKDG_Fam_16h_Mod_00h-0Fh.pdf

MSRC001_0071 COFVID Status
[54:49] MaxCpuCof: maximum core COF. Specifies the maximum CPU
COF supported by the processor. The maximum frequency is 100 MHz * MaxCpuCof.

With a 6 bit field that's 0x3F or 63x100Mhz = 6.3Ghz

Maybe they just threw an extra bit in there for future proofing but even with a 5 bit field they would have a 3.1Ghz cap. They must have envisioned it going past 3.1Ghz in the not so distant future.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Intel posts their GFLOPs here for export regulations:

http://www.intel.com/support/processors/sb/CS-017346.htm

i7-3770K
Base (3.5) 112
Turbo (3.9) 125
GPU Only 147.2

 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


It's true that AMD hasn't updated that statement, however it was made back in early 2012. At the time the roadmap also showed Kaveri being available the entirety of 2013, and there was no mention of Richland at all. The new roadmap in 2013 put Kaveri in the back 3rd of the year, and a Richland appeared to fill the gap.

AMD_Roadmap_Kaveri.jpg


In their recent conference calls they have outright refused to answer which chips are being made at GF. There is also speculation that Kaveri is now on it's 2nd generation as the first one tested too slow (The one that could have showed up instead of Richland). That first Kaveri may have been on a bulk process matching the CTO statement at the time.

With neither TSMC or GF shipping bulk chips past 3Ghz we can only HOPE that Kaveri is not on a bulk process. Boost "IPC" by 35% and cut frequency by 33%. You'd end up with the same performance (at best) although at a lower TDP. The same people saying Hasfail would have to say Steamfail as well.

Besides this is a Speculation thread. If we had to stick to official AMD statements there would be nothing past page 2. ;)
 


just out of curiosity, why would a Thuban need:

1) 8 cores
2) Double Cache L2 and L3

Thuban's hex cores in performance at 3.3ghz is not surpassed by the Vishera's 8 core at 4ghz. 32nm Thuban would have still consumed less die space than a 8 core or even hex core Vishera and performed faster in single threaded workloads. All they needed was to redesign the IMC and front end a bit and I am sure a 1100T say in 32nm form with more transistors would have out performed a 8350 in gaming senses by around 20-30% at lower clocks, that is the games that stress single threads ie: Civ 5, SC2, Shogun etc where AMD CPU's are far behind Intel's

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


I didn't say that 2GHz is a limit, but that is the top frequency used by AMD. Probably jaguar can be clocked at 2.6GHz, the same that you can OC Richland up to at 5GHz on air. But 2GHz is the frequency chose by AMD for jaguar in base to its own safety and power consumption requirements.



Exactly as I said. 112 GFLOP (DP) for the i7-3770k. For SP the number is the double: 224 GFLOP.

i7-3770k: 4C x ( 16SP FLOP per cycle ) x 3.5 GHz = 224 GFLOP
i7-3770k: 4C x ( 8DP FLOP per cycle ) x 3.5 GHz = 112 GFLOP

i7-4770k: 4C x ( 32SP FLOP per cycle ) x 3.5 GHz = 448 GFLOP
i7-4770k: 4C x ( 16DP FLOP per cycle ) x 3.5 GHz = 224 GFLOP

Piledriver: 4C x ( 8SP FLOP per cycle ) x 4.0 GHz = 128 GFLOP
Piledriver: 4C x ( 4DP FLOP per cycle ) x 4.0 GHz = 64 GFLOP

Steamroller: 4C x ( 8SP FLOP per cycle ) x 4.0 GHz = 128 GFLOP
Steamroller: 4C x ( 4DP FLOP per cycle ) x 4.0 GHz = 64 GFLOP

Total performance (CPU+GPU). I gave only SP performance and rounded:

i7-3770k: 224 + 294 GFLOP = 518 GFLOP
Richland A10: 131 + 668 GFLOP = 799 GFLOP
i7-4770k: 448 + 400 GFLOP = 848 GFLOP
Kaveri A10: 128 + 922 GFLOP = 1050 GFLOP




It is entirely possible that AMD has changed its plans and I have no problem with speculation, but problem when people does not differentiate between personal speculation and official claims.

No problem with people believing that Kaveri is SOI, but problem when people says Kaveri is SOI as if it was an official claim.

No problem with people believing that AMD will release FX Steamroller, but problem when people says AMD confirmed that will release FX Steamroller.

No problem with people believing on a 8 core APU, but problem when people says AMD will release a 8-core APU.

No problem with people believing that Steamroller has an improved FPU, or a doubled FPU, or two 256-bit FMAC units..., but problem when people says Steamroller has that, whereas ignoring that AMD has confirmed, during Kaveri presentation, that Steamroller has a 8 FLOP per cycle FPU like Piledriver.

And so on.

Now about your comments. The information that I have is that kaveri is going bulk on Glofo. AMD claimed that kaveri is bulk. The Japanese article cited by 8350rocks is from July this year and affirms that kaveri is being made at Glofo on a 28nm bulk high-performance process.

Glofo claims that they can go beyond the 3 GHz on bulk.

You seem to claim that Kaveri one year delay is because they are moving from bulk to SOI. AMD vicepresident Roy Taylor has officially said, a pair of weeks ago, that the delay of kaveri is not due to architecture or process but due to HSA. I interpret that AMD has delayed kaveri waiting more HSA enabled software available at launch.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Let us see the improvement in performance per watt for the desktop.

Kaveri A10 APU is 1050 GFLOP and 100W (TDP)
Piledriver FX-8350 CPU is 256 GFLOP and 125W (TDP)
Piledriver FX-4320 CPU is 128 GFLOP and 95W (TDP)

Therefore Kaveri APU offers about 5X the gigaflops per-watt compared to Piledriver FX-8350 and almost 8X the gigaflops per-watt compared to Piledriver FX-4320.

The exact last value is 7.79X, which I rounded to 8X for comparison with AMD claim. If that is not a giant improvement in performance per-watt, I don't know what is.

For the sake of comparison, a hypothetical Steamroller FX-4520 (4 Steamroller cores @ 4Ghz) would offer about 1.15X the performance per-watt compared to Piledriver FX-4320.

I believe it is evident why AMD is moving to HSA APUs with Berlin and Kaveri.

In one sense I consider their new HSA APU concept so revolutionary as the introduction of AMD 64 was. AMD mistake then was that they released AMD64 in solitary and time was needed before software took advantage of the new architecture (Note for Intel fanboys: AMD licensed AMD64 to Intel, your Intel chip is using technology developed by AMD).

AMD did learn from its errors and founded the HSA foundation, which joins the biggest names in the industry (except Intel). That and AMD vicepresident affirming that kaveri delay was due to "HSA's marketing not ready", let me believe that we will see HSA enabled software on kaveri launch.
 
</vacation>
<restart>
<catch up on unread posts>

<wading through thousands(according to toms post count, may include replies) of worthless flameposts containing red and blue colored b.s.>
...
<wading>
....
<disregarding r.a.l.f. vs m.i.l.f. verbal wars and textual harrassment of thread followers>
....
<found 1 useful info> kaveri (previously 2012-2013 release) release in q1 2014 and associated info </found>
....
<wading>
....
<so much crap that it isn't even amusing anymore>
<reached current page>
</restart>

result:
i am glad that absolutely didn't miss anything worth missing. turned out to be a right time to take a timeout. :D

kaveri motherboards keep coming: Gigabyte announces the FM2+ A88X motherboard line for AMD’s Kaveri
http://vr-zone.com/articles/gigabyte-announces-the-fm2-a88x-motherboard-line-for-amds-kaveri/51580.html
Richland Athlons and Semprons are coming
http://www.fudzilla.com/home/item/32238-richland-athlons-and-semprons-are-coming
Asus preps 10.1'' neo-netbook with Temash inside
http://techreport.com/news/25241/asus-preps-10-1-neo-netbook-with-temash-inside
TI comes out with D*ckPort silicon
http://semiaccurate.com/2013/08/05/ti-comes-out-with-dockport-silicon/

btw, does anyone think that according to amd's own uarch roadmaps, amd shouldn't have an fx-type cpu for steamroller and/or after steamroller and that some kind of a high performance apu will replace the flagship cpus. i'm disregarding prices for now.
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860
^^ double precision to single precision isn't just "x2"

8 core CPU [Piledriver architecture, four modules]
4GHz CPU core clock / 4.2GHz CPU all-core Turbo
1.2Bn transistors / 315mm2 die size
Manufactured on 32nm SOI from GlobalFoundries
Socket AM3+
Dual Channel 128-bit Memory Controller
Two DIMMS - 1866MHz
Four DIMMS - 1600MHz
Simultaneous Read/Write in unganged mode w/four DIMMs
Supports AMD Memory Profiles (AMP) & Intel XMP
125W TDP with AMD Turbo Core 2.0
256 Single Precision GLFOPS
64 Double Precision GFLOPS

AES, FMA3, FMA4, AVX, AMD-V, SSSE4.2 Instruction sets
$195USD SEP
http://www.rage3d.com/reviews/cpu/amd_vishera_fx8350_launch_review/index.php?p=1

Aside from that, whate are you losing?

sandra-arith.png

sandra%20arithmetic.png


so you get 5x the gflops and 1/2 the gips. ... Can kaveri replace the 8350? good luck. Haven't seen anywhere claims of 2x the integer performance.

Aside from that, im sure thats assuming 0% igp load and only theoretical values.

More important question, how many threads is that 5x going to require or is HSA seamless at making the IGP look like a "single core"?
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
Now 64/32 != 2 and Piledriver has less Double Precision GFLOPs that what AMD claims officially for Bulldozer, because a site (a funny site that says that Piledriver has a thing called GLFOPS) says so. LOL

Funny, also how Gigaflops per-watt are confused with gigaflops; claims of 2x integer performance are invented; nonsense about 0% igp loads, "single core" IGP; LOL
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


so your saying while your playing a game with the IGP, that your gflops will be 8x as much as the 4300? LAWL. good luck with that one.

Maybe you think that the IGP can be used simultaneously for gpu and fpu at 100% usage for each, creating a 200% theoretical IGP usage? or your thinking Kaveri is 8x (sorry i quoted the 5x instead of 8x at the same watt cpu) the float performance without using the IGP?

Nonsense about "single core"? then explain how AMD is going to achieve 8x the performance through scheduling a single thread through the IGP. sum it up by telling us how many "FPU cores" are in the IGP? Is the entire IGP seen as one core?

Or is this some miracle where theoretical gflops isn't dependent on the program being multi-threaded?
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


No.



No.



No.



No.



No.

 

may be if you elaborate a bit further into how the single thread will be executed by the igpu. may be it's more about how the thread(s?) will be executed and less about the igpu acting like a single core. amd has been trying real hard to blur the traditional concept of a cpu core.
i think i saw something like this being discussed before but it's tough to notice in the middle of those posts. but, i am not going back in there! the horror! :pt1cable:
 
Not wanting to create giant quote pyramids.
@sarinaide

This is a very common myth that Phenom II was faster then BD. That was only true during the first iteration and only due to BD's shared caching / scheduler causing each BD core to stall out and thus never provide full performance. The K10 IS "Stars", and the latest is the Llano which was a modified upgraded version of it. The modular design is far and beyond superior, it just has a very steep learning curve to get right. PD was a fairly big improvement and SR will be another large improvement. People even proved this by disabling even numbered "cores", provided you had a MB that supported that, to prevent the resource arbitration. Provided a large increase in single threaded performance but a big hit in overall performance. Also when comparing performance you need to keep in mind exactly which processor components are different. The K10h uArch (Stars) had 3 ALU's per "core", the same as SB while the BD uArch has 2 ALU's per "core". Thus a 6 core K10 had the same number of ALU's as an 8 core BD (16) yet the BD had more cache, a much better SIMD/FP unit and clocked significantly higher. It was limited by resource arbitration and thus never got to actually use all it's components simultaneously.

Also Jaguar is not from K10 but is a BD uArch modular chip. This was the entire purpose of the BD modular design, they can easily exchange / attach components to cheaply tailor it to any target system without needing to design a brand new chip. There is no K10 chip being released now.

About the socket, there is absolutely nothing that a new socket would of given the CPU (outside of the LVDS connectors). We've gone over this before, many pages ago, about how the IMC, HT and CPU clock are all different from each other. Changing sockets does nothing for any of those, their all separate entities that are individually adjustable, provided you have a board that can do it. Socket AM3+ isn't even running HT at it's highest speed yet as most CPU's are set to 2.0~2.6ghz where as the bus itself goes to 3.2ghz. The IMC resides inside the "NB" component and is completely separate from the HT components, so no amount of interface redesign will improve memory / cache performance.

If you want to see what a "die shrunk" Phenom II looks like then go buy a Llano APU. They have double the L2 cache per core (1MB vs 512KB) to compensate for not having any shared L3 cache. This is good because shared L3 is fairly low performance and really not that important. If your hitting L3 then something horribly wrong has happened in your predictor / caching mechanism.

If people are going to complain about the BD uArch they need to at least complain about real issues instead of imaginary ones. Things like resource arbitration and cache latency are very much performance inhibitors, yet their a result of not enough R&D being poured into the design to tweak it.
 
Status
Not open for further replies.