AMD CPU speculation... and expert conjecture

Page 396 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Well, nothing is stopping you from making a GPU using X86 cores; Intel tried it (Larrabee/Knights Peak)...oh wait, it drew a ton of power and hasn't actually made it as a product yet. Never mind.
 


Wasnt starting a war, was just explaining what NEON was
 


I don't care if ARM is discussed. It's the way it's been discussed that people take issue with. As if ARM is the magical solution to everything just because it's newer. The grass is always greener in the other persons yard when scalability issues are all conveniently ignored.
 


Actually that's still going on and shipping now. They redubbed it Xeon Phi.

They're still dumping money into it and expanding it with on package memory (HBM).
 


Actually, the results from threads > logical cores are not influenced by priority - at that point the app is consuming approximately 100% of the CPU and there is nothing extra to squeeze out. It might make a difference if you are running other things in the background, but the app specifically requests users to shut down background processes! And really, if you are going to run a benchmark in parallel to youtube and video rendering, you can't expect sane results, regardless of what you set the priority to 🙂

The place where priority affected the results was where threads < logical cores, as higher priority seemed to incline Windows to schedule the threads more efficiently. But once the cores are saturated, there is basically no difference. Here are Yuka's experiments with priority (high priority compared to normal priority):

http://www.headline-benchmark.com/results/cef25b47-463f-4aa8-a70b-f4a9c516f1c7/083b462f-9ef2-46f6-b1bb-a3c83c1f9cce

As you can see, the priority possibly gives a boost for light threading, but by the time cores are saturated there is no real difference.

 


Depends on a few things honestly. Intel/AMD may differ here, due to how CMT operates. Going to be Linux/Windows scheduling differences on this one too...For the Windows case:

In the case where NumThreads < NumCores, I'd expect a speedup via priority boost, as this would reduce the chance of another system thread booting one of the threads for your application, instead booting a lower priority thread from a different core. This also has the secondary effect of greatly reducing the chances of threads bouncing between cores, which can eat performance.

For the NumThreads > NumCores case, depending on CPU arch and scheduling, a few things could happen. On one hand, you'd have a bottleneck where no matter what you do, some of your threads can't run, and thus, priority boosts really won't affect performance. On the other hand, in a HTT/CMT system, getting threads done even slightly faster can have a significant impact if it results getting another thread off a HTT/CMT core (which costs you performance). That was the case I was thinking of above.

The Linux case would be the most interesting, since the default scheduler (CFS) tends to run threads in such a way as to ensure each gets roughly the same amount of total execution time. As a result, I'd expect total execution time to rise as the number of threads in the system increases (background tasks, etc). I'd expect you'd be able to measure differences between a heavy and light linux distro in purely CPU bound benchmarks.

FYI, you should be able to invoke the WinAPI and manually set priority to individual threads; should be trivial to make them all the highest priority on Windows, which should remove any such issues in the future.
 

And how difficult would it be to insert the text "CONFIDENTIAL" on a slide, in a document or an image...?

Printing business cards saying "Title: Emperor of the Galaxy" doesn't make it true, but it is simple enough to make them... And a lot of fun...
 

It's a bit OT, so I PM'd you gamerk.

For the sake of posterity, I believe Cazalan below is refering to the porno spam that has since been deleted, not this post... lol
 


I think even AMD isn't sure what their roadmap will be 2 years out. It will depend on their Kaveri and other APU sales and which nodes GF finally has available. They can always tape out parts and not make them depending on the economics of it at the time.

They even said they would release a version of the 8 core APU used in PS4/XBone but that's not on any roadmap yet either.
 
Cazalan My comment was not addressed to you.

kviksand81 You can believe what you want. The PR representative has also said that the roadmap presented at APU13 is official and you continue believing it is not official, because you cannot find it in the AMD website...

jdwii Except that James Prior didn't say that the roadmap was fake he only said: "I've never seen that slide before, I don't know where that came from". And this is the same James Prior whom I was discussing during days in twitter when I leaked the Kaveri diagram just before publishing the BSN* article. I got the diagram from an official talk given by one of his chiefs at AMD, still he claimed during days he was not familiar neither with the talk nor with the slide I was mentioning to him. Finally he was unable to answer my question about the slide with a simple YES or NO.



20nm bulk is ready for volume production. The first processors have been tapped out on 16 nm bulk FinFET (bulk) and the first tests of 10nm bulk FinFET are under the radar.
 
@juan
What cazalan said is correct. Mentioning arm a few times here and there is one thing. Talking about ARM in everything you post is something different entirely.

Good luck on GF's 20nm being on time.
 
^^^ I didn't say if he was correct or not, merely said that my comment was not addressed to him. Also talking about other posters in everything you post here is something different entirely.
 
@juanrga:

20nm bulk is a pipe dream...it will be FD-SOI for CPUs past 28nm. 20nm bulk is only ULP for ARM cores, etc. that operate below 2 GHz.

16nm is a hybrid process with 14nm FEOL and 20nm BEOL. It is an XM process and will be geared toward FinFETs, probably on FD-SOI...though if they actually manage to get FinFETs to work, they may be able to do without FD-SOI. However, FinFET on bulk is quite a bit more expensive and complex.

14nm will be FD-SOI or FinFET, and 10nm will only be FD-SOI FinFETs because they will have to eliminate all complexities...or did you not read that even Intel is going to FD-SOI past 14nm?
 


Not going anywhere != New Product. Read between the lines on this one.
 


Agreed
 


I'll be "that guy" and point out the obvious: neither of those are CPU intensive benchies.

For all I want the new APUs to do fine, credit due when it's time. In this case, I want to see how it fares not only in an "avg joe" scenario (PC Mark does that, I believe), but going to the OC scenario.

I don't know how to express it well, but they put themselves in the same level as an i5K, so I might judge them at that level. Just like they put themselves in the same level as the 980X with BD1. We all know how that went,lol.

Cheers!
 
8% uplift over i5 4670k on CPU only or CPU/GPU total score? i really don`t believe Kaveri can even compete with the 3570k, even less with the 4670k, they should show some CPU only benchmarks, maybe they are using the GPU to boost total score... i don`t like this, i am starting to feel Kaveri will be a disappointment on CPU only workloads.
 
Off topic but i installed W8.1 on my Amd A8 3520M Llano and i got 15% lower scores in CPU Multicore benchmark on Cincebench but i have a 5% boost in the Single core test. In Wprime i got the same scores maybe 1-2% slower on 8.1 compared to 7. But in Fritz chess benchmark i got 13% boost using the test with only 1 thread compared to 7 and i got about a 1% boost using all the cores.

I noticed W8.1 seems to use turbo a more efficiently even on the older Llano laptop but i can't explain the lower scores in Cinebench and the slightly lower scores in Wprime(tested 3 times)

Going to try gaming next see how that feels on this thing.
 


pcmark scores include a gaming component so almost certainly the iGPU is included in the 4670 comparison. Also on the slide the next bullet point calls out the GCN cores.
 
PCMark 8 measures APU performance. It is measuring CPU+GPU of Kaveri against the CPU+GPU of i5-4670k. PCMark 8 uses code from ordinary applications that are accelerated by the iGPU: Handbrake, Photoshop, VLC player...
 
Status
Not open for further replies.