AMD CPU speculation... and expert conjecture

Page 468 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Yep, Bandwidth isn't the issue for most tasks, latency is. Take access main memory by the CPU; no matter how much you can pump across the bus, if it takes you 1ms to get that data across, regardless of size, you are stuck for 1ms. Hence the CPU cache, which is really just a way to reduce latency as much as possible and keeping the CPU fed. VRAM operates the same way.

If the minimum read time is 1ms, regardless of how much data you can pump across the bus, every time you need to read from that bus, you stop processing for that amount of time. That's latency. And it kills performance.
 

8350rocks

Distinguished


Meet Dragonfly...500GB/s bandwidth interconnects...

Nothing discrete is outdated, or slow...you do not have that much data bandwidth in cache.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Lots of things can be done. It doesn't mean that's the way it will be utilized. I saw it and dismiss it as marketing fluff. Ultimately it isn't even up to Intel. It's the system integrators that choose that. Your Cray's, your SuperMicros, your Tyans. The people making building blocks for HPC.




You just said NVidia was working on a mythological 150GB/s interconnect for their exascale APU. Then when I explain to you exactly how that can be done you say it hasn't been invented yet? LoL! :pt1cable:

Of course internal interconnects will always be faster than external interconnects, that's basic physics. CPUs/APUs may grow to 100B transistors by 2020 but you'll still need tens of thousands of them to get to exascale.
 

juggernautxtr

Honorable
Dec 21, 2013
101
0
10,680


for some reason they cause errors in my code reader, have had to send 2 back because something in the programming is not being translated correctly.
 

vmN

Honorable
Oct 27, 2013
1,666
0
12,160
Something I don't quite get, is why AMD is continuing their clombsy scheduling system.
I could imagine using a shared scheduler on mutiple cores(Just like on GPU), would balance the workload much better and could provide better performance.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Now split the bandwidth per Aries chip among the whole number of Xeon CPUs. Cray engineers have measured the bandwidth for Sandy Bridge Xeon based cluster and they got 15GB/s available for each Xeon CPU.

Now compare to the bandwidths per APU given above and you can see why I am far from impressed.



I find interesting that you took before Hazra marketing stuff (against heterogeneity) as gospel but now you take Intel roadmap and selling plans as marketing. When Hazra claims that the discrete card is for legacy users it is not making marketing, but telling you the plans of the company. This is the same situation when Feldman (AMD) say you that the Warsaw CPU is only for legacy users and that AMD doesn't plan to release Steamroller/Excavator Opteron CPUs. That is not marketing it is reality.

About what supercomputer makers will do, I can say you that Cray is one of members of the team designing the supercomputer based in Nvidia APUs. :sarcastic:

I have said that AMD/Nvidia APUs are scheduled for 2018 or so. Why are you surprised that the products are not ready today? If I say you that Carrizo APU is scheduled for 2015, will be you also surprised that you cannot purchase one today?

What I tried to say you above is that the new interconnects for the exascale level supercomputers are not simply obtained from taking a current interconnect and increasing the bandwidth. I already explained you that exascale compute is not obtained by simply scaling up current architectures/designs. One link explaining some of the new paradigms was given. You can continue ignoring it, but doesn't change anything.

Of course internal interconnects will always be faster than external interconnects, that's basic physics, but whereas you mention the obvious you ignore the relevant part. At exascale level there is a 10x power wall, which doesn't exist at current level. Current supercompuiters are based in a CPU+dGPU design whereas future supercomputers will be not due to this.
 
AMD bringing dual OS solution to retail
http://www.fudzilla.com/home/item/34051-amd-bringing-dual-os-solution-to-retail
AMD to offer Android emulation on retail products
http://semiaccurate.com/2014/02/26/amd-offer-android-emulation-retail-products/

i want this on a kaveri apu rite nao http://www.techpowerup.com/198227/hybrid-memory-cube-consortium-releases-hmc-2-0-specification.html 160GB/s from a 2GB chip @ 70% less energy mmm... <3 perf power efficiency...

Catalyst Beta 14.2 V1.3 drivers are optimized for Thief
http://techreport.com/news/26085/catalyst-beta-14-2-v1-3-drivers-are-optimized-for-thief
Dual graphics DirectX 9 application issues have been resolved rly :O

AMD Press Talks Up Major Open-Source Linux Driver Features
http://www.phoronix.com/scan.php?page=news_item&px=MTYxNDc

AMD reportedly moves desktop headquarters to China to strengthen competitiveness
http://www.digitimes.com/news/a20140224PD214.html

AMD-powered fit-PC4 released
http://www.fanlesstech.com/2014/02/amd-powered-fit-pc4-released.html



 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


That ups it to 60GB/s. Between L2 and L3 cache speeds.
 


What's the minimum latency? Funny how that goes unmentioned.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Why does marketing confuse you so much? Of course Intel is going to tout that Phi can run solo. It differentiates them from what AMD/NVidia have for products today that compete with it. You're focusing on a singular aspect that ultimately is not the game changer for that hardware, which is the on-package memory.






Yes I'm aware. Cray works with pretty much every major and minor player in the field. They still work with AMD too. Until you see a product announced it's just one of many things "in the works".

NVidia is also working with IBM to make a traditional PowerPC+GPU solution. But who knows where that will lead now that IBM is selling their fabs. There's a lot of nervous IBM'ers right now. 13,000 layoffs with more to come.

Heck Samsung just joined the OpenPower consortium. I thought ARM was the future. ;)



You identified a NVidia approximated 150GB/s interconnect and I showed you how it can be done. Do you need now more than 150GB/s?

Sure they will need something more elaborate than the existing Dragonfly topology, but the underlying building block is still an MGT. The MGTs aren't just scaling up in bandwidth they're also reducing power. Unless you're envisioning quantum interconnects, we only have parallel and serial buses to work with.



You mean to tell me that discrete circuits will become integrated circuits to reduce power? :sarcastic:
 
Mantle just died:

http://techreport.com/news/26090/mantle-no-more-gdc-sessions-point-to-the-next-directx?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+techreport%2Fall+(The+Tech+Report)

Come learn how future changes to Direct3D will enable next generation games to run faster than ever before!

In this session we will discuss future improvements in Direct3D that will allow developers an unprecedented level of hardware control and reduced CPU rendering overhead across a broad ecosystem of hardware.

If you use cutting-edge 3D graphics in your games, middleware, or engines and want to efficiently build rich and immersive visuals, you don't want to miss this talk.

Driver overhead has been a frustrating reality for game developers for the entire life of the PC game industry. On desktop systems, driver overhead can decrease frame rate, while on mobile devices driver overhead is more insidious--robbing both battery life and frame rate. In this unprecedented sponsored session, Graham Sellers (AMD), Tim Foley (Intel), Cass Everitt (NVIDIA) and John McDonald (NVIDIA) will present high-level concepts available in today's OpenGL implementations that radically reduce driver overhead--by up to 10x or more. The techniques presented will apply to all major vendors and are suitable for use across multiple platforms. Additionally, they will demonstrate practical demos of the techniques in action in an extensible, open source comparison framework.

So Mantel spurred DX and OGL to get their acts together to address their overhead issues, removing the need for Mantel to exist in the first place. Which I predicted several months ago...
 

8350rocks

Distinguished
@JUANRGA:

Seriously...?

Look, a single GPU is orders of magnitude more power efficient than a single APU for exascale computing.

What do you not understand? 4x GPU + 1x CPU > 4x APUs.

Are you that stubborn or that ignorant? Which is it?
 

con635

Honorable
Oct 3, 2013
644
0
11,010

Think this was the plan, it was mentioned before, will the dx improvements be free though? I cant even get 11.x without shelling out for windows 8. This is big step forward for budget PC gaming, happy times.

 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


I don't think those details have been worked out fully. It is an interconnect using a packet based protocol. With read and write sizes of 16 to 128 bytes. Latency is likely on par with what you'd get from a QPI link.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810



You can thank AMD for twisting their arms. Someone had to do it.
 

juggernautxtr

Honorable
Dec 21, 2013
101
0
10,680


m$ will most likely turn it into an app you have to pay for or to get the next improvements have to buy a new OS.

 

ColinAP

Honorable
Jan 7, 2014
18
0
10,510


That's a paradoxical thing to say. If "Mantel" hadn't have existed in the first place then how would DX and OGL have been spurred to get their acts together?

Another one of your predictions was that nobody would ever use "Mantel". They have. So your prediction rate on "Mantel" is 50%. Well done, give yourself a pat on the back.
 
So Mantel spurred DX and OGL to get their acts together to address their overhead issues, removing the need for Mantel to exist in the first place. Which I predicted several months ago...

Just because MS will support something, in the future, that may or may not resemble MANTLE, isn't cause to proclaim it dead. Now we enter into the competition phase where competing standards will each evolve and attempt to solve the problem in different ways. Developers will try out both and over time eventually one will win out as the accepted standard and the other will fade into obscurity. Regardless of which eventually wins, the process itself will enhance the final product for the users as each competitor will be trying to create a bigger better mousetrap.
 


Remember AMD is the smallest of the three major GPU OEM's, after NVIDIA and Intel. So if DX and OGL improve, even if to a lesser extent then Mantel allows, Mantel goes away simply due to market support.

Secondly, I know for a fact the improvements to DX have been in work for at least a year; you saw the first bit of the API speedup in 11.1 (Look at BF4 in Win8 versus Win7). OGL...is a bit more surprising, since the API is just so badly written at this point.
 

vmN

Honorable
Oct 27, 2013
1,666
0
12,160
The thing is mantle is currently only supported by newer AMD cards, which is their biggest problem right now.
We are going through a interesting time, gentlemen.
 
Status
Not open for further replies.