1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
Confirms that said here before:
- MANTLE is not specific for AMD hardware.
- High-end cards such as R9-290X are unused, ~50% idle is not unusual in current games.
- MANTLE eliminates several GPU and CPU bottlenecks
- "PC gamer will be able to achieve higher frame-rate with a lower spec CPU!"
If AMD is getting 50% idle times, then they need to fix their drivers.
"GFLOPs calculations developed by AMD Performance Labs measuring compute capacity for the AMD A10-5800K desktop APU which is 736 GFLOPS. AMD GFLOPs calculated using GFLOPs = CPU GFLOPs + GPU GFLOPs = CPU Core Freq. (3.8GHz) X Core Count (4) X 8 FLOPS + GPU Core Freq.(800MHz) X DirectX® 11 capable Shader Count (384) X 2 FLOPS."
"theoretical GFLOPS calculated by AMD as 856 for AMD A10-7850K with AMD RadeonTM R7 Series Graphics. GFLOPS = CPU GFLOPS + GPU GFLOPS = CPU Core Freq (3.7 Ghz) x Core count (4) x 8 + GPU Core Freq (720MHz) x Radeon Core (512) x 2"
You see? Different CPU and GPU cores, same formula. I don't think that Kaveri is only 16% better than Trinity. Looks like GFLOPS on slides are worthless.
Max theoretical, assuming no internal/external bottlenecks exist. Absolute increases will be lower, obviously.
On the CPU side, you have a 2x edge in favor of the 8350, so in purely CPU benchmarks, FX is about twice as fast, again, assuming no bottlenecks exist. So figure ~25%-33% edge in favor of FX when factoring that in.
Throwing in the GPU, you have about a 3.5x max theoretical edge, again, assuming no bottlenecks. That being, said pair FX with a 660GTX or something, and the equation changes back in favor of FX. So without knowing what GPU FX is paired with, the second number of kinda useless since you have NOTHING to compare against.
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
So to answer the unasked question: In games, Kaveri will roughly equate to core-i3 performance when the latter is paired with an equivalent GPU. So it puts downward pressure on i3's, but that's about it.
"GFLOPs calculations developed by AMD Performance Labs measuring compute capacity for the AMD A10-5800K desktop APU which is 736 GFLOPS. AMD GFLOPs calculated using GFLOPs = CPU GFLOPs + GPU GFLOPs = CPU Core Freq. (3.8GHz) X Core Count (4) X 8 FLOPS + GPU Core Freq.(800MHz) X DirectX® 11 capable Shader Count (384) X 2 FLOPS."
"theoretical GFLOPS calculated by AMD as 856 for AMD A10-7850K with AMD RadeonTM R7 Series Graphics. GFLOPS = CPU GFLOPS + GPU GFLOPS = CPU Core Freq (3.7 Ghz) x Core count (4) x 8 + GPU Core Freq (720MHz) x Radeon Core (512) x 2"
You see? Different CPU and GPU cores, same formula. I don't think that Kaveri is only 16% better than Trinity. Looks like GFLOPS on slides are worthless.
As explained in my BSN* article those formula provide "the maximum floating point performance".
It is the maximum allowed by the architecture. The maximum allowed by the GCN cores and the maximum allowed by the FPU in the module. Recall the the FPU in Steamroller is the same FPU used in Piledriver regarding performance.
That doesn't mean that the above maximum values are the values obtained in the practice for each code. The effective GFLOP depend on the rest of the architecture (e.g. front-end ability to feed the FPU) and of developer ability to obtain the maximum possible from the silicon.
Said that, the GFLOP number gives a good idea of how Kaveri will outperform a FX-8350 when using HSA software. In my BSN* article I included a HSA benchmark
1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
1) Answered before. I repeat. They are not replacing FX with APUs now but extending FX towards 2014. However, they will replace FX in the future. LIsa Su has already stated AMD plans to dominate the market from phones to servers using APUs. I gave the quote before.
2) Neither she mentioned explicitly the word "bulk" when mentioned the current products made @ 28nm "We are fully top-top-bottom in 28nm now".
3) I quoted tech websites that reproduce that AMD has said. One of the quotes was: "The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said."
gamerk316 :
If AMD is getting 50% idle times, then they need to fix their drivers.
The same happens to Nvidia cards. Or do you believe that Titan/780 are 2x faster than 290X? Because aren't.
gamerk316 :
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
A FX paired to a GPU is not an HSA enabled system and will not run HSA software as the Kaveri APU does.
Moreover, both of you forget that the HSA APU can be paired with a HSA enabled dGPU to make a more powerful HSA system.
This information comes from the same source who confirmed the stop to the development of new chipsets for Socket AM3 +, the death of the Socket AM3 + , the maximum 512SP Kaveri, and the elimination of the IMC dedicated to GDDR5. It is therefore a reliable source.
NVIDIA and AMD have ALWAYS taken different approaches to GPU design; NVIDIA favors stronger shaders, AMD favors memory bandwidth. This goes back to when DX5/DX6 were not very memory intensive APIs (Which ATI pressured MSFT to change in DX7/8, hence why NVIDIA's 5000 series did so badly). NVIDIA could have other bottlenecks not related to draw call overhead. Likewise, AMD could be twice as powerful as NVIDIA, but being held back by software.
What I can say, in the games I play, I'm seeing >85%+ usage, so no, my 770 isn't idling 50% of the time.
1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
1) Answered before. I repeat. They are not replacing FX with APUs now but extending FX towards 2014. However, they will replace FX in the future. LIsa Su has already stated AMD plans to dominate the market from phones to servers using APUs. I gave the quote before.
2) Neither she mentioned explicitly the word "bulk" when mentioned the current products made @ 28nm "We are fully top-top-bottom in 28nm now".
3) I quoted tech websites that reproduce that AMD has said. One of the quotes was: "The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said."
gamerk316 :
If AMD is getting 50% idle times, then they need to fix their drivers.
The same happens to Nvidia cards. Or do you believe that Titan/780 are 2x faster than 290X? Because aren't.
gamerk316 :
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
A FX paired to a GPU is not an HSA enabled system and will not run HSA software as the Kaveri APU does.
Moreover, both of you forget that the HSA APU can be paired with a HSA enabled dGPU to make a more powerful HSA system.
The issue is, HSA is going to do nothing in heavy rendering/compiling scenarios. Sure, it might make photoshop faster...though I would challenge you to show me a design house doing 3D modeling using photoshop on a wide scale.
You won't find one, because photoshop is for 2D artists. Now, Autodesk products for 3D modeling...they aren't going to favor something like HSA just yet would be my guess. It's too polarizing for the large numbers of design houses working on Intel machines with NVidia GPUs.
So don't expect HSA to catch on like you think until well after the launch and likely the next generation has released.
Is it innovative? Sure, I cannot fault AMD for their forward thinking.
Is Kaveri an improvement over Richland? Sure, though they very heavily missed the target performance, and I have word as to why that happened (the commentary has been provided already in a previous post).
The issue I see here, is that you are buying all the hype, and I am hearing other things from AMD.
Do they expect it to sell well? Sure, for mainstream PCs. Laptops and OEM desktops.
However, they do not consider it to be HEDT. Your posts manipulating comparisons by including the GPU aspect of Kaveri are entirely misleading. We are discussing CPU intensive tasks, and last time I checked 256 > 118 by 100+%. So, Kaveri is not something I would put into any graphics workstation, or other productivity machine in my office. Until that day comes...Kaveri is not High End. It may be high end for all in ones, laptops, tablets, or whatever else they want to put it into. However, it is by far and away, less raw horsepower than the 8350, and will never exceed the performance of the 8350 in CPU bound tasks.
We won't be able to make any concrete deductions on how bad GF screwed AMD with bulk and SOI.
We know one thing for sure. If AMD had selected SOI, they couldn't present Kaveri at APU13 and start shipping it now, because GF has failed to execute their 28nm SOI plans. Their Fab 8 is not ready for mass production and Samsung is coming in their help, taking the control of the R&D dept.
Assuming GF gets its problems solved for late 2014 it would be... too late (haha), because then Kaveri would be competing against Haswell successor (Broadwell) and Intel promises a much more powerful iGPU for Broadwell. Then AMD would cancel Kaveri and release Carrizo.
Luckily AMD selected bulk and can now start to ship Kaveri for official launch at January. Also AMD is now tappingout 20nm chips on bulk and 16nm on bulk FINFET.
blackkstar :
Yeah, the 7850k is not that far behind Trinity in clock speed, but look at what happened with Richland overclocking. I can nearly guarantee you that you won't be seeing a single 7850k breaking 5ghz on air. And Richlands that did that were common.
To be quiet honest I would be surprised if Kaveri broke mid 4ghz range for maximum overclocks. I have a feeling that AMD used the clock speed tweaks they used on Richland to salvage a bulk product. Meaning that whatever AMD did that let Richland break 5ghz while Trinity had no chance was applied to Kaveri to prevent Kaveri from having catastrophic clock speed. RCM comes to mind and I have a very good feeling if AMD didn't beta test what they did to get clock speeds up with Richland than we'd be looking at a 2ghz range Kaveri high end model.
As said before:
The CPU in Kaveri will be overclocked beyond 4.5 GHZ without problems. It will not break worldwide records either.
The former phrase is based in leaked docs. The late phrase means that I don't expect Kaveri to break the 8GHz mark. ;-)
blackkstar :
I also don't expect Kaveri to overclock well at all because of what AMD did with FX. Clearly it was a better choice to release PD with a 20% clockspeed improvement than to release SR FX on bulk. That alone should speak volumes of why there is no SR FX.
I'm assuming things kind of went like this
1. AMD starts talking about Steamroller on AM3+ using 28nm SOI
2. Glofo completely screws it up with delays, tells AMD they can get their big 28nm SOI chip in mid to late 2014.
3. AMD decided to either go bulk with SR FX or just not to it at all
4. They find out clockspeed and overclocking hit that comes with bulk wouldn't make the chip 10% faster
5. Release PD parts with 20%+ clockspeed bump (FX 9000 series)
Before I speculated with AMD taking one of two possible ways. The second was "abandon FX-4000 and use Warsaw-like dies for FX-6000/8000/9000 refresh".
Warsaw promises further reduction in power consumption compared to Opteron. I assume that are using fully enabled resonant clock mesh. Efficiency in the desktop is secondary, but reduction in power consumption could be used for increasing base/turbo clocks and the ability to overclock beyond 5GHz on air.
But the 2104 desktop roadmap shows that they are not doing that.
Then the big question is why is not AMD reusing the dies that are fabricating for Warsaw also for the FX line? Why the old FX-line is extended to 2014 without any refresh? Why the FX-4000 series continue being sold when the 4-core Opterons are abandoned and Warsaw is 12/16 core?
The more likely answer to all that is "lack of demand". We know that the FX line didn't sell well. Therefore, AMD must have an inventory of old products that they want to sell and, at the same time, there is no room for a refresh that 'nobody' would purchase. By nobody I mean less than 1% of total gamers or so.
We also have leaked docs. that said that the 9590 was the last FX chip.
My belief is that AMD is extending the FX line up to 2014 for replacing it with an ultra-high-end APU in 2015 (FM3 socket). Time will say.
blackkstar :
Take a look at S|A, Kaveri missed targets all over. It didn't miss targets because bulk is just as good as SOI.
The S|A article is overly pessimistic. Several of us (including myself) discussed it in S|A forums.
I want proof of the following:
-AMD said they are replacing FX with Kaveri
-AMD said they are going to 20nm bulk
-AMD said Kaveri APUs are high end parts
Everything I have seen so far is all contradicting your claims.
Do you read? Because in the same message that you reply I wrote:
the old FX-line is extended to 2014 without any refresh [...]
My belief is that AMD is extending the FX line up to 2014 for replacing it with an ultra-high-end APU in 2015 (FM3 socket). Time will say.
About the second question, it was also answered before. but in another post. I will repeat this quote from AMD:
We are typically at the leading edge across the technology nodes. We are fully top-top-bottom in 28nm now across all of our products, and we are transitioning to both 20nm and to FinFETs over the next couple of quarters in terms of designs. So we will continue to do that across our foundry partners. […] We will do 20nm first and then we will go to FinFETs
856 >> 256 which means that the new APU will outperform the old 8350 CPU with HSA enabled software.
As shown in my BSN* article, the CPU in Kaveri will perform like an SB/IB i5 with ordinary software.
As shown at APU13, the Kaveri CPU can perform like a 8350 at stock or a i7-4770k at stock with MANTLE enabled software.
As leaked benchmarks show, the Kaveri CPU will outperform a 8350 in integer workloads with software up to four threads
But in any case since you ask about what AMD said, here is what AMD says directly or indirectly:
Su said the Kaveri chips will bring improvements over competing processors from larger rival Intel in such areas as graphics capabilities, overall performance and power efficiency. Kaveri also signals a significant step forward in AMD's heterogeneous computing vision of ramping up the parallel computing capabilities in system through the tight integration of multi-core CPUs and GPUs on the same piece of silicon."We believe this really unlocks a different level of … computing," Su said during her opening keynote address at the developer conference.
The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said. Driving this demand are such factors as cloud computing, big data and greater mobility. AMD's APU architecture is aimed at addressing that demand.
"The days of single-threaded performance are over," Su said. She noted that between 2011 and 2012, AMD shipped more than 80 million APUs, a number she expects to grow to more than 150 million in 2014 and more than 300 million within a few years after that, illustrating the growing demand for such heterogeneous computing capabilities in the industry.
AMD has recently updated its product roadmap and is set to release its Hawaii-based GPUs at the end of September, Kaveri-based APUs for the high-end segment and Kabini-based APUs for the entry-level segment in the first quarter of 2014, according to sources from the upstream supply chain.
Aimed at high-end desktops and laptops, Kaveri is the follow-up to last year's top-of-the-line Richland series and marks AMD's first heterogeneous APU - where a CPU and GPU are housed on the same piece of silicon to boost performance and speed.
[...]
Further details weren't available, but AMD said the new APU is due in the first six months of next year and that it will demo devices at CES in January.
If theoretical GFLOPs are all that matters, why keep the CPU and instead just use a GPU?
If we are only looking at GFLOPs to measure performance, then we might as well just get rid of the CPU entirely and replace it with GCN cores as the CPU is just wasting die space for low FLOP parts of the chip.
I realize I am going about this in a round-a-bout kind of why, but you and I both know a system without an x86 CPU isn't going anywhere because not all workloads can scale to GPU cores.
If you are going to include GPU GFLOPs in APU calculation and Mantle is going to work across dGPUs, why can't I use my Tahiti chip for my total GFLOP count in my rig?
Then I have 318GFLOPs for my CPU and my overclocked Tahiti (which is close to Hawaii with ~40% overclock) has 5,324.8GFLOPs.
So my rig has 5,642.8 GFLOPs. Suddenly your comparison between FX and APU with APU winning doesn't look like such a winner now.
It was obvious from watching APU13 AMD has intentions of bringing all of this to dGPUs.
Just to put this into perspective, my system has about 6.6 times more theoretical GFLOP performance than Kaveri.
And yes, I realize you can add a dGPU to kaveri, but if you do add a high end one, the iGPU's performance is so small it is almost irrelevant in comparison.
But I didn't mention the 290/290X because are more powerful but "Because those two cards already support "system unified addressing" and may be the perfect complement for a hUMA HSA APU"
huh? i know nothing about system unified addressing and i haven't seen it in action. that's why i ignored it, instead considered the gfx card(did i specify a model in my reply?) as traditional discreet gfx cards, in an Imaginary pc.
Nvidia has just presented new GPGPU with unified memory addressing (aka Nvidia version of AMD hUMA)
1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
1) Answered before. I repeat. They are not replacing FX with APUs now but extending FX towards 2014. However, they will replace FX in the future. LIsa Su has already stated AMD plans to dominate the market from phones to servers using APUs. I gave the quote before.
2) Neither she mentioned explicitly the word "bulk" when mentioned the current products made @ 28nm "We are fully top-top-bottom in 28nm now".
3) I quoted tech websites that reproduce that AMD has said. One of the quotes was: "The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said."
gamerk316 :
If AMD is getting 50% idle times, then they need to fix their drivers.
The same happens to Nvidia cards. Or do you believe that Titan/780 are 2x faster than 290X? Because aren't.
gamerk316 :
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
A FX paired to a GPU is not an HSA enabled system and will not run HSA software as the Kaveri APU does.
Moreover, both of you forget that the HSA APU can be paired with a HSA enabled dGPU to make a more powerful HSA system.
The issue is, HSA is going to do nothing in heavy rendering/compiling scenarios. Sure, it might make photoshop faster...though I would challenge you to show me a design house doing 3D modeling using photoshop on a wide scale.
You won't find one, because photoshop is for 2D artists. Now, Autodesk products for 3D modeling...they aren't going to favor something like HSA just yet would be my guess. It's too polarizing for the large numbers of design houses working on Intel machines with NVidia GPUs.
So don't expect HSA to catch on like you think until well after the launch and likely the next generation has released.
Is it innovative? Sure, I cannot fault AMD for their forward thinking.
Is Kaveri an improvement over Richland? Sure, though they very heavily missed the target performance, and I have word as to why that happened (the commentary has been provided already in a previous post).
The issue I see here, is that you are buying all the hype, and I am hearing other things from AMD.
Do they expect it to sell well? Sure, for mainstream PCs. Laptops and OEM desktops.
However, they do not consider it to be HEDT. Your posts manipulating comparisons by including the GPU aspect of Kaveri are entirely misleading. We are discussing CPU intensive tasks, and last time I checked 256 > 118 by 100+%. So, Kaveri is not something I would put into any graphics workstation, or other productivity machine in my office. Until that day comes...Kaveri is not High End. It may be high end for all in ones, laptops, tablets, or whatever else they want to put it into. However, it is by far and away, less raw horsepower than the 8350, and will never exceed the performance of the 8350 in CPU bound tasks.
What about Solidworks, ProE and Catia? I believe they take advantage of GPU.
i think people are focussing on the wrong strength of kaveri. kaveri is amd's first real sign to shifting fp calculations to gpus, if the leaked bench showing 16% regression and ~30% higher integer performance is true.. kaveri already has the igpu, so as long as amd can use the igpu as fp-co-processor, shouldn't be a problem. problem will be how the softwares see this new system and how they can properly take advantage of it. although i wonder how the igpu will perform if two or more tasks try to use it simultaneously... like play games and run some kind of avx code-type-thingies (that's one of the first fpu-taxing tasks google told me^_^).
although, fp regression pose a problem with cpu derivative of berlin compared to other cpus (fx).
i think people are focussing on the wrong strength of kaveri. kaveri is amd's first real sign to shifting fp calculations to gpus, if the leaked bench showing 16% regression and ~30% higher integer performance is true.. kaveri already has the igpu, so as long as amd can use the igpu as fp-co-processor, shouldn't be a problem. problem will be how the softwares see this new system and how they can properly take advantage of it. although i wonder how the igpu will perform if two or more tasks try to use it simultaneously... like play games and run some kind of avx code-type-thingies (that's one of the first fpu-taxing tasks google told me^_^).
although, fp regression pose a problem with cpu derivative of berlin compared to other cpus (fx).
But the way AMD is IMPLEMENTING it, they are leaving it to the developers to move the processing over. And for serial FP tasks, it won't benefit to move the processing over.
So the FP performance? It IS going to show up in benchmarks.
1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
1) Answered before. I repeat. They are not replacing FX with APUs now but extending FX towards 2014. However, they will replace FX in the future. LIsa Su has already stated AMD plans to dominate the market from phones to servers using APUs. I gave the quote before.
2) Neither she mentioned explicitly the word "bulk" when mentioned the current products made @ 28nm "We are fully top-top-bottom in 28nm now".
3) I quoted tech websites that reproduce that AMD has said. One of the quotes was: "The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said."
gamerk316 :
If AMD is getting 50% idle times, then they need to fix their drivers.
The same happens to Nvidia cards. Or do you believe that Titan/780 are 2x faster than 290X? Because aren't.
gamerk316 :
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
A FX paired to a GPU is not an HSA enabled system and will not run HSA software as the Kaveri APU does.
Moreover, both of you forget that the HSA APU can be paired with a HSA enabled dGPU to make a more powerful HSA system.
The issue is, HSA is going to do nothing in heavy rendering/compiling scenarios. Sure, it might make photoshop faster...though I would challenge you to show me a design house doing 3D modeling using photoshop on a wide scale.
You won't find one, because photoshop is for 2D artists. Now, Autodesk products for 3D modeling...they aren't going to favor something like HSA just yet would be my guess. It's too polarizing for the large numbers of design houses working on Intel machines with NVidia GPUs.
So don't expect HSA to catch on like you think until well after the launch and likely the next generation has released.
Is it innovative? Sure, I cannot fault AMD for their forward thinking.
Is Kaveri an improvement over Richland? Sure, though they very heavily missed the target performance, and I have word as to why that happened (the commentary has been provided already in a previous post).
The issue I see here, is that you are buying all the hype, and I am hearing other things from AMD.
Do they expect it to sell well? Sure, for mainstream PCs. Laptops and OEM desktops.
However, they do not consider it to be HEDT. Your posts manipulating comparisons by including the GPU aspect of Kaveri are entirely misleading. We are discussing CPU intensive tasks, and last time I checked 256 > 118 by 100+%. So, Kaveri is not something I would put into any graphics workstation, or other productivity machine in my office. Until that day comes...Kaveri is not High End. It may be high end for all in ones, laptops, tablets, or whatever else they want to put it into. However, it is by far and away, less raw horsepower than the 8350, and will never exceed the performance of the 8350 in CPU bound tasks.
I find it curious that several talks about HSA and CAD were given at APU13.
What "heavily missed target performance"? Overall GFLOP? Sure, because the drop of GDDR5 support obligated AMD to reduce the iGPU frequency. It has nothing to do with bulk vs SOI as you pretend, because the iGPU and the 7750 are both bulk. It is memory bandwidth all the time.
At the same time AMD showed iGPU running BF4 at least 2x faster than GT630, which surprised more than one, because they were expecting less performance.
I have given you quotes from AMD, including vicepresident, that contradict you.
I compared Kaveri to CPUs using ordinary CPU workloads, using HSA workloads and using MANTLE enabled workloads.
I have also shown non-HSA situations Kaveri will outperform the 8350 CPU, but you insist on ignoring those.
The only manipulation here is from you selectively picking what you want, misleading posting about it, and ignoring the rest.
No, Kaveri is not for tablets. As mentioned before in this same thread Beema is for tablets.
856 >> 256 which means that the new APU will outperform the old 8350 CPU with HSA enabled software.
As shown in my BSN* article, the CPU in Kaveri will perform like an SB/IB i5 with ordinary software.
As shown at APU13, the Kaveri CPU can perform like a 8350 at stock or a i7-4770k at stock with MANTLE enabled software.
As leaked benchmarks show, the Kaveri CPU will outperform a 8350 in integer workloads with software up to four threads
If theoretical GFLOPs are all that matters, why keep the CPU and instead just use a GPU?
If we are only looking at GFLOPs to measure performance, then we might as well just get rid of the CPU entirely and replace it with GCN cores as the CPU is just wasting die space for low FLOP parts of the chip.
I realize I am going about this in a round-a-bout kind of why, but you and I both know a system without an x86 CPU isn't going anywhere because not all workloads can scale to GPU cores.
If you are going to include GPU GFLOPs in APU calculation and Mantle is going to work across dGPUs, why can't I use my Tahiti chip for my total GFLOP count in my rig?
Then I have 318GFLOPs for my CPU and my overclocked Tahiti (which is close to Hawaii with ~40% overclock) has 5,324.8GFLOPs.
So my rig has 5,642.8 GFLOPs. Suddenly your comparison between FX and APU with APU winning doesn't look like such a winner now.
It was obvious from watching APU13 AMD has intentions of bringing all of this to dGPUs.
Just to put this into perspective, my system has about 6.6 times more theoretical GFLOP performance than Kaveri.
And yes, I realize you can add a dGPU to kaveri, but if you do add a high end one, the iGPU's performance is so small it is almost irrelevant in comparison.
Therefore anything that I wrote beyond GFLOP is ignored I see.
Also total performance of 856+5000 doesn't seem irrelevant when compared to 256+5000, specially when the former includes HSA hUMA
Here's the issue Juan, in the one benchmark where Kaveri beats PD, they underclocked to 1.8. Which sounds good, until you factor in what happens when a SINGLE PD core gets over 100% work. Here's a hint: You take a giant performance hit. I'm guessing that's what happened here; basically, PD's design ended up bottlenecking when underclocked. If Kaveri were OC'd, rather the PD UC'd, then I'd expect the opposite (and expected) result of PD thrashing Kaveri.
So as always, you need to qualify your statement, to "Kaveri beats PD when PD is underclocked to half its normal speed", also noting how the PD arch is not designed to run at those low frequencies. Or did we suddenly decide to get away from stock v stock testing when it became convenient? We're talking performance, not architectural efficiency, right?
i think people are focussing on the wrong strength of kaveri. kaveri is amd's first real sign to shifting fp calculations to gpus
Exactly. Since first day (since AMD adquisition of ATI) AMD concept of APU has been that of a heterogeneous compute processor, where the GPU is used as a giant FPU
Kaveri is the first APU that fulfills that long dream and however some people continue believing that an APU is something for cheap gaming or cheap graphics. LOL
Well, I do believe the lower GFLOPs are due to Bulk not reaching the same clocks as FD-SOI. The formula uses the speed for each component (CPU and GPU) to calculate the FLOPs, so it's kind of obvious that the process DOES matter in terms of how APU behaves now. Indirect effect, but it still sucks.
And in those same lines, not giving official Turbo speeds is really something to be wary of. They can't back out of the 4GHz mark after Richland broke it so easily. I am doubting the initial batch of Kaveri parts will be as good as Richland when OCing and hence will have lower performance after tweaking. I wonder how the GCN part will stack against VLIW4.
Cheers!
EDIT: Looks like HSA software will save Kaveri. If it doesn't, then Kaveri starts to smell like a flop.
At the same time AMD showed iGPU running BF4 at least 2x faster than GT630, which surprised more than one, because they were expecting less performance.
i think people are focussing on the wrong strength of kaveri. kaveri is amd's first real sign to shifting fp calculations to gpus
Exactly. Since first day (since AMD adquisition of ATI) AMD concept of APU has been that of a heterogeneous compute processor, where the GPU is used as a giant FPU
Kaveri is the first APU that fulfills that long dream and however some people continue believing that an APU is something for cheap gaming or cheap graphics. LOL
Guess what? Because AMD is leaving it to developers to change they way they do things, an APU is going to be something for cheap gaming.
Now, if AMD simply routed all FP calculations to the APU, or had something trivial (say, a one-liner) to do this in code, then that would be one thing. Instead, AMD is forcing software updates, forcing new coding styles, and forcing something that won't be supported by the other giant in the market. Typically, that doesn't end well; anyone remember how many studios came out behind PhysX?
What I am afraid we are seeing is the beginning of an API war, which means we WILL be returning to the days of software incompatibility. I remember the days when Glide got all the graphical options, OpenGL had the high quality textures, and DirectX had lower quality textures, but hardware T&L and dynamic shadows. If you didn't have a Voodoo, you had reduced quality. That's where things are starting to head, and I don't like it one bit.
I'm *hoping* MSFT preempts all this by greatly expanding the DirectX API, preferably with a Physics engine built in (so we gone one standard adopted by everyone), and gives a mechanism for lower-level hardware access (which they can do better then anyone, since they can change the OS to make it work).
At the same time AMD showed iGPU running BF4 at least 2x faster than GT630, which surprised more than one, because they were expecting less performance.
192 cores @ 875 vs 512 @ 700 ... I would hope that kaveri can be faster ...
Who in their right mind in any shape or form would expect less performance out of ~2.5x as much hardware? Thats a dumb statement in itself.
To be fair, NVIDIA's shaders have ALWAYS been faster then AMD/ATI, but that number of extra ones Kaveri has is significant enough where AMD is going to win that matchup.
Well, I do believe the lower GFLOPs are due to Bulk not reaching the same clocks as FD-SOI. The formula uses the speed for each component (CPU and GPU) to calculate the FLOPs, so it's kind of obvious that the process DOES matter in terms of how APU behaves now. Indirect effect, but it still sucks.
And in those same lines, not giving official Turbo speeds is really something to be wary of. They can't back out of the 4GHz mark after Richland broke it so easily. I am doubting the initial batch of Kaveri parts will be as good as Richland when OCing and hence will have lower performance after tweaking. I wonder how the GCN part will stack against VLIW4.
Cheers!
EDIT: Looks like HSA software will save Kaveri. If it doesn't, then Kaveri starts to smell like a flop.
which is why I was talking wether or not it gets adopted. If HSA fails, Kaveri fails.
there is no reported clock speed, and seeing as this started out at 2544 integer score, what are the chances that this isn't always running at 1.8 ghz?
1.) Where does AMD say they're replacing FX with an APU at all? You did not answer the question.
2.) Where did it say specifically 20nm bulk? You did not answer...
3.) You quoted tech websites...not AMD directly. That's the same as saying your speculation will come to pass...(oh wait! you're doing that!). Again, question is unanswered.
I await your quotes from AMD directly and the links to the quotes for verification.
1) Answered before. I repeat. They are not replacing FX with APUs now but extending FX towards 2014. However, they will replace FX in the future. LIsa Su has already stated AMD plans to dominate the market from phones to servers using APUs. I gave the quote before.
2) Neither she mentioned explicitly the word "bulk" when mentioned the current products made @ 28nm "We are fully top-top-bottom in 28nm now".
3) I quoted tech websites that reproduce that AMD has said. One of the quotes was: "The demand for greater parallel computing capabilities is building through all levels of computing, from mobile devices and PCs to cloud servers and high-performance computing systems, she said."
gamerk316 :
If AMD is getting 50% idle times, then they need to fix their drivers.
The same happens to Nvidia cards. Or do you believe that Titan/780 are 2x faster than 290X? Because aren't.
gamerk316 :
That's the point Juanrga keeps skipping over: FX WILL be paired with a GPU of some sort, so you MUST factor that in when equating performance. The real issue, as 8350rocks points out, is power/cost. Obviously, Kaveri platform is cheaper, but an FX-8350 paired with a $200 GPU will still be significantly faster.
A FX paired to a GPU is not an HSA enabled system and will not run HSA software as the Kaveri APU does.
Moreover, both of you forget that the HSA APU can be paired with a HSA enabled dGPU to make a more powerful HSA system.
The issue is, HSA is going to do nothing in heavy rendering/compiling scenarios. Sure, it might make photoshop faster...though I would challenge you to show me a design house doing 3D modeling using photoshop on a wide scale.
You won't find one, because photoshop is for 2D artists. Now, Autodesk products for 3D modeling...they aren't going to favor something like HSA just yet would be my guess. It's too polarizing for the large numbers of design houses working on Intel machines with NVidia GPUs.
So don't expect HSA to catch on like you think until well after the launch and likely the next generation has released.
Is it innovative? Sure, I cannot fault AMD for their forward thinking.
Is Kaveri an improvement over Richland? Sure, though they very heavily missed the target performance, and I have word as to why that happened (the commentary has been provided already in a previous post).
The issue I see here, is that you are buying all the hype, and I am hearing other things from AMD.
Do they expect it to sell well? Sure, for mainstream PCs. Laptops and OEM desktops.
However, they do not consider it to be HEDT. Your posts manipulating comparisons by including the GPU aspect of Kaveri are entirely misleading. We are discussing CPU intensive tasks, and last time I checked 256 > 118 by 100+%. So, Kaveri is not something I would put into any graphics workstation, or other productivity machine in my office. Until that day comes...Kaveri is not High End. It may be high end for all in ones, laptops, tablets, or whatever else they want to put it into. However, it is by far and away, less raw horsepower than the 8350, and will never exceed the performance of the 8350 in CPU bound tasks.
I find it curious that several talks about HSA and CAD were given at APU13.
What "heavily missed target performance"? Overall GFLOP? Sure, because the drop of GDDR5 support obligated AMD to reduce the iGPU frequency. It has nothing to do with bulk vs SOI as you pretend, because the iGPU and the 7750 are both bulk. It is memory bandwidth all the time.
At the same time AMD showed iGPU running BF4 at least 2x faster than GT630, which surprised more than one, because they were expecting less performance.
I have given you quotes from AMD, including vicepresident, that contradict you.
I compared Kaveri to CPUs using ordinary CPU workloads, using HSA workloads and using MANTLE enabled workloads.
I have also shown non-HSA situations Kaveri will outperform the 8350 CPU, but you insist on ignoring those.
The only manipulation here is from you selectively picking what you want, misleading posting about it, and ignoring the rest.
No, Kaveri is not for tablets. As mentioned before in this same thread Beema is for tablets.
856 >> 256 which means that the new APU will outperform the old 8350 CPU with HSA enabled software.
As shown in my BSN* article, the CPU in Kaveri will perform like an SB/IB i5 with ordinary software.
As shown at APU13, the Kaveri CPU can perform like a 8350 at stock or a i7-4770k at stock with MANTLE enabled software.
As leaked benchmarks show, the Kaveri CPU will outperform a 8350 in integer workloads with software up to four threads
If theoretical GFLOPs are all that matters, why keep the CPU and instead just use a GPU?
If we are only looking at GFLOPs to measure performance, then we might as well just get rid of the CPU entirely and replace it with GCN cores as the CPU is just wasting die space for low FLOP parts of the chip.
I realize I am going about this in a round-a-bout kind of why, but you and I both know a system without an x86 CPU isn't going anywhere because not all workloads can scale to GPU cores.
If you are going to include GPU GFLOPs in APU calculation and Mantle is going to work across dGPUs, why can't I use my Tahiti chip for my total GFLOP count in my rig?
Then I have 318GFLOPs for my CPU and my overclocked Tahiti (which is close to Hawaii with ~40% overclock) has 5,324.8GFLOPs.
So my rig has 5,642.8 GFLOPs. Suddenly your comparison between FX and APU with APU winning doesn't look like such a winner now.
It was obvious from watching APU13 AMD has intentions of bringing all of this to dGPUs.
Just to put this into perspective, my system has about 6.6 times more theoretical GFLOP performance than Kaveri.
And yes, I realize you can add a dGPU to kaveri, but if you do add a high end one, the iGPU's performance is so small it is almost irrelevant in comparison.
Therefore anything that I wrote beyond GFLOP is ignored I see.
Also total performance of 856+5000 doesn't seem irrelevant when compared to 256+5000, specially when the former includes HSA hUMA
How about the 8% they missed the CPU target and the 24% they missed the GPU target?
They didn't hit the CPU target because of clockspeed, and they didn't hit the GPU target because of clockspeed.
Both of those factors correlate to several things:
Target TDP limits (influenced by bulk vs. FD-SOI)
Target power consumption limits (influenced by bulk vs. FD-SOI)
Clockspeed headroom (influenced by bulk vs. FD-SOI)
The issue is, bulk substrate means higher leakage, which means more power is required to reach the same clockspeed that can be achieved on FD-SOI because of the insulator.
This means, basically, that while Kaveri runs at 3.7 GHz at 95W TDP, Richland runs at 4.1 GHz + Turbo at 100W TDP or 10% faster on a less advanced process with larger/fewer transistors.
Now, your 20% performance gain just took a 10% hit because of loss of clockspeed. See the difference?
High leakage in bulk substrate is why Intel uses so many tricks to make FinFET work on bulk. Like 3D transistors, and all the other super expensive to R&D stuff they use in their fab just to keep clockspeeds competitive on bulk process.
Additionally, bulk is far more temperature sensitive to power consumption at lower voltages than FD-SOI. If you crank the vcore up on a haswell chip, it gets hot fast. You can turn up the vcore on a FX chip and it doesn't require nearly as drastic cooling to run cooler than the Intel option.
These are all factors that you have not addressed and lead directly into the issues AMD had with bulk process.
They are also why I believe my contact mentioned directly that they would not pursue HEDT on bulk substrate, and would not be doing bulk beyond 28nm.
In the end, if AMD keeps this up, we all lose. FinFET is not the way of the future, it's many times more costly to develop a FinFET on bulk with all the technology Intel has, to compete, rather than take a simple planar UTBB FD-SOI that competes without all the dog and pony show tricks to get there.
But what do I know about CPUs? You clearly are the master of everything about CPUs, if nothing else because you said so, and no one else could possibly be right about something.
What about amd going with gate first as apposed to intels more expensive gate last on bulk ? Is this the substrate your referring to 8350 Rocks ?
Gate last actually does make a difference, the back biasing and other advantages it affords help. However, the issue is that bulk substrate is just a generic silicon wafer, while FD-SOI (or Fully Depleted Silicon On Insulator) has a thin slice of insulating material in the middle of the wafer. This has tremendous advantages over a standard silicon wafer.
the big question is. can amd come out with steamroller mainstream cpu thats not too expensive. but delivers better than intels mid-range offerings thats what i wanna know.