Professional Help: Getting The Best Overclock From AMD's A8-3870K

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
[citation][nom]strub[/nom]I have no clue why people still favour g620 + discrete. I recently built a Llano 3870k system with stock speed but undervolting. Idle: 27W full power prime95+3d game: 87W. This is by FAR less power than any intel + discrete could do. And I can still play all the games with twice the frame rates than my old 4670 was capable of.[/citation]

Intel Celeron or Petium plus discrete can have several times more graphics performance for the same cost and actually quite similar power consumption. A Celeron or Pentium plus a Radeon 7750 is much more power efficient in frames per second per watt than a lone Llano APU. An APU plus discrete can be pretty good, but it won't match the 7750, let alone the 7770. I dislike that setup because of the fairly low CPU performance, but it's not so low that even CPU-bound games are likely to be enough to let a lone APU win against the 7750 and the 7770. An A6 or A8 plus a Radeon 6670 might be able to, but not a lone APU.

Also, a Celeron or Pentium plus a 7750 can have lower power consumption than your A8 while providing greater performance So, no, your setup is not using far less power than some alternative Intel solutions can.
 

strub

Honorable
Aug 11, 2012
3
0
10,510
Well, I happened to have had a g630 system for comparison. Idle:27W, full cpu load 72W.
This system had no graphics card and it's general performance was way below the Llano. At least for the stuff I need it for.

Maven compile of the same project:
A8-3870k 8GB, toshiba 2.5" HDD : 2:17
MacBook Pro quad i7 mid2011 8GB / 256GB SSD: 2:12
g630 4GB / samsung 3.5" 7k2 HDD 500GB: 3:50

Ram amount was not a problem for the compile though, MAVEN_OPTS have been set to 1gb

The g630 and llano setups are pretty comparable. g630 even had a faster disk (though the 3.5" might add a few watt on the power bill).

Most games do not benefit from more than 2 cores, so here they might be on par. But in general work situations the 3870 is notably faster. My personal observations seem to be underlined by a few public benches.
passmark g630: 2620
passmark 3870k: 4758

Now consider that the Llano build had a 3d game running in parallel with windowMode 1680x1050 and only consumes 15W more. Of course a 7750 is faster, but it also adds additional 75W to cool! And it's also much easier to have a steady air flow in a small box if you only need to cool 1 part. After replacing the stock cooler the system is really quiet, even under load.
Btw, Linux performance is fine on Llano out of the box. For Windows you should definitely replace the AHCI driver from Microsoft with the one from AMD. It almost doubled my HDD performance when running MySQL and PostgreSQL benchmarks.

I've not informed myself if undervolting is possible with the locked intels like the g630, so there might be room for improvement on that front.
 
[citation][nom]strub[/nom]Well, I happened to have had a g630 system for comparison. Idle:27W, full cpu load 72W. This system had no graphics card and it's general performance was way below the Llano. At least for the stuff I need it for.Maven compile of the same project:A8-3870k 8GB, toshiba 2.5" HDD : 2:17 MacBook Pro quad i7 mid2011 8GB / 256GB SSD: 2:12g630 4GB / samsung 3.5" 7k2 HDD 500GB: 3:50Ram amount was not a problem for the compile though, MAVEN_OPTS have been set to 1gbThe g630 and llano setups are pretty comparable. g630 even had a faster disk (though the 3.5" might add a few watt on the power bill).Most games do not benefit from more than 2 cores, so here they might be on par. But in general work situations the 3870 is notably faster. My personal observations seem to be underlined by a few public benches.passmark g630: 2620passmark 3870k: 4758 Now consider that the Llano build had a 3d game running in parallel with windowMode 1680x1050 and only consumes 15W more. Of course a 7750 is faster, but it also adds additional 75W to cool! And it's also much easier to have a steady air flow in a small box if you only need to cool 1 part. After replacing the stock cooler the system is really quiet, even under load. Btw, Linux performance is fine on Llano out of the box. For Windows you should definitely replace the AHCI driver from Microsoft with the one from AMD. It almost doubled my HDD performance when running MySQL and PostgreSQL benchmarks.I've not informed myself if undervolting is possible with the locked intels like the g630, so there might be room for improvement on that front.[/citation]

There seems to be a few things wrong with some of your numbers. The Pentium G630 shouldn't break 60w by itself, let alone 70w. The 7750 does not use nearly 75w of power. It has a 55w TDP and doesn't even reach its TDP during gaming workloads.

Of course the A8 has higher highly threaded CPU performance than the Pentium G630. It has twice as many cores and each core is fairly close in performance to a core from the Pentium G630. You also ignore the fact that the Pentium would use even less power when used with a discrete4 card because its IGP would be disabled, so the IGP will use almost no power at all. Furthermore, even if the 7750 used anywhere near 75w, that wouldn't be enough to generate much heat.

I'd have to check, but if you can undervolt the Pentiums, then that is definitely an option too. Udervolting the 7750 is also considerable. The Intel CPU plus discrete AMD card can use less power than the top APUs can while providing far more performance and this is a fact that Tom's has already proven in at least one previous article.
 

strub

Honorable
Aug 11, 2012
3
0
10,510
All the numbers I posted were measured on the power line with the same EMT707 power meter. Of course the g630 cpu alone doesn't consume 72W but the whole system did. the 3870k is housed in an antec 1380. The g630 from my colleague was a self built box and I don't know the maker of the power supply as I returned the box already. Guess it was a a standard 80+ 350W build (guess seasonic like with all his boxes).

You were right with the 7750 which is specced with 55W, though a few vendors seem to overclock it and mention 61W.

I also have a question to you: how you could suck almost 160W out of your 3870 system? Even if I disable the undervolting, I don't pass the 130W mark with my box running prime95 + a game (NoS World windowed 1680x1050, best quali). Again: my 90W with the undervolted (1.185V + gpu uv, but stock clock speeds) 3870k are on the powerline for the whole system.
I recently updated the Bios and still need to redo all my tweaking again. But with plainly setting 1.2V in the bios I measure 107W while running prime95 torture test + NoS World in the above mentioned configuration. What really wonders me btw is that the game is absolutely playable even while running prime95 and all cores are at 100% all the time.
NoS alone is 74W, Prime95 alone is 94W


While measuring the g630 there was no 3d load running - thus the graphics core was mostly idle. Disabling the internal GPU and adding a 7750 (10W in 2d low power mode) would not help much I assume.

Of course the 7750 is better for gaming but if the 3870 is fast enough for you, then it's ~50$ cheaper overall, easier to cool, more silent + faster for real world apps imo.
 

strub

Honorable
Aug 11, 2012
3
0
10,510
Did some more tests. Seems my ASRock motherboard drivers enable some undervolting even if I set nothing in the BIOS. I get 129W prime and if all fans fully kick in it rises to 142.
After disabling all divers and services, dynamic clocking, etc it went up to 156W.
With undervolting I get 94W and after a while (guess fans again) it rises to 101W.
 
The bottem line for gaming is STILL INTEL ONLY, for a few years now. Even the 'cheap' intel cpus beat the ever living crap out of amd's high-end cpus. Sorry, there is just no comparing the two.
Even if you have to spend a few more dollars, that's basically 50% to 100% faster then anything amd has to offer. More then worth it.
 
[citation][nom]computertech82[/nom]The bottem line for gaming is STILL INTEL ONLY, for a few years now. Even the 'cheap' intel cpus beat the ever living crap out of amd's high-end cpus. Sorry, there is just no comparing the two. Even if you have to spend a few more dollars, that's basically 50% to 100% faster then anything amd has to offer. More then worth it.[/citation]

Where did you get that idea? i3 versus Phenom II and FX-4xxx, the three series are all fairly close and one only pulls ahead of the others in some games (the i3 primarily only wins by much in older games these days because many games can handle four threads fairly well and even then, it's not a loss when overclocking is considered).

Furthermore, an FX-8120 and an FX-8150 with one core per module disabled lets the other core in each module get a significant performance per Hz boost from not needing to share resources (up to about 25% performance boost per Hz) while cutting down power consumption by between about 30% to 40%. With overclocking, they can give the LGA 1155 i5s quite a run for the money, although they still fall a little short of the K edition i5s in usual top overclocking gaming performance.

The same can be done on the six-core FX CPUs, but they end up as tri core CPUs rather than quad core CPUs when you do this to them (still more than enough to pas the i3s by substantially). I'd say that the bottom line for top energy efficiency and user-friendlyness is Intel, but AMD is still a very strong competitor (except for Steam users because they often have problems) if you know how to use the CPUs to their full potential. Sure, you're then kinda fixing some mistakes made by AMD, but that doesn't diminish the effectiveness of the processor, only the reputation of AMD.
 

shiitaki

Distinguished
Aug 20, 2011
44
0
18,540
To do this APU thing right, AMD needs to realize the same thing Intel has refused to accept, system ram is way to damn slow for graphics! The way to make this APU idea work is to add two additional 64 bit channels to the system, like Intels quad channel, but the these two additional channels would not go far, just far enough to reach a socket similar to sodimm, where DDR5 modules would reside. The system could use it when gaming for the frame and texture data, then used in a server to cache the hard drives, as well as bolster the cpu cache. By having ridiculously fast graphics ram for cpu cache, on die could be reduced and those transistors used for more processing power. A gig of highspeed ram could hold the entire game engine as well as the graphics work at lower resolutions.
 
[citation][nom]shiitaki[/nom]To do this APU thing right, AMD needs to realize the same thing Intel has refused to accept, system ram is way to damn slow for graphics! The way to make this APU idea work is to add two additional 64 bit channels to the system, like Intels quad channel, but the these two additional channels would not go far, just far enough to reach a socket similar to sodimm, where DDR5 modules would reside. The system could use it when gaming for the frame and texture data, then used in a server to cache the hard drives, as well as bolster the cpu cache. By having ridiculously fast graphics ram for cpu cache, on die could be reduced and those transistors used for more processing power. A gig of highspeed ram could hold the entire game engine as well as the graphics work at lower resolutions.[/citation]

You mean GDDR5, not DDR5, right?

The concept for having additional GDDR5 memory lanes for the IGP is something that has been brought up for APUs by several Tom's members (including myself) in the past. The problem with using it as cache is that it isn't actually high-speed. GDDR5 has very high bandwidth, but graphics memory has very high latency too. It would make a very poor replacement for cache. This means that it would be added complexity (granted that I think that it's worth it) and die area. It would be a great idea to do with the next gen APUs that are supposed to be die-shrinks because that would leave more room for the two high-bandwidth 64 bit GDDR5 controllers.
 

army_ant7

Distinguished
May 31, 2009
629
0
18,980
@shiitake and blaz:
AMD may be waiting for DDR4, which may prove to be sufficient somehow. :) Plus, I I learned, digging through stuff shown during their convention a month or two ago, that they plan to unify the CPU and GPU (in an APU) memory space so that there's no need to reserve RAM for the GPU, and so that both it and the CPU can access the same chunks of RAM and process them. At least I think that's what it is. I remember that this may be planned for next year's Steamroller-based Kaveri APU's (the ones following Trinity).

I found this interesting as I think they said it could remove unnecessary latencies from data having to "go through" the CPU and then the GPU after, the possibly redundant copying of data between the memory spaces of both, etc. I could imagine a huge boost in GPGPU performance and plain application performance. I remember reading as well that the GPU would be able to assign its own tasks and be somewhat independent of the CPU.

I could be wrong about a bunch of things I've said, but if you guys are interested, then may this serve as a lead for research you could do on it if you haven't.
 
[citation][nom]army_ant7[/nom]@shiitake and blaz:AMD may be waiting for DDR4, which may prove to be sufficient somehow. :) Plus, I I learned, digging through stuff shown during their convention a month or two ago, that they plan to unify the CPU and GPU (in an APU) memory space so that there's no need to reserve RAM for the GPU, and so that both it and the CPU can access the same chunks of RAM and process them. At least I think that's what it is. I remember that this may be planned for next year's Steamroller-based Kaveri APU's (the ones following Trinity).I found this interesting as I think they said it could remove unnecessary latencies from data having to "go through" the CPU and then the GPU after, the possibly redundant copying of data between the memory spaces of both, etc. I could imagine a huge boost in GPGPU performance and plain application performance. I remember reading as well that the GPU would be able to assign its own tasks and be somewhat independent of the CPU.I could be wrong about a bunch of things I've said, but if you guys are interested, then may this serve as a lead for research you could do on it if you haven't.[/citation]

Much of what you've read is correct. However, DDR4 is not coming out any time soon for consumers (something like late 2014 or early 2015 are the early expected dates), merging the CPU and GPU memory more seamlessly won't solve the fact that there simply isn't enough bandwidth for the GPU, let alone both the GPU and the CPU, and even having DDR4 might not be enough by the time it comes out because the IGPs would then still be bottle-necked by it.

GDDR5 chips can often go as far as 1500MHz and farther. In order to even come close to it's performance for GPUs, DDR4 chips would need to surpass 3000MHz, so DDR4 won't be better than GDDR5 any time soon because it probably isn't going to clock that high until even more time after it launches with lower frequencies first. Again, by that time GDDR6 or GDDR7 would probably be out and would be needed to satisfy the IGP's memory bandwidth needs. If not, then it would take something such as XDR2 memory.

AMD's intended memory latency and such fixes will help, but they won't be a nearly perfect solution.
 

army_ant7

Distinguished
May 31, 2009
629
0
18,980
[citation][nom]blazorthon[/nom]Much of what you've read is correct. However, DDR4 is not coming out any time soon for consumers (something like late 2014 or early 2015 are the early expected dates), merging the CPU and GPU memory more seamlessly won't solve the fact that there simply isn't enough bandwidth for the GPU, let alone both the GPU and the CPU, and even having DDR4 might not be enough by the time it comes out because the IGPs would then still be bottle-necked by it.GDDR5 chips can often go as far as 1500MHz and farther. In order to even come close to it's performance for GPUs, DDR4 chips would need to surpass 3000MHz, so DDR4 won't be better than GDDR5 any time soon because it probably isn't going to clock that high until even more time after it launches with lower frequencies first. Again, by that time GDDR6 or GDDR7 would probably be out and would be needed to satisfy the IGP's memory bandwidth needs. If not, then it would take something such as XDR2 memory.AMD's intended memory latency and such fixes will help, but they won't be a nearly perfect solution.[/citation]
I see... I can't argue with you there, but I wonder if they'd really be able to implement a solution to this, unless they will just end up using some dedicated graphics memory like you guys said. Maybe some sort of data compression to alleviate bandwidth costs in exchange for some additional processing, maybe like the supposed texture (I think) compression used by Civilization V which uses DirectCompute to decompress it. (Haha! I don't know what I'm talking about really.)

I'd probably have to do a lot of research for I've only done some mild research on this, but why can't GDDR RAM be used in place of DDR RAM? Sorry for the newb question, though some people might appreciate info on this as well. I'm not sure if I've read something about the data being handled is different between the two, which is probably the obvious answer.

Another might be why Rambus' XDR2 isn't being used. I forgot if I've read as well if it's just because it's just not being implemented or if it's because it has a disadvantage compared to GDDR5 (and further). Thanks! :)
 
[citation][nom]army_ant7[/nom]I see... I can't argue with you there, but I wonder if they'd really be able to implement a solution to this, unless they will just end up using some dedicated graphics memory like you guys said. Maybe some sort of data compression to alleviate bandwidth costs in exchange for some additional processing, maybe like the supposed texture (I think) compression used by Civilization V which uses DirectCompute to decompress it. (Haha! I don't know what I'm talking about really.)I'd probably have to do a lot of research for I've only done some mild research on this, but why can't GDDR RAM be used in place of DDR RAM? Sorry for the newb question, though some people might appreciate info on this as well. I'm not sure if I've read something about the data being handled is different between the two, which is probably the obvious answer.Another might be why Rambus' XDR2 isn't being used. I forgot if I've read as well if it's just because it's just not being implemented or if it's because it has a disadvantage compared to GDDR5 (and further). Thanks! :)[/citation]

Using compression and decompression through Direct Compute or OpenCL are good solutions to this, but they must be implemented in-game or else AMD would probably need to do a whole lot of work to get it done either in hardware or in the driver. It certainly could be a good concept to implement, but I think that it is better left to the games to try.

Te problem with graphics memory is that it has huge latency compared to system memory. Having very high timings helps it clock very high, but not to such a point where it can easily make up for those timings enough for latency to remain low.

As far as I'm aware, XDR2 is avoided simply because Rambus is a patent troll and will attack other companies over pretty much anything that they think that they can win. Most companies today have made strong efforst to have nothing to do with Rambus desptie their having superior memory technology in multiple markets. For example, their mobile XDR memory is far faster than and uses less power than even LPDDR3, yet I don't know of any products that use it. At this point, companies might want to wait for Rambus to almost go out of business and then buy them out before they use Rambus tech strictly to not give Rambus any leverage for suing.
 

army_ant7

Distinguished
May 31, 2009
629
0
18,980
[citation][nom]blazorthon[/nom]The problem with graphics memory is that it has huge latency compared to system memory. Having very high timings helps it clock very high, but not to such a point where it can easily make up for those timings enough for latency to remain low.[/citation]
Hm... That's new to me, and I'm glad you told me about it. As much as I know about memory timings and clockrate, I'd probably have to have a deeper understanding about how our RAM technologies work to understand how latency and memory clockrate affect different types of data.

[citation][nom]blazorthon[/nom]As far as I'm aware, XDR2 is avoided simply because Rambus is a patent troll and will attack other companies over pretty much anything that they think that they can win. Most companies today have made strong efforst to have nothing to do with Rambus desptie their having superior memory technology in multiple markets. For example, their mobile XDR memory is far faster than and uses less power than even LPDDR3, yet I don't know of any products that use it. At this point, companies might want to wait for Rambus to almost go out of business and then buy them out before they use Rambus tech strictly to not give Rambus any leverage for suing.[/citation]
It's cool that you know about that issue with Rambus. Hearing about how business practices are, in a way, not benefiting the consumers is a shame, but we can't really get angry any of those companies, well, maybe just if Rambus is being unjust with its patent claims.
I could research more into these to confirm this but I'll take your word for it since you're a reliable fellow researcher who just shares what he learns and thinks. :) Thanks for these yet again!
 

army_ant7

Distinguished
May 31, 2009
629
0
18,980
Thinking about Civilization V's data compression-decompression using DirectCompute, I wonder if it performs particularly well on APU's compared to other games. I remember seeing benchmarks and I'd have to look back at them and compare them to discrete CPU and GPU systems.
 
[citation][nom]army_ant7[/nom]Thinking about Civilization V's data compression-decompression using DirectCompute, I wonder if it performs particularly well on APU's compared to other games. I remember seeing benchmarks and I'd have to look back at them and compare them to discrete CPU and GPU systems.[/citation]

That's a good question. I think that I'll look into it. I bet that it would work incredibly well on AMD's next-gen APUs that have the CPU and GPU memory totally converged and use GCN or better for the processing cores' micro-architecture. The CPU and GPU parts of Llano and even Trinity are still fairly high-latency when communicating with each-other, but it is much lower than with discrete cards, so it might affect their performance compared to DDR3-equipped Radeon 6570s and Radeon 6670s.
 
Status
Not open for further replies.