AMD CPU speculation... and expert conjecture

Page 446 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The main reasons for choosing a FreeDOS computer are (i) you don't pay for a OS license and (ii) you can format and install your favorite OS (e.g. linux, Windows XP/7...).
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


If you use older binaries the CPU gain is in the single digit. If you use updated binaries that can use the extra hardware in the Haswell CPU* then you can see big gains up to 40% over IB.

* Haswell has twice more max. GFLOPS per core than IB: 32-per-core vs 16-per-core.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
Power consumption and OC easiness

Kaveri%20Power%20Draw_575px.png

http://www.anandtech.com/show/7677/amd-kaveri-review-a8-7600-a10-7850k/9

Kaveri can be OC from 3.7GHz up to Richland base freq. without changing the voltage. The increase in power consumption at load is small (single digit watts). This data seems to agree with my original explanation that the reduction in CPU frequencies was mainly due to the reduction in TDP rating (100W-->95W) and not from SOI-->bulk. A hypothetical Kaveri APU rated at 100W, like Richland, could be clocked at 4.0--4.1 GHz. The difference would be on the OC capacity, with Richland OC better.

The above table also shows that you don't need increase voltages heavily to hit 4.4Ghz. Personally I would prefer stop at 4.3GHz with 1.275.
 
btw, 8GB ram is less for me (previously i thoght that i can do with 4GB but then i got 64bit window), especially because i don't use virtual memory, window 7 takes 1-1.5GB and kaspersky chews 1GB ram when i start full system scan, and after using chrome for few hours and few tabs (30-50 only :p beause my monitor is 18.5" only ) i get a window notification which tells me that i ran out of memory

Technically, you ALWAYS use virtual memory. Windows is designed so every allocation of RAM by the OS is done in terms of virtual, not real, memory addresses. When you disable the Page File, what Windows does is force a 1:1 correlation between virtual addresses and real addresses. But you don't stop using the Virtual Memory subsystem.
 

truegenius

Distinguished
BANNED
^
to me, it looks like poor overclocker chip

stock voltage is around 1.24v, and voltage at load at 4.3GHz is 1.368v,
so that chip require ~10% increase in voltage to get ~16% overclock ( maybe igpu or ES is the cause less overclock )

if we compare this to phenom then
stock voltage of my phenom is 1.4v and if i add 10% to it then it will become ~1.55v
with 10% increase in voltage phenom 2 can also hit 4.3GHz stable , while 4.3 GHz over 3.2 is ~35% overclock, generally phenom ( atleast lower clocked versions like 1090t) don't require voltage boost to get 20% overclock
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780


Trying to find the good median of software to performance is hard. In a perfect world everyone would run Gentoo or Funtoo and they'd get the best performance possible. I'd be interested in getting my hands on a Haswell and installing Gentoo on it and then comparing it to Windows but I don't have that luxury.

I have done a little playing with x264 compiling in Gentoo and on my FX it was one of the lower yielding performance gains. IIRC it was about 10%. But I recall x264 having a lot of hand tuned assembly, so I wonder if the Haswell binary is a version of x264 that's compiled differently, is hand optimized for Haswell, or some sort of combination of the two.

Haswell saw massive gains in Dolphin Emulator too. The kind of gains I expect to see from recompiling programs in Gentoo on my FX.

I've been trying to get Dolphin to compile with GCC and bdver2 use flags with mingw in Gentoo. I had a good native linux version but I'm forced to use the OpenGL plugin, which defeats the entire purpose of speeding up the CPU because the OpenGL plugin is so slow. If I get it with mingw, I can have a windows binary with GCC optimizations for AMD. I think it would give AMD a massive fighting chance in Dolphin.

I am planning on doing some solid comparisons between Gentoo optimized for bdver2 (piledriver) and Windows, but I am slow at doing things, so it's going to take some time. I am kind of waiting on getting a Jaguar chip too.

EDIT: haha as soon as I post I'm having issues, I get x86_64-emerge to start compiling and making dependencies. Maybe I will have some luck soon. I am making some progress. There are no documents or anything online regarding how to compile from Gentoo with Mingw to Windows, so I'm going it alone here.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


One doesn't need a source-based distro to see the benefits of the architecture. Several Windows sites reviewed Haswell with updated binaries. E.g. updating the x264 binary i7-4770k advantage over i7-3770k changed from 7% to 16%. The biggest advantage that I can recall now was C-ray under linux: the i7-4770k was 40% faster than the i7-3770k.

The gains of Kaveri with HSA enabled software are bigger as shown before.
 


Eh? The OGL plugin is faster then the old DX9 one. And all of them should be significantly faster then the software render, unless something is REALLY mucked up.
 


Except your ignoring ~50% of the silicon on the CPU. The GCN iGPU eats up more power then the previous VLIW4 design's. That translates into more TDP having to be used for the iGPU and so less is available for the CPU to clock up. When AMD design's these chips they have to take both components into account when designing specs. If you have a dGPU then you can safely ignore the iGPU's thermal usage, but then why would you be using an APU in the first place.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


I think you mean ~50% of the silicon on the APU. But you are missing my point. I am comparing hypothetical 100W Kaveri (4.0--4.1GHz) against 95W Kaveri (3.7GHz), with the iGPU being the same in both.

I will explain it again: A hypothetical 100W Kaveri would have the CPU clocked at 4.0--4.1GHz using the same voltage (iGPU unchanged); the reduction in CPU clocks to 3.7GHz is not due to bulk but due to the reduction in TDP.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Yes, the scores for C-ray with different CFLAGS are:

Nocona --> 23.07 seconds
Core2 --> 22.95 seconds
Corei7 --> 22.95 seconds
Corei7-avx --> 22.84 seconds
Core-avx-i --> 22.83 seconds
Core-avx-2 --> 17.02 seconds

where Nocona is the CFLAG for the old Xeons, Core2 (the original Intel Core CPUs with SSE3 support), Corei7 (Ironlake), Corei7-avx (Sandy Bridge), Core-avx-i (Ivy Bridge), and Core-avx2 (Haswell).

In GraphicsMagick benchmark the gap is even superior. The i7-4770k is a 42% faster thanks to AVX2.

AVX2 also doubles the maximum floating point performance: i7-3770k (224 GFLOPS SP) vs i7-4770k (448 GFLOPS SP).

Excavator will introduce AVX support and double-size FMAC units: Kaveri (118 GFLOPS SP) vs Carrizo (224 GFLOPS SP)*

* I assume that Carrizo base clock will be around 3.5GHz.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Do you read? Before I mentioned a commercial x264 "real program" that comes with native AVX2 support. There are more. And people using source-based "real programs" can recompile them using match=Core-avx-2
 

NICE. \o/
just checked mc, $160 for an a10 7700k with asus A55BM-E combo.
meanwhile, newegg has dropped only 5 bucks.

otoh, core i3 4340 is $140 at mc.
 

Embra

Distinguished


Very nice. :) -minus BF4.

 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


"JOIN THE REVOLUTION. Buy this processor, get a Battlefield 4 game code. Ask your sales associate for a coupon. Offer good through March 29, 2014 or while supplies last. Coupon availability varies by store. "

You May get one. ;)
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780


DX9 plugin has been deprecated for quite some time. DX11 version is vastly superior. That's what you normally use now. perhaps I should have been more clear, OGL plugin is slower than DX11 version.

Not to mention I have compiz running on a 1440p monitor and 1200p monitor. I spend all this time customizing compiler settings and running Gentoo, only to enjoy fglrx driver and poorly coded OGL applications causing problems with tons of eye candy enabled.

I'm not gonna whine about fglrx like everyone loves to do though. I've been using fglrx since mid 00s on my x1600 mobility, and I've been using windows ATI driver since Rage 128 Pro. People who think Catalyst and fglrx suck now must have amnesia, because old Nvidia driver from 8800GTS days and old ATI/fglrx sucked way, way more than fglrx and catalyst do now.

I still remember going back to Nvidia with 8800GTS. Everyone went "NVIDIA THE DRIVERS ARE SO MUCH BETTER!"

The first time the drivers crashed and SMARTGART wasn't there to save me from a BSOD, I was pretty pissed off at every idiot who told me the drivers were better for Nvidia, haha. At least we don't have to worry about that anymore.

I am just hoping for a Mantle plugin for Dolphin. It would more than likely help a ton with getting the program to thread better as the render thread would be spread amongst x number of cores.

I might even give it a shot, I wonder if people would send me bitcoins in exchange for it. I love me some bitcoins. Set up file server for house, install 6950, leave it mining 24/7, make profit from running home file server.
 
And So It Begins.

Opteron A1100 8 core, 64bit arm soc
http://www.anandtech.com/show/7724/it-begins-amd-announces-its-first-arm-based-server-soc-64bit8core-opteron-a1100
http://vr-zone.com/articles/opteron-a1100-amds-first-arm-based-soc/70876.html
http://www.techpowerup.com/197351/amd-announces-arm-based-server-cpu-and-development-platform.html
http://techreport.com/news/25977/amd-reveals-arm-based-opteron-a1100-series

Analysis: Is the microserver market real?
http://semiaccurate.com/2014/01/28/analysis-microserver-market-real/
part of it is beyond ze paywall.
 


Umm x86 can address a whole lot more then that. Maybe your just talking about the low power atoms?
 

Yep, currently the atom and jaguar servers from AMD can only do 32GB.
 
Status
Not open for further replies.