AMD CPU speculation... and expert conjecture

juggernautxtr · Feb 16, 2014

de5_Roy :

i got a 5450,6850 an an old alien ware with an opteron, wonder if i could get it to mine.

juggernautxtr · Feb 16, 2014

tourist :

gues that answers that.
yup, and i have a 9800 nvidea card too.......lol three gpus plus the opteron.........mine you ancient conglomeration😱 monster mine!!!!!! :bounce:

truegenius · Feb 16, 2014

juggernautxtr :

you can check out how much you may earn at a particular mining rate
http://www.dustcoin.com/

Cazalan · Feb 16, 2014

juanrga :

I'm not missing your point, I'm just saying it doesn't make any sense for Intels model. You're suggesting that Intel is introducing a 3rd type of core to their Xeon + Xeon Phi model. Such that Phi would be heterogeneous on it's own having a LCU and TCU. No one is suggesting that besides you.

Why would Intel add that kind of complexity to a solution already working quite well after just 1 generation in use? Xeon (LCU) + Phi (TCU). They will continue to offer more efficient Xeons, and more efficient Phi.

The "neo-heterogeneity" just means they offer a simplified/unified programming model for deployments of Xeon+Phi cores. Some liken it to the term HSA. Of course the end solution remains heterogeneous. Xeon + Xeon Phi.

jimmysmitty · Feb 16, 2014

truegenius :

I did mine. I had a few bitcoins before GPUs became pointless for that, then the site got hacked and I lost them.

I was doing litecoin then got tired of the noise so I stopped. I don't want a 290X for mining, my 7970s fans are one their way out and since I do not have a receipt RMAing is a pain in the arse. I want a replacement GPU for gaming and am also thinking for the future as I am hoping VALVe releases a new game engine soon and HL3.

Of course I am just hoping.

But yea most countries outside of the US have not seen the insane price increase. Guess their etailers actually like their customers.

Cazalan · Feb 16, 2014

truegenius :

Was that $30 before or after the $10 or so in electricity? 😉

palladin9479 · Feb 16, 2014

gamerk316 :

It's like I said, APU's don't make sense if your adding a dGPU. The entire design is to have the graphics unit on the chip so you don't need a separate graphics card. Saves space and power so you can go with a smaller design. Make it small enough and you can use a picoPSU which is ridiculously small and allows for extremely tiny cases. I like referring to the M350 because it's the smallest case you'll get on the mini-ITX platform and really emphasizes what you can do with it. Via's original vision wasn't these monstrous micro-ATX case's branded "mini-ITX" fitted with 300W+ PSU's running a dual slot dGPU, it was full featured cheap miniature computers that could be used to do 99% of desktop computing at a fraction of the space and energy usage.

juggernautxtr · Feb 16, 2014

palladin9479 :

they are going further than just the graphics on chip, they are also looking at the massive parallel processing ability of gpu/ppu (example 290x for mining). they essentially will become faster than dcpu alone. The dedicated cpu is coming to an end I think except for major high end work stations and servers. the apu will soon be at enthusiast lvls cheaper faster and better than dedicated parts alone. as i stated before, the only thing i see staying around is dgpu for intense gaming purposes.

when i watched AMD announce the APU, they specifically stated that they wanted the massive parallel processing power of the PPU(what they called it) for processing not just graphics. thhey explained that the cpu to search a 10000 page catalog could take up to 2 minutes, where as a PPU could do it 2 seconds(just what i remember them saying). the idea is more about parrallel processing than actual graphics.

palladin9479 · Feb 17, 2014

juggernautxtr :

Realistically isn't going to happen.

I've explained several times the difference between scalar and vector processing and how code works in regards to each. The most you can hope for is a co-processor capable of specialized tasks which can be fairly useful when properly employed. 90%+ of your code won't take advantage of it, but the parts that do would see a significant improvement.

If anyone doubts this, then they don't understand ASM and that those coprocessors are horrible at running integer logic.

To those thinking to themselves "We don't use ASM anymore, why would he reference that instead of a high level language", it's because all code, regardless of language, compiles to binary before it hits the CPU. CPU's only understand binary machine code, so whether the code is compiled to binary at runtime (JIT Java) or during development (C++/ect..) doesn't matter, it's all binary before the CPU gets it. ASM is the closest thing to machine code that's still human readable. It's often a 1:1 translation between the code and the compiled binary and thus it's very useful as a teaching tool to learn exactly how CPU's process data. Learning ASM also teaches that the vast majority of code is just moving data around memory or to / from registers and doing logical compares (integer logic) on the contents of the registers then jumping to another piece of code depending on the results of those compares. The actual amount of math involved is fairly small in comparison. Array processors (GPU's / Vector processors) really suck at those logical compares and checking the values in memory, they are slow and easily bottle-necked. What they excel at is when you need to crunch large datasets of predefined numbers that require minimal logical work being done. So unless your calculating the density of a neutron star, or modeling non-thermalized plasma on your home computer, a massive array coprocessor wouldn't be of much use.

truegenius · Feb 17, 2014

Cazalan :

its after subtracting electricity bill 😀
it means that i will get 4 cents per hour for not using my PC 🙁 its too low
also this crypto mining looks too good to be true
i mean i saw that a guy in a pool is earning around 160 ltc a day which means $72000 per month (according to current ltc rate of 15$ per ltc ( when i started mining it was around 17$)) , which means a supercharged camero zl1 every month for just running a pc for 24/7 :ouch:

no ads, no formatting, no work just relax and earn :heink:

i also want a camero but after 30-35 hours of mining, my ltc balance became 0.106534 🙁 , enough to buy a Mc'd burger and a coke 😛
this mining is stealing peace of mind (at least at current mining rate) thus not worth the hassle as of now :pfff:

will think about it after completing sleeping dogs

juggernautxtr · Feb 17, 2014

Realistically isn't going to happen.

I've explained several times the difference between scalar and vector processing and how code works in regards to each. The most you can hope for is a co-processor capable of specialized tasks which can be fairly useful when properly employed. 90%+ of your code won't take advantage of it, but the parts that do would see a significant improvement.

If anyone doubts this, then they don't understand ASM and that those coprocessors are horrible at running integer logic.

To those thinking to themselves "We don't use ASM anymore, why would he reference that instead of a high level language", it's because all code, regardless of language, compiles to binary before it hits the CPU. CPU's only understand binary machine code, so whether the code is compiled to binary at runtime (JIT Java) or during development (C++/ect..) doesn't matter, it's all binary before the CPU gets it. ASM is the closest thing to machine code that's still human readable. It's often a 1:1 translation between the code and the compiled binary and thus it's very useful as a teaching tool to learn exactly how CPU's process data. Learning ASM also teaches that the vast majority of code is just moving data around memory or to / from registers and doing logical compares (integer logic) on the contents of the registers then jumping to another piece of code depending on the results of those compares. The actual amount of math involved is fairly small in comparison. Array processors (GPU's / Vector processors) really suck at those logical compares and checking the values in memory, they are slow and easily bottle-necked. What they excel at is when you need to crunch large datasets of predefined numbers that require minimal logical work being done. So unless your calculating the density of a neutron star, or modeling non-thermalized plasma on your home computer, a massive array coprocessor wouldn't be of much use.[/quotemsg]

thats why AMD built the modular cpu they did (ie massive multi-threaded ability), they want this to work as closely as possible. while the ppu runs its thing and needs to communicate to cpu what is can longer process, homogenous computing. the ppu will do all major grunt work for parallel side and shift all other work to cpu. the combined will use less power each, each will be doing what it does best, cpu for serial processing,ppu for parallel processing. the programs haven't hit that we can use the ppu/gpu side of these chips extensively. that's what we can't see at the moment. the cpu and gpu/ppu working so closely together.

In no way i am saying your wrong, just thats what I am seeing and hearing, i don't think dedicated components will go away entirely for at least another 4-5 years for enthusiast lvl, after that i don't think we will see anything dedicated except dgpu for intense gaming purposes.
i think we will only see dedicated cpu and ppu in major high end machines $500+ part. everything from there down will be dominated by apu. but the other side is they will be the enthusiast level by that time say buying a $200 amd apu in the future would be like buying an 8350 with a 290 on the same die. you would really only need a good graphic card to handle the game.
i disagree with jaun that dedicated parts will go away entirely, that surely won't happen. as different machines are dedicated to certain tasks and would be a waste of time for one or other compute unit sitting there idling all the time.
i misstated earlier that AMD will shift all processor to APU, I should have said mostly APU. with low production of dedicated parts for certain tasks.though i thing it will be very heavy in dgpu for gaming as that will just intensify beyond apu capability.

juanrga · Feb 17, 2014

jimmysmitty :

Since AMD presented its server roadmap and plans at early 2013, we knew that AMD was not releasing Steamroller Opteron CPUs. I was one of the first who predicted that the server roadmap implied that AMD was killing the idea of a FX Steamroller CPU. I received lots of replies: from a hard "you have no idea, what you say is impossible" to a weak "wait for the desktop roadmap". All the roadmaps for 2014 and 2015 are now well-known and confirm what I said.

I assume that you have no link that give to my request.

I already gave you a link from Intel acknowledging that the problem is in the fabrication process. I also said you that Altera is considering to leave Intel by TMSC. The problem cannot be due to Broadwell arch.

Neither Apple A7 nor Nvidia Denver are using ARM core designs as base. Both companies are designing their own cores from scratch. Specially Nvidia, whose design is derived from their initial x86 core. I already mentioned this to you.

Apple predecessor A6 SoC (32 bit) already had OoO. Even standard Cortex cores from ARM have been OoO designs since 32 bits. I think you would also compare performance before making claims about complexity

http://www.anandtech.com/show/7335/the-iphone-5s-review/6

Apple A7 SoC (a dual core @ 1.3GHz) is able to compete with "the best AMD and Intel have to offer in this space" aka quad-cores at higher frequencies: 1.46GHz for Intel (Turbo of 2.39GHz) and 1.5GHz for AMD. You need to double the number of x86 cores and maintain higher frequencies to offer the performance of ARM cores from Apple.

The ultra-high-performance APU from Nvidia that I mentioned in the past is designed for 10nm.

8350rocks :

ARM has already fabbed 14nm chips

And I already mentioned the next slide to you. Check that part about tapping out 10nm chip

juanrga · Feb 17, 2014

Cazalan :

Intel is introducing a new core, the KL core, for the new Phi. I already gave you a link with details of the new core arch.

I have given you a quote from Intel representative saying that the new KL Phi further advances "heterogeneous computing" (his own words).

The discrete card version of Phi cannot compete against the discrete cards from Nvidia, neither in raw performance nor in code easiness. CUDA GPUs are about 2x faster than Phi. People prefer Xeon+CUDA rather than Xeon+Phi. Intel is trying to change the game rules with the new socket version. If you purchase the socketed Phi and it works well alone, without requiring a discrete card, then you don't need to purchase Nvidia nor AMD anymore.

As mentioned above "neo-heterogeneity" is a marketing term. Intel is not offering a simplified/unified programming model. In fact the kind of modifications that your code requires to work efficienctly on a Phi is close to the modifications needed for CUDA. The only difference is that CUDA tools can do the optimization simpler.

Results gathered on Intel’s Xeon Phi were surprisingly disappointing… It took quite some effort to create solutions with good performance due to vectorization tuning, despite that the Xeon Phi is said to be easily programmable.

http://link.springer.com/chapter/10.1007%2F978-3-642-38750-0_25

On the Phi front, the porting of so many users' code was relatively simple, which was beneficial in terms of getting up and running, but there's far more to the story past the pure port. According to TACC Director of Scientific Applications, Dr. Karl Schulz, getting code clicked over to Phi is the relatively easy part (unless they’re reliant on a large number of third party libraries). It’s getting the code optimized that's the real challenge.

[...]

“You can port easily, but the things you do in CUDA to vectorize your code still have to be done for Phi,” he explained.

http://archive.hpcwire.com/hpcwire/2013-05-17/saddling_phi_for_tacc%E2%80%99s_stampede.html

In short:

HSA >> x86|ARM + CUDA >> "neo-heterogeneity"

gamerk316 · Feb 17, 2014

palladin9479 :

thats why AMD built the modular cpu they did (ie massive multi-threaded ability), they want this to work as closely as possible. while the ppu runs its thing and needs to communicate to cpu what is can longer process, homogenous computing. the ppu will do all major grunt work for parallel side and shift all other work to cpu. the combined will use less power each, each will be doing what it does best, cpu for serial processing,ppu for parallel processing. the programs haven't hit that we can use the ppu/gpu side of these chips extensively. that's what we can't see at the moment. the cpu and gpu/ppu working so closely together.

Simple question: How is the PPU (or APU, or whatever we call it) going to know which code to process, and which code to leave up to the CPU?

Hence the root issue here: You need S/W integration, and, as I've said MANY times since BD was first announced, that is NOT going to happen. GPU's and the like only benefit when stuff scales, and the things that are easy to offload to GPU's already have been.

You also ignore OS side problems: You can't use an AMD APU with a NVIDIA dGPU for instance, because the Windows driver model only supports one primary graphics driver at a time. Which makes handling different use cases a LOT harder for the developer.

Then you get the classic "What if I have a REALLY strong CPU and REALLY weak GPU, and only Vector processing? Should I offload? How much? How do I maximize performance? What if the reverse case is true?" performance questions. Optimize for one config leads to a regression for another. You can't guess on the performance characteristics of the hardware you are running on.

So putting aside Palladin [who is 100% correct in what he says here], I argue that the mere face you need to handle the loading in S/W makes AMD's approach not work.

Sigmanick · Feb 17, 2014

So, if you like WCCFTech or not, we have some (I hate to admit this) 2nd hand bench marks for HSA software. Assuming this information has not been previously posted, and that these are 100% valid, I am impressed. First set of benchies are x86 code, second set of benchies are HSA enabled.

http:/Korean benchies via WCCF/

juggernautxtr · Feb 17, 2014

Sigmanick :

7850k with full on hsa whopping a 4670k intel
thats a pounding

Cazalan · Feb 17, 2014

I wouldn't put any money in it myself because there is zero consumer protection. One virus can wipe your wallet. If my cc gets stolen I can get it blocked and charges reversed. Same with a checking/savings account.

IMHO, the spare CPU/GPU time is better spent on F@H or something. Everyone gets affected by cancer and other diseases at some point in their life.

70k a month would be like half a million dollars in hardware expense. Requiring a couple people just to setup/maintain the hardware. Rent on a facility that can power and house that much equipment. Or he's a rogue IT guy that infected his own network for cash. 😉

Cazalan · Feb 17, 2014

juanrga :

Xeon Phi has homogeneous cores. Intel fully intends that Xeon Phi will be coupled with Xeon in an HPC deployment. That is their game plan for heterogeneous HPC. I haven't seen anyone besides yourself even suggest that there would be HPC deployments of just Phi. Phi does not have any high speed cores. They are all slow speed cores.

Yet I have seen discussions about a DP (2-way) or MP (4-way) system where there would be one Xeon and one Phi. Or 2 Xeon and 2 Phi, or 1 Xeon and 3 Phi. This is what socketing allows them to do.

Change the subject to which is going to perform better or which is easier to code for if you like, but I have no skin in that game. Nvidia has a full suite of tools and Intel has a full suite of tools. Engineers can cry all day about how hard their job is but it doesn't make them look good. They're paid to get results. If it was just push button they would outsource their jobs to interns.

As this discussion is going nowhere. Let's just bookmark this SuperMicro page and see what is offered later in the year, shall we?

http://www.supermicro.com/products/nfo/Xeon_Phi.cfm

Sigmanick · Feb 17, 2014

Wouldn't you just love to be able to set up CG miner on 30-40 computers to run on the off hours and not pay a dime in Utility costs or hardware? All those machines tied to your wallet. Even better if you can keep those machines mining after you are fired.

juggernautxtr · Feb 17, 2014

http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=8871755&CatId=7387

tiger direct looks to be staying out of the gouging, looks like i will be buying a 290 at a reasonable price.

Cazalan · Feb 17, 2014

Sigmanick :

Fired would be the most lenient outcome. People have done it though.

http://www.wired.com/science/discoveries/news/2002/01/49961

"McOwen was facing a fine of up to $415,000 and eight felony counts of computer theft and computer trespass, which could have resulted in anywhere from eight to 120 years in jail, for loading a distributed computing client onto PCs at DeKalb Technical College in Atlanta, Georgia, where he worked until December of 1999. "

Cazalan · Feb 17, 2014

juggernautxtr :

Amazon has some at retail price, very low quantities. With reviews to promote the litecoin fad like this, lol.

"My power supply didn't support this, but it worked in my teenage-daughter's computer. She's now making $5 a day mining LiteCoins and is really happy, considering that we had been paying her $4/month for allowance. "

truegenius · Feb 17, 2014

Cazalan :

and i was thinking to do this by creating a batch file in startup folder on all the net cafe, training institutes and college computers :miam:

but i am retiring now from all this mining job :mmmfff:

though if anyone need 41.54191457 DOGE and 0.106534 LTC then you can pm me your wallet address :ange:

( because all these coins are hypnotizing me to mine more :hebe:

and made me a night owl 😛 ) thats all that i managed to get in past few days 😗

jimmysmitty · Feb 17, 2014

juanrga :

That benchmark you posted doesn't show anything but BayTrail, a A4 5000 and the Apple A& all in a bunch of web browser benchmarks. How is that equaling performance? Those benchmarks show nothing really and besides CPU they also are dependent on what browser is used and since each CPU was tested on different OSes ( 8 for the AMD A4, 8.1 for the Atom and iOS for the Apple A7) meaning different browsers for each (IE10, IE11 and Safari) it is a null and void comparison.

The only way to get a true equal comparison is to compare the chips on an equal OC with the same software therefore eliminating any personal OS optimizations (iOS is optimized for Apples hardware, much like games for a 360 are optimized for that specific hardware). Without that equal playing field, you can make a weaker CPU like the A7 look as powerful as a Core i5 through software optimizations.

And when I see the 14nm chips out (Intel has fabbed them, they just haven't gotten the yields they desire) from Samsung or whoever is putting up the billions to setup the fab I will believe it. ARM is not really doing any fabbing or process manufacturing. They develop archs and sell them to others.

Sigmanick :

I am not sure about WCCFTech. They posted and had people believing every rumor for the R9 290 series, even the one that was showing Bulldozer cores with a large GPU attached to it.

I will wait and see before I judge HSA.

My issue is, like with Mantle, the software has to be made to take advantage of that feature and if it is only a select few, renders such abilities null and void.

One thing about Intel is they always work with software vendors to push their newest tech abilities. Of course Intel has the money while AMD still is not quite there but they could find a mutual way to get it sorted.

Cazalan :

You can, and anyone should, back up your bit/litecoin wallet to a USB stick. Hell it is a small file so you can easily throw it on a cloud drive, thumb drive and eHDD to be super safe. And that should be done daily.

juggernautxtr :

That is $200 higher than MSRP. The 290 was supposed to be $400, $550 for the 290X. I would say add $50 for the after market cooling and BF4 so $450 for that one, still $150 higher than it should be.

http://www.shopblt.com/cgi-bin/shop/shop.cgi?action=thispage&thispage=0110040015014_BTQ7722P.shtml&order_id=!ORDERID!

That's about where the price should be.

logainofhades · Feb 17, 2014

juggernautxtr :

How do you figure they are staying out of the price gouging? The R9 290's MSRP is in the $400 range.

AMD CPU speculation... and expert conjecture

Honorable

Honorable

Distinguished

Distinguished

Champion

Distinguished

Splendid

Honorable

Splendid

Distinguished

Honorable

Distinguished

Distinguished

Glorious

Honorable

Honorable

Distinguished

Distinguished

Honorable

Honorable

Distinguished

Distinguished

Distinguished

Champion

Titan

Share this page