8350rocks :
juanrga :
No.
Before starting. Can you snip the replies to other authors, before replying me? Or it is too work for you? I have made it for you now.
First. I gave you baselineS ==> Plural ==> More than one.
Baselines of what 64 bit server capable ARM architecture?
LOL. Do you want a 64 bit ARM baseline for the _first_ 64 bit ARM chip? That is very easy to solve, because only one baseline is possible. Do you need some hint?
8350rocks :
Second. A9/A15 are not "x86 CPU baseline".
No, they're tablet chips, which you so astutely pointed out earlier, is not the same thing as a DT or server CPU.
Nope. I said «A9/A15 are not "x86 CPU baseline"».
The A9 and the A15 can be found in phones. The A9 and the A15 can be found in some _early_ ARM servers
8350rocks :
Third, one can compare ARM performance to x86 performance, in the same way that one can compare PowerPC performance to x86 performance, and MIPS performance to ARM performance... This is usual.
Comparing is one thing, making assumption based on marketing slides and hype with no currently equivalent architectural baseline is too many assumptions and extrapolations to be useful. We will find out what Seattle does in 2H 2014. How about no more ARM discussion until we at least have ES's? Everyone good with that? I know I would be fine with it.
You don't need "equivalent architectural baseline" to compare the performance of ARM to x86, PowerPc to x86, MIPS to ARM, ARM to PowerPC...
If you don't want to discuss more about ARM, you can stop posting. If you stop posting more nonsense, I will stop correcting it.
8350rocks :
Fourth. Nobody said you that ARM and x86 perform equally in many tasks. We know that some taks will be best suited to ARM and other best suited to x86. That was emphasized before. That is why the word "about" is used. That is why the symbol "~" is used.
As per above, you are making too many extrapolations with minimal data. It would be like trying to extrapolate how long a 1960's supercomputer would take to run modern x86-64 instructions if you converted it from punch card instructions. You might be close...you might miss the mark by an entire galaxy worth of error. It's useless information.
How many extrapolations do you believe are being made? 200? 58? 3.1416?
Your pretension that comparing A57 to jaguar, Piledriver, or Sandy Bridge is like comparing a 1960 supercomputer with punch cards to a modern x86-64 is either an intelligent joke or you are going crazy with _your_ pretension that x86 is modern and complete, whereas ARM is incomplete and old.
8350rocks :
Unsurprisingly, this also happens when comparing x86 to x86. We know that a FX-8350 can be faster than i7-3770k in some task but barely match an i3 in other tasks. But this has never stopped you from making x86 to x86 comparisons or to make comments against the i3. Another instance of how you use double standards.
When comparing them, we have as close to an actual "fair" comparison as possible. I also would like to point out that I recommend most people take benchmarks with a grain of salt as you can never be entirely sure of 100% of the variables in play. I give that advice with working hardware you can buy in the real world. Imagine how I feel about trying to guess numbers from something that doesn't exist, and has no prior precedent from a similar architecture...?
You must recommend that when it is about x86. When you said that Steamroller will be 30% you didn't mentioned "grain of salt", neither you used any benchmark. You merely claimed the expected performance from "something that doesn't exist". The fact you use the "that doesn't exist" argument only against ARM, reflect again how you play by double standard rules.
8350rocks :
Fifth. Evidently the A7 and A9 are not the same thing as A57, but we can measure performance. Benchmarks of the A57 against the A15 exist. Benchmarks of the A15 against the A9 exist, and so on.
As you so astutely mentioned directly above when someone showed you the inferiority of ARM in compute benchmarks:
"Tablet chips != DT or server chips, this is not a fair comparison"
So, trying to extrapolate data from incomparable architectures is ok now? Is that because you are doing it and it serves your purpose? Because everyone else is scratching their heads. It would be like trying to extrapolate Xeon E5 performance from an Intel Atom...or trying to make assumption about Steamroller based on Temash performance. Yet according to you, we are ludicrous for trying to do such things. However, it's clearly ok for you to throw around those numbers and claim they're comparable...right? :sarcasm:
You mix things. Don't strange you are so confused. You cannot compare the _raw performance_ of a ARM phone chip to the _raw performance_ of a x86 DT chip and pretend, as _you_ do, that x86 is faster. If someone make this either he doesn't have absolutely any idea about the topic or is being dishonest.
Of course, that is different from using a phone chip as baseline to obtain the performance of a server chip, which can be made with independence of the arch.
Nobody is _extrapolating_ the performance of Xeon E5 from an Atom, neither _asumming_ Steamroller performance from Temash.
8350rocks :
Finally. The only ludicrous here is your anti-ARM crusade: from your initial nonsense CISC is "Complete" but ARM is not, to your recent "what is the baseline", passing by your attempt to compare architectures by using x86 chip vs ARM cluster benchmarks. LOL
Ok, I have had enough of this reference, explain to me what you understand the difference to be between CISC and RISC without cut and paste from Wikipedia. You still have failed to do so, and I think you misunderstand the difference entirely. So let me break it down for you:
Could you get a RISC architecture to do everything a CISC architecture can? Sure, with enough coding effort you likely could.
HOWEVER:
Because you cannot use the same level of abstraction in the code, and the instructions are far simpler, your code would be simpler. This means it would take more code to do the same things comparatively, and the CPU would spend more time processing instructions. Why, you ask? Well, because when you have higher level instruction sets in the CPU uarch, you can use more advanced instructions that take longer than 1 clock cycle to run. This means less code can do more work because you can run more complex instructions that a RISC architecture would have to break down into multiple operations.
So, what that means, is that in RISC, to do the same high level abstraction, you would have more bloated code to get all of the same functionality. Your CPU would be more bogged down running code longer because it doesn't have the high level instructions. Think windows is huge in x86?? Want to bet a DT version of windows for ARM architecture would be even more bloated if they included the same features? Want to bet it would run significantly slower too for many operations because it would just take more time to process the extra code to implement the abstractions?
The answer to the above is yes, it would be slower, it would take more time to run code that requires higher level instructions in x86.
THAT is why ARM will not beat x86 in raw compute. It simply would not happen. No matter how much you try to brute force it, x86 is better at raw compute.
Now get off your dead ARM horse, and stop beating the poor thing, it's dead...ok?
Do your lack of answer and your refusal to cite the same link again implies that you did knew that you cannot evaluate the performance/efficiency of two CPU architectures from comparing a chip to a cluster?
Regarding CISC/RISC, I know enough to say you the following.
In the first place, any modern CISC processor is so complex than cannot be directly implemented in the silicon. In any AMD or Intel modern chip CISC is internally translated to a RISC-like uops, which are then executed on hardware.
_Your_ old pretension that a RISC architecture cannot do everything a CISC architecture in a natural way is based in _your_ confusion that CISC mean Complete and RISC is for Incomplete.
Since the CISC instruction are never executed in a modern CISC processor, but are first translated to RISC uops all your arguments vanish in the air. But I will bite.
First mistake. A RISC CPU doesn't spend more time procesing instructions. It merely process more instructions, since the RISC arch minimizes the execution time for each instruction, because each one is simpler, the time needed to execute the program is minimized _not_ maximized. This is RISC 101 stuff.
Second mistake. Further repetition of the first. Your any-ARM-version-of-Windows-will-be-slower and nonsense as that was solved above.
Third mistake. A ___genuine___ CISC CPU doesn't spend less time procesing instructions. It merely process less instructions, but since each instruction is more complex the execution time for each instruction is longer, and the time needed to execute the program is _not_ minimized. Again, this is for a genuine CISC CPU that executes CISC on silicon. Modern CISC CPUs are implemented as RISC on silicon and the code really executed by the hardware is RISC-like.
The more funny part of your anti-RISC rant is that your lovely CISC CPUs are really executing RISC code at the metal level. When your write rants against RISC you are really writting rants about how really work a FX-8350 at the most fundamental level (silicon level). I think it was Intel the first to notice the advantages of the RISC approach and the first to implement RISC uops on silicon; AMD did a bit latter if my memory doesn't fail.
Fourt mistake. To use _your_ misunderstanding of RISC for pretending that ARM cannot be faster than x86 is lovely, but that you are unaware of that some of the more powerful CPUs in the planet are RISC gives a degree of tragedy to your posts.
For instance, POWER 7 offers 33 GFLOP per core. Ivy Bridge i7-3770k offers 28 GFLOP per core. POWER 7 was replaced by POWER7+ and this by POWER8. IBM states that it is two to three times as fast as the POWER7. Stop from pretending that RISC cannot be as fast as x86.
Again, if you don't want to discuss more about ARM, you can stop posting. If you stop posting more nonsense, I will stop correcting it.
noob2222 :
juanrga :
Marketing didn't play any role in calculating the GFLOPs.
ya, ok so you came up with the figures from real world benchmarks on a chip that won't be made till next year and have the schematics for building it yourself. got it.
While (Juanrga = wrong) {printf("JUANRGA IS NEVER WRONG")};
Nope. I already explained to you how were obtained. Maybe you would stop from taking nigthmare-code lessons from Mr. if(0!=1) then{} else{} and pay some attention to the material on the posts that you reply...