How much performance improvement does hyper threaded core provide?

leaf__00

Commendable
Oct 14, 2016
38
0
1,540
Let assume we are comparing 2 same cpus; same brand(Intel), same generation(architecture,Clock, cache),same environment(cooling,temperature,os,ram speed,game version,motherboard,cpu revision,clock, etc) BUT with and without hyperthreading.
FOR EXAMPLE,
Let say an i3 6100 running Cinebench R15 multi-core bench, how much(by PERCENTAGE) performance improvement(by maximum) be there; with hyper-threading and without hyper-threading?

OR

1.For example, an i3 6100 with hyperthreading DISABLED and i5-6600k with turbo boost DISABLED and Overclocked to 3.7ghz(just like the i3), SHOULD'T it provide exactly twice the performance(Maximum performance) of the i3?
(provided the software used have no problem maxing out cores and doesn't have any specific requirement,line cinebench?).
I understand,if everything is SAME BUT DIFFERENT numbers of REAL CORES, performance should scale linearly.

3.I do understand people will be talking like:"oh no, it won't,games use only single core,no u can't predict it,blablablaa".
But we are speaking abt controlled environment,or just the maximum possible performance
Pls don't make it any harder to ask or explain my post.

4.THE REASON TO ASK.
Just wanna know, the maximum, possible raw performance that is achievable with hyperthreaded cores.

I'm still a noob and guessing Cnebench could do the job.
I dont have any of cpus mentioned above.So, im hoping for someone to provide the data.

Thank you very much, if you could be helpful.
 
Solution
There is no way you'd get anything close to 100% in anything but terribly optimised (compiler) code with a lot of perfectly timed stalls. The most improvement you'll see will be from rendering/encoding apps and even then 30% (nothing to sneeze at) might be the absolute peak improvement. This is probably because they fetch a lot of data rather than the core workloads fitting within the L1 or L2 cache.

All this being said, cpu with only 2 real cores will benefit more from HT to 4 cores than a 4 core will HT to 8 cores, because of the diminishing returns of having extra logical cores. But a 4 core CPU will easily outperform a 2 core one with HT in any work load.
1. Ideally you get 100% performance boost from doubling cores of cpu on cpu dependent software. On some tasks this is possible (you can actually execute threads simultaneously). Sometimes you can't due to having to wait for result(s) from other thread(s).

4. https://www.youtube.com/watch?v=N7jETCzhaVY here's a video of two indeltical systems running side by side other with hyperthreading on. And no you won't get 100% extra performance, although a noticeable bump. Depending on on software hyperthreading can also hurt performance of a cpu. Hyperthreading allows almost seamless switching of task execution on real cores which brings the extra performance (if you want to read moore google about hyperthreading and thread execution, too long to try to explain here).

There are marketing reasons for having different versions of the same product(s).
 
HT has been around a long time, you could easily have researched this yourself. It lets the CPU execute some code while the first logical core is waiting for another instruction to complete (ie. due a slow operation like a memory fetch).

It can provide a 5-20% or so boost to some multithreaded applications depending on the code and the workload. It is not in any way a replacement for a second real core. It doesn't help much in games.
 
There is no way you'd get anything close to 100% in anything but terribly optimised (compiler) code with a lot of perfectly timed stalls. The most improvement you'll see will be from rendering/encoding apps and even then 30% (nothing to sneeze at) might be the absolute peak improvement. This is probably because they fetch a lot of data rather than the core workloads fitting within the L1 or L2 cache.

All this being said, cpu with only 2 real cores will benefit more from HT to 4 cores than a 4 core will HT to 8 cores, because of the diminishing returns of having extra logical cores. But a 4 core CPU will easily outperform a 2 core one with HT in any work load.
 
Solution
The trouble with ignoring all the situational variables that makes determining performance 'difficult' is it's unrealistic. If ignoring how real world programs perform in various environments or differences from one application to the next regarding ht, we may as well say hyper threading offers purple performance and smells like apples.

Theory doesn't process data, reality does and unfortunately there's no 'perfect scenario'. I would say 50% improvement due to ht over a non ht core is extremely optimistic. Real world, as mentioned, is generally 5-20% improvement overall with extremes of up to about 30-32% and in other cases 0% improvement or worse, a detriment and better to shut ht off.

Ht doesn't double cores or anything else, that's just how windows sees it and reports it. If a cpu is dual core with ht then it has 2 cores, period. Ht allows data to be processed during what would normally be stalls in the stream of data being processed, in other words keeping the actual physical cores that exist a bit busier and more consistently fed with data to process.

It does get a bit muddied, as you put it 'blah blah' but that's the reality of it. To ignore the real world performance and variables suggests you don't want to know. Breaking it down to make it more simple or basic is fine for theory but it's not reality so really no point in theorizing about fictional things. At that point the concept is over simplified to the point of being wrong or erroneous. Long story short, having ht is usually better than not having it at all. 2 physical and 2 logical (ht) 'cores' are worse than having 4 physical cores.

It's a bit hard to speculate, using terms like 'raw performance'. What is raw performance? Is it processing game physics, is it encoding video, is it running virtual machines? In order to exhibit 'performance' it must be 'doing something'. The raw performance of car not actually driving parked in a driveway and a rock are identical. It's only once the car begins to actually drive that performance can be measured.
 
You could add that HT makes system more responsive if running the same amount of heavily used threads as there are physical cores. This was especially noticeable on the first ht enabled cpus which only had one physical core, no more lag when opening programs (you could actually do something while the other program was loading up)!
 


Well,the reason i sound like "regardless of variables situation" is NOT to estimate real world performance.That's a logic i understand but that is not what im seeking after.Else why would i mention.To make it clear, if we want to estimate the effect of changing value of just one variable,than the test carried out should have values of other variables fixed. As simple as that.

"Theory doesn't process data, reality does and unfortunately there's no 'perfect scenario'". Again, im not after the real world scenario.Theory does process data,but in reality,the are other variable to be considered.With that said, theory and reality is not something opposing(srry,my english is not good) each other.It doesn't work like that.

When i said "blabla",it's not ignorance,its just like,"etc", got it?chill dude.And by raw performance(like i said,im noob) i mean maximum performance possible to achieved, let say in video processing or just a benchmark.

But,lets just get to a resolution.Like you said, 30-32%, yeap that's it, after a very long research(n knwing my limit), i found it to be 33%(that 1-0.66).
I don't know if this is just a coincidence.BUt this is it.I know cache is not something intel provide based on price or on whim, but to be available as sufficient resource to be used by available pipeline(threads).
With 2 core pentium,it's 3mb,i3 with 2 core and 4 thread,it's 4mb(+33% than 3mb),i5 with 6mb(twice cores as much as the pentium), and i7 with 8mb(+33% of i5's 6mb).
Thanks, anyway.

Any explanation abt Hyperthreading,i have read so many times,all over the internet,u guys regurgitate it here,afterall none of us have sufficient qualification to understand(truly) but just enough to make a purchase.BUT i just need a number.Truly srry if this sounds harsh, but i hope u understand ,it a very simple.

 
The problem is that you're trying to isolate variables that are dependent upon one another. Like trying to make one angle of a triangle larger without decreasing either of the other two angles. The theoretical performance of hyperthreading, and non hyperthreading for that matter, comes down to efficiency. And while you can go ahead and calculate using 100% efficiency, it won't reflect any real scenario.

Another reason you can't say 4 cores should be exactly twice as fast at 2 cores is that "more" isn't the same thing as "faster." 4 people may be able to carry twice as much as 2 people. But they can't run any faster than 1 person could.
 
TMTOWTSAC may have explained it better with the triangle analogy.

I just meant that when ignoring real world variables then the performance (or relative performance) amounts being speculated will be wrong. It's not as simple as saying disregard all the variables and the raw performance difference is xyz %. That percentage relies on the combination of variables surrounding the situation.

You're right, the cache does differ between an i3, i5 and i7. It will make some difference since cache is basically the ram or data storage closest to the cpu keeping it fed. When that runs out the slower dram has to be accessed to fill it back up.

If you've ever burned optical media to disc by creating a cd or dvd especially on older/slower pc's it's similar to the buffer that keeps the optical drive fed. Everything runs smooth so long as that buffer stays full, if it can't keep up with the process there's a bottleneck there. Obviously a cpu won't suffer buffer underrun errors and crash like the cd/dvd drive will.

Cache like l1 or l3 cache alone won't make a huge difference but it's all part of the equation. With an i7 having 4 cores and ht keeping the cores working harder or more efficiently it benefits from having slightly more cache than an i5. Dram speed will also factor into the equation as well as the demand of the program and everything else tied together.

You can have 2 school buses, one that's faster than the other flat out on a long empty straight road but if one of the variables like picking up students and one student is consistently late boarding the faster bus, the slower of the two school buses will actually perform better and arrive ahead of schedule. If I discounted that variable information and told you 'bus 1 is faster' yet bus 2 consistently arrived 5min sooner every day I would be giving you bad information.