Discussion Thoughts on Hyper-Threading removal ?

jnjnilson6

Distinguished
https://videocardz.com/newz/intel-a...-xe-cores-no-hyper-threading-and-ddr4-support

Hyper-Threading - the revolutionary Technology Intel has kept going ever since the Pentium 4 days.
AMD were able to employ it only as of latterly.
It was a huge deal back in the day. Core i7s went with it and i5s without it and it sure did provide a notable kick for the buck.

I would like it very much if you should share your opinions on the removal of Hyper-Threading in Arrow Lake CPUs and maybe recapture the vague nostalgic drift the technology brings about from the sun-strewn days of the Pentiums (P4s).

Thank you and do write up! :)
 
  • Like
Reactions: Order 66

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
https://videocardz.com/newz/intel-a...-xe-cores-no-hyper-threading-and-ddr4-support

Hyper-Threading - the revolutionary Technology Intel has kept going ever since the Pentium 4 days.
AMD were able to employ it only as of latterly.
It was a huge deal back in the day. Core i7s went with it and i5s without it and it sure did provide a notable kick for the buck.

I would like it very much if you should share your opinions on the removal of Hyper-Threading in Arrow Lake CPUs and maybe recapture the vague nostalgic drift the technology brings about from the sun-strewn days of the Pentiums (P4s).

Thank you and do write up! :)
No! Hyperthreading should not be removed until there is a suitable alternative. I don't understand why it is being removed.
 
  • Like
Reactions: jnjnilson6

Eximo

Titan
Ambassador
What do you mean? Also, won't the performance loss be massive? I've heard that hyperthreading improves performance by about 40%.
All of the side channel attack/vulnerabilities took advantage of SMT/HT in one way or another, this applies to pretty much every CPU. By removing it and its mitigations, it should make things safer and more efficient.

The extra threads will be replaced by efficiency cores or small cores. So we won't really lose anything.
 

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
The extra threads will be replaced by efficiency cores or small cores. So we won't really lose anything.
I'm not saying you're wrong, it could be my lack of knowledge, but wouldn't replacing hyperthreading with e cores result in worse performance due to them being lower power and thus lower clocked? Or would it result in more performance due to more physical cores? I could see both things happening.
 
  • Like
Reactions: jnjnilson6

Eximo

Titan
Ambassador
I'm not saying you're wrong, it could be my lack of knowledge, but wouldn't replacing hyperthreading with e cores result in worse performance due to them being lower power and thus lower clocked? Or would it result in more performance due to more physical cores? I could see both things happening.

If you presume that 40% figure, what is faster? A hyperthread on a P core, or an E-core thread running at 60-80% the speed of a P core thread (gets a little skewed when you look at like a 6Ghz P core. The more normal chips stick to around 5Ghz)
 

kanewolf

Titan
Moderator
https://videocardz.com/newz/intel-a...-xe-cores-no-hyper-threading-and-ddr4-support

Hyper-Threading - the revolutionary Technology Intel has kept going ever since the Pentium 4 days.
AMD were able to employ it only as of latterly.
It was a huge deal back in the day. Core i7s went with it and i5s without it and it sure did provide a notable kick for the buck.

I would like it very much if you should share your opinions on the removal of Hyper-Threading in Arrow Lake CPUs and maybe recapture the vague nostalgic drift the technology brings about from the sun-strewn days of the Pentiums (P4s).

Thank you and do write up! :)
Once CPUs became non-uniform (P cores / E cores) hyperthreading just becomes more complexity for the scheduler. If you have a CPU limited program, hyperthreading provides only small improvements.
 

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
If you presume that 40% figure, what is faster? A hyperthread on a P core, or an E-core thread running at 60-80% the speed of a P core thread (gets a little skewed when you look at like a 6Ghz P core. The more normal chips stick to around 5Ghz)
I'm a bit confused on Zen4c cores. It seems that they are the same as normal zen4 cores, but smaller. I suppose with that, 8 zen 4 cores + 8 zen 4c cores would be better than 8 zen 4 cores without HT, but zen4c cores also have hyperthreading. As far as Intel's efficient cores go, on the i5 12600k for example, there is a significant 1.3 GHz difference between max boost clocks on the e cores versus the P cores. (3.6 GHz vs 4.9 GHz)
 

Eximo

Titan
Ambassador
I did give a range between 60-80%, 3.6/4.9 is 73% so that works. The clock speeds go up together until you hit the high end K skus.

But we are talking about future CPUs. Existing products don't really apply.

If Intel removes hyperthreading and makes a successful product then others may follow.
 

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
I did give a range between 60-80%, 3.6/4.9 is 73% so that works. The clock speeds go up together until you hit the high end K skus.

But we are talking about future CPUs. Existing products don't really apply.

If Intel removes hyperthreading and makes a successful product then others may follow.
Yes, but I just feel like this should be slowly rolled out so we don't have to suffer potential issues with a first gen product, assuming AMD launches a competing product at the same time. If this somehow fails, then we may be stuck with lower performance until the next generation of these new non HT CPUs release. Just my thoughts, don't know if anything I've said is even possible.
 
  • Like
Reactions: jnjnilson6

Eximo

Titan
Ambassador
I don't think there is a way to roll it out slowly. Either it has the capability or it doesn't. Just remember that Intel already has upcoming silicon in testing phases a long time before we see it. So they will have tested how removal of hyperthreading and more reliance on e cores and lp cores and ai centric cores change things.

Also keep in mind that desktop is never first. Mobile is where the money is really at, so you'll see laptops with this before we see it in desktop.
 

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
I don't think there is a way to roll it out slowly. Either it has the capability or it doesn't. Just remember that Intel already has upcoming silicon in testing phases a long time before we see it. So they will have tested how removal of hyperthreading and more reliance on e cores and lp cores and ai centric cores change things.

Also keep in mind that desktop is never first. Mobile is where the money is really at, so you'll see laptops with this before we see it in desktop.
What about zen 4c? I don't really understand how those are different because they still have HT unlike intel's e cores. I suppose I don't understand how AMD would release something similar.
 
  • Like
Reactions: jnjnilson6

Eximo

Titan
Ambassador
What about zen 4c? I don't really understand how those are different because they still have HT unlike intel's e cores. I suppose I don't understand how AMD would release something similar.

They would design a new core. Again we are talking about future CPUs. Zen 4 and Zen 4c are already existing. Zen 5 is already in the works, but Zen 6 could have anything in it. Not sure I have actually heard about Zen 5c cores, but presumably they would make them again.

If they can drop all the security and SMT complexities in the hardware and free up space for more cores, it seems a logical thing to do.
 
HT isn't free performance there's a fairly significant power consumption hit. I'd imagine removing HT will also simplify scheduling a bit. There's also the rentable unit concept which has been kicked around, but I think it might be a bit early for that architecturally speaking.

Keep in mind it is unlikely Intel is going to put out a CPU on desktop that is slower than what it's replacing as it has blown up in their face the two times it happened (though RKL was mostly backport proof of concept). So even if HT is just removed I'd expect to see at worse the same level of multithreaded performance out of each tier.
 
The extra threads will be replaced by efficiency cores or small cores. So we won't really lose anything.
But if the extra threads stay and they add more cores then we would gain extra performance...
Also you would lose the ability to run 16 threads at high speed/clock and would only be able to run 8 threads at high speeds.
HT isn't free performance there's a fairly significant power consumption hit.
Could you link to what you are basing this on?
I haven't seen anybody cover that for decades.
If the power hit is in line with the performance gain then who cares.
 
Could you link to what you are basing this on?
I haven't seen anybody cover that for decades.
If the power hit is in line with the performance gain then who cares.
It is in line with performance gains in multithreaded workloads. The point I'm making is that removing it allows for that power to be spent elsewhere. So it's not like Intel doesn't get anything to work with should they be removing it.
 
  • Like
Reactions: jnjnilson6

Eximo

Titan
Ambassador
But if the extra threads stay and they add more cores then we would gain extra performance...
Also you would lose the ability to run 16 threads at high speed/clock and would only be able to run 8 threads at high speeds.
Of course, we could also have 16 P cores from Intel and run them at reasonable clock speeds to keep power in check. Or have lots of e-cores that do the same thing which is the way they went.

I wouldn't mind 8 P cores without hyperthreading as long as they have a decent 20% IPC lift, and paired with like 24 e cores for multithreaded work. Not that I really need that at the moment. All my heavy lifting gets done by older Xeons.
 
  • Like
Reactions: jnjnilson6
It is in line with performance gains in multithreaded workloads. The point I'm making is that removing it allows for that power to be spent elsewhere. So it's not like Intel doesn't get anything to work with should they be removing it.
But that power is only being used if it is being used, what difference would it make if that power provides performance with HT or with small cores?
Smaller cores will not be using less power to provide the same amount of performance, unless there will be more than one small core for every lost HT and that would not be profitable.
I wouldn't mind 8 P cores without hyperthreading as long as they have a decent 20% IPC lift,
IPC lift in what?! Because the mayor problem right now is that there is zero software that can even use the amount of IPC that CPU cores have right now, that's why HT/SMT can provide an 40% increase in performance even in benchmarks which is code that is meant to push the cores as much as possible.
 

iTRiP

Honorable
Feb 4, 2019
929
86
11,090
Intresting read, All the cpu's that I had that had Hyper Threading where some of my favorites, I rather hope it stays, or if it has to be removed, then replaced by something just as awesome. Personally I have a HT cpu right now in my pc, And my thinking about upgrading is just that, upgrading, and not replacing.
 

Eximo

Titan
Ambassador
But that power is only being used if it is being used, what difference would it make if that power provides performance with HT or with small cores?
Smaller cores will not be using less power to provide the same amount of performance, unless there will be more than one small core for every lost HT and that would not be profitable.

IPC lift in what?! Because the mayor problem right now is that there is zero software that can even use the amount of IPC that CPU cores have right now, that's why HT/SMT can provide an 40% increase in performance even in benchmarks which is code that is meant to push the cores as much as possible.

I think that is exactly the path they chose. 8P cores 16E cores. So they kept the thread count that 16P cores would have.

IPC uplift in general. A new CPU I expect there to be some gains.
 
But that power is only being used if it is being used, what difference would it make if that power provides performance with HT or with small cores?
Smaller cores will not be using less power to provide the same amount of performance, unless there will be more than one small core for every lost HT and that would not be profitable.
Somehow the point is being lost on you, and I just don't know how you're not getting it. I'm not saying their strategy is good or bad, or that I even know what it is. I'm simply saying HT isn't free so if they're removing it that opens up power to be used elsewhere.
 
  • Like
Reactions: jnjnilson6
https://videocardz.com/newz/intel-a...-xe-cores-no-hyper-threading-and-ddr4-support

Hyper-Threading - the revolutionary Technology Intel has kept going ever since the Pentium 4 days.
AMD were able to employ it only as of latterly.
It was a huge deal back in the day. Core i7s went with it and i5s without it and it sure did provide a notable kick for the buck.

I would like it very much if you should share your opinions on the removal of Hyper-Threading in Arrow Lake CPUs and maybe recapture the vague nostalgic drift the technology brings about from the sun-strewn days of the Pentiums (P4s).

Thank you and do write up! :)
One of the reasons I got the 7950X is so I can disable SMT and still have 16 full cores at my disposal.
HT/SMT is not bad per se, and it still has its uses in many of today's workloads, but for gaming and general desktop usage the power and thermal overhead required for HT can now be put to better use.

Better security, less power, lower temps, and higher frequencies win over HT now.
 
  • Like
Reactions: jnjnilson6
I'm not saying you're wrong, it could be my lack of knowledge, but wouldn't replacing hyperthreading with e cores result in worse performance due to them being lower power and thus lower clocked? Or would it result in more performance due to more physical cores? I could see both things happening.

It really depends on work load.

First we have to understand what SMT really is, a secondary x86 register stack. So quick class on some basic superscalar uArch stuff.

There are no modern x86 processors and haven't been for a long time. Instead both Intel and AMD process CPU's with their own proprietary internal language. The front end instruction decoders accept x86 instructions, then convert them into smaller proprietary instructions that get shipped to the scheduler, that then schedules them to be executed on internal resources. Basic integer operations are done on the Arithmetic Logic Units (ALU's), memory instructions are executed on the Address Generation Units (AGU's) or Memory Management Units (MMU's). Floating point and SIMD (SSE/AVX/etc) instructions are shipped offed to the FPU / SIMD units. After the work is done the result is dumped onto the register stack in a format identical to what x86 produced.

How does SMT fit into this? Well just because x86 only allows for one operation at a time doesn't mean we can't have multiple of those processor resources. CPU cores frequently have multiple ALU's, AGU's, MMU's and FPU's, meaning there is always some amount of resource units sitting around not doing anything. If we introduce a second x86 register stack on the front end decoder, then we can accept two separate work streams and the decoder / scheduler can then assign work to those units and we can increase our total performance. The down side is that there is no preference between even and odd numbered cores. If the OS assigns one thread to core 2 then another thread to core 3, there is a high likelihood those threads might end up fighting over the cores processing resources. There have been many methods to get around this, normally the OS's scheduler see's the CPU family and from that looks up how to treat different cores and tries not to assign two busy threads to adjacent target pairs.

With the advent of heterogeneous computing, instead of an 8 core SMT CPU, it might be better to do four heavy cores and 16 thin cores. The heavy cores might end up with unused resources but lately it's been thermal dissipation that has limited performance not processor resource availability.
 

Order 66

Grand Moff
Apr 13, 2023
2,164
909
2,570
It really depends on work load.

First we have to understand what SMT really is, a secondary x86 register stack. So quick class on some basic superscalar uArch stuff.

There are no modern x86 processors and haven't been for a long time. Instead both Intel and AMD process CPU's with their own proprietary internal language. The front end instruction decoders accept x86 instructions, then convert them into smaller proprietary instructions that get shipped to the scheduler, that then schedules them to be executed on internal resources. Basic integer operations are done on the Arithmetic Logic Units (ALU's), memory instructions are executed on the Address Generation Units (AGU's) or Memory Management Units (MMU's). Floating point and SIMD (SSE/AVX/etc) instructions are shipped offed to the FPU / SIMD units. After the work is done the result is dumped onto the register stack in a format identical to what x86 produced.

How does SMT fit into this? Well just because x86 only allows for one operation at a time doesn't mean we can't have multiple of those processor resources. CPU cores frequently have multiple ALU's, AGU's, MMU's and FPU's, meaning there is always some amount of resource units sitting around not doing anything. If we introduce a second x86 register stack on the front end decoder, then we can accept two separate work streams and the decoder / scheduler can then assign work to those units and we can increase our total performance. The down side is that there is no preference between even and odd numbered cores. If the OS assigns one thread to core 2 then another thread to core 3, there is a high likelihood those threads might end up fighting over the cores processing resources. There have been many methods to get around this, normally the OS's scheduler see's the CPU family and from that looks up how to treat different cores and tries not to assign two busy threads to adjacent target pairs.

With the advent of heterogeneous computing, instead of an 8 core SMT CPU, it might be better to do four heavy cores and 16 thin cores. The heavy cores might end up with unused resources but lately it's been thermal dissipation that has limited performance not processor resource availability.
Ok, but then how do threads not fight over resources in a demanding benchmark like cinebench? or do they? I have no idea.
 
  • Like
Reactions: jnjnilson6