News Intel's next-gen Arrow Lake CPUs might come without hyperthreaded cores — leak points to 24 CPU cores, DDR5-6400 support, and a new 800-series chipset

bit_user · Jan 24, 2024

AndrewJacksonZA said:
Me. My team and I and our workloads that we ran. (SQL Server)

You tried with HT enabled vs. disabled?

Were virtual machines involved? If so, how?

AndrewJacksonZA · Jan 25, 2024

jeremyj_83 said:
Is this on a physical appliance or in a virtualized environment?

bit_user said:
You tried with HT enabled vs. disabled?

Were virtual machines involved? If so, how?

Tin, no VMs.

jeremyj_83 · Jan 25, 2024

AndrewJacksonZA said:
Tin, no VMs.

That would mean you would have to disable HT on a host and migrate the VM to said host to find out if HT gave you such a speed up. That said the VM will still have a certain number of vCPUs allocated to it. A VM with 8 vCPUs on a HT enabled host would use 4 cores and 8 total threads. However, on a host without HT enabled you would get 8 full cores. Only way 4c/8t can be as fast as 8c/8t in threaded applications is if the instructions are small enough that 1/2 or less of the resources on a core are being used. At that point you would in theory get a 100% increase in performance from HT. That just doesn't happen.

What probably is happening is by diabling HT on a host you have halved the number of vCPUs available in the virtualized environment. Therefore your 8 vCPU VM now has to wait for 8 physical cores to be available for any processing to be done; whereas on a host with HT it only had to wait for 4 phycial cores to be available. Most likely your hosts are over provisioned (assigning out 10 vCPUs but only having 8 physical CPUs) on CPU resources (common practice and a 25% over provisioning is about standard). Since there aren't enough physical resources available, the VM has to wait for resources to become available. For example say you have 16 physical cores and the VM needs 8 cores. The VM MUST wait until it can gain access to 8 cores before it gets CPU time. Now say you have two 2 vCPU VMs running on the same host and they are waiting for resources and 6 cores come available. They will take the 4 cores (each only needs 2 cores) once available leaving your 8 vCPU VM still waiting for CPU time. If you have HT enabled those 6 physical CPUs now are viewed as 12 vCPU by the hypervisor. That means you can run both 2 vCPU VMs AND the 8 vCPU VM at the same time. This means the "speed up" you are seeing by HT is due to CPU Ready % being lower since there are 2t per core instead of HT itself being responsible in conjunction with your code.

AndrewJacksonZA · Jan 25, 2024

Let me restate myself, perhaps it was misunderstood:

SQL Server running on tin. No VMs. HT gave improvements of up to 80% with no cases of performance loss for the workloads that we ran, which were datawarehouse and ODS loads.

jeremyj_83 · Jan 25, 2024

AndrewJacksonZA said:
Let me restate myself, perhaps it was misunderstood:

SQL Server running on tin. No VMs. HT gave improvements of up to 80% with no cases of performance loss for the workloads that we ran, which were datawarehouse and ODS loads.

By tin you mean bare metal and or physical appliance? Even so your experience is the exception and not the rule. In most cases I have seen the speed up by HT is about 30% as HT allows for more execution units on a physical core to be used. In order to get an 80% speed up you would need your execution units to only be half used.

bit_user · Jan 26, 2024

AndrewJacksonZA said:
SQL Server running on tin. No VMs. HT gave improvements of up to 80%

What kind of CPU?

When running with HT disabled, is it possible the server was configured to use more threads than there were CPU cores available for it to use (i.e. not being used by anything else)? That's definitely a way I could see it get a disproportionate speedup from using HT - if threads were getting starved out. However, it'd be an artificial case, because the solution would be just to decrease the number of threads and suddenly your non-HT performance would improve.

AndrewJacksonZA · Jan 26, 2024

I agree that our use case was the exception.

It wasn't an appliance. They were two servers, exactly the same config. It was initially SQL Server 2008, eventually upgraded all the way to 2017. I don't recall the exact CPU model, but they were dual CPU Xeons, 6C12T per CPU. We followed Microsoft's recommended settings for things like MAXDOP (the maximum degree of parallelism) for the servers, experimented with our own (because, of course, "we knew better" lol) and then discovered that Microsoft knew this particular aspect of their own product better than we did since their settings for our server setup worked best. (I guess MS and Intel really did earn the "Wintel" label the way the SQL Server works so well with Intel CPUs. -)

Our bottleneck was always the storage in our daily runs (our workloads were bursty since they were typically batch loads in the morning and then staggered batch processing scheduled for every few hours for weekdays, and then hammered at month ends and year ends.)

jeremyj_83 · Jan 26, 2024

AndrewJacksonZA said:
It wasn't an appliance. They were two servers, exactly the same config. It was initially SQL Server 2008, eventually upgraded all the way to 2017.

That is a physical appliance.

bit_user said:
When running with HT disabled, is it possible the server was configured to use more threads than there were CPU cores available for it to use (i.e. not being used by anything else)? That's definitely a way I could see it get a disproportionate speedup from using HT - if threads were getting starved out. However, it'd be an artificial case, because the solution would be just to decrease the number of threads and suddenly your non-HT performance would improve.

AndrewJacksonZA said:
they were dual CPU Xeons, 6C12T per CPU

That for sure is a possibility with only 12c/24t. Where I work we do some cloud hosting and I've seen Server 2008 R2 VMs running Oracle with 28 vCPU. It is very possible that your work load needed more CPU grunt than 12c alone could provide, hence the large increase in performance by HT. If you had 24c your speedup would probably have been 100%.

AndrewJacksonZA said:
Our bottleneck was always the storage in our daily runs

To be honest unless you have an in RAM DB the storage layer will always be your bottleneck. DBs can get around some of the storage bottlenecks with caching and predictive calls, however, if you make a call to a table it didn't expect then it still has to go to storage.

AndrewJacksonZA · Jan 26, 2024

jeremyj_83 said:
That is a physical appliance.

Nope. Servers ordered from our provider to our specifications. Initially it was supposed to be only used for our DW and reporting, but then over the years it's role expanded to internal file share, and FTP server, one VM for two years, and some other (very!) small applications because other teams "didn't have budget." We also upgraded the machine's RAM once and the storage twice.

An appliance is a single-purpose sealed black box from a vendor that you're not allowed to touch. We almost bought a Microsoft Parallel Datawarehouse Applicance, but exco didn't approve the funding.

jeremyj_83 said:
To be honest unless you have an in RAM DB the storage layer will always be your bottleneck. DBs can get around some of the storage bottlenecks with caching and predictive calls, however, if you make a call to a table it didn't expect then it still has to go to storage.

I know.

If you want some fun, check out Thomas Grohser's presentation from the EightKB conference "Scaling SQL Server beyond 2 CPUs." He goes into a _wonderful_ amount of detail regarding SQL Server and CPUs. <3

https://eightkb.online/previous/

bit_user · Jan 26, 2024

AndrewJacksonZA said:
They were two servers, exactly the same config. It was initially SQL Server 2008, eventually upgraded all the way to 2017. I don't recall the exact CPU model, but they were dual CPU Xeons, 6C12T per CPU.

So HT was disabled on only one? If so, did you ever check that they indeed performed identically when HT was either enabled or disabled on both?

AndrewJacksonZA said:
Our bottleneck was always the storage in our daily runs

In this case, I wonder if simply increasing the number of threads in your non-HT case could've improved performance. If the threads were blocking on I/O, then having more could increase the effective queue depth, resulting in higher IOPS - thereby netting you more performance. In this case, having more threads scheduled on the HT machine could improve performance beyond what just the HT, itself, is doing.

All I'm saying is that it would need further investigation to know if the real differentiator was HT. There's certainly a chance that the IPC in executing your queries was exceptionally low, but I wouldn't expect so.

jeremyj_83 · Jan 26, 2024

bit_user said:
If the threads were blocking on I/O, then having more could increase the effective queue depth, resulting in higher IOPS - thereby netting you more performance.

I'm assuming they were using local disk, age makes me think 10k HDD, instead of a SAN. 24 10k disks will only get you about 3600 IOPS with a max of 6GBps sequential reads. If they were on a SAN it would have been 8Gbps fiber at first so that would have really killed your through put.

thestryker · Jan 26, 2024

bit_user said:
All I'm saying is that it would need further investigation to know if the real differentiator was HT. There's certainly a chance that the IPC in executing your queries was exceptionally low, but I wouldn't expect so.

Consider the likely hardware era it'd be somewhere between Nehalem and Ivy Bridge most likely (maybe Haswell). Boosting and TDP worked differently than now so there was likely no clockspeed difference between HT on/off like there would be now. That certainly wouldn't make for all of the difference, but it helps.

Search

News Intel's next-gen Arrow Lake CPUs might come without hyperthreaded cores — leak points to 24 CPU cores, DDR5-6400 support, and a new 800-series chipset

bit_user

Titan

AndrewJacksonZA

Distinguished

jeremyj_83

Glorious

AndrewJacksonZA

Distinguished

jeremyj_83

Glorious

bit_user

Titan

AndrewJacksonZA

Distinguished

jeremyj_83

Glorious

AndrewJacksonZA

Distinguished

bit_user

Titan

jeremyj_83

Glorious

thestryker

Judicious

TRENDING THREADS

Latest posts

Moderators online

Share this page