This is very much a work in progress by members of the enthusiast community in an unsanctioned effort, and there are still several unknowns:The NUMA Dissociater is confirmed to work on EPYC 7551 and TR 2970/2990, but may work on other HCC NUMA platforms.
+Just for reference, this is not a 'fix' for the Windows scheduler, either. It does not make any alterations to the scheduler whatsoever.The ‘fix’ is also bizarrely imprecise. For affected processes, a call to SetProcessAffinityMask, without even changing the affinity (e.g. all CPUs to all CPUs), resolves it – at least most of the time. Best guess is that the preferred NUMA node for the process is removed and that causes the Windows scheduler to change behavior, as evidenced by the thread ideal processor selections, and more importantly the massive change in performance.
It has been difficult to reach solid conclusions because the behavior is so bizarre. Hopefully the community can contribute to our understanding of the details and scope of this sub-optimal behavior.
We're not afraid of beta testing things, but given time constraints, it's best to wait until this project is further along.Update: It must be noted that te NUMA Dissociater doesn’t ‘take’ sometimes, particularly on the 2990wx. EPYC is less impacted. On the 2990wx, restarting the Indigo process 10 times, 50% of those instances won’t result in the ‘fast’ edition. This is still a massive improvement, as without the NUMA Dissociater, 100% of runs will be slow. This is not any bug in Coreprio, but something inherent to the nature of the performance issue.