News Take control of your Intel CPU's P-Cores and E-Cores with CoreDirector software

Status
Not open for further replies.
Well, now I don't feel so bad that I was confused, too...

I bought Project Lasso specifically to get the sort of control I have via 'numactl' on Linux with Windows applications and then found it rather less intuitive than I hoped.

Worse, it then contained a link to the CoreDirector from within the commercial Lasso and I felt a groan coming out of my mouth as I had already been bombarded with upgrade mails by Bitsum.

AFAIK CoreDirector is actually the older tool and hasn't seen updates for a while, but it also can't go against hard-wired defaults of the hardware: the OS will use P-cores no matter what and you can't entirely shut them down on any hybrid device (while the opposite is easy).

What's still missing is a comprehensive solution which also allows you to set the power budgets, PL1, PL2 and TAU for starters and then perhaps you'd be able to say things like "give 8 Watts to all things Firefox, but chose cores to match what's active".

To be frank, the goal for me is really to test these facilities for server and embedded workloads, I'm not keen on acutally regulating the notebook I am running this on quite that way.
 
Gamers nexus did test the apo and it seems to make some difference but was a pain to install and only supports 14th gen cpu.
I can't see this being a bad thing when you have people turning off 1/2 their processors so they can force 7950x3d to only use cores with cache.
 
  • Like
Reactions: KyaraM
When I have the xeon v4 with 22 cores process lasso works great. But with the 12700t and 13500t fell some difficulty to work perfectly. Windows make a great job selecting the cores to work. Not making any difference when gaming.

E CORES it's about love or hate it. (The fps of an i5 family is a downfall its why intel don't put the APO on it)

Disable e cores system becomes slower.
Disable hyper tread become slower.
Disable all and you get a i5 8600 lol
 
What would be some scenarios where one would want to keep apps off e-Cores outside of poorly optimized programs?

I know what it's doing but I'm not seeing the real world use for the average user.
 
  • Like
Reactions: Tac 25
The makers of Process Lasso have debuted a new application for Intel processors called CoreDirector that allows users to manually control how applications utilize an Intel chip's E cores and P cores.

Take control of your Intel CPU's P-Cores and E-Cores with CoreDirector software : Read more
Unless you have some insider info that you would like to share, it's not more aggressive but just more targeted, to the specific needs of the game and to the specific CPU.
Intel already knows this since it has developed a new utility called Application Optimization (APO), that adjusts thread affinities even more aggressively than what Thread Director already does. However, it's very limited in use and currently only supports two titles.
 
That's not enough for you?!

If you use one app or game that runs worse than on an older CPU that would be enough for most people to look into this.
Geez untuck that cameltoe. I was simply acknowledging one use case and asking if there were any others.

Also its poor forum etiquette and cluttering to post twice in a row. One post, multiple quotes, but I'm not a mod here so I'm not telling you what to do, just a suggestion.
 
Last edited:
Yes it would be Kool to see some sort of review on this heck a perfect game in my opinion for benchmark would be skylines 2 and simulation speed in developer mode but ud have to have a big city to check this iv been reading in a simulation speed thread that even people with a 13700 or 13900k Completly gets destroyed in simulation speed on big cities making a 13600k almost pointless to upgrade to for this game would be interesting to know if core director or project lasso would actualy make that cpu viable for skylines
 
Would be a great opportunity to test stock behavior, APO and ProcessLasso/CoreDirector against one another in Metro Exodus/R6 Siege to see how it plays out.
There is also the option of per core disabling in the bios. Then the disabled core is invisible to the os. I heard of APO disabling all but one ecore per bank and thought the extra l2 cache per ecore might be a benefit so I tried it. Not a lot of difference in cp2077.
I haven't tried it vs all cores enabled in Exodus.

But where would the testing end? There are so many of options. Maybe a poll and a small number of the most popular gets run?
 
There is also the option of per core disabling in the bios. Then the disabled core is invisible to the os. I heard of APO disabling all but one ecore per bank and thought the extra l2 cache per ecore might be a benefit so I tried it. Not a lot of difference in cp2077.
I haven't tried it vs all cores enabled in Exodus.

But where would the testing end? There are so many of options. Maybe a poll and a small number of the most popular gets run?
Well until/unless Intel expands APO the two titles I mentioned are the only ones with it so they're the only ones worth testing. It's entirely possible the limited title support in APO is because they're only putting in things that benefit and haven't found a whole lot yet.

In HUB's R6 testing they weren't able to match the APO performance by disabling HT and/or e-cores despite them getting higher performance than stock. That's why I'd like to see a direct comparison between stock/third party software/APO to see what sort of advantage Intel is or isn't bringing to the table.
 
That's why the true solution is better APIs and OS support. The more that software can tell the OS about the threads and workload, the more effectively the OS can schedule it.
All of this exists since forever, the problem is that devs, and especially game devs, completely ignore them. They make all the thread decisions for the current playstation console since that is the biggest market and everybody else gets ports that have no changes made to them at all, we all know how many games were completely unplayable at launch even on high end systems, they only do any extra work if the game completely fails to work on windows.
 
All of this exists since forever, the problem is that devs, and especially game devs, completely ignore them.
Well... a few things are new. Hybrid CPUs create new challenges for thread schedulers, although some of that complexity already existed with hyperthreading.

As for stuff that "existed since forever", I actually happen to know a fair bit about threading APIs and can tell you that what's provided on non-realtime operating systems doesn't do a good enough job of enabling the OS to do more optimal scheduling.

The problem with RTOS-style scheduling is that it's too high-overhead. You basically end up designing for the worst case. You get deterministic performance, but at the expense of opportunistic optimizations. Neither is ideal for something like gaming.
 
  • Like
Reactions: thestryker
The problem with RTOS-style scheduling is that it's too high-overhead. You basically end up designing for the worst case. You get deterministic performance, but at the expense of opportunistic optimizations. Neither is ideal for something like gaming.
This is where part of the problem comes in as you basically need real time monitoring and control built into the hardware (DTT) with something like Thread Director to make custom scheduling low enough latency.
That's why the true solution is better APIs and OS support. The more that software can tell the OS about the threads and workload, the more effectively the OS can schedule it.
The biggest issue I see with an API taking the place of something like APO is the seeming requirement of running everything through each CPU to create the model. If Intel could circumvent this then APO would magically work on everything that had the hardware required for real time scheduling (or if you want to be cynical it would run on all of the 14th gen K/KF SKUs instead of 2/3). As long as this part is a requirement (and assuming Intel isn't lying about testing time) I just don't see any way broad implementation happens due to hardware access (meaning developers having access to every CPU that could do it) and/or time (generating the model) if nothing else.

Ideally you're right I certainly agree if developers could do something short akin to preloading shaders it would be great. Then you'd just need whatever driver software was required to interface with the scheduling hardware.
 
Well... a few things are new. Hybrid CPUs create new challenges for thread schedulers, although some of that complexity already existed with hyperthreading.

As for stuff that "existed since forever", I actually happen to know a fair bit about threading APIs and can tell you that what's provided on non-realtime operating systems doesn't do a good enough job of enabling the OS to do more optimal scheduling.

The problem with RTOS-style scheduling is that it's too high-overhead. You basically end up designing for the worst case. You get deterministic performance, but at the expense of opportunistic optimizations. Neither is ideal for something like gaming.
The problem is that making an API will be completely useless because no game dev will ever produce any data for it, if they would they would use the tools they currently have, like basic thread priorities and affinity masks, but they never do so why would you think that they would use a new API.
A new API would have to do everything on its own, so basically it would be APO again.

What you are talking about is similar to what people talked about with dx12 multi-gpu, do you remember that?! That went nowhere because no dev would ever do that much more work when they get enough money by making the game run well on one single console config.
 
This is where part of the problem comes in as you basically need real time monitoring and control built into the hardware (DTT) with something like Thread Director to make custom scheduling low enough latency.
I don't agree with that. What you're talking about is preemption, but most threads in a game need to run tasks to completion.

Plus, the Thread Director doesn't even do thread scheduling. It just collects statistics that inform the OS thread scheduler whether the thread is more suitable for a P or E core.

The biggest issue I see with an API taking the place of something like APO is the seeming requirement of running everything through each CPU to create the model.
I think that's not necessary. I won't go into detail about what I think the API should look like, since I don't have an easy way to prototype it and test my theories.
 
  • Like
Reactions: SyCoREAPER
The problem is that making an API will be completely useless because no game dev will ever produce any data for it, if they would they would use the tools they currently have, like basic thread priorities and affinity masks, but they never do so why would you think that they would use a new API.
Affinity masks are a pain to use and can create more problems than they solve, if you're not very careful and knowledgeable about what you're doing. I'm sure many game devs have been burned by trying to do such things, which are basically second-guessing the OS' thread scheduler and tying its hands.

Priorities are even more fraught. If you're not careful, you can create priority-inversion scenarios, where a lower-priority thread starves a higher one.

A new API would have to do everything on its own, so basically it would be APO again.
The idea is to tell the OS more about the application's needs and the relationship between the threads, so the OS can do a better job of scheduling them. MacOS has some interesting ideas, but I think they don't go far enough in the area of what games would need.
 
Affinity masks are a pain to use and can create more problems than they solve, if you're not very careful and knowledgeable about what you're doing. I'm sure many game devs have been burned by trying to do such things, which are basically second-guessing the OS' thread scheduler and tying its hands.

Priorities are even more fraught. If you're not careful, you can create priority-inversion scenarios, where a lower-priority thread starves a higher one.
Yes, that's why game devs completely ignore all that stuff on windows and only do it for one console because there all consoles are the same, and we get stuck with the console config on drastically different hardware.
There is no reason to believe that they would start to care because of a new API when they didn't care until now, an API is not going to make anything easier for the game devs.

Also why only two games and two cpus are supported until now, it IS a pain.
The idea is to tell the OS more about the application's needs and the relationship between the threads, so the OS can do a better job of scheduling them. MacOS has some interesting ideas, but I think they don't go far enough in the area of what games would need.
Yup, see above.
If game devs won't do that with affinities and priorities then they aren't going to do that with anything else either.
 
I don't agree with that. What you're talking about is preemption, but most threads in a game need to run tasks to completion.
The only thing that runs to completion in a game are the scripts that determine when and what should happen, everything else is never ending loops.
Plus, the Thread Director doesn't even do thread scheduling. It just collects statistics that inform the OS thread scheduler whether the thread is more suitable for a P or E core.
It constantly collects and informs the OS about any changes in workloads and informs it about opportunities to change workloads to other cores/threads on the fly, and that is important to make custom scheduling low enough latency even if it doesn't do any scheduling itself.
 
There is no reason to believe that they would start to care because of a new API when they didn't care until now, an API is not going to make anything easier for the game devs.
Maybe APO will wake up both game developers and Microsoft, thanks to showing them just how much performance is being left on the table. It's pretty easy to be complacent when you're ignorant. Harder, when you know the true cost of the status quo.

If game devs won't do that with affinities and priorities then they aren't going to do that with anything else either.
I disagree. I think if you make it easier and less error-prone than affinities and thread priorities, then we might see greater uptake - especially if there's strong data showing the benefits of doing so.

The only thing that runs to completion in a game are the scripts that determine when and what should happen, everything else is never ending loops.
I meant the individual chunks of computation needed to generate a frame. In order to have a frame you can display, all of the steps needed to compute it must complete.
 
I think that's not necessary. I won't go into detail about what I think the API should look like, since I don't have an easy way to prototype it and test my theories.
If you were actually right about this one then Intel's approach wouldn't make any sense due to how many steps they're going through. They have the ability to monitor the threads (Thread Director) and optimize what goes where (DTT). It's the latter step that clearly needs the per CPU optimization and given the limited application/hardware support it isn't a quick step. We know it has nothing to do with clockspeeds/turbo frequency so it has to be something with p/e/ht configuration that they're analyzing. There are a lot of clues as to how ridiculously involved everything is in the APO FAQ and the DTT information.

If there was actually a way to bypass all of this work for something that would be straightforward for developers to implement I can't imagine why Intel wouldn't be going that route instead.
I don't agree with that. What you're talking about is preemption, but most threads in a game need to run tasks to completion.

Plus, the Thread Director doesn't even do thread scheduling. It just collects statistics that inform the OS thread scheduler whether the thread is more suitable for a P or E core.
I absolutely snipped the wrong part of your post I meant to refer to improving scheduling for windows et al. I don't think it can be done purely software side.
 
If you were actually right about this one then Intel's approach wouldn't make any sense due to how many steps they're going through.
Intel's approach is non-intrusive. Mine is very intrusive (API changes require app changes, by definition) and therefore won't benefit the current generation of games or CPUs. It also needs to hook into the the OS' thread scheduler, which further stretches lead times.
 
  • Like
Reactions: thestryker
Status
Not open for further replies.