noob2222 :
so your saying programmers won't know what threads to schedule to a locked core? could be the main problem right here if programmers aren't tought what their thread actually does.
noob, really simple example here: Say you hard lock your heavy workload thread to core 7 of an 8 core system, since that core almost never gets used. For the sake of argument, you give this thread "above-normal" priority, to ensure it gets control of that core. Now, some OTHER program goes along, schedules its heavy workload thread to core 7 of the 8 core system, since that core is almost never used, and gives its thread "high priority". Guess what? Your first applications thread will almost never run, and it can't move to another core because you locked it to just that one core. Guess what? You just tanked your applications performance. This is why outside of platforms where you have a guarantee of what resources are available for use (consoles, and most all integrated platforms), you do NOT hard lock threads to cores.
You are making the same exact arguments that were made back in the 70's in regards to manually setting variables in the CPU registers. Most programming languages still support some form of the "register" keyword, which allows the developer to manually specify what register to put a variable in. The idea at the time was that some heavily used variables would be better off locked in a register [say, the control variable of a FOR loop], to avoid doing a lot of costly memory reads. However, that also reduces the number of registers available for the rest of the system to use, and if you do a lot of processing, performance could suffer. FYI: No one would ever consider doing this anymore, because in every instance, the compiler is smarter then you are when it comes to putting variables in registers. Same thing applies to threads: The OS has a FAR better understanding then you do of the entire system state, and will do a better job then you in regards to thread allocation.
The ONLY instance I'd ever modify the number of cores a thread could run on:
A: I know I have to high-workload, totally independent threads. In this case, I'd take steps to try and ensure they go on different cores [and even this is risky if not handled really carefully].
B: The CPU cache is split between different cores, and as a result, I would try and limit threads to the cores that have access to the same cache.
And gamerk, don't dismiss 64bits that easily in multi-threading. You'll have a bigger address space to actually allocate resources for threading. In the calculation part, you'll be able to have more variables in memory at a given time to consume. You have to look at it from that POV. I know it won't be a huge deal, but it will be like the transition from 16bits to 32bits. All in all, it's not a game changer, but it adds to the party, hahaha.
The overhead, in terms of Address Space usage, is trivial, unless you are coping large amounts of data across the thread boundary. I point out, if you need to replicate that much data across threads, you probably shouldn't be using a separate thread in the first place.