It's been many years since I wrote software, but I know a fair bit about operating systems and software development, so here is my take on it:
Basically what I think you do is to parallelize as much of your code as you can, utilizing threads which can run concurrently, then the operating system parcels out the threads to however many cores the user has in his system. That is one of the primary tasks of an operating system: CPU utilization, or time-slicing. So if you have spawned 10 threads that can run simultaneously and there are only four cores, the operating system will assign the threads to the different cores until it runs out of cores. If there are more threads than cores, then time-slicing will occur (again, under the...