cgner :
And yet games dont use more than 2-3 in most cases.
Again, you can't speed is serial processes by adding more cores.
noob2222 :
not possible or not practical? there is a thin line and most often everything is put into the not possible category because its not cost effective.
From what I understand with locks, there are alternatives, but they are difficult and harder to implement, wich means more cost.
Sephamores, Mutexes, Interlocks are all functionally the same at a really low level: The keep data locked for a period of time so only one thread can access them.
Any time you have two threads running at the same time that can both read/write to a variable, you need to lock it every time you access it to ensure your changes aren't hammered by the other thread. [Note that in the case where one thread only writes and teh other only reads, you don't need the extra overhead. In any other case, locks are a necessity].
the problem also is that the solution (from what I gathered) will actually slow down lesser core computers as instead of a lock-wait your turn, its 3 simultaneous threads, each execution and one comparator that must always be running. However in a high core-count system, this would be considerably faster as you don't have threads sitting doing nothing because the data is locked, instead you will occasionally have a thread given new data to re-run if the data itself was changed.
Uhhh...no. You don't give a thread "new data", because you, the developer, have no clue your thread is blocked. Thats the domain of the OS.
Its the OS that determines if a thread can run or not. Its the OS that schedules threads to run. Its the OS that puts threads on a specific core. All developers can do is make the heavy workload threads as parallel as possible and hope the OS allocates them in a semi-parallel way. [At this point, choice in compilers can be a HUGE performance factor.]
Within the application, I have no way of knowing if a thread is ever blocked, because if the thread WAS blocked, it would be unable to run and determine it is blocked! If a thread can't run, the OS preempts it, and some other thread (maybe for your program, maybe not) will run instead.
My point being, the OS scheduler plays a role.
How often is data locked, examined, and the value unchanged then ulocked again?
Explicitly when you have two threads that can both access the same data object, implicitly by the OS whenever you do memory access. Depending on design, this can be a measurable performance impact, or negligible. Depends a lot on how many threads need to access the same data structures. If you have a lot of threads that need access to the same object, theres not much you can do performance wise.
-------------------------------------------------
Now lets look at games again. You have a LOT of data that ends up being shared (Specifically, the geometry matrix, which touches the rendering, physics, AI, and audio engines). Every time its accessed, no one else can touch it until its explicitly unlocked again. That by itself will limit scaling, because you have to design around that possibility (try and ensure by design that the threads won't need to access the structure at the same time). But by doing so, you limit how parallel you are (you now have serial processes).
Now, a lot of this is done under the hood by game engines these days, so developers don't notice. But THINK about how the different game engines can interact: Audio cues can affect AI processing, so audio must be done before AI is processed. Physics can affect geometry, which in turn can affect audio processing (assuming a 3D sound engine that accounts for terrain features). And so on.
You begin to realize that a lot of what has to happen must be done in a SPECIFIC ORDER. That itself limits how parallel you can be, and thus limits the benefit of adding more cores. If only 20% of the program is actually parallel, you can not speed up the program by more then 20% via adding more cores.