gamerk316 :
hcl123 :
Better tools, cooperative threads, a better model of threading which should be in next toolchains, as probably the HSA one.
No one is willing to go back to cooperative threading. No chance in hell. Anyone who proposes it to a SW engineer gets laughed out of the room.
I don't think so... But with so many things it depends on the targets the tools and the developers. Speculative threading (or speculative multithreading spMT) is a good target for a cooperative threading model. Matter of fact i think it was intel that originated the "Mitosis" compiler that centered about making a good job of (semi)automatically parallelize code speculatively.
http://pdf.aminer.org/000/542/803/mitosis_compiler_an_infrastructure_for_speculative_threading_based_on_pre.pdf
gamerk316 :
The hardware for the rescue would be transactional memory if it could be done simpler. Locks then would be much less of a headache... memory management, synchronization and race conditions issues could be gone or tremendously mitigated.
Understand how transaction works: Essentially, do the processing without a lock, and if the memory contents have not changed from what they were when you started, then you can safely save your data. If they have, then you have to put a traditional lock in place, and do the processing again.
Now, in a latency sensitive program, why in the hell would you take the chance your processing time could DOUBLE in the worst case, for a very minimal potential speedup?
You are describing *only* "hardware lock elision", in what seems a *software* transactional approach, which depending on implementation cannot be that uber advantageous.
A true Hardware Transactional Memory model, as i like to see it (lol)... is(can be) a speculative threading mechanism above all, or better said a thread oriented "data" speculation mechanism, it goes much behind locks, though those are essential... their threads by inherent propriety are cooperative and speculative (abort or commit).
Is it not possible to build good " hardware" support for speculative multi threading ? ... i think it is... and nothing can augment more the IPC of a sequential piece of code, than dynamically on the fly, break that code in pieces that execute in parallel, no matter if its in a speculative way. ILP (instruction level parallelism) pretty much is a dead end, doesn't hurt to try to speculate a little lol
HTM AMD way (over 100% speedup possible on that "test")
http://llvm.org/pubs/2010-04-EUROSYS-DresdenTM.pdf
The beast ( can't remember if this was posted here already)
http://translate.google.com/translate?langpair=auto|en&u=http%3A%2F%2Fdiybbs.zol.com.cn%2F11%2F11_106489.html
(EDIT: Original link
http://diybbs.zol.com.cn/11/11_106489.html , the google translalate link seems to break on tis site)
4x speedup of the performance of a FX8150 on integer... and GCN compute cores based FlexFPU on those modules for a new kind of APU... AVX kind of code can be almost 9x speedup lol... base on a what can be an HTM based speculative multithreading uarch (spMT).
(probably based on simulations, but was not 1th April and the piece says AMD presentation at Beijin )
fake ? .. pipe dreams ?... then academia is full of those since looong, meaning none of those ideas is far fetched, or never thoughted before.
[EDIT 2: if HTM based and spMT, this could have advantages on a ARM 64bit uarch)