juanrga :
Each Steamroller _module_ contains three FPU units and two integer units. But I am discussing about something different. I am asking AMD if each _integer unit_ has 2 or 3 ALUs. Hot Chips 2012 showed two integer units with 2 ALUS each (4 ALUs per module). Sep Kaveri talk showed two integer units with 3 ALUS each (6 ALUs per module).
It's all in the code there. A module has 2 bdver3 cores. The FP units are shared of course but each core can access them, just they might have to wait.
These are your 2 ALUs.
(define_cpu_unit "bdver3-ieu0" "bdver3_ieu")
(define_cpu_unit "bdver3-ieu1" "bdver3_ieu")
(define_reservation "bdver3-ieu" "(bdver3-ieu0|bdver3-ieu1)")
These are your 2 AGUs.
(define_cpu_unit "bdver3-agu0" "bdver3_agu")
(define_cpu_unit "bdver3-agu1" "bdver3_agu")
(define_reservation "bdver3-agu" "(bdver3-agu0|bdver3-agu1)")
Which makes up the typical 4 integer pipelines you see in the block diagrams. Same as Bulldozer.
Further down you can see how many clock cycles each operation takes. That's where you can do a more fine comparison between architecture changes.
