IPC is the inverse of CPI = Ʃ(CCI)(IIC)/IC
attempting to leave this out and claim IPC = performance/clock only works for same architectures. do you know the program values of IC?
let me demonstrate.
say you have a program with millions of lines of code, some sabotage happens and you get IF cupid="intel"{b++};
lets say this is hidden and gets compiled, how would any end user be able to detect this anomaly? how does this affect IPC = performance/clock?
say this line is ran once per 10 loops because it is embedded in some animation call. that means for every 10 instructions you get 11 instructions to complete if your name is Intel.
so ipc = IC/Ʃ(CCI)(IIC) for the sake of arguing lets say this is a very simple program that runs one instruction per 10 clocks, regardless of type
for AMD, IPC = 10/1(10)(10) or 10/100
for intel, IPC = 11/1(10)(10) or 11/100
this is at the software level, lets see what happens to performance at the hardware level.
say both cpus are capable of running this loop at its peak,@ 3 GHZ.
amd = 10/100 x 3 (ghz) = 0.3ns per loop
intel = 11/100 x 3 = 0.33ns per loop
now because this program artificially introduced extra lines of code, even if the hardware IPC is 100% identical, it will run slower if "Intel" is detected by 10%
now do you see why you need to know the instruction count before you can claim software = hardware IPC instead of just stating performance = ipc x frequency?
now for the non theoretical part, this is happening IRL.
just scroll up.
P.S. IGNORE ANY SYNTAX PROBLEMS, I KNOW ITS NOT CORRECT, JUST SIMPLE TO GRASP WHAT IS DONE.
attempting to leave this out and claim IPC = performance/clock only works for same architectures. do you know the program values of IC?
let me demonstrate.
say you have a program with millions of lines of code, some sabotage happens and you get IF cupid="intel"{b++};
lets say this is hidden and gets compiled, how would any end user be able to detect this anomaly? how does this affect IPC = performance/clock?
say this line is ran once per 10 loops because it is embedded in some animation call. that means for every 10 instructions you get 11 instructions to complete if your name is Intel.
so ipc = IC/Ʃ(CCI)(IIC) for the sake of arguing lets say this is a very simple program that runs one instruction per 10 clocks, regardless of type
for AMD, IPC = 10/1(10)(10) or 10/100
for intel, IPC = 11/1(10)(10) or 11/100
this is at the software level, lets see what happens to performance at the hardware level.
say both cpus are capable of running this loop at its peak,@ 3 GHZ.
amd = 10/100 x 3 (ghz) = 0.3ns per loop
intel = 11/100 x 3 = 0.33ns per loop
now because this program artificially introduced extra lines of code, even if the hardware IPC is 100% identical, it will run slower if "Intel" is detected by 10%
now do you see why you need to know the instruction count before you can claim software = hardware IPC instead of just stating performance = ipc x frequency?
now for the non theoretical part, this is happening IRL.
just scroll up.
P.S. IGNORE ANY SYNTAX PROBLEMS, I KNOW ITS NOT CORRECT, JUST SIMPLE TO GRASP WHAT IS DONE.
For users and purchasers of a computer system, instructions per clock is not a particularly useful indication of the performance of their system.