Question what is the relationship between worst case execution time (WCET) and cpu utilization of a single core processor

sarusukh · Jan 9, 2022

I am working on a single core processor i.e. Raspberry Pi Zero. I want to know how the worst case execution time of tasks executing on Pi (assuming sequential execution of tasks in single core) is related to the CPU utilization and/or core frequency?
Any possible article or reference would help!

From 'Perf' command in Linux, I obtained following values for a piece of code (application):

Core frequency at which the code was run
time elapsed (msec) for whole code to execute
L1 dcache loads, %age of L1 dcache loads, and L1 dcache loads in M/sec
L1 dcache load miss, %age of L1 dcache load misses amongst all hits
L1 dcache store miss, %age L1 dcache store miss, and L1 dcache store misses in M/sec
L1 dcache stores, %age of L1 dcache stores, L1 dcache stores in M/sec
L1 icache load misses, %age of L1 icache load misses
no. of cycles required to execute the code, %age of cycles required at a particular core frequency (GHz)
Total no. of instructions, %age of total no. of instructions, IPC
CPU clock (ms), part of CPU utilised
branches, % branches
dTLB load misses, % dTLB load misses
dTLB store misses, % dTLB store misses
iTLB load misses, % iTLB load misses
stalled cycles front end, stalled cycles backend

kanewolf · Jan 9, 2022

sarusukh said:
I am working on a single core processor i.e. Raspberry Pi Zero. I want to know how the worst case execution time of tasks executing on Pi (assuming sequential execution of tasks in single core) is related to the CPU utilization and/or core frequency?
Any possible article or reference would help!

There are a lot of things to consider. What is the time required for a context switch. How much data has to be loaded into cache. Speed of storage. Even your choice of OS will matter. I don't know if a value can be easily determined.

sarusukh · Jan 9, 2022

kanewolf said:
There are a lot of things to consider. What is the time required for a context switch. How much data has to be loaded into cache. Speed of storage. Even your choice of OS will matter. I don't know if a value can be easily determined.

Considering all the factors to be variables, how can we then sum it up in an equation form? I want an idea about the relationship both the factors share!

kanewolf · Jan 9, 2022

sarusukh said:
Considering all the factors to be variables, how can we then sum it up in an equation form? I want an idea about the relationship both the factors share!

Start simple, research how many clock cycles to make a context switch.
I have no clue what that would be.
Your basic summation will be clock cycles.

sarusukh · Jan 9, 2022

kanewolf said:
Start simple, research how many clock cycles to make a context switch.
I have no clue what that would be.
Your basic summation will be clock cycles.

Lets say I run a program on the processor at X MHz core frequency, and then measure the clock cycles required to make a context switch. Then, what would it say about the WCET and CPU utilization relationship?

geofelt · Jan 9, 2022

The mix of instructions used in the app can vary as to the number of clock cycles required.
Can you test a sample run and time it?
A single thread app should run at 100% cpu utilization unless some sort of I/O is required.

sarusukh · Jan 9, 2022

geofelt said:
The mix of instructions used in the app can vary as to the number of clock cycles required.
Can you test a sample run and time it?
A single thread app should run at 100% cpu utilization unless some sort of I/O is required.

yes I have data with me, running a short application that doesn't require any I/O, at different values of core frequencies and it's relative duration.

sarusukh · Jan 9, 2022

sarusukh said:
yes I have data with me, running a short application that doesn't require any I/O, at different values of core frequencies and it's relative duration.

for ex. at 100 MHz, 80.724% CPU utilization was there and total time of execution as recorded practically was 102 seconds

kanewolf · Jan 9, 2022

sarusukh said:
Lets say I run a program on the processor at X MHz core frequency, and then measure the clock cycles required to make a context switch. Then, what would it say about the WCET and CPU utilization relationship?

I assume you read the Wiki article on WCET -- https://en.wikipedia.org/wiki/Worst-case_execution_time and the references at the bottom.
WCET is more a real time QOS type measurement. It doesn't sound like your

sarusukh said:
execution as recorded practically was 102 seconds

Is very "real time". If you were worried about servicing a GPIO pin in less than 1/20 second, the WCET would be more appropriate.
A 100+ second execution doesn't seem like appropriate test case.

sarusukh · Jan 12, 2022

geofelt said:
The mix of instructions used in the app can vary as to the number of clock cycles required.
Can you test a sample run and time it?
A single thread app should run at 100% cpu utilization unless some sort of I/O is required.

I have edited the question and provided all the details that i have on this processor!

hotaru.hino · Jan 12, 2022

If I understand this question correctly, what you're trying to achieve is some real-time value and trying to map it to CPU % utilization or clock speed to try and come up with some formula to perfectly or nearly perfectly determine the execution time of some task.

I believe this is a pointless endeavor because both the software (the OS) and hardware (CPU and its various features) make trying to find a deterministic value to a near-perfect degree practically impossible. You have to contend with:

Software:
- The OS keeping your task on the CPU
- Nothing interrupting the task (which is kind of impossible to have a handle on)
- If you're using software built with a normally interpreted or JIT compiled language, initial runs of it will typically be slower than later runs.
- If you have an I/O access, then you're basically screwed for any sort of predictability.
Hardware:
- Can the application reside entirely in cache? Will the CPU even keep it there?
- If you have an out-of-order execution processor, the retire stage of the pipeline may vary depending on how many things were done out of order
- If there's any sort of branch prediction, then branches in your code may have an unpredictable impact on execution time

Those are off the top of my head. In any case, there's a lot of factors that can add time that are beyond your control. At best all you can do is do a run of enough samples that you can create a model that you have high confidence in is how the thing behaves.

kanewolf · Jan 12, 2022

sarusukh said:
I have edited the question and provided all the details that i have on this processor!

You have stats from 1 Perf run. Now repeat it 100 times to have statistically significant sample size. THEN you can draw some conclusions. A single data point is not enough to draw any conclusions.

gamerk316 · Jan 13, 2022

The main issue here is even for a single-threaded task, you still have an OS managing everything under the hood. So depending on your current/past workloads, your performance could change by quite a bit (percentage wise at least). The biggest factor for a simple workload is probably whether everything is already cached or not, but depending on what you are doing (IO, data, etc.) other components (HDD/RAM especially) start to become factors.

Honestly, this is something that's hard to really define; there's really too many permutations to consider. What we typically do is just run the program a number of times, varying how we do so (eg: reboots between runs? Run other apps first? Etc.) and look at the "worst case" results to get an idea what worst case execution looks like. It's still a ballpark number, but better then nothing.

hotaru.hino · Jan 13, 2022

If you really want to dig down and do some analysis on the code that you're running, you can try and find a assembly dump of what the program is executing, then using ARM's documentation, map the instructions to how many clock cycles it takes to execute that instruction and add it all up.

Of course, this only applies to just the program running on the CPU. Again, there are going to be a million other things affecting the actual run time of the program.

Search

Question what is the relationship between worst case execution time (WCET) and cpu utilization of a single core processor

sarusukh

kanewolf

Titan

sarusukh

kanewolf

Titan

sarusukh

geofelt

Titan

sarusukh

sarusukh

kanewolf

Titan

sarusukh

hotaru.hino

Glorious

kanewolf

Titan

gamerk316

Glorious

hotaru.hino

Glorious

TRENDING THREADS

Latest posts

Moderators online

Share this page