Because GPUs have more than 10 times the processing elements compared to CPUs.
Best CPUs have 32 or 28 cores each with 16 or 32 processing elements.
Best GPUs have 4096 or 5120 processing elements.
Doing 5120 multiplications at each clock cycle is harder than doing 1024 multiplications each clock cycle. So they loosen each cycle, by decreasing frequency so the strain on transistors becomes bearable.
Each GPU processing element works with 40–100 threads at once. Each CPU processing elements works with only 1 or 2 threads concurrently. This means GPU also has a stress on non-processing elements(such as instruction fetch, data fetch, texture fetch, writeback result, calculate somethings,..) too.
If one processing element can’t reach a frequency, whole GPU is stuck at that frequency(because of dependent power distribution or lack of high resolution(per core) dynamic boost). So having less processing elements is better for getting higher frequency on whole package.
GPU is built for throughput. It cares for more square roots per second rather than finishing a single square root sooner. It does this by letting each square root latency hidden behind other things such as global memory data fetch latency which is also high latency.
Ofcourse you can put one GPU in liquid nitrogen cooling to reach some CPU frequencies but power efficiency is very bad at that point and should be used only when world is being invaded by martians and calculation speed is a matter of life and death. Now that I mentioned it, “life” science depends on throughput so that GPUs today can overpower(sometimes by an order of magnitude) all CPUs, even at half frequency. Even gaming is possible solely by GPUs with their half frequency. CPUs can’t do gaming themselves, even at 10GHz.