Regardless of whether your algorithm extensively exercises a new architecture's deeper re-order buffers, wider execution units, etc. or not, the extra transistors are still baked into critical paths and dictating maximum attainable clock frequencies.I'm talking about specific benchmarks that don't need any kind of upstream or downstream complexity,that you can just feed all the instructions into the CPU and the output needs no sorting.
Since one of the things Intel did with Ice Lake is double throughput of some AVX instructions, things that can heavily leverage those will obviously see some substantial gains. For more typical workloads though, the IPC improvement barely offsets the clock frequency deficit.