I have to make more tests but I think A1111 + AMD GPUs under Linux are much faster than Nod.ai/Windows.
Entirely possible, though these days I suspect the difference has shrunk quite a bit. That's partly because I'm also quite sure that Nod.ai is pretty heavily invested in making AMD GPUs look as good as possible. I say that because I have tried to test Nod.ai with Nvidia, and the results were universally poor at the time I last tried it, and also because I have tried the latest A1111 instructions for running it on AMD under Windows, and performance was also worse than Nod.ai.
Anyway, I'm also sure that there are more optimized variants of SD than A1111 for Nvidia GPUs. Especially if we wanted to start looking for tuned versions that use the FP8 mode on the tensor cores, which would be a potentially easy doubling of performance. But the good thing is that Nod.ai, OpenVINO, and the base Automatic1111 instructions can all get up and running with a minimum of hassle under Windows.