The only reasonable explanation I can find on why two identical CPUs would perform differently under the same circumstances is that the RAM subtimings are different. There's a few subtimings that, unless you notice them, they can change performance slightly under certain scenarios. When you set XMP, the motherboard gets to decide most of the subtimings and that is almost always random. Some cheaper motherboards don't even expose them. The rest is just Windows having random services running during tests, but less likely in a controlled envronment.
Usually you don't go that deep when configuring test benches, unless you're specifically testing RAM, I'd say? Happy to be wrong on this one.
Regards.