The topic of Steam hardware survey efficacy came up on one of the HUB videos recently and I thought they had a really good take. They basically viewed the data as mostly worthless due to the random nature of the survey. They also suggested a much better method would be for Valve to make it an opt-in program when you first load up Steam and register a system.
Neither one of them could remember the last time they'd gotten a survey pop-up and that got me thinking I couldn't recall either. I did get one on the first of this year though which had gotten me thinking about that video again.
If they said it was worthless because of the random nature, they don't understand statistics. My problem with it is that I worry it's
not completely random. Because if the sampling is triggered by anything other than pure randomness, it's inherently skewed and untrustworthy.
And I often go in spurts on being sampled, which is also worrisome. I'll go for maybe five or six months without getting pinged, and then suddenly I'll get sampled on five PCs all in the same week. It's almost like they're basing the sampling on user ID rather than randomness, which again would be bad statistics.
What I do know is that the data tends to look internally consistent. That means it's not useless; we just can't extrapolate exactly how many people own various GPUs. Minor fluctuations on a monthly basis are expected if you're doing random sampling, because the confidence interval will always leave a margin of error. Randomly sample 1% of the population ten times and you'll get ten slightly different results. But the month-to-month fluctuations from the Steam HW Survey don't even seem to correlate with that sort of variability, so again: worrisome.
The thing that strikes me as odd is why Valve even bothers to make it opt-in sampling. Given how much data Valve already has access to, it seems to me they could just have all the data, all the time, and there's not much anyone could do about it. I would wager other launchers (Epic, Ubi, EA, etc.) that don't provide any public statistics may already be gathering all the data from all the users. Valve is just being slightly kind by providing some guidance to the marketplace.
What I really wish Valve would do is to provide more insight into the sampling, and also collect CPU strings and show that data as well! I'd really love to know how many people (percentage wise) have an i9-13900K, and how many are still holding on to an old Bulldozer CPU, and stuff like that. Just grouping by clocks and core counts means we know next to nothing about the various CPUs.