News Nvidia's RTX 4090 Appears on Latest Steam Hardware Survey

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

KyaraM

Admirable
Mar 11, 2022
1,465
638
6,690
I've been looking at Steam's data for years. It's questionable what they're doing at time, but the numbers do stay consistent. If Steam was truly just using weird sampling of only changed PCs, as an example, we'd see major swings in use of various parts over time. The fact that the most popular cards usually only show a slight month to month variation suggests that at least proper random sampling is being done. That's the most important thing.

We can't give things like margin of error or confidence levels, we don't know how many PCs are sampled each month, we don't know why the GPU API pages don't sum up to 100% in each column (less in DX12/11/10, more in Vulkan), we don't know why there's a large percentage of "Other" in Vulkan but much lower in the DX lists, and we don't know the population being sampled either (all Steam users who logged in during a given month? Maybe, but maybe not). But the data does mean something on some level, if only that Nvidia GPUs remain far more popular than AMD. I'm very curious to see how long it takes for Intel Arc GPUs to start showing up. 🙃
Definitely not "people who logged in during a given month". Been a lot on Steam in December and January, but didn't get the survey. During other months, I get the survey on all my machines and logged in exactly once. So I think we can kinda rule that one out.
 

InvalidError

Titan
Moderator
Definitely not "people who logged in during a given month". Been a lot on Steam in December and January, but didn't get the survey. During other months, I get the survey on all my machines and logged in exactly once. So I think we can kinda rule that one out.
Even if you used the most scientific sampling method, the scenario of getting hit back-to-back on some months and not at all on others is still possible. I get the Steam survey 3-4 times per year on the only PC I have Steam on and it has been about two months since the most recent one.
 

KyaraM

Admirable
Mar 11, 2022
1,465
638
6,690
Even if you used the most scientific sampling method, the scenario of getting hit back-to-back on some months and not at all on others is still possible. I get the Steam survey 3-4 times per year on the only PC I have Steam on and it has been about two months since the most recent one.
Of course it's still possible. All I said was that it's definitely not "everyone who logged in during the month".
 

JarredWaltonGPU

Senior GPU Editor
Editor
Definitely not "people who logged in during a given month". Been a lot on Steam in December and January, but didn't get the survey. During other months, I get the survey on all my machines and logged in exactly once. So I think we can kinda rule that one out.
The statistical definition of "population" means the group being surveyed, while the actual surveyed PCs would be the "sample group." So you can be part of the population (Steam users in a given month) and rarely if ever get sampled. Generally speaking, even if you only sample less than 1% of the total population, if the sampling is truly random you end up with reasonably consistent data. The margin of error and confidence level would change based on how much of the population you sample. If you sample 100%, you have a 0% margin of error and 100% confidence level. If you only sample 1%, it will drop quite a bit to maybe something like a 3% margin of error and 95% confidence level.

But as I said before, we don't know the population size, and we don't know how many PCs/users are sampled, which means we can't determine the margin of error at all. Valve could, yet it chooses not to, which has always struck me as rather odd. I mean, go look at BackBlaze's HDD reliability statistics. That's REAL statistical analysis, and you could pretty much automate the whole thing with Steam so that it would generate the monthly pages and have a short sentence at the end saying, "Steam sampled XXX PCs out of a population of YYY, giving a margin of error of ZZZ% and a WWW% confidence level."

Which is probably the main reason Valve doesn't do exactly that: It doesn't want to give a truly hard piece of data on how large the population is. But Valve does release a statement every couple of years about how many Steam users there are, peak concurrents is always given, and stuff like that. And Valve doesn't actually need to provide people with any data at all, so it's kind of cool that it chooses to at least give us some insight. But if you're going to periodically list user base and if you gave a monthly summary of the user statistics, we'd undoubtedly have others writing, "OMG, Steam is imploding because there were 5 million fewer users this month than last month!" And then the next month: "OMG, Steam had its best month ever with 10 million more users than the previous month!"

So in summary, I bet Valve is doing reasonably proper statistical methods, and the only reason it doesn't provide more insight is because doing so would allow people to work out how large the "Steam Population" is every month, and it doesn't want to provide that data.
 

InvalidError

Titan
Moderator
But as I said before, we don't know the population size, and we don't know how many PCs/users are sampled, which means we can't determine the margin of error at all. Valve could, yet it chooses not to, which has always struck me as rather odd.
It doesn't strike me as odd: these statistics have no impact on the customer base, so there is no reason nor obligation to disclose any of it. If I were to put myself in Valve's shoes, the main reason I'd want to publish those numbers is for developer guidance: "this is the sort of hardware we are seeing on our platform's install base, develop your Steam games accordingly."

From that point of view, Valve does have a vested interest in ensuring its statistics are reasonably representative of its entire install base.

Based on my own experience of being consistently surveyed 3-4 times per year, I'm guessing Valve keeps census data for a few months and a pretty large chunk of the total install base spanning that many months back is represented in the survey data. That would contribute quite a bit to stabilizing stats against wild month-to-month fluctuations.
 

KyaraM

Admirable
Mar 11, 2022
1,465
638
6,690
The statistical definition of "population" means the group being surveyed, while the actual surveyed PCs would be the "sample group." So you can be part of the population (Steam users in a given month) and rarely if ever get sampled. Generally speaking, even if you only sample less than 1% of the total population, if the sampling is truly random you end up with reasonably consistent data. The margin of error and confidence level would change based on how much of the population you sample. If you sample 100%, you have a 0% margin of error and 100% confidence level. If you only sample 1%, it will drop quite a bit to maybe something like a 3% margin of error and 95% confidence level.

But as I said before, we don't know the population size, and we don't know how many PCs/users are sampled, which means we can't determine the margin of error at all. Valve could, yet it chooses not to, which has always struck me as rather odd. I mean, go look at BackBlaze's HDD reliability statistics. That's REAL statistical analysis, and you could pretty much automate the whole thing with Steam so that it would generate the monthly pages and have a short sentence at the end saying, "Steam sampled XXX PCs out of a population of YYY, giving a margin of error of ZZZ% and a WWW% confidence level."

Which is probably the main reason Valve doesn't do exactly that: It doesn't want to give a truly hard piece of data on how large the population is. But Valve does release a statement every couple of years about how many Steam users there are, peak concurrents is always given, and stuff like that. And Valve doesn't actually need to provide people with any data at all, so it's kind of cool that it chooses to at least give us some insight. But if you're going to periodically list user base and if you gave a monthly summary of the user statistics, we'd undoubtedly have others writing, "OMG, Steam is imploding because there were 5 million fewer users this month than last month!" And then the next month: "OMG, Steam had its best month ever with 10 million more users than the previous month!"

So in summary, I bet Valve is doing reasonably proper statistical methods, and the only reason it doesn't provide more insight is because doing so would allow people to work out how large the "Steam Population" is every month, and it doesn't want to provide that data.
I understand statistical analysis and random sampling, it was part of my university education. I misread your post as "everyone in a given month gets sampled", not "x% of all people online during a month". And pointed out that it cannot be that. That's all.
 

JarredWaltonGPU

Senior GPU Editor
Editor
It doesn't strike me as odd: these statistics have no impact on the customer base, so there is no reason nor obligation to disclose any of it. If I were to put myself in Valve's shoes, the main reason I'd want to publish those numbers is for developer guidance: "this is the sort of hardware we are seeing on our platform's install base, develop your Steam games accordingly."

From that point of view, Valve does have a vested interest in ensuring its statistics are reasonably representative of its entire install base.

Based on my own experience of being consistently surveyed 3-4 times per year, I'm guessing Valve keeps census data for a few months and a pretty large chunk of the total install base spanning that many months back is represented in the survey data. That would contribute quite a bit to stabilizing stats against wild month-to-month fluctuations.
The rest of the post points out why it likely doesn't give additional details. So "odd" is maybe more "unfortunate" than truly something I don't get. Because you know if there were stats saying exactly how many people used Steam in a given month, Valve would get all sorts of additional scrutiny.
 

TRENDING THREADS