News Intel’s 5.0 GHz Core i9-9900KS Ships Next Month, Cascade-Lake X Offering 2x Perf-Per-Dollar

Actually, what's really on offer here is the binning and speed gating that's required to deliver an all-core 5GHz processor. This means that this i9 series processor likely has some untapped overclocking potential, IMO..

Though, Intel hasn't improved their single-core boost clock speeds with this new iteration of the i9-9900K.
 
BTW, I assume these CPUs based on LGA2066 socket, offering more powerful per dollar than existing Skylake-X architecture (it does not state if we are looking at 9th Gen Refresh or 7th Gen).

I think this processor has an iGPU on board, IMO. Nothing new or special about this whole release though.....
 

GetSmart

Commendable
Jun 17, 2019
173
44
1,610
BTW, I assume these CPUs based on LGA2066 socket, offering more powerful per dollar than existing Skylake-X architecture (it does not state if we are looking at 9th Gen Refresh or 7th Gen).

I think this processor has an iGPU on board, IMO. Nothing new or special about this whole release though.....
That Intel Core i9-9900KS is for LGA1151 socket (on mainstream desktop platform). The other Intel Cascade Lake-X CPUs in that chart is most likely for LGA2066 socket (on HEDT platforms). And looking at those bars, can anticipate big price cuts for Intel's HEDT CPUs (notice that it is performance per dollar). For example, look at Hwbot benchmark rankings for 12 cores nevermind the new AMD Ryzen 9 3900X entries but can clearly see that the Intel Core i9-9920X was not popular (only 2 entries). Thus could likely be Intel's new response to AMD's HCC (high core count) CPUs..
 
  • Like
Reactions: TJ Hooker
Ok, so the headline is actually two different stories. Because what I read was that Intel was slashing the price of the 9900K in half with the KS model and the KS would be a new core family... which would be AMAZING, but SUPER uncharacteristic of Intel. So, new 9900KS is going to just be another 9900K but factory overclocked, and Intel HEDT is going to get closer to more realistic pricing, I guess that part is good.
 
The tide has turned.
What AMD is doing before to Intel, is now with Intel doing to AMD.. LOL!

yeah I also find it extremely ironic
When I bought my i7 4790K back in 2015 I chose it because it offered so much more performance while still drew much less power than any CPU AMD had to offer

Now I hope AMD will do the same to Nvidia
offering a cheaper GPU while drawing much less power
like rtx 2080Ti performance for 700 usd at 150W :D:rolleyes:
 
That Intel Core i9-9900KS is for LGA1151 socket (on mainstream desktop platform). The other Intel Cascade Lake-X CPUs in that chart is most likely for LGA2066 socket (on HEDT platforms). And looking at those bars, can anticipate big price cuts for Intel's HEDT CPUs (notice that it is performance per dollar). For example, look at Hwbot benchmark rankings for 12 cores nevermind the new AMD Ryzen 9 3900X entries but can clearly see that the Intel Core i9-9920X was not popular (only 2 entries). Thus could likely be Intel's new response to AMD's HCC (high core count) CPUs..

I would love to see their HEDT platform CPUs drop in price. But I also think it means they will cut the lower core counts because if they cut it in almost half the 8 core i9 would be cheaper than the 9900K. So my best guess would eb that they are going to up the minimum core count and cut prices but will still be a higher entry cost than a top end mainstream CPU.
 

chaz_music

Distinguished
Dec 12, 2009
84
51
18,640
And unlike the Ryzen CPUs that Intel is comparing to, Intel STILL sells their mainstream CPUs that don't support ECC memory. There is no reason for that anymore, and Intel should be ashamed.

Data corruption is very common, even with perfectly fine hardware. Google showed this in a memory study in 2014 in their server farms. There were a high number of ECC triggers over a 3-5 month period during the study, which took much root cause analysis to figure out. They found that the ECC triggers occurred at the same time as very high solar flares and sunspots, which causes a ton of Gamma Rays. Lightning can do the same thing by creating emag fields and small amounts of Gamma and X-rays. It is not caused by failing hardware but the consequence of natural disturbances.

I know that many people are into overclocking, and ECC functionality does add a slight amount of added time delay, so for enthusiasts, the BIOS should allow disabling ECC. Simple.

Now come on Intel - stop forcefeeding your customers heavy handed tactics to make them buy Xeon processors. That crap only lasts until your monopoly stops. Which it has.
 
And unlike the Ryzen CPUs that Intel is comparing to, Intel STILL sells their mainstream CPUs that don't support ECC memory. There is no reason for that anymore, and Intel should be ashamed.

Data corruption is very common, even with perfectly fine hardware. Google showed this in a memory study in 2014 in their server farms. There were a high number of ECC triggers over a 3-5 month period during the study, which took much root cause analysis to figure out. They found that the ECC triggers occurred at the same time as very high solar flares and sunspots, which causes a ton of Gamma Rays. Lightning can do the same thing by creating emag fields and small amounts of Gamma and X-rays. It is not caused by failing hardware but the consequence of natural disturbances.

I know that many people are into overclocking, and ECC functionality does add a slight amount of added time delay, so for enthusiasts, the BIOS should allow disabling ECC. Simple.

Now come on Intel - stop forcefeeding your customers heavy handed tactics to make them buy Xeon processors. That crap only lasts until your monopoly stops. Which it has.

Short of professionals and servers, what use would ECC be for the mainstream? The same mainstream that is vastly made up of common people who don't even know what DDR is much less ECC?

The amount of people that need ECC outside of servers is small enough to not even be a blip on the radar.
 

Arbie

Distinguished
Oct 8, 2007
208
65
18,760
Thank you AMD. Not that I'll be buying Intel, but it's nice to see them forced to offer better and better value.

Now back to my 2C/2T laptop that could and should have been twice as powerful, had Intel not fleeced us for so many years.
 

chaz_music

Distinguished
Dec 12, 2009
84
51
18,640
Short of professionals and servers, what use would ECC be for the mainstream? The same mainstream that is vastly made up of common people who don't even know what DDR is much less ECC?

The amount of people that need ECC outside of servers is small enough to not even be a blip on the radar.

Ohhh, no sir. Maybe this is true in your world, but that is not true from my experience. My mother hated having computer crashes, but did not care that it was a memory issue. You are essentially saying that she should be OK with her computer crashing. So to expect her to know about ECC is a rather nerdy viewpoint. You are used to understanding complex issues and expect others to know as well.

When your grandmother or any other non-techie can use a computer for email, it is truly just an appliance. I don't think that the common user should know what ECC or DDR is. It should be built in, like ABS brakes or airbags in your car. This is my point: reliability should not be optional. Steve Jobs is quoted as saying that "It is not the consumers' responsiblity to know what they want". I would not expect a PC buyer to know what they want. But I am quite sure that they expect it to be reliable, and it is not that difficult to make that happen.

ECC is not new (this is 2019, BTW), and it is actually very inexpensive to implement. The cost overhead that the memory manufacturers charge is due to heavy handed tactics to keep profits and prices up, and the much lower volume of ECC memory sales. In terms of IC development and silicon real estate, adding USB3.0 costs way more than ECC circuitry.

Ever had Windows give you a BSOD? Yes it is often caused by Windows, but it has been found that many of the crashes are just data corruption. As an engineer, I know that these come from a host of sources: memory, CPU, HDD, and chipsets. We have RAID for HDDs, and file systems with ReFS and ZFS. The CPUs already have error detection and correction for several internal soft errors.

My last 3 home PCs have ECC. My NAS has ECC. My work PC has ECC. The computer that I set up for my sister has ECC. My viewpoint is that ECC is for anyone who isn't casually playing only games on their computer and would like to have "no crashes" or data loss.

IMO.

-Charles
 
Ohhh, no sir. Maybe this is true in your world, but that is not true from my experience. My mother hated having computer crashes, but did not care that it was a memory issue. You are essentially saying that she should be OK with her computer crashing. So to expect her to know about ECC is a rather nerdy viewpoint. You are used to understanding complex issues and expect others to know as well.

When your grandmother or any other non-techie can use a computer for email, it is truly just an appliance. I don't think that the common user should know what ECC or DDR is. It should be built in, like ABS brakes or airbags in your car. This is my point: reliability should not be optional. Steve Jobs is quoted as saying that "It is not the consumers' responsiblity to know what they want". I would not expect a PC buyer to know what they want. But I am quite sure that they expect it to be reliable, and it is not that difficult to make that happen.

ECC is not new (this is 2019, BTW), and it is actually very inexpensive to implement. The cost overhead that the memory manufacturers charge is due to heavy handed tactics to keep profits and prices up, and the much lower volume of ECC memory sales. In terms of IC development and silicon real estate, adding USB3.0 costs way more than ECC circuitry.

Ever had Windows give you a BSOD? Yes it is often caused by Windows, but it has been found that many of the crashes are just data corruption. As an engineer, I know that these come from a host of sources: memory, CPU, HDD, and chipsets. We have RAID for HDDs, and file systems with ReFS and ZFS. The CPUs already have error detection and correction for several internal soft errors.

My last 3 home PCs have ECC. My NAS has ECC. My work PC has ECC. The computer that I set up for my sister has ECC. My viewpoint is that ECC is for anyone who isn't casually playing only games on their computer and would like to have "no crashes" or data loss.

IMO.

-Charles

Short of issues caused by my tinkering, such as overclocking, I have never had a BSoD on any of my home PCs that I have built for myself or family. My grandmother currently uses an Intel NUC that I built and mounted to a monitor and it has been working great.

If a system is having crashes ECC is not some end all be all magic solution that can fix said crashes. It is there to ensure data integrity in data critical operations.

The mass majority of people do not use RAID either and those that do in the mainstream typically utilize RAID 0, even though its pointless with SSDs these days. RAID 1 only protects from hardware level failure, not potential corruption due to a missed bit so its not even in the same league as ECC which has dedicated hardware for that purpose.

ReFS is still new and still is not bootable. The mass majority of people, mainstream, uses NTFS still. Hell I would bet most server still use NTFS as the cost and hassle to migrate to ReFS is currently not worth the benefits it has over NTFS.

If a system is built properly there should not be any BSoD. If there is the most likely culprit is faulty hardware or drivers, not because they are not using ECC.
 

bit_user

Polypheme
Ambassador
And unlike the Ryzen CPUs that Intel is comparing to, Intel STILL sells their mainstream CPUs that don't support ECC memory. There is no reason for that anymore, and Intel should be ashamed.
It's been this way since they moved the memory controller on-die, in the first i5/i7 chips. Before that, you just had to buy a motherboard that supported it.

Data corruption is very common, even with perfectly fine hardware.
All good points, but the public doesn't know, doesn't care, and hardware vendors seem to have decided it's not worth trying to educate them. Until memory errors become a major source of system instability or data corruption, I'm resigned to the fact that this won't change.

It is not caused by failing hardware but the consequence of natural disturbances.
Well, I don't regularly check my logs, but the bulk of memory errors I've seen on both ECC and non-ECC hardware in fact have been caused by hardware issues.

The fact is that if a few bit errors occur in the fifty or hundred billion in a modern desktop PC, they're unlikely to be somewhere that will cause a problem noticeable by users. So, for client PCs used in a casual fashion (gaming, media consumption, web), it's rather overkill. However, that quickly changes as the stakes go up. So, for business use & especially any kind of file server or database server, ECC is a must-have!

I know that many people are into overclocking, and ECC functionality does add a slight amount of added time delay, so for enthusiasts, the BIOS should allow disabling ECC. Simple.
Not really. Unbuffered ECC DIMMs aren't usually not available in the highest speeds or lowest latencies, but it's otherwise no slower. The actual integrity checking happens in the CPU's memory controller, which might add a couple CPU cycles, but I've not really noticed in benchmarks.

Now come on Intel - stop forcefeeding your customers heavy handed tactics to make them buy Xeon processors. That crap only lasts until your monopoly stops. Which it has.
Well, most i3's have it, as well as some random Pentium SKUs. These are a good option, for NAS-builders. (Also Ryzen, but unfortunately not the APUs - so you have to add a dGPU, which is annoying.)

https://ark.intel.com/content/www/u...efilter.html?productType=873&0_ECCMemory=True

Tellingly, even though Ryzen CPUs support it (aside from their APUs), you don't find ECC support in most AM4 motherboards. So, I wouldn't count on competitive forces driving it, because there just doesn't seem to be the market demand.

Also, Xeons often don't cost much more than the mainstream counterparts. You can't (usually) overclock them, but the bigger issue for ECC-users is the limited selection & higher price of the motherboards that support it.
 
  • Like
Reactions: jimmysmitty

bit_user

Polypheme
Ambassador
I should probably mention that I feel a little weird being on this side of the issue. Usually, I'm the one complaining about lack of ECC support. That said, I think you're being rather alarmist, though I otherwise agree with your thinking.

Ohhh, no sir. Maybe this is true in your world, but that is not true from my experience. My mother hated having computer crashes, but did not care that it was a memory issue.
Yeah, I thought about mentioning this in my last reply, but let's be clear about the real issue. In your previous message, you were concerned with natural causes of memory errors, but now you're really talking about using components of marginal quality.

Not that it's not an issue, but sourcing better quality memory is usually enough (and still cheaper than ECC). When I can't use ECC, I buy memory rated for a higher speed than I plan to use, and I make sure to memtest it for at least 12 hours. If it can pass that test (and given that I use high-quality UPS and PSUs), then I've found it not to be an issue.

Granted, my experience represents only a few data points, but there's just loads of hardware out there that's not using ECC, from reputable OEMs. If memory errors were as frequent as you suggest, the big PC OEMs would be using it for the simple reason of lowering their support costs.

When your grandmother or any other non-techie can use a computer for email, it is truly just an appliance. I don't think that the common user should know what ECC or DDR is. It should be built in, like ABS brakes or airbags in your car.
You're confusing an issue of life-safety with the (often minor) inconvenience of computer unreliability.

This is my point: reliability should not be optional. Steve Jobs is quoted as saying that "It is not the consumers' responsiblity to know what they want". I would not expect a PC buyer to know what they want. But I am quite sure that they expect it to be reliable, and it is not that difficult to make that happen.
It's funny that you cite Steve Jobs, because I'm pretty sure Apple sold plenty of machines without ECC, even when he still ran it.

The problem with the simple idea that "PCs should be reliable" is that it's often difficult to ascertain when crashes occur due to a hardware error vs. software bug. And I'll bet that software bugs are still more common. So, as long as the finger of blame doesn't have a clear target, it's tough to convince vendors to add this cost, that will likely come mostly out of their margins.

ECC is not new (this is 2019, BTW), and it is actually very inexpensive to implement.
Well, you need 12.5% more memory cells. So, that's your floor. Then, you need the additional motherboard traces and additional QA time to test that everything works properly. So, while it's not huge, it's also not trivial.

Ever had Windows give you a BSOD? Yes it is often caused by Windows, but it has been found that many of the crashes are just data corruption.
IIRC, Microsoft tried to make ECC memory mandatory for Windows Vista-qualified PCs. I'm guessing they got too much push back from OEMs.

We have RAID for HDDs, and file systems with ReFS and ZFS.
Pfft. RAIDs only save you from data corruption if you scrub them, which most people don't. However, filesystems with built-in CRCs are definitely a positive development. That's one of the features I like about BTRFS.

My last 3 home PCs have ECC. My NAS has ECC. My work PC has ECC. The computer that I set up for my sister has ECC. My viewpoint is that ECC is for anyone who isn't casually playing only games on their computer and would like to have "no crashes" or data loss.
While you're at it, be sure to put in a word for running at stock clock speeds and using a good UPS and PSU - both things that anyone valuing stability should really do.

I think most users probably have cheap PSUs and don't use a UPS, yet power issues are probably a more common underlying cause of system instability. Of course, more and more people have laptops, where a UPS isn't needed and the issue of PSU quality is largely out of their hands (though you can buy 19V Seasonic power bricks).
 
Last edited:

bit_user

Polypheme
Ambassador
Short of issues caused by my tinkering, such as overclocking, I have never had a BSoD on any of my home PCs that I have built for myself or family.
Never is a strong word. Given how long you've been doing this, I don't believe it.

Do you turn off or reboot your PCs daily? Windows has gotten damn stable (Win 7, at least) but it wasn't always so, and most of my blue screens have been on machines that have been up for weeks or months.

If a system is having crashes ECC is not some end all be all magic solution that can fix said crashes. It is there to ensure data integrity in data critical operations.
ECC will avoid crashes caused by memory errors. Not double-bit, but those are practically non-existent. It's just that a lot of crashes aren't caused by memory errors.

If a system is built properly there should not be any BSoD.
Bad RAM is a real thing. I've gotten new RAM that didn't pass memtest, and I've seen working machines become unstable due to memory failures, after prolonged use (though I'll grant you that the symptom was an app crashing - not the whole OS). Neither are terribly common, but your statement is simply not true.

If there is the most likely culprit is faulty hardware or drivers, not because they are not using ECC.
As I mentioned above, you're overlooking A/C power issues.
 

bit_user

Polypheme
Ambassador
Now come on Intel - stop forcefeeding your customers heavy handed tactics to make them buy Xeon processors.
I'll credit you with one point, here.

Intel seems to be walking away from the LGA 2066 platform, for Xeon W. So, that leaves a gap between LGA 1151 and LGA 3647. If Intel isn't offering any new Xeon W's for their LGA 2066, then the lack of ECC in their X-series really does push people onto a much more expensive platform (assuming they have some need that isn't met by LGA 1151).
 
Never is a strong word. Given how long you've been doing this, I don't believe it.

Do you turn off or reboot your PCs daily? Windows has gotten damn stable (Win 7, at least) but it wasn't always so, and most of my blue screens have been on machines that have been up for weeks or months.


ECC will avoid crashes caused by memory errors. Not double-bit, but those are practically non-existent. It's just that a lot of crashes aren't caused by memory errors.


Bad RAM is a real thing. I've gotten new RAM that didn't pass memtest, and I've seen working machines become unstable due to memory failures, after prolonged use (though I'll grant you that the symptom was an app crashing - not the whole OS). Neither are terribly common, but your statement is simply not true.


As I mentioned above, you're overlooking A/C power issues.

I actually have not. I have been very luck TBH and I am thankful for that but the only BSoD I have gotten tend to be from me tinkering with things like overclocking or memory timings etc. I have also only had one bad stick of RAM (system wouldn't even POST), one GPU fan go out and one motherboard have caps go bad after it was alive for 8 years.

I do turn my PC off every day. I guess I am old school like that.

I am not saying ECC wont but the guy is making it seem like it would somehow stop all crashes when thats not the case or that it has some major use in the mainstream, which it does not.

Let me expand on it. A properly built system with good parts should not have BSoD short of software or driver issues. I didn't mean to say with bad hardware, of course faulty hardware can cause BSoD and other crashes. In fact I even stated that the majority of BSoD I have experience are faulty hardware related, more so than software.

In my experience ECC RAMS need is in Servers, HPCs and workstations that require data integrity over all else. Mainstream has little to no use for ECC RAM. Much like mainstream has almost no use for RAID anymore, especially RAID 0. Now there are outliers and one offs but again I am talking the mass majority which is what the mainstream is which is where LGA1151 is marketed. Even we enthusiasts are a very small percentage of that and thats where the need may arise more than anything so an even smaller fraction of the already small market.
 
  • Like
Reactions: bit_user

chaz_music

Distinguished
Dec 12, 2009
84
51
18,640
Hello fellows - Charles here again.

I appreciate the comments, but I should clarify some of my own comments. But I do acknowledge the thought you both have put into this.

1. As a point of reference, I am a VERY experienced hardware engineer, having worked with ultrahigh reliability designs in many places, including NASA, Aerojet/Rocketdyne, Sierra Nevada, and Honeywell. It is very much in my wheelhouse to know and limit failure modes. And as a consultant, I've gotten very good at the craft. They hire me to tell them why it hurts and how to fix it. In military and aerospace designs, they work with the assumption that a comic error is going to happen, and how to deal with it. This is why they require MIL-SPEC parts, having added features like Rad-hard or redundant configurations. The higher the altitude, the better the chance of a soft failure. Electronics on planes have data fails more than they do on the ground.

What is the most reliable type of transistor for radiation? A simple P-channel MOSFET. The carriers are holes, which is the absence of an electron. There is nothing for gamma rays to hit.

2. As many of your observations show, there are many sources of failures, including the power supplies (differential and common mode noise, regulation, conducted noise), AC line quality (yep, I gotta UPS), and even user / assembler issues such as ESD damage or supply-chain damage (shipping and warehousing: from vibration, humidity, forklift handling, etc.). A computer is very complex, and the success of it running correctly has to do with everyone who played a part in it and/or touched it: concept engineering, design, manufacturing, and distribution. Did the person at Best Buy drop it on the floor right before you bought it? They undoubtedly added micro-cracks in the solder joints. Oh I hate BGAs.

3. OMG the number of problems I have found coming back to the power system. My earliest was a Xeon configuration in which the HDDs would just die. The failures seemed to be right after power up. Put a scope on it - and found the PSU 12V rail was overshooting to nearly 20V at power up. PITA. How about power line transients? And as CPU and memory voltages get lower, the motherboard power quality is going to be even more of a factor.

4. I am not implying that ECC is an end-all be-all fix, but I find it aggravating that a technology that has been around since at least 1990 is not de facto in our computers now (29 years later). The hold-off is only due to wanting more profit. If most computers had ECC now, it would be a cost-parity :). I found the wikipedia page for ECC entertaining. I did not know Cray originally left it out of their earliest parallel systems, and had to add it in later. Who knew?
https://en.wikipedia.org/wiki/ECC_memory

5. I never never overclock my main machines, and am very careful about heat and dust buildup. I have overclocked some old machines just for fun, though. What's that smell?


Short of issues caused by my tinkering, such as overclocking, I have never had a BSoD on any of my home PCs that I have built for myself or family. My grandmother currently uses an Intel NUC that I built and mounted to a monitor and it has been working great.

Hmm. You my friend have won the Wintel lottery. I learned to be a backup and save fiend in the W95/W98/W2000 days. When W98 first came out, the FAT32 driver had a bug. By the time I figured out there was a problem, it had blitzed my file - and the OS.

The mass majority of people do not use RAID either and those that do in the mainstream typically utilize RAID 0, even though its pointless with SSDs these days. RAID 1 only protects from hardware level failure, not potential corruption due to a missed bit so its not even in the same league as ECC which has dedicated hardware for that purpose.

ReFS is still new and still is not bootable. The mass majority of people, mainstream, uses NTFS still. Hell I would bet most server still use NTFS as the cost and hassle to migrate to ReFS is currently not worth the benefits it has over NTFS.

Just because a technology like RAID has limitation does not mean it should be ignored. RAID has been around for a long time also, and yes, it has a lot of barnacles on it. That is why Sun/Oracle started making ZFS and MS started making ReFS. Both of these are intended to catch hardware level soft errors using data redundancy (CRCs) and other features such as COW (copy-on-write) to get around the need for RAID controllers requiring batteries for retaining any unwritten data in a power failure scenario. And batteries fail a lot also, so I call that a "D'oh!". They should use EDLCs (supercaps) which have an insane lifetime. Just don't get them hot.

I think that there are several 'NIXs that can boot into ZFS natively such as FreeBSD / FreeNAS and Ubuntu. MS has had some setback on ReFS, which was supposed to originally be in W7 (remember Longmere?). BTW: I have been using RAID for at least 15 years. And I still have my old files from 1989. Everything before that, was, well, pre-bedroom fire days ...

And yes, most BSODs are due to drivers. Ever wonder why hardware for the Mac costs typically more? Same of iOS app development. Apple is very good at keeping garbage out of their customer universe.

You're confusing an issue of life-safety with the (often minor) inconvenience of computer unreliability.

If you are oinly using a PC for games, then it is only an inconvenience, although when it happened to my mom, it was a hassle for me because she was 600 miles away. But if you corrupt a Ph.D dissertation paper or kill a presentation on your laptop as you fly to a trade show, it can be a calamity. I think you overlooked my point that these features in autos are now standard. I believe having ECC in PCs should be standard. It is very old technology.

IIRC, Microsoft tried to make ECC memory mandatory for Windows Vista-qualified PCs. I'm guessing they got too much push back from OEMs.

MS did that because they found that they were getting blamed for BSOD failures that were hardware caused and not from the OS. I remember that there was a report around then that showed that in one generation of CPUs (not saying who!) that the memory corruption was happening in cache, not actually in memory. That really bites.

-Chas
 

bit_user

Polypheme
Ambassador
one GPU fan go out
At work, I lost a Radeon 9700 Pro due to fan failure. That's the only time I've had that happen.

Funny thing was, I never did anything that remotely required that card. We had some plans, but I ended up only using it for Windows desktop graphics.

I do turn my PC off every day. I guess I am old school like that.
I started using sleep mode, I think back on Windows XP.
 

bit_user

Polypheme
Ambassador
1. As a point of reference, I am a VERY experienced hardware engineer,
Thanks for the background, but I've found that disclosing one's credentials has more downside than upside, in online conversations. If someone is being an idiot, credentials don't usually do much to convince them of your points. So, if it comes to that, I usually try to walk away.

What is the most reliable type of transistor for radiation? A simple P-channel MOSFET. The carriers are holes, which is the absence of an electron. There is nothing for gamma rays to hit.
Um, so with RAM being a charge-storage device, is there any way for it to store holes? Or is this just a moot point, for RAM?

If most computers had ECC now, it would be a cost-parity :).
But, I don't see how you can get around the fact that you actually need more physical cells. As I said, adding 8 bits of ECC per 64 should always cost at least 12.5% more.

5. I never never overclock my main machines, and am very careful about heat and dust buildup. I have overclocked some old machines just for fun, though.
I even slightly-underclocked the first fileserver I ever built.

That is why Sun/Oracle started making ZFS and MS started making ReFS. Both of these are intended to catch hardware level soft errors using data redundancy (CRCs) and other features such as COW (copy-on-write) to get around the need for RAID controllers requiring batteries for retaining any unwritten data in a power failure scenario.
Oracle walked away from ZFS, and instead focused on BTRFS. That was a few years ago, so I don't know what their current position is, but they had even refused to even re-license ZFS so it could be included in the mainline Linux kernel. Since then, I think someone did a clean rewrite of ZFS to address the licensing problem:

https://www.phoronix.com/scan.php?page=search&q=ZFS+On+Linux

Anyway, BTRFS provides most of the benefits of ZFS.

And I'm not so sure CoW is really anything to do with reliability, so much as enabling features like snapshots. It's pretty cool, but you occasionally need to be aware of it, so you can disable it! It causes horrendous problems for cases where large files are frequently modified, like databases and Virtual Machine disk images.

IMO, putting a filesystem journal on persistent RAM is the best way to avoid needing battery-backed RAID.

I have been using RAID for at least 15 years.
Cool. Same.

And I still have my old files from 1989.
I have everything from about 2004. I never got around to copying off the data from older hard disks, so that data is toast. But, there was nothing really of value, either.

I think you overlooked my point that these features in autos are now standard.
No, I did not. That's where I tried to point out that ABS in cars is about saving lives. However, I can also imagine it pays for itself through lower insurance premiums.

In one sense, this gets back to my earlier point about the blame game with computer unreliability and not knowing whether it was a hardware or software failure. Without ABS, you can literally measure the skid marks and make the case that a car with ABS would've been able to avoid the collision (or stay on the road, etc.). However, most computer crashes go undiagnosed, so the full impact of memory errors remains unclear, to many.

I remember that there was a report around then that showed that in one generation of CPUs (not saying who!) that the memory corruption was happening in cache, not actually in memory. That really bites.
I see a lot of CPU specs mentioning ECC-protected L2, for instance.