News 39-Year-Old 4.77 MHz DOS Web Server Hits 2,500 Hours of Uptime

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
maybe 2500 days would be worth talking about.
Yeah, my initial assumption was that they meant days, not hours. That'd be 6.8 years, which is remarkable but not unheard of, for more modern hardware.

Again, I get that MS DOS lacks memory protection. However, once you have a stable implementation, there shouldn't really be any reason why it wouldn't just run indefinitely, AFAIK. The only potential issue that even comes to mind is heap fragmentation.

@mbbrutman , by any chance did you do any software development on other systems? If it were me, I'd develop and unit test as much of it as I could under Linux, where I could use tools like valgrind to check for memory bugs.

What language, compiler, and runtime libraries did you use?
 
Last edited:
What InvalidError said about printers : exactly true (the PC's character table contained invisible characters dedicated to basic printer control). About drivers : there were few, but still - mouse driver, CGA emulation on Hercules graphics, but also some sound cards (if you had one) or ATAPI drives needed a DOS driver. Stuff like modems or network cards, too.
Sound card drivers weren't really a thing in the DOS days, you just set your ULTRASND/BLASTER environment variables to tell software what the IRQs, port ranges, DMAs, etc. were and most software went bare-metal from there. The only "drivers" I remember were for sound card emulation (mainly AdLib and Roland) and I don't remember those working particularly well.

There were no drivers for ISA and serial port modems, ISA modems behaved exactly like an extra serial port. I've always bare-metal-interfaced with serial ports/modems in DOS, pretty simple stuff. I doubt any decent terminal software relied on OS services to interface with serial ports either, much easier and faster to do it bare-metal, especially on a 4.77MHz 8088/8086/V20 where every cycle counts.

Then you have the standard device drivers like ANSI.SYS. If you are using a parallel port storage device like a Zip drive you have a driver there.
ANSI.SYS was only necessary if you wanted to do things like colors from batch files or pipe ANSI-formatted serial data straight to console to render a remote text interface without parsing the stuff yourself. If you don't want your local text output to feel like a serial console from all of the encoding and decoding overhead that goes with ANSI, you write directly to the character buffer at b800:0000.

As for the ZIP drive, only the parallel port version (and floppy if there was one, not sure) required drivers to make the OS aware that there was a drive on there since the parallel port does not support any form of native discovery. The PATA version (I had one of those) just shows up like a normal removable block device, no IOmega-specific sauce required in Windows, don't remember trying to use it in DOS.
 
I think folks are confusing TSR (Terminate and Stay Resident) programs and Drivers, DOS didn't need many actual "drivers" because most anything could and often did talk directly to hardware to get things done. DOS had a ton of TSR's, due to how bare metal DOS was many people didn't want to have to rewrite basic software for everything so lots of little programs appeared that loaded into memory and made various useful functions available on-demand.
 
@mbbrutman , by any chance did you do any software development on other systems? If it were me, I'd develop and unit test as much of it as I could under Linux, where I could use tools like valgrind to check for memory errors.

What language, compiler, and runtime libraries did you use?
I don't think any modern *NIX could ever run on that hardware, apart from MINIX. And the simple act of porting would introduce this kind of bugs in droves. OTOH, 16-bit dev meant you could almost track your pointers by eyeballing them...
 
Last edited:
You mis-quoted my message. Please fix your post, even if you just add a bare [quote] tag, at the beginning.

I don't think any modern *NIX could ever run on that hardware, apart from MINIX.
I said "as much of it as I could", meaning all the parts of the TCP stack, etc. which didn't interact with DOS or the bare hardware.

And the simple act of porting would introduce this kind of bugs in droves.
Assuming it's C code, probably not. In general, getting code to work with another toolchain and even on another OS usually improves the quality, but that also depends on how much you have to touch just to make it portable. Hence, I'd focus on unit testing just the pure, generic C routines.
 
You mis-quoted my message. Please fix your post, even if you just add a bare [quote] tag, at the beginning.


I said "as much of it as I could", meaning all the parts of the TCP stack, etc. which didn't interact with DOS or the bare hardware.


Assuming it's C code, probably not. In general, getting code to work with another toolchain and even on another OS usually improves the quality, but that also depends on how much you have to touch just to make it portable. Hence, I'd focus on unit testing just the pure, generic C routines.
  1. that's what happens when replying on phones, dang...
  2. agreed, however IMHO it's not the part where leaks or overruns are most likely to occur.
  3. it would still need to be on a 16-bit OS (or with a C compiler that allows cross-platform compilation towards a 16-bit OS), as the meaning for integer, long etc. may change from one to the other. There isn't many of them that still support 16-bit platforms, the port itself may thus hit bugs in the compilation toolchain.
I'm not saying you're wrong, only that the port itself may hit more bugs than one would solve when compared with developing on period-correct software.
 
2.93 years might sound like a long time between restarts, but the computer in the Voyager spacecraft has been running for over 48 years (and counting) without a reboot, for example.
This factoid gets trotted out a lot, despite not being true. The Voyager probes have triplicate main processors and many subsystem specific processors (e.g. comms subsystem, instrument processors, etc), and resets are hardly uncommon, nor are software updates (which end with a reset to switch to the new code). The Voyagers also contain multiple CLRTs (Command Loss Reset Timers) for multiple subsystems, so components can reset themselves if they lose communication - and part of the regular communication with the probes involves sending regular CLRT timer rest commands - in addition to commanded resets. Total system resets have also occurred, e.g. for Voyager 2 in 2010: https://voyager.jpl.nasa.gov/news/details.php?article_id=16
 
  • Like
Reactions: bit_user
It all depends on how much data the cpu handles. AMD Ryzen 2900 and 2950x for instance, never made it further than a week to two weeks of uptime under 100% load. They're horrible for full load applications. Intel on the other hand, some of my modern celerons, pentiums, and core I5 units, ran quite literally for months before freezing.

I believe intel runs more stable than amd, but the cpu isn't the only one at fault. Sometimes ram running xmp profiles can cause errors too.

For a dos server, the guy might as well have ran it from an atom cpu with registered ram. Uses the same power, but is quite literally 100-400x faster, resulting in lower lag.
 
It all depends on how much data the cpu handles. AMD Ryzen 2900 and 2950x for instance, never made it further than a week to two weeks of uptime under 100% load. They're horrible for full load applications. Intel on the other hand, some of my modern celerons, pentiums, and core I5 units, ran quite literally for months before freezing.

I believe intel runs more stable than amd, but the cpu isn't the only one at fault. Sometimes ram running xmp profiles can cause errors too.

For a dos server, the guy might as well have ran it from an atom cpu with registered ram. Uses the same power, but is quite literally 100-400x faster, resulting in lower lag.
Erm... The Ryzen 2900 and 2950, you sure about that ? There are Threadripper CPUs under these names - 300W beasts that require a LOT of cooling to work properly. Loading them 100% for weeks at a time will sure tax a motherboard quite a lot. A LOT more than Celeron or Pentium processors, that are weak-@r$e entry level processors that use up only little power. An i5 is not much better.
At least the AMD Zen platforms support ECC memory, that's one thing worth considering for a server - where Intel requires you to go Xeon for ECC.
 
IDK if missing some aspect of this or what. I saw the posts where individual(s) are discussing what old hardware it is and whatnot...but >105 days uptime is supposed to be a big deal?
The only thing that might be somewhat of a big deal is the stack of bodges surviving the traffic surge from such a story which likely prompted some people to try crashing the thing by testing whether its IP stack and minimal HTTP server could handle weird packets.
 
  • Like
Reactions: bit_user
@bit_user The development environment is Open Watcom 1.9. I run it under Windows/Cygwin so that I can use multiple windows for editing, SVN for source control, etc. For testing I use a virtual machine first, then move to real hardware.

The development environment is limited. I can test snippets of code using gcc but my environment is 16 bit DOS. I have to do a lot of unnatural things to get reasonable performance in that environment and the nature of communications code is asynchronous, so my best testing tools are a good flight recorder and not being too clever with the code.
 
  • Like
Reactions: bit_user
As of this writing the machine has survived a "hug of death" from Hacker News and now the traffic rush from Tom's, which has had the machine running in an overloaded state for the last 24+ hours. (It is serving what it can but there is no doubt that requests are timing out.) It's totally exposed on port 80 so it has to deal with all of the garbage that comes in there.

If you think this is a trivial milestone, well, try it sometime. ;-0 (You'll have to write your own TCP/IP and web server too.) There are a lot of things that can go wrong, both in hardware and in software. Things overheat. 40 year old capacitors and resistors fail. The power has gone out twice. (Thank you to the UPS!) DOS fragments memory if you are not careful. You can't leak any resources. You have to sync the time regularly because the machine clock drifts. The list goes on ...
 
Last edited:
  • Like
Reactions: bit_user
For a dos server, the guy might as well have ran it from an atom cpu with registered ram. Uses the same power, but is quite literally 100-400x faster, resulting in lower lag.

If you want to run a web server, sure. That wasn't the point of the project. The point of the project was to have fun by stretching the bounds while being slightly silly in the process. The Dymo tape label and the ridiculous choice of hardware should have been the giveaways.

I think that is being lost on a lot of people. It's not practical to spend the time to create something like this. That's why it's "retrocomputing performance art."
 
  • Like
Reactions: bit_user
3. it would still need to be on a 16-bit OS (or with a C compiler that allows cross-platform compilation towards a 16-bit OS), as the meaning for integer, long etc. may change from one to the other. There isn't many of them that still support 16-bit platforms, the port itself may thus hit bugs in the compilation toolchain.

I'm not saying you're wrong, only that the port itself may hit more bugs than one would solve when compared with developing on period-correct software.
Good point about the types changing sizes. Not using the same sizes could mean failing to find overflow bugs in counters and index variables.

I guess, if you're not using a C99-compliant compiler, then you'd want to use your own typedefs along the lines of what's in <stdint.h>.
 
It all depends on how much data the cpu handles. AMD Ryzen 2900 and 2950x for instance, never made it further than a week to two weeks of uptime under 100% load. They're horrible for full load applications. Intel on the other hand, some of my modern celerons, pentiums, and core I5 units, ran quite literally for months before freezing.
Source? If you're not using ECC memory, then I could see how you might get a kernel panic that way. Otherwise, I find that claim highly suspect.

I believe intel runs more stable than amd, but the cpu isn't the only one at fault. Sometimes ram running xmp profiles can cause errors too.
Intel Xeons are specified for 24/7 heavy compute loads. Most of their client processors aren't.

For a dos server, the guy might as well have ran it from an atom cpu with registered ram. Uses the same power, but is quite literally 100-400x faster, resulting in lower lag.
Agreed that old hardware is inefficient. That clearly wasn't the point.
 
At least the AMD Zen platforms support ECC memory, that's one thing worth considering for a server - where Intel requires you to go Xeon for ECC.
Most Intel i3 models support ECC memory, if you use them in a motherboard which supports it. In the case of Alder Lake and Raptor Lake, they opted to do the same for most of the upper product stack, rather than making a Xeon E-series version.

 
Most Intel i3 models support ECC memory, if you use them in a motherboard which supports it. In the case of Alder Lake and Raptor Lake, they opted to do the same for most of the upper product stack, rather than making a Xeon E-series version.
Yeah, well, since DDR5 requires ECC, they had little choice but to enable it - or look stupid when asked why the same motherboard with the same CPU would support ECC on its DDR5 slots and not on the DDR4 ones.
You could enable ECC on a Ryzen 1600 on a B350 motherboards and DDR4 ECC RAM - ECC support required at motherboard level, but there were some - it wasn't a matter of chipset, more of wiring and BIOS/UEFI support.
 
Depends on what the software was doing. Because it wasn't multithreaded, if the software would go away and do something for a while, you could easily get ahead of it.


DOS had TSR's (Terminate and Stay Resident programs - sort of like a UNIX process running in the background), but those tended to be extremely simple so you basically had just 1 program running at a time.

It's hard to believe the 386 launched in 1985 and it took another 10 years for Windows 95 to come along and finally put the first nail in DOS' coffin.
Tsr interrupt 21
 
  • Like
Reactions: bit_user
Yeah, well, since DDR5 requires ECC, they had little choice but to enable it
There are a few points of confusion, here.

First, the whole thing with i3's supporting ECC memory goes way back. I have a Haswell i3 in a server board (which cost more than twice as much as the CPU, lol), with ECC memory, as my main fileserver. I think because Intel's Xeon E3 series didn't overlap with some of the i3 models, Intel decided just to allow ECC on those.

Second, DDR5 has on-die ECC, but it's invisible to the outside and doesn't protect against data transmission errors. That's why ECC DIMMs are still a "thing", in the DDR5 era. Unfortunately, because DDR5 has a data width of 32 bits, that means each DIMM actually implements 2 channels and they each need ECC. So, instead of ECC DIMMs having 9 (or 18) chips, they now have 10 (or 20). In other words, 25% more. That pushes prices at least 25% higher than the equivalent speed & capacity DDR5 UDIMM.

Finally, Intel did not enable ECC on all of the Gen 12 and Gen 13 models. You can see a complete list of the LGA-1700 CPUs with it here:

- or look stupid when asked why the same motherboard with the same CPU would support ECC on its DDR5 slots and not on the DDR4 ones.
Oh, and if you want to use ECC UDIMMs on a LGA-1700 CPU, you'd better have a motherboard with the W680 chipset.

You could enable ECC on a Ryzen 1600 on a B350 motherboards and DDR4 ECC RAM - ECC support required at motherboard level, but there were some - it wasn't a matter of chipset, more of wiring and BIOS/UEFI support.
AMD played games with ECC support. First, there was the whole game around whether it was officially supported. Then, the question of whether you could get reporting of ECC errors. Finally, they took a page from Intel's book and disabled it on their non-Pro APUs.

This was a pretty shameful turn, for AMD. They started out on the high road, and then went down the same path as Intel.