Archived from groups: rec.games.roguelike.angband (
More info?)
Sherm Pendley wrote:
> No, someone posted a link to a well-known neutral, third party authority
> that conclusively shows that the server has been running Apache on
> Solaris for over four years.
Why would there even *be* a neutral, third party authority keeping close
tabs on what OS a *totally unrelated machine* was running? And supposing
there was, why should I trust you? I can think of two excellent reasons
why I *shouldn't*:
1. Apache servers on unix don't generally foul up mime types or time
stamps, but the server in question did during a period of time a lot
more recent than four years ago.
2. You're a prick.
Also, even supposing this is true, and some site (though it is certainly
not "well known" since I for one have never heard of it and I'm the
quintessential random sample member) is for some unfathomable reason
tracking the OS being used by one or more completely unrelated machines
elsewhere in the world (what is it anyway, the CIA? Or maybe Microsoft
keeping an eye on the size of the competition's market share?) then
there's the question of its sampling resolution and accuracy. The
numbers someone quoted suggest it samples only every several months.
Who's to say they didn't give IIS a try a week or so ago, and the
problems we observed were fallout from (and contributed to the end of)
that disastrous flirtation? Also, however it "diagnoses" the server and
OS of an utterly unrelated machine, it won't get it right 100% of the
time. If it relies on those machines to self-report, using normal HTTP
header info, then those machines may sometimes spoof that info. It's
well known that client headers are notoriously unreliable for
identifying browser usage accurately -- a lot of browsers spoof being
Internet Exploder because sites either don't work or dumb themselves
down if they don't; people deliberately spoof their browser's identity
for privacy or even paranoia reasons; and so forth. Servers sometimes
behaving similarly would not surprise me much. If instead it diagnoses
the server type by doing various things with it it's still liable to get
it wrong. If it times responses, IIS on XP Pro on a beefy box will look
like Apache on Linux on a 486. If it looks for an exploit (which would
be ethically dubious) it will not correctly identify IIS machines that
have had the vulnerability patched (all 3 of them). If it sees if it
accepts a URL altered to use backslashes ... well, that might be pretty
darn accurate. Or not. I don't actually know in that case. If it
monitors the server for problems and its trouble-free for days on end it
will naturally assume Apache running on some unix, and it might be an
IIS/XP box that had a lucky few days and keeled over the very next. On
the other hand, if the box is in yoyo mode it will suspect Windows
running, probably, IIS, even if it's a rock-solid Apache/Solaris config
that happens to be running on a box that's dependent on the world's
flakiest power grid (which, incidentally, is in Quebec). And that's
leaving aside the whole issue of whether a problem with a server that
takes the specific form of connections timing out and no-route-to-host
errors is due to the box itself being down or intervening routing
problems -- this will be a weakness with any scheme dependent on
sampling from a central point. One that uses server self-reporting will
be least impacted, if it simply jots "not reachable" for any server that
wasn't reachable at a sampling time, or drops such sampling attempts
entirely. On the other hand those are precisely the schemes dependent on
the server accurately identifying itself, which is not a safe assumption.
> When you see articles
> trumpeting about how IIS has lost a few more points of market share to
> Apache, those articles are nearly always quoting Netcraft data.
Not this Netcraft thing again! I thought we'd been over that.
Anyway this makes the data even more suspect -- there are ideological
points to be won and lost and actual stocks with dollar values at stake.
This provides motivation for partisans of either side to spoof server
self-identifications for various reasons.
* A partisan with access at a low level to a machine running the *other*
server -- say a unix-lover forced to administer an IIS farm to make
ends meet -- may arrange for it to contribute to their own side's
market share figures dishonestly. Microsoft itself almost certainly
runs a bunch of linux/apache boxen for mission-critical stuff and
makes them incorrectly claim to be its own products for the same
reason. (What? You don't honestly think they actually eat their own
dogfood, do you? The whole empire would collapse the minute they
trusted anything truly crucial to their business to a Windows box --
trust me.)
* A partisan might perversely run their own side's server and spoof the
other's identity, so as to make the market share figures support a
shrieking cry of "They're going to win! Look at their figures! We've
gotta DOOOO SOMETHING!!" -- alarmists and extremists of course, and
anyone with something to gain from a little rabble-rousing. Most
likely to make an Apache misidentify as IIS however to generate
figures that will alarm the anti-Microsoft crowd.
On top of which, running a bunch of unnecessary extra servers of one
type or the other and making sure they get counted in the census -- but
that wouldn't be salient here.
> But I'll give you this one anyway - the fact is, no I can't *prove*
> beyond a shadow of a doubt that their log is genuine.
Ah -- some sense and reason at last.
> Actually, it does. Netcraft has data for that server that goes back to
> November 2000.
Data of dubious and ideologically-charged provenance.
> I think it's fairly clearly *neither* the server, nor the browser. If
> the .zip file were being damaged in transit, it would presumably fail
> its CRC check upon being unzipped.
What, an issue with intermediate hosts? You're joking right? First of
all multiple people reported problems; presumably not all using the same
ISP. It's quite likely the only hop they all had in common in
downloading the file was the source -- the server whose wonkiness or
lack thereof is in dispute. Perhaps there's a gateway or similar
upstream of that server, perhaps belonging to their connectivity
provider, that the transfers all bottlenecked through, so it's not a
certainty however. But then, internet routing has been pretty stable and
reliable as to data integrity for a decade or two now. (Actual route
selection is subject to screwups and traffic snarlups, resulting in
slowdowns, out of order packet arrivals, and packets dropped, but not in
actual data integrity compromise. Routers simply do not ever actually
edit packets, save to modify the headers, particularly HTL and any other
loop-prevention state, such as the last router to touch the packet if
that's even tracked.) The high level protocols, meanwhile, have
pedigrees nearly as long -- about a decade for HTTP, longer for FTP and
the like. The only likely errors with an HTTP transfer are a lost
connection or reordered or missing packets. Reordered packets are
reordered the way they're supposed to go, and missing packets are simply
reordered. As in, asked for again. If some of the file never arrives,
the transfer hangs or eventually aborts with a timeout error of some
sort. It does not appear to succeed and give a bogus file. And a file
that's three times LONGER than what should have arrived? That requires
not missing packets or overwritten bytes, but the spontaneous creation
of bogus *extra* packets, something I've never heard of outside of (bad)
science fiction (usually bad cyberpunk to be exact).
> Such checks aren't foolproof, but the
> odds against a mangled file having the same CRC as the original are
> astronomical.
SHA1 and MD5 are even more trustworthy, but becoming obsolescent; SHA128
is now recommended for serious spoof-resistance. But that assumes
deliberate attempts to make a bogus file with a matching checksum (say,
a virus-infected executable that has the same checksum as the clean
original) are being guarded against, rather than accidental corruption.
> I suspect that either the files in the .zip are broken, or the app
> that's unzipping it is breaking them at that point.
It's the 30th competition savefile; you'd expect the guy who posts them
to know how to zip them correctly by now. If there was ever going to be
a bad zip at the source, it was going to be the 1st, not the 30th. As
for the unzipper, we're talking multiple people with problems fetching
it over multiple routes, using different browsers, and using different
unzippers. One used whatever obscure unzipper is common on RiscOS, which
is still implied by this thread's subject line after all this time.
Another had (I think) Windows. They were probably using the built-in XP
unzipper, but might have been using Winzip and (slight chance) possibly
even pkunzip. If a Mac user had trouble (I'm unsure of that) they would
likely have used Stuffit Expander; a linux user gunzip or perhaps an X
app, but that would probably either wrap gunzip, or use zlib which
gunzip itself is basically just a wrapper for to allow the library to be
called from shell scripts and the command line.
(Attempting to un-gzip a pkzipped file would cause problems, but
probably a wrong archive format error message rather than a failed
checksum error message. Also, it wouldn't have hit the RiscOS user,
since RiscOS is quite clearly not Unix judging by the ability of a hung
unprivileged app to bring RiscOS to its knees as discussed here recently.)
--
http://www.gnu.org/philosophy/right-to-read.html
Palladium? Trusted Computing? DRM? Microsoft? Sauron.
"One ring to rule them all, one ring to find them
One ring to bring them all, and in the darkness bind them."