[ANNC] BetaComp 2004 Results [LONG]

G

Guest

Guest
Archived from groups: rec.games.int-fiction (More info?)

ANNC: BetaComp 2004 Results

After much delay, the results of this year's BetaComp are in. I was pleased
to see that once again, each entrant found at least *something* that no
other entrant did, and no single entrant found more than about 50% of all
the bugs that were found. This reinforces the prevailing theory that several
testers (and possibly multiple rounds of testing) are most effective in
ferreting out all the bugs.

Unlike last year, the comp used a current game-in-progress to test rather
than a game written just for the comp. Neil Bowers provided the game, a port
of a 1982 game called "Planet of Death" (or PoD). Because it's a port, Mr.
Bowers is trying to preserve the feel of the original, but as you might
guess, the original doesn't play very well by today's standards. The
BetaComp entrants struggled with this conflict in many ways, as we'll see.
But first, the awards:

We start with an Honorable Mention. Mike Jones was a last-minute drop-out,
but he still made some nice comments about the game. I think he put on his
1982 hat too firmly and was too kind with the game. And he liked it!

Next: the award for Pulling No Punches goes to Mark Tilford. Although he had
a hard time judging the game in the context of it being a port, he was very
clear about what worked and what didn't. Kudos!

___ Bond wins the award for Best Use of Undo to Try Different Approaches.
Mr. Bowers pointed out how useful this method can be to test a game.

Best Encouragement (With Appropriate Fixes) goes to Apollo Hogan. There's
definitely something to be said for positive criticism: as an author, it's
probably my favorite part.

Another important part of any beta-test is pointing out alternative
phrasings and missing synonyms: for the second year in a row, Esa E. Peuha
wins the award for Most Thorough Listing of Missing Synonyms. BetaComp's
first returning champ.

A solid, entertaining beta-test report earns Adam Thornton this year's award
for Funniest Report and Most Critical Without Actually Insulting the Author.
My favorite line: "In 1982 I might well have enjoyed this game... In 2004,
however, 'Planet of Death' was only marginally
more fun than plunging heated skewers into my eardrums." Adam Thornton:
killing them with comedy.

The coveted Grammar King title goes to Andrew Walters. His report picked out
a lot of problems no one else mentioned, which is invaluable to a game
author.

We're down to our top two comp entrants. It really is difficult to judge
beta-test reports, since so much of it comes down to subjective criteria.
Mr. Bowers and I were split on which of them was the absolute best entry,
but since we're giving out two gift certificates of equal value I don't feel
too bad about it.

So, I give you, the runner-up of this year's BetaComp: with a thorough and
very well organized report, Andrew Krywaniuk wins second place and a $25
gift certificate.

And in first place, with an extremely thorough and explanatory report, the
winner of BetaComp 2004, Graham Holden, who also wins a $25 gift
certificate.

Congratulations to all the entrants! It's not easy to have someone judge
what you normally do as a favor to people. Thanks for entering, and keep on
testing. Detailed scores and notes to follow. See you next summer for
BetaComp 2005!

Jess Knoch
BetaComp organizer

* * * * * * * * * * * * * * * * * * * *

Now for the overall breakdown of scores and general notes on the reports.
The game, as I mentioned earlier, presented quite a challenge this year,
especially since my listed criteria for judging included commenting and
criticizing "the bigger issues in the game," which can cover quite a lot in
a game ported from 1982. Mr. Bowers' notes to testers made it pretty clear
he was not looking for too many comments about the nostalgic bits, so
testers had to use their best judgment on how much to say about the glaring
plot holes, nonsensical geography, and guess-the-verb troubles. Mr. Bowers
and I judged the entrants separately, and then I took a straight average of
our scores to come up with the final winners list. We weren't always in line
regarding how well entries did in each category, but I think our votes hold
equal weight.

The other challenge this year was that the first release that went out to
competitors had a (near-)fatal bug, so that the walkthrough provided didn't
work and it seemed to be unfinishable. Actually, one tester did manage to
finish the first release by exploiting a different bug. But everyone else
used the next release either in whole or in part to finish testing the game,
which (I'm sure) led to some frustration. I'd like to think it just added
some realism to the artifice of the competition.

Mr. Bowers used a scale of 1-10 points in each of my four categories
(Organization, Clarity, Thoroughness, and Ability to Criticize), plus bonus
points. I am using the scale of ability from Puzzle Pirates
(www.puzzlepirates.com) because I like it. You can think of it as a scale of
1-8 points if you like. From worst to best, that scale is:

Able
Distinguished
Respected
Master
Renowned
Grand-Master
Legendary
Ultimate

Now, in alphabetical order, notes and comments on the entries (JK = Jess
Knoch, NB = Neil Bowers):

Entrant: ___ Bond
Organization: 6/10 (NB), Master (JK)
Clarity: 5/10 (NB), Respected (JK)
Thoroughness: 7/10 (NB), Respected (JK)
Ability to Criticize: 6/10 (NB), Grand-Master (JK)

Mr. Bond did a great job commenting on various aspects of the game,
including whether or not it was desirable to include things like mazes,
snarky error messages, instant-death actions, weird geography, and requiring
bizarre unclued actions to progress. I thought this was his best category.
He included suggestions for alternate solutions to puzzles and places where
the parser could help out a struggling player. As for format, his report had
a few introductory comments and then the rest as a game transcript, which is
helpful since the author gets to see exactly what was tried (although Mr.
Bond doesn't always flag the bugs explicitly). As mentioned above, Mr.
Bowers was impressed with his use of "undo" to see how different states
affected the responses and outcomes. And I agree, the combination was
reminiscent of something I would put on my luggage ^_^.

Entrant: Apollo Hogan
Organization: 6/10 (NB), Renowned (JK)
Clarity: 8/10 (NB), Legendary (JK)
Thoroughness: 5/10 (NB), Distinguished (JK)
Ability to Criticize: 5/10 (NB), Master (JK)

This report might have been the most encouraging of Mr. Bowers' endeavors
with PoD. Whether or not it should be encouraged is something that other
entrants might take issue with, but that's beside the point. Mr. Hogan
impressed the judges with the quality and clarity of his suggestions and
comments, even if he didn't pick out as many individual bugs as some other
entries. Still, it is always nice to see a positive report.

Entrant: Graham Holden
Organization: 9/10 (NB), Legendary (JK)
Clarity: 9/10 (NB), Renowned (JK)
Thoroughness: 10/10 (NB), Ultimate (JK)
Ability to Criticize: 10/10 (NB), Legendary (JK)

This was definitely the prettiest report, and the prettiness was matched by
the quality of the report. He started with a brief introduction, so that the
author would have a good idea of where he was coming from with his
suggestions. The author was very pleased with that section. After that, the
executive summary highlighted his overall impressions and suggestions for
the game. There can be no denying that the report was well organized and
very, very thorough. But... and I can't believe I'm about to say this... it
was almost too wordy. I feel the Wrath of the Verbosity Gods about to smite
me for saying it, but I can't help it. It is indeed a beautiful report, but
there is so much text to get through up front that the actual suggestions
are just a tad harder to read. I feel like a heel for mentioning the bad
parts of such a conscientious report. Anyway, another good part (besides the
depth and organization) is that Mr. Holden will occasionally suggest how
something should be fixed. This is always helpful. Overall, a great report.
Mr. Bowers said it read "like a report from a professional tester."

Entrant: Andrew Krywaniuk
Organization: 7/10 (NB), Ultimate (JK)
Clarity: 7/10 (NB), Grand-Master (JK)
Thoroughness: 9/10 (NB), Legendary (JK)
Ability to Criticize: 9/10 (NB), Legendary (JK)

Mr. Krywaniuk was the only tester to use the first release exclusively,
which I see as something of an accomplishment. The report starts with a few
introductory notes and an executive summary, which I think is the best way
to begin, since it gives the author a prioritized list of things to do.
After that, the bugs were pretty much listed by location, making them easy
to fix in the code. His number one suggestion was that the author re-think
his goals in porting this game to Inform. He was very critical of the plot
holes and "old-school" problems with the game. In addition to his report, he
included four transcripts of his play-sessions, which contained (by my
count) about five additional bugs that weren't listed in his report. It just
goes to show you, it's nice to have the actual transcripts in addition to
the crafted report. Mr. Bowers mentioned that the bit about the intro was
very helpful: the entrant made a special effort to comment on the importance
of the introduction and suggest alternative and more exciting openings. The
author also awarded him a bonus point for saying the game isn't ready for
beta in its current state (the only entrant to do so).

Entrant: Esa E. Peuha
Organization: 6/10 (NB), Grand-Master (JK)
Clarity: 8/10 (NB), Legendary (JK)
Thoroughness: 8/10 (NB), Renowned (JK)
Ability to Criticize: 6/10 (NB), Renowned (JK)

For the second year in a row, I admire Mr. Peuha's ability to find
alternative names for objects, places, and people. He started his report
with some overall comments on the game, and followed it up with the master
list of objects that need additional names. This is very important input
into any game, although other areas of the game don't receive as much
attention. The author singled out Mr. Peuha's comments on the ice cavern as
especially noteworthy. He was also impressed that the disassembly of the
game turned up errors which couldn't be found any other way, such as name
properties made invalid by spaces. For example, the author mistakenly
included "small " in the name property for the "small, green man," meaning
you cannot refer to him as "small" or a "small man." Several testers
mentioned that "small" is needed as a synonym for the small, green man, but
the author, when looking over his code, might see "small " and not realize
that the space was messing things up. So these comments are quite helpful.

Entrant: Adam Thornton
Organization: 7/10 (NB), Legendary (JK)
Clarity: 7/10 (NB), Grand-Master (JK)
Thoroughness: 6/10 (NB), Master (JK)
Ability to Criticize: 7/10 (NB), Ultimate (JK)

This was probably the most entertaining report for me to read, perhaps only
because I am not the author. Mr. Thornton is extremely critical of the
higher-level stuff, and talks extensively about the porting issue. I can say
pretty confidently that he prefers some of today's modern conventions to the
old-school style, on matters of instant death, mazes, and guess-the-verb
problems. ("Yay-for-instant-death" is sarcasm, right?) Anyway, he somehow
manages to avoid it feeling like an attack on the author (this straight from
the author's notes), and in fact is rather complimentary of the author's
skills in coding. He would rather see said skills applied to an original
work, rather than dredging up a game that wasn't so great two decades ago.
In any case, Mr. Thornton is a very competent beta-tester, and this report
picks up on a lot of important issues in the game.

Entrant: Mark Tilford
Organization: 4/10 (NB), Master (JK)
Clarity: 5/10 (NB), Renowned (JK)
Thoroughness: 4/10 (NB), Distinguished (JK)
Ability to Criticize: 3/10 (NB), Grand-Master (JK)

Mr. Tilford's report was a bit on the terse side, at least compared to some
of the others. He found the restriction that the game is a port of the
original first and foremost *too* restrictive. Still, he did a good job
commenting on the porting issues, and did play through the game using "bug"
to mark comments and problems. According to the author's master list, he
found two bugs that no one else explicitly reported, which just goes to show
you that even if one person's report is short, it can still be fruitful. He
does point out the worst bugs, so in a pinch this report would lead to a
huge improvement in the game.

Entrant: Andrew Walters
Organization: 8/10 (NB), Ultimate (JK)
Clarity: 8/10 (NB), Grand-Master (JK)
Thoroughness: 7/10 (NB), Legendary (JK)
Ability to Criticize: 7/10 (NB), Master (JK)

This year's surprise third-place finisher, Mr. Walters did quite well in
several categories. His report was organized into four sections: bugs,
typos, general puzzle and game design issues, and a few suggested things to
implement (objects and verbs). I liked this organization a lot, and I may
adopt it myself. Mr. Walters also wins my award for most bugs found,
although the way I came up with that distinction is a bit indirect. I split
all the bugs/issues/suggestions that all the entrants mentioned (and one I
found myself that didn't get mentioned) into two broad, messy categories:
Bugs, which I would definitely want to fix if I were the author, and
Suggestions, which I would add if I had the time. This is imperfect, since
people will come at things in different ways, and sometimes the same bug
gets listed twice, or two bugs get mixed into one comment, but it works for
me. In this system, I count 91 total Bugs, and Mr. Walters found 48 of them,
or 53%, which was the highest of all the entrants. If you're interested,
second was Mr. Holden with 46, and third was Mr. Krywaniuk with 39. The
other list, Suggestions, had 68 items, of which Mr. Krywaniuk found 31 (46%)
and Mr. Holden found 30.

Anyhow, I give Mr. Walters the award for Most Bugs Found, however slippery
that title might be. In addition, he made several good comments about
grammar, earning him the previously mentioned Grammar King title. Definitely
a solid report. There were several things that no one else mentioned, which
makes his report a very valuable one.

Thanks again to all the entrants! This will be posted at:
http://www.strangebreezes.com/if/comps/beta2004/default.htm
.... but I wanted to get it on the newsgroup first.
 
Archived from groups: rec.games.int-fiction (More info?)

"Jess Knoch" <jessicaknoch@mindspring.com> writes:

> Another important part of any beta-test is pointing out alternative
> phrasings and missing synonyms: for the second year in a row, Esa E. Peuha
> wins the award for Most Thorough Listing of Missing Synonyms. BetaComp's
> first returning champ.

Cool! (BTW, I actually have two middle initials, but that's no big deal.)

> For the second year in a row, I admire Mr. Peuha's ability to find
> alternative names for objects, places, and people.

I can't help feeling that your admiration is a bit undeserved, because
nearly all of them were already in the various descriptions of the game;
I mostly just tried "x foo" for every word "foo" I saw on the screen
that I thought the game should recognize. I'm actually surprised that
the other contestants apparently didn't do the same thing.

> For example, the author mistakenly
> included "small " in the name property for the "small, green man," meaning
> you cannot refer to him as "small" or a "small man." Several testers
> mentioned that "small" is needed as a synonym for the small, green man, but
> the author, when looking over his code, might see "small " and not realize
> that the space was messing things up.

I think it is actually a bug in Inform that it silently accepts invalid
dictionary words; it should at least warn about them.

--
Esa Peuha
student of mathematics at the University of Helsinki
http://www.helsinki.fi/~peuha/
 
Archived from groups: rec.games.int-fiction (More info?)

On Tue, 14 Sep 2004 23:58:47 -0400, "Jess Knoch"
<jessicaknoch@mindspring.com> wrote:

>And in first place, with an extremely thorough and explanatory report, the
>winner of BetaComp 2004, Graham Holden, who also wins a $25 gift
>certificate.

*blushes*

I would like to thank my agent, my mother, ... oops, wrong awards!

Seriously: my thanks (and, I'm sure, of the other entrants) to you
for organising the competition; to Neil for providing something to
work on; and especially to the pair of you for sifting through our
collected efforts in so much detail. Also, please accept a special
prize for this year: The "Captain Mainwaring 'I Was Wondering When
Someone Was Going To Spot That' Award" for not claiming that you'd
included the walkthrough problem as a deliberate test of the beta-
testers!

(For the title of this award to make any sense, "Dad's Army" needs
to have made it across the Atlantic; if it hasn't, just accept the
kudos).

I suppose, then, that it would be churlish in the extreme to point
out that the BetaComp results are not formatted to the same sixty-
six-characters-without-redundant-spaces restriction that you (self
imposed) upon the original announcement? Yep, thought so 🙂

>I feel like a heel for mentioning the bad parts of such a conscientious report.

But, just as a beta-tester should mention the bad parts -- so that
an author can improve the work under test; so too should a "tester
of testers" -- in the hope that the tester can improve. The format
of my reports is still evolving, so your comments will be taken on
board.

> I split
>all the bugs/issues/suggestions that all the entrants mentioned (and one I
>found myself that didn't get mentioned) into two broad, messy categories:
>Bugs, which I would definitely want to fix if I were the author, and
>Suggestions, which I would add if I had the time. This is imperfect, since
>people will come at things in different ways, and sometimes the same bug
>gets listed twice, or two bugs get mixed into one comment, but it works for
>me. In this system, I count 91 total Bugs,
<snip>
>The other list, Suggestions, had 68 items,

Purely from a "I wonder what else was there" point of view, I'd be
very pleased if you could post these lists, either here or to your
website.



Regards,
Graham Holden (g-holden AT dircon DOT co DOT uk)
--
There are 10 types of people in the world;
those that understand binary and those that don't.
 
Archived from groups: rec.games.int-fiction (More info?)

Jess Knoch wrote:
> ANNC: BetaComp 2004 Results

The results, entries, and bug lists are all posted on my website. Link off
the main BetaComp 2004 page at

http://www.strangebreezes.com/if/comps/beta2004/default.htm

I have fixed the names on the website that were posted incorrectly here. I
also forgot to thank Neil Bowers for the use of his game and his extensive
help with the secondary release, judging, and running the comp in general.

Changes for next year: a more original work of IF, and earlier start dates
and deadlines.

--Jess
 
Archived from groups: rec.games.int-fiction (More info?)

"Jess Knoch" <jessicaknoch@mindspring.com> wrote in message news:<2r09tjF13ske2U1@uni-berlin.de>...
>
> Changes for next year: a more original work of IF, and earlier start dates
> and deadlines.

Regarding timing, I was thinking the other day that running the beta
comp in the run-up to the IF Comp probably means that lots of
potential beta testers are busy testing someone else's IF Comp entry.

Cheers,
drj