Calibrating the game review system

Games, Web 3 Comments

I’ve said for a while that I don’t think individual game review scores are particularly useful, on account of the fact that an individual’s taste varies. It’s always important to read the detail of a review rather than to take the score on face value, where (hopefully) a decent reviewer will explain the reasoning behind the aspects he/she did and did not like, so you can judge how much they apply to you. Even this, however, is a hit-and-miss affair, because English being the rich and imprecise language it is, all sorts of emphasis creeps in based on the reviewer’s own opinions, which may or may not be your own - it’s easier to detect and cancel out when you read the text rather than just look at the score, but even so there’s no getting away from the fact that a review is far from an empirical measurement.

I do tend to find Metacritic useful - despite my mistrust of individual review scores, when taken in aggregate, the natural statistical process tends to smooth out the anomalies and result in a reasonably good guide. Again though, you have to take into account your genre preferences - Halo 3 and Oblivion scored persistently highly but personally I’m not keen on either of them, but that’s down to my preference rather than the quality of the games themselves. When restricted to a genre though (and often, restricting to subgenre is required, for example Halo 3, Gears of War and Bioshock should all be in separate subgenres IMO), Metacritic’s results do seem to align with my overall opinion; such as rating Rock Band 1/2 and Guitar Hero 2 as the leaders in the music performance/imitation genre, and Geometry Wars 2 and Rez HD in the arcade shooter genre. I don’t think the overall numerical rankings are at all useful, but within subgenres the relative ranking does seem pretty sound, meaning that while you might want to filter your non-preferred game types out, if you’re looking for a good game in subgenre X, Metacritic is quite a good way to find one - provided you can reliably identify those subgenres (again, reading the full review text should help here).

However, Metacritic scores can take a little time to settle down, and inevitably purchases are made based on the earliest 2-3 web reviews (or even previews sometimes). In these cases, you need to evaluate the reviewer as much as the game, IMO. While most sites do credit the author of the review, few of them make it easy to find what other games this reviewer did or did not like, which is vital information. Some people categorise individual sites as reliable for them or not, but I don’t buy that - you can’t say that you trust 1UP or Eurogamer universally, it’s all about the individual reviewers. I tend to read Eurogamer for the humour, but whether I agree with their reviewers varies wildly. I often agree with Kieron Gillen (a veteran of several PC games mags I used to read too) but have strongly disagreed with their Keza MacDonald, who for example said that Guitar Hero III was ‘in every conceivable way, a better product than its predecessors’, which is pure, unadulterated tosh. It’s partly his fault I bought GHIII, only to abandon it in disgust within 2 weeks, so I discount any opinion he has on music games now.

So ideally, in online game reviews I’d like to see a box-out summary of a few other games in the same genre that this reviewer has judged, in order to figure out how much weight I should give to their opinions. Any chance of that Eurogamer/1UP/IGN et al? I think games reviewers should put their own face / personality out there more so we can identify with those we do / do not tend to agree with - this is the kind of thing some 8-bit mags used to do in the 80’s, with several people pitching in and each identifying themselves with a little picture or something. Why is everything so impersonal on the web?

The folly of game scores

Games 5 Comments

I’ve believed for a while that the process of reviewing games (or indeed, most things) by allocating them a numerical score is akin to trying to pin a rosette to a charging rhino; an exercise in utter futility. The very act of publishing an absolute number to a review is a total fallacy - that you can legitimately assign a piece of pseudo-empirical data, which will be processed as such downstream, to what is in fact an entirely subjective opinion.

If a game gets a 7 or better it’s generally ‘good’, and if it gets a 9 or 10 it’s generally considered ‘fecking awesome’. But it’s entirely based on what the reviewer tends to like - someone who dislikes JRPGs isn’t going to rate a JRPG game as highly as someone that does, regardless of the inherent quality. Now, most reputable review sources will match their games to reviewers who are knowledgeable in the genre, which usually means that they also like that genre - otherwise they wouldn’t have played enough games in it to be considered authorative. If this was achieved perfectly, each review would then be an assessment of pure quality and not of the genre itself, because each reviewer would be equally enthusiastic of the genre in question. Fine in theory, except that humans, even those highly trained game reviewers, are imprecise measuring devices at the best of times.

So, in practice it often entirely falls apart. Take the recent review of MGS4 by Eurogamer, which has been raising some hackles among fans because they ‘only’ gave the game an 8. If you just skipped to the number, you might be forgiven for thinking that this denotes disappointment (everyone seems to assume a headline game has to get a 10), and a number of people have posted in the comments along the lines of “I was going to get this, but now it’s got an 8, I won’t”. And yet, if you read the actual review, if you’re a MGS fan this game is probably going to be exactly what you wanted - by the sounds of it it’s going to be a superbly realised sequel consistent with the deep tradition of the series. The point the reviewer makes, and presumably one of the reasons for the score, is that this cuts both ways - that if you didn’t like MGS to begin with, you’re sure as hell not going to like this one either. It’s a Marmite thing. So how do you assign a single number that represents the fact that a probably equal number of people are going to adore and despise it? Simply put, you can’t. Any number you assign is going to be wrong.

The only ‘right’ review is in the text. A good reviewer will explain why he/she likes or dislikes certain elements of the game, and if they’re subject or genre-relative, the reader is able to decide whether that applies to them or not. I read the review and knew for sure that MGS4 isn’t for me, and I’m sure MGS fans will read the review and get the exact opposite message. That’s perfect! A single score just disseminates false information, boiling away delicious reasoned argument and analysis into a nasty primieval sludge of a number at the bottom. I also think Eurogamer’s Rock Band review is accurate in the text but not the score for the same reason - personally the downsides they mention don’t dampen my personal enthusiasm for the game one jot, and as such the 8 they gave that for me personally is probably not representative either, even though for some people (perhaps those less into music games and/or more price sensitive) a lower score might be appropriate. It’s subjective!

The final nail in the coffin is that scores invite comparison. According to the average game scores, Mario Kart DS is a ‘better’ game than Puzzle Quest, something certainly not borne out by my personal play times on these 2 titles, but in any case these games are so different as to be completely incomparable. It’s like saying lemon is better than chocolate. It’s nonsense. Gibberish.

I realise that simple scores are perfect for the lazy, sound-byte loving MTV generation. But it doesn’t make them right.