The Case for Metacritic

Posted on April 3, 2009 by Soren Johnson

Over the last few years, Metacritic has become a popular whipping boy within the games industry. A recent example would be Adam Sessler’s bit at GDC’s journalist rant session. At the risk of beginning to sound like a reactionary contrarian, I feel a case needs to be made for Metacritic. Unlike my argument for used games (or, rather, for thinking critically about what we are trying to sell consumers for $60), I feel much less conflicted in this case, so let me state my thesis very clearly: Metacritic has been a incredible boon for consumers and the games industry industry in general. The core reason is simple – publishers need a metric for quality.

What should executives do if they want to objectively raise the quality bar at their companies? They certainly don’t have enough time to play and judge their games for themselves. Even if they did, they would invariably overvalue their own tastes and opinions. Should they instead rely on their own internal play-testers? Trust the word of the developers? Simply listen to the market? I’ve been in the industry for ten years now, and when I started, the only objective measuring stick we had for “quality” was sales. Is that really what we want to return to?

Yes, I know translating all ratings onto a 100-point scale distorts them – a C is not a 60 is not three stars – but we need to not let the perfect be the enemy of the good. What are the odds that we can get every outlet onto the same scoring scale? Not likely. Can Metacritic improve the way it converts non-numeric ratings into scores? Absolutely. However, the whole point of an aggregator is that these issues come out in the wash. When 50 opinions are being thrown into the machine, a 74 is actually different from a 73.

I use Metacritic all the time, and I love it. It’s changed my game-buying (and movie-watching and music-listening) habits for the better, which of course funnels money into the pockets of deserving developers and encourages publishers to aim for critically-acclaimed products. Have we gotten so jaded that we have lost sight of what a wonderous thing this is? Metacritic puts an army of critics at our fingertips. Further, consumers are not morons who can’t judge a score within a larger context. We all realize that, due to the tastes of the average professional reviewer, some games are going to be over-rated and some will be under-rated.

Ultimately, the argument against Metacritic seems to revolve around whether publishers should take these numbers seriously. Some contracts are even beginning to include clauses tying bonuses to Metacritic scores. Others are concerned that publishers are too obsessed with raising their Metacritic averages. Actually, let’s think about that last sentence in detail. Note that when I just wrote “others,” I was referring to journalists, not to investors. As John Riccitiello famously said, “I don’t think investors give a shit about our quality.” How bizarre is it that once the game industry starts taking journalists’ work seriously, they complain about it?

I’ll give my own perspective on this issue. Over the years, I have seen many great ideas shut down becomes someone in charge thinks they won’t impact sales. However, when I am in an EA meeting in which we talk about the need to raise our Metacritic scores – and the concrete steps or extra development time thus required – I’ll tell you what I feel like doing. I feel like jumping for joy. How incredible is it to work for a publisher who cares about improving the quality of our games in the eyes of critics and uses an independent metric to prove it.

As for the renumeration issue, isn’t it a good thing that there is a second avenue for rewarding developers who have made a great game? Certainly, contracts are not going to stop favoring high game sales, so – hopefully – Metacritic clauses can ensure that a few developers with overlooked but highly-rated games will still be compensated. Now, if a game doesn’t have high sales and also doesn’t get a good Metacritic score, well, there’s a name for that type of game, and these developers should not be protesting. Further, developers also need to stop complaining that a few specific reviews are dragging down their Metacritic scores. Besides the fact that both good and bad reviews are earned, in a world without Metacritic, one low score from GameSpot, GameSpy, 1Up, or IGN becomes a disaster. Score aggregation, by definition, protects developers from too much power being in the hands of one critic.

Journalists also need to have the guts to give games a score and stick by it. Putting a score on a review doesn’t take away the ability to add nuance to one’s criticism. My favorite music book is the Third Edition of the Rolling Stone Album Guide. As the reviews were written by just four critics, I have learned to understand the exact difference between five and four-and-a-half stars (or, for that matter, between two-and-a-half and three stars). If you are a great reviewer, the score you give a game helps me place it in context with everything else you have rated. Moreover, your score lets you contribute, via Metacritic and all the other aggregators, to the meta-critique of games on the Net. What exactly is the problem here?

21 thoughts on “The Case for Metacritic”

steve on April 3, 2009 at 11:27 am said:

“The core reason is simple – publishers need a metric for quality”

“Quality” isn’t something that can be turned into a simple metric. It’s the result of a number of data points—feedback, sales, reviews, awards, whatever. Relying on critics isn’t going to give you meaningful results anymore than relying solely on sales.

“How incredible is it to work for a publisher who cares about improving the quality of our games in the eyes of critics and uses an independent metric to prove it.”

Are critics representative that of the overall audience? The Sims isn’t necessarily a well-reviewed game by the IGNs of the world, so would you modify the game to match their tastes at the expense of the actual people who buy and love the game?

“As for the renumeration issue, isn’t it wonderful that there is a second avenue for rewarding developers who have made a great game?”

Not when you know more about the game review process. Unless you’re releasing the biggest game of the year, you’re going to be lucky if reviewers spend more than a day with your game.

AAA games get better reviews for various reasons, one of which is that they do get more money, more time, and more polish.

But let’s not overlook the impact of hype. Everyone goes into a review expecting it to be great, and few will dare defy the pre-determined consensus. That’s how GTA IV ends up with such high ratings; those reviews were written in the heads of critics before the game even shipped.

“Score aggregation, by definition, protects developers from too much power being in the hands of one critic.”

Maybe you’re not familiar with how Metacritic works. It weighs scores from certain publications considerably higher than others. (Game Rankings is what you’re looking for if you want everything weighed identically.) With Metacritic, that one bad IGN or Gamespot review can tank your score.

There’s also the sheep effect. The first major site that publishes a review generally sets the tone for the ones that follow. Most critics are afraid to rock the boat because they fear the angry Internet nerd.
Soren Johnson on April 3, 2009 at 11:53 am said:

I am aware that Metacritic weighs reviews differently, which is why I prefer it over GameRankings, but that’s just my personal taste.

I am not trying to argue here for Metacritic over the other aggregators. Metacritic is obviously not the perfect metric, but I think every aggregation scheme is going to have its pluses and minuses. At the end of the day, though, publishers really do need a metric for quality, and the aggregation sites are, by far, the closest the industry has come to having one.

Why are we assuming that people can’t think for themselves and use these scores as simply a valuable tool? Everyone knows that kids games, for example, are going to have a different score range from AAA games. That doesn’t mean that the scores aren’t “meaningful.”
Dale on April 3, 2009 at 12:53 pm said:

“However, when I am in an EA meeting in which we talk about the need to raise our Metacritic scores – and the concrete steps or extra development time thus required – I’ll tell you what I feel like doing. I feel like jumping for joy. How incredible is it to work for a publisher who cares about improving the quality of our games in the eyes of critics and uses an independent metric to prove it.”

I’m sorry but you lost me there Soren. For years, EA has done exactly the opposite. They’ve pushed sales numbers over quality. And this is popularised in Will Wright’s own comment “We’d Rather Have the Metacritic and Sales of Sims 2 (65-75) than Those of Half-Life (90-96)”. From the latest and upcoming games there is no evidence to suggest the change of focus: Red Alert, The Sims 3, Spore, and others which are purely focused on grabbing sales and not quality.

I hope you’re right and EA is turning to quality over quantity, but until some actual evidence is presented by them I can’t help but be totally skeptical.
Soren Johnson on April 3, 2009 at 1:14 pm said:

EA’s first job, obviously, is to sell games, but raising the Metacritic average of our games has been a significant internal priority for the last couple years. (Quality and quantity are certainly not mutually exclusive.) If we fail, well, that’s a different story. My main point is that there is now at least a somewhat concrete metric to aim for rather than the old, vague “make our games better” trope.
George Geczy on April 3, 2009 at 2:03 pm said:

There is a I could say on this topic, but let me jump first the my #1 reason why metacritic is evil: It doesn’t care about, and in fact it often screws badly with, titles from second-tier (and lower) publishers.

Of course an EA title is going to be reviewed by everyone so their metacritic list will get full, and the concept generally works. But a title by a publisher like Paradox (our publisher for Supreme Ruler 2020), or Matrix, or Shrapnel, or (insert small publisher name here) gets screwed because the coverage of reviews is spotty and inconsistent.

For instance, our game had about a dozen print magazine scored reviews published worldwide – most in Europe of course, since that’s where most of the PC print mags are. Metacritic posted the two worst, for instance PC Gamer US with a score of 50%, and didn’t post a much more detailed review from PC Gamer Sweden (also futuremark owned) which gave us 80% for the same game. Metacritic is often North-American and mainstream centric, ignoring more detailed and thorough reviews from genre-specific sites and foreign press.

When it comes to metacritic I could support the concept, but certainly not the execution.
Rathanel on April 3, 2009 at 2:07 pm said:

There are several points I’d like to respond to, but I’ll keep it short and pick two.

First: “However, when I am in an EA meeting in which we talk about the need to raise our Metacritic scores – and the concrete steps or extra development time thus required – I’ll tell you what I feel like doing. I feel like jumping for joy. How incredible is it to work for a publisher who cares about improving the quality of our games in the eyes of critics and uses an independent metric to prove it.”

I appreciate your optimism, but that is emphatically not what they just said. They want to improve their scores, not their quality.

Much like learning a subject in school, just because your grades are higher does not mean you have a better grasp of the material. Even short of flat-out cheating, there are many ways to improve your test scores without increasing your overall knowledge of the subject: learning the teacher’s habits and biases, studying previous exams the teacher has given, or observing which problems or points the teacher focuses on in class. Once you start doing that, you are no longer studying the subject, you are studying the tests for the subject.

And this is precisely what the above quote says to me. Hearing that the company is more concerned about pleasing the critics than the consumer is disturbing.

From my perspective, it seems rather insane that the work of many people over the course of years, work intended to be consumed by as many people as possible, is being held accountable to the judgment of a handful of people known as “games journalists”. The target audience should be “all gamers with any interest in genre X”, not “all journalists who may or may not be assigned to cover this game when it is released”.

And from the company’s perspective, the only metric that matters is how many people were willing to pay money for their product. Quality, execution, innovation, and any other metric are all subordinate to paying the bills, and only matter if they help you pay the bills for longer. A focus on quality in the long-term will of course benefit everyone, but that too is a means to an end.

— “Why are we assuming that people can’t think for themselves and use these scores as simply a valuable tool? Everyone knows that kids games, for example, are going to have a different score range from AAA games. That doesn’t mean that the scores aren’t “meaningful.””

We are assuming that the scores themselves have such a significant impact because so much energy is spent discussing the scores and their impact. The complaints in regards to “bad” reviews are not that “the reviewer is biased against the musical style of the game and let this color their review of the game” but that “the score is too low, clearly the reviewer hates this company/console/publisher/genre/the color blue” or “the score is too high, clearly the reviewer has been bribed.”

The contractual bonuses based on MetaCritic scores are just another example where the scores are not being used as “valuable tools” but as something with inherent value, separate from the reviews which feed into them.

Imagine a scenario where a game receives a mediocre to poor MetaCritic score (a 60, we’ll say) but sells ten times as many copies as were projected, well into the millions. Is this a good game? A bad game?

Clearly the company is making money on this game, shouldn’t the people who worked on this game be rewarded for producing something above and beyond what was expected? Or should they be fired or their bonuses withheld because the game didn’t review well, and thus performed worse than expected?

In fact, when did we decide that MetaCritic is a valuable tool? How many people use MetaCritic as a guide for purchasing decisions, sole or otherwise? How well does the MetaCritic score align with their own tastes and experiences? How well does a high MetaCritic score align with sales, both initial and over the course of a year plus?

/endrant

A question for you, if I may… As someone who works in the industry, how have you seen or heard about the MetaCritic scores factoring into bonuses and contracts (to the extent you are allowed to answer, of course)? Is it only developers who are stuck with the score? Is marketing held accountable as well?

And how do sales figure in? Are total sales applied to everyone for bonuses, or only select groups?
steve on April 3, 2009 at 2:30 pm said:

“At the end of the day, though, publishers really do need a metric for quality, and the aggregation sites are, by far, the closest the industry has come to having one.”

The problem is that people do ascribe meaning to a meaningless number, despite the fact you can’t put a number on something as unquantifiable and fuzzy as “quality.” I realize you’re not saying it’s some perfect metric, but the number lacks the context that all of the text accompanying those numbers gives.

For example, I’ve written reviews where I said a game was terrible, but awesomely terrible. If I give it a high or low rating based on some very personal criteria, it won’t translate at all because it’s been separated from its context.

“Why are we assuming that people can’t think for themselves and use these scores as simply a valuable tool?”

I think that once you involve math, you add a certain amount of certainty. But all of those numbers are arbitrary and weighed differently by each source, much less by the aggregator, and those formulas are arbitrary in how they convert everyone else’s arbitrary number. I’m not mathematician, but it seems like it gets less meaningful over time instead of more meaningful.

Or to put it more pithy, I don’t believe math can be used to put a valuation on art.
Soren Johnson on April 3, 2009 at 3:49 pm said:

I understand your points. However, what I am trying to say – and perhaps my post should have said just this for clarity – is that I have seen real, tangible benefits as a developer for publishers wanting to increase their Metacritic score. I find this effect so significant that I hate to see this not acknowledged when people fall into the typical “Metacritic sucks” rant. It is easy to be cynical and believe that executives don’t care about quality – but if they loudly focus internally on improving the aggregate scores, clearly that is an important message for the developers.
George Geczy on April 3, 2009 at 5:33 pm said:

“but if they loudly focus internally on improving the aggregate scores, clearly that is an important message for the developers.”

Maybe instead the publishers should try to engage their player communities more, or at least follow the feedback and opinions of their fans. And I’m certainly not talking about Creative Assembly deleting references on their forums to Tom Chick’s critical review, or EA pushing gaming sites to remove user reviews that focused on Spore’s DRM, etc. It’s true that player feedback is not ‘measurable metric’ that you can put into a contract or even a boardroom meeting, but following forums would give decision-makers a much better idea of what players like and what they don’t.
George Geczy on April 3, 2009 at 5:38 pm said:

“If I give it a high or low rating based on some very personal criteria, it won’t translate at all because it’s been separated from its context.”

And this is also a great point – I noticed that one of the reviews that metacritic posted for us said “Too hardcore for casual gamers, but strategists should add two to the score”. Add two means add 20% to the 60% mark they gave us – and since casual gamers are NOT our audience, metacritic should have posted that review at 80%, which of course they did not.

I’ll go back under my rock now…
Soren Johnson on April 4, 2009 at 9:16 am said:

@George: Yeah, I think for more niche titles, aggregators are going to be more problematic. You won’t be able to compare the score for Supreme Ruler 2020 fairly with, say, Dawn of War 2 or Empire: Total War. (Perhaps it is still useful for comparisons with Europa Universalis 3 or SR 2010, but the sample size is still quite small…) As I argued in my main post, we have to view Metacritic scores in a larger context of what type of audience is the game aiming for and what are the preferences of the typical reviewer. For my own part, I am more optimistic than some that consumers, developers, and publishers are capable of making this comparison.
Soren Johnson on April 4, 2009 at 9:22 am said:

@Rathanel: EA is certainly more concerned with making the consumer happy rather than making the critics happy. The problem is that the voice of the consumer is very hard to hear through all the noises. The only thing in black and white, ultimately, is the sales figures, and we all can think of many great titles that didn’t measure up by that metric.

As for contracts, I don’t have a lot of direct experience with that myself. My HOPE is that we can use aggregators to provide an additional way to reward developers for doing a good job, but I have heard disturbing reports that sometimes you need to hit a sales AND a Metacritic target to get a bonus. I think these type of clauses are too restrictive and are not helping anyone.
Dale on April 4, 2009 at 1:47 pm said:

Soren, it’s good to hear that EA is focusing on gamers again rather than sales & shareholders. I really hope this is the truth. Thanks for you comments.
Pingback: Spectre Collie » Blog Archive » Generally Unfavorable
David desJardins on April 4, 2009 at 11:40 pm said:

The problem with Metacritic is not games with 50 ratings, it’s games with 5 ratings, where some reviewers have systematic bias (i.e., their average scores are lower or higher than others), and so the semi-random effect of who happens to review a particular game can appreciably affect the scores.

That said, I think Metacritic is great, and seems almost entirely positive from my point of view. And it’s especially helpful to the indie publishers who don’t get very many reviews, but still appear on a more-or-less equal footing.
Pingback: The Sunday Papers | Rock, Paper, Shotgun
Andrew on April 6, 2009 at 5:42 am said:

Hmm, I think the point of Metacritic is a flawed one. I hope the reviewers move more and more away from scores and simply to “Is it worth playing this game, yes or no” like how many film reviewers work. This can be aggregators too, and hopefully worldwide coverage will be better in the future too (as said by others, all the review aggregators are pretty spotty).

I’m happy that games are moving forwards on “quality” (whatever that is), but doing it on the basis of a number isn’t a good way of doing it for me. It is much easier to say if something is worth playing then to say how much a percent of it is quality.
D P on April 6, 2009 at 8:57 am said:

Would this be as much of a controversy or problem if game reviewers were independent of “the industry”? (Ie, not tied to needing to pad reviews up to maintain advertising funding.) Is this actually the reason that an 80 out of 100 is an “average” for most places, and we see so many titles at 90+?
Impaler[WrG] on April 13, 2009 at 5:08 pm said:

Perhaps the Metacritic and other aggregates could benefit from the inclusion of other type of data in their mix. In Politic a scientific, demographically weighted and random poll is considered the gold standard for measuring success (not counting actual elections of course). Why not conduct real consumer post-purchase scientific polls of satisfaction with the product.

Though Critics have their place I find it disappointing that games have fallen into clutches of ‘critics’ in much the same way movies and most other creative works. A few things like books are driven more by sales based statistics or word of mouth. But in the most part post-purchase feedback from consumers is shockingly absent from almost all discourse on the quality of arts & entertainment in America and their no good reason for that.
cephalo on May 6, 2009 at 12:42 pm said:

Personally, I feel I’ve been burned more and more by game reviews. I think metacritic helps, but it’s clear to me that the industry is pretty well saturated with some kind of payola. In the late 90’s you could really trust game reviews, but lately I feel I can only trust consumer reviews. I’ve seen alot of shining reviews coming out before games are even released, only to find that the game doesn’t even come close to the hype, and sometimes not even stable enough to suffer through a whole game.
Pingback: Tyranny of the Masses « gamereader

DESIGNER NOTES

Soren Johnson's Game Design Journal

The Case for Metacritic

21 thoughts on “The Case for Metacritic”

Leave a Reply