02 Jan

Why most rating systems suck

Most websites feature the possibility to rate articles, files, videos and other content. It’s a good way to keep popular content the most visible. If the ratings are to be used seriously, we need accurate data. Most rating systems do not necessarily provide accurate data.

The biggest reason why rating content doesn’t work as well as it could is the fact humans use the system. Humans are very noisy when it comes to input. While the law of large numbers eventually ensures consensus, choosing a good rating system could make rating stable much faster.

Pick a number

Probably the most common way to rate content is to give it a score, usually between one and five. Some sites use a larger scale, notably IMDB, which scores movies between one and ten points. Now, here’s the big revelation: this is not a very good way to do this.

The more options you have, the harder it is to choose accurately. IMDB probably has an user base of hundreds of thousands (most movies have tens of thousands of voters), which means they could simply have two ratings: “I liked this movie” and “I didn’t like this movie”. Maybe they could throw in “I didn’t like this but I don’t hate it either”, too. Considering the huge amount of votes, they still would get an average score that provided enough accuracy for their charts and whatnot.

Another downside in such a system is the fact people give biased scores. The reasons might vary, my personal excuse is that school grades in Finland go traditionally from four to ten, roughly representing a score of 40 % to 100 % respectively. This means, I rarely vote outside the familiar scale, unless a movie has to be punished and given the lowest rating possible. Which is another example of the noisy input people provide.

Pros: Easy to get accurate data from small user base
Cons: Very noisy, hard to decide between similar options

Thumbs

A slightly more modern way to rate things is ironically a very old method: thumbing things up or down, much like a Roman emperor. As stated above, the limited scale still provides accurate ratings thanks to the law of large numbers. Giving the user two options removes the statistical noise (excluding accidental votes), it is easier to extract the information by asking simply “Did you like this or not?”.

However, giving the user less options limits his or her ability to rank items on their personal lists and favorites, if that is needed. This could be solved with additional questions such as “Did you like X more than Y?”, or simply allowing the user to order items from best to worst in a list. In addition, an ordered list based approach would obviously give more perspective for the user if he or she wants to provide accurate input, having to constantly think of an item really is better than the item below it.

Pros: Easy to vote, less noise
Cons: Needs a bigger data set to provide accurate data

Obviously, all the above is to taken seriously only if you really want to either get better input or if you want to avoid a bit of work: a fancy system doesn’t necessarily provide that much and could only be confusing to the users. My personal choice would be a minimalistic approach, thumbing up or down, that is. Any comments?

6 thoughts on “Why most rating systems suck

  1. http://forums.facepunchstudios.com/
    Have got a unique rating system, you can rate the posts people made with these ratings: Gold star, Agree, Disagree, Awful spelling, Winner, Good idea, Usefull, Funny, Friendly, Unfriendly, Zing, Thanks, Informative and Artistic.

    It’s a pretty cool forum made for the Half-life 2 modification Garry’s mod. But its also ther to disucss other games and lifes problems.

  2. The thumbs (which clearly come from stumbleupon) have more than two options. Although there are only two promoted, you can choose from the tools menu to report the page as being in the wrong topic, not appropriate, innacurate, wrong language or broken.

    I suspect this could be generalised quite well for a wide variety of voting applications. Present two options in an obvious way but supply other options that can be chosen if desired.

    Going with movies as an example, I might thumb up “Die Hard” because I liked it but I would also want to say “Mindless action” about it. I would also thumb up “The fifth element” but even though it’s an action flick starring the same actor, it’s not “Mindless action” so hopefully there would be an appropriate option for me to choose. Maybe “Comedy action”. I could still just give the movie a thumbs up or a thumbs down but if I want to supply more information to help others know how I rate that movie, I can do it.

    I think this would avoid both the noise and the bias of the first method and yet enable a smaller data set to provide more accurate results.

    On a side note, most movie rating systems are not on a scale between one and five… or rather they are, but they allow half-points (usually half-stars) which essentially puts them on a scale between one and ten. This has always bugged me. Why not just make the scale from one to ten ? Bah !

Comments are closed.