Why the polls got the Lib Dems wrong in 2010

Everytime we get an ICM poll it produces a lot of people commenting on the big difference between the level of support the Lib Dems get. I intend returning to that at some point in the future, but as background for considering how the polls represent the Lib Dems though, we really need to work out why all the polls got them wrong at the general election.

All of the companies who released polls within a few days of the general election overstated the level of Lib Dem support, by between 2 and 5 points, but the reasons are still unclear, and will probably remain so until pollsters get to test their methods against the next general election. Basically the possible reasons boil down to reasons of people changing their intentions, pollsters misjudging their intentions, or pollsters polling unrepresentative groups of people.

To take the first explanation first, people may have genuinely intended to vote Liberal Democrat when interviewed by pollsters, but then changed their mind in the hours that followed or even in the polling booth itself. There is plenty of anecdotal evidence of this late swing, but scant proper evidence. The only polling to really support it is Populus's final poll, which found much lower levels of Lib Dem support in the fieldwork conducted on Wednesday than in the Tuesday fieldwork - even then, this may be co-incidence as YouGov also compared their Tuesday and Wednesday fieldwork and didn't find a drop in Lib Dem support.

Final polls strongly suggested that the Lib Dem poll was soft - several companies included questions on whether people might still change their mind and found Lib Dem voters were still unsure. However, just because people are uncertain doesn't mean they necessarily will change their mind. If there had been a late swing away from the Lib Dems, then it should have been picked up by recontact surveys after the election. Angus Reid, YouGov and ICM all recontacted people who were polled late in the campaign to see if people who told them they would vote Lib Dem changed their minds after the final polls - all found neglible levels of late swing.

Polls conducted immediately after the election also argue against the late swing hypothesis. MORI don't use any past vote weighting, so their post-election polls would have used identical weighting to their pre-election polls. Their poll conducted immediately after the general election still found 27% of people claiming they voted Liberal Democrat. The late swing hypothesis may make sense narratively, and probably does explain a small amount of the error, but the evidence simply doesn't back up it being the whole reason for the error.

Moving on, we come to whether pollsters correctly interpreted what people told them. Here I include whether pollsters correctly worked out which people would vote, and what they did with don't knows. A common hypothesis is that the new support the Liberal Democrats picked up after the debates was amongst people who were least likely to actually turn out to vote - young people, people who hadn't previously voted and so on - and that when election day finally came round those people did not actually vote. It also makes sense that the exit poll would be right if the problem was down to incorrectly predicting turnout. Once again, this makes sense narratively, but the evidence from recontact surveys after the election doesn't back it up - ICM found no sign that Lib Dem supporters were less likely to actually turn out, YouGov found only neligible differences. In this case, we can at least look forward to a conclusive answer though. Obviously we will never know for certain who people who responded to surveys actually voted for, but we will know for certain if they voted. The British Election Study validates its survey against the marked electoral register, which will show for certain whether people who said they were going to vote Lib Dem before the election disproportionately ended up not voting.

Then we get to how pollsters treat don't knows. Clearly the reason for the error could have been don't knows breaking disproportionately for Labour and Conservatives. ICM's recontact survey found don't knows broke disproportionately for Labour and their topline adjustment made their figures more accurate, MORI's squeeze question also boosted Labour - so it would seem the pollsters got this one right. YouGov don't use any reallocation of don't knows - but their re-contact survey showed don't knows splitted pretty evenly (there are likely to be different patterns of saying don't know when there is no human interviewer).

Finally we come to pollsters getting unrepresentative groups of voters in their samples - either through weighting problems, disproportionate response rates or sampling problems.

Weighting as a cause faces the problem of the wide variety of methods pollsters use - it has to be a problem that effects both online and phone polling, past-vote weighted and non-past-vote weighting polling. For example, in their post-mortem ICM suggested they got their Lib Dem weighting wrong and weighted them too highly (the implication from the article is that ICM reduced the assumed level of false recall in order to reduce the level Labour were weighed to, with the side effect that the Lib Dems were weighted to a higher level). However, this cannot explain the whole problem as companies who don't use past-vote weighting like MORI also got it wrong.

Disproportionate response rate is the hypothesis that in the heights of "Cleggmania" people who were voting Lib Dem were much more likely to agree to take part in opinion polls, ending up being over-represented. This seems like a plausible explanation to me, but the evidence to support it is lacking.

A possible test was the exit poll, if Lib Dems were disproportionately likely to take part in pre-election polls, they should also have been disproportionately likely to agree to take part in the exit poll. In the exit poll interviewers estimate the political support of respondents before interviewing them as a method of picking up differential response rates. In 2010 people who the interviewers estimated would vote Lib Dem did indeed have a slightly higher response rate, but it was only by around 84% to 80% for Conservatives and, more importantly, interviewers were very poor at correctly identifying people who "looked Lib Dem" - the evidence from the exit poll therefore is very much "not proven".

Another possible measure was looking at YouGov's panel data. I thought perhaps response rates amongst people who identified with the Liberal Democrats might have disproportionately risen after the first debate. Having checked it out, they did rise, but no more than response rates amongst Conservative and Labour supporters. Of course this doesn't rule out the explanation, these were pre-existing Lib Dem indentifiers, when the people experiencing the biggest surge of enthusiasm may have been new-converts to Cleggism, but those people were not pre-identified on the panel.

A final explanation is what is there is a sampling problem, if pollsters samples were biased towards the Liberal Democrats. As with weighting, this would need to be something that somehow affected all pollsters despite them using online panels, telephones or face-to-face quota sampling. One hypothesis is about the young people pollsters contact. All types of sampling seem to face problems getting under 25s - phone pollsters in particular always seem to have to almost double them in their weightings. I suspect the young people that are contacted by pollsters are the sort of well-educated, middle class young people who are most likely to vote Lib Dem. The ones they don't contact are the ones most likely not to vote at all.

Another hypothesis is that people who agree to take part in opinion polls are too interested in politics. Polls from various different companies found 50% of so of people claiming they had watched the leadership debates, which viewing figures suggest is an overestimate, even accounting for people including watching a clip of it on the news. It makes perfect sense that if the jump in Lib Dem support was concentrated amongst those who watched the debate, and polls included too many people who watched the debate, then polls would overestimate Lib Dem support. Sadly there is no firm measure pollsters could use to weight for interest in politics.

Strikingly in 1992, when response rate to polls was about 1 in 8 or so, 52% of people told MORI they had been very or fairly interested in the general election campaign. In 2010, with response rates down to about 1 in 12 or so, 75% of people told MORI they had been very or fairly interested in the general election campaign. Now, perhaps that was because 2010 was a particularly interesting campaign with the first leaders' debates... or perhaps as response rates fall the group of people willing to take part in polls is increasingly disproportionately interested in politics.

So, taking all those into account, there is no clear reason for the Lib Dem overestimate in the pre-election polls. My guess is that a little bit was down to late swing, with other bits down to disproportionate response from Lib Dem supporters during the enthusiasm of Cleggmania, and pollsters having samples that are rather too well educated and interested in politics. As I've said above though, actually proving this is an entirely different matter!

Why the polls got the Lib Dems wrong in 2010

Read more

What would happen if the Prime Minister lost their seat?

Teens Think Happiness and Wellbeing More Important Than Money from a Job

Latest General Election Projection: Knife Edge Coalitions

Local Elections Expectations Management