Latest voting intention and the mystery of house effects
There have been several new polls with voting intention figures since the weekend, though all so far have been conducted before the government's defeat on their Brexit plan.
ComRes/Express (14th-15th) - CON 37%(nc), LAB 39%(nc), LDEM 8%(-1), UKIP 7%(+1) YouGov/Times (13th-14th)- CON 39%(-2), LAB 34%(-1), LDEM 11%(nc), UKIP 6%(+2) Kantar (10th-14th) - CON 35%(-3), LAB 38%(nc), LDEM 9%(nc), UKIP 6%(+2)
Looking across the polls as a whole Conservative support appears to be dropping a little, though polls are still ultimately showing Labour and Conservative very close together in terms of voting intention. As ever there are some differences between companies - YouGov are still showing a small but consistent Tory lead, the most recent polls from BMG, Opinium and MORI had a tie (though Opinium and MORI haven't released any 2019 polls yet), Kantar, ComRes and Suration all showed a small Labour lead in their most last polls.
Several people have asked me about the reasons for the difference between polling companies figures. There isn't an easy answer - there rarely is. The reality is that all polling companies want to be right and want to be accurate, so if there were easy explanations for the differences and it was easy to know what the right choices were, they would all rapidly come into line!
There are two real elements that are responsible for house effects between pollsters. The first is the things they do to the voting intention data after it is collected and weighted - primarily that is how do they account for turnout (to what extent do they weight down or filter out people who are unlikely to vote), and what to do they with people who say they don't know how they'll vote (do they ignore them, or use squeeze questions or inference to try and estimate how they might end up voting). The good thing about these sort of differences is that they are easily quantifiable - you can look up the polling tables, compare the figures with turnout weighting and without, and see exactly the impact they have.
At the time of the 2017 election these adjustments were responsible for a lot of the difference between polling companies. Some polls were using turnout models that really transformed their topline figures. However, those sort of models also largely turned out to be wrong in 2017, so polling companies are now using much lighter touch turnout models, and little in the way of reallocating don't knows. There are a few unusual cases (for example, I think ComRes still reallocate don't knows, which helps Labour at present, but most companies do not. BMG no longer do any weighting or filtering by likelihood to vote, an adjustment which for other companies tends to reduce Labour support by a point or two). These small differences are not, by themselves, enough to explain the differences between polls.
The other big differences between polls are their samples and the weights and quotas they use to make them representative. It is far, far more difficult to quantify the impact of these differences (indeed, without access to raw samples it's pretty much impossible). Under BPC rules polling companies are supposed to be transparent about what they weight their samples by and to what targets, so we can tell what the differences are, but we can't with any confidence tell what the impact is.
I believe all the polling companies weight by age, gender and region. Every company except for Ipsos MORI also votes by how people voted at the last election. After that polling companies differ - most vote by EU Ref vote, some companies weight by education (YouGov, Kantar, Survation), some by social class (YouGov, ComRes), income (BMG, Survation), working status (Kantar), level of interest in politics (YouGov), newspaper readership (Ipsos MORI) and so on.
Even if polling companies weight by the same variables, there can be differences. For example, while almost everyone weights by how people voted at the last election, there are differences in the proportion of non-voters they weight to. It makes a difference whether targets are interlocked or not. Companies may use different bands for things like age, education or income weighting. On top of all this, there are questions about when the weighting data is collected, for things like past general election vote and past referendum vote there is a well-known phenomenon of "false recall", where people do not accurately report how they voted in an election a few years back. Hence weighting by past vote data collected at the time of the election when it was fresh in people's minds can be very different to weighting by past vote data collected now, at the time of the survey when people may be less accurate.
Given there isn't presently a huge impact from different approaches to turnout or don't knows, the difference between polling companies is likely to be down some of these factors which are - fairly evidently - extremely difficult to quantify. All you can really conclude is that the difference is probably down to the different sampling and weighting of the different companies, and that, short of a general election, there is no easy way for either observers (nor pollsters themselves!) to be sure what the right answer is. All I would advise is to avoid the temptation of (a) assuming that the polls you want to be true are correct... that's just wishful thinking, or (b) assuming that the majority are right. There are plenty of instances (ICM in 1997, or Survation and the YouGov MRP model in 2017), when the odd one out turned out to be the one that was right.