The results of Tuesday’s presidential election came as a surprise to nearly everyone who had been following the national and state election polling, which consistently projected Hillary Clinton as defeating Donald Trump. Relying largely on opinion polls, election forecasters put Clinton’s chance of winning at anywhere from 70% to as high as 99%, and pegged her as the heavy favorite to win a number of states such as Pennsylvania and Wisconsin that in the end were taken by Trump.
How could the polls have been so wrong about the state of the election?
There is a great deal of speculation but no clear answers as to the cause of the disconnect, but there is one point of agreement: Across the board, polls underestimated Trump’s level of support. With few exceptions, the final round of public polling showed Clinton with a lead of 1 to 7 percentage points in the national popular vote. State-level polling was more variable, but there were few instances where polls overstated Trump’s support.
The fact that so many forecasts were off-target was particularly notable given the increasingly wide variety of methodologies being tested and reported via the mainstream media and other channels. The traditional telephone polls of recent decades are now joined by increasing numbers of high profile, online probability and nonprobability sample surveys, as well as prediction markets, all of which showed similar errors.
Pollsters don’t have a clear diagnosis yet for the misfires, and it will likely be some time before we know for sure what happened. There are, however, several possible explanations for the misstep that many in the polling community will be talking about in upcoming weeks.
One likely culprit is what pollsters refer to as nonresponse bias. This occurs when certain kinds of people systematically do not respond to surveys despite equal opportunity outreach to all parts of the electorate. We know that some groups – including the less educated voters who were a key demographic for Trump on Election Day – are consistently hard for pollsters to reach. It is possible that the frustration and anti-institutional feelings that drove the Trump campaign may also have aligned with an unwillingness to respond to polls. The result would be a strongly pro-Trump segment of the population that simply did not show up in the polls in proportion to their actual share of the population.
Some have also suggested that many of those who were polled simply were not honest about whom they intended to vote for. The idea of so-called “shy Trumpers” suggests that support for Trump was socially undesirable, and that his supporters were unwilling to admit their support to pollsters. This hypothesis is reminiscent of the supposed “Bradley effect,” when Democrat Tom Bradley, the black mayor of Los Angeles, lost the 1982 California gubernatorial election to Republican George Deukmejian despite having been ahead in the polls, supposedly because voters were reluctant to tell interviewers that they were not going to vote for a black candidate.
The “shy Trumper” hypothesis has received a fair amount of attention this year. If this were the case, we would expect to see Trump perform systematically better in online surveys, as research has found that people are less likely to report socially undesirable behavior when they are talking to a live interviewer. Politico and Morning Consult conducted an experiment to see if this was the case, and found that overall, there was little indication of an effect, though they did find some suggestion that college-educated and higher-income voters might have been more likely to support Trump online.
A third possibility involves the way pollsters identify likely voters. Because we can’t know in advance who is actually going to vote, pollsters develop models predicting who is going to vote and what the electorate will look like on Election Day. This is a notoriously difficult task, and small differences in assumptions can produce sizable differences in election predictions. We may find that the voters that pollsters were expecting, particularly in the Midwestern and Rust Belt states that so defied expectations, were not the ones that showed up. Because many traditional likely-voter models incorporate measures of enthusiasm into their calculus, 2016’s distinctly unenthused electorate – at least on the Democratic side – may have also wreaked some havoc with this aspect of measurement.
When the polls failed to accurately predict the British general election in May 2015, it took a blue ribbon panel and more than six months of work before the public had the results of a data-driven, independent inquiry in hand. It may take a similar amount of time to get to the bottom of this election as well. The survey industry’s leading standards association, the American Association for Public Opinion Research, already has an ad hoc committee in place to study the election and report back in May (Pew Research Center’s Director of Survey Research Courtney Kennedy is chairing the committee).
Pollsters are well aware that the profession faces serious challenges that this election has only served to highlight. But this is also a time of extensive experimentation and innovation in the field. The role of polling in a democracy goes far beyond simply predicting the horse race. At its best, polling provides an equal voice to everyone and helps to give expression to the public’s needs and wants in ways that elections may be too blunt to do. That is why restoring polling’s credibility is so important, and why we are committed to helping in the effort to do so.