Political journalism has become infatuated with opinion polls, and yet news organizations remain ill-equipped to make sense of the flood of data.
05 October 2017 |BENJAMIN TOFF | Politico
“The near-year since Donald Trump’s surprise electoral victory has been filled with soul-searching and recriminations among those who research public opinion and those who write about it. A conversation around whether polls failed has hardened into two main camps: one blaming the data, the other blaming the media.
But this version of the debate misses the point. The problem isn’t simply flawed data or the media’s misuse of it; these problems cannot be separated. Political journalism has become infatuated with opinion polls—what some have called a “Nate Silver Effect”—and yet news organizations remain ill-equipped to make sense of the flood of data.
Aggregators and forecasting websites such as RealClearPolitics and FiveThirtyEight, which attracted a combined 200 million visits in October 2016, have altered the way political reporters cover American politics, but among journalists and survey researchers, considerable ambivalence remains over whether these changes have, on balance, been for the better.
Several years before the 2016 election, I set out to better understand the changes in how news organizations were using opinion data by interviewing journalists, polling analysts and research practitioners across a range of institutions, including major national news outlets and private industry. Some of my findings were recently published online in the journal Journalism, and have considerable bearing on debates over what went wrong last year.
The shock results of 2016 were not an aberration. In talking with people who study and report on public opinion, it was apparent that there have been major shifts in how data are evaluated for quality and disseminated publicly. Even if the 2016 polls were not nearly as far off as their detractors sometimes assume (and they weren’t, at least if you compare national polling averages in the closing days of the 2016 race to Clinton’s margin of victory in the popular vote), methodologies are changing rapidly, newsroom resources are shrinking, and it has become easier than ever for anyone to sponsor their own junk survey, pass it off as social science and disseminate results to sympathetic audiences. (Much of the public already believes that this is what pollsters do: A Marist poll from March found that sixin 10 registered voters do not trust opinion surveys.)
And this is where the Nate Silver Effect gets complicated. Polls are popular fodder, but discretion when it comes to them is all too rare. The quality of the data underlying aggregators’ models is, in fact, more questionable than in the past. Was a new survey conducted using acceptable methodological rigor?
Are assumptions about non-response and voting likelihoods defensible? Did question-wording or question-order put a thumb on the scale? Weighing these matters—let alone accounting for the vagaries of ordinary sampling error—requires a level of institutional knowledge and resources that most news organizations simply cannot afford.
Sites like FiveThirtyEight have drilled into readers the importance of averaging across polls as a corrective to people’s tendencies to “pick the poll numbers they like and disregard the rest,” as one reporter I interviewed put it.
This very concern over outlier polls led this same reporter’s news organization to avoid citing individual poll results (“I always tell everyone I want to see three polls before I’ll quote them”). But in doing so, there’s a risk of learning the wrong lessons. Averaging across polls helps guard against some sources of error, such as those due to ordinary sampling error, but it does little to address the underlying problem of poor-quality data.
It’s a garbage-in, garbage-out scenario: Averaging polls can be useful, but if the data being input are bad, then the averages will be tainted, too. In fact, in its postmortem on the performance of the 2016 polls, the American Association for Public Opinion Research found that the “large, problematic errors” observed in “key battleground states”—which fueled many forecasters’ overconfident models—were due in large part to a lack of high-quality state-level surveys in the final weeks of the race. Averaging using outdated or flawed data might have contributed to perceptions that Clinton’s lead was insurmountable.
Many reporters and editors have taken to heart the importance of paying appropriate attention to polling data when handicapping races or describing candidates’ chances, but parsing and dissecting data are rarely as straightforward as plugging numbers into an algorithm.
It requires making judgment calls about a range of factors that are difficult to quantify. To his credit, Silver and his colleagues have tried to guide journalists with easily digestible tips for reading polls “like a pro” in an attempt to guard against the trap of false confidence, but the numbers themselves are often more compelling than the caveats.
Even in 2014 and 2015, I heard repeated concerns about whether the Nate Silver Effect on newsrooms might be causing some to embrace polling averages and forecasts as gospel, with election outcomes presumed to be preordained by the data weeks or months before votes are cast.
One editor I spoke with blamed the “incessant desire of social scientists to pretend they’re physicists” when human behavior is “never going to be that precise.” But the expectation of “pinpoint” precision also comes with the territory; as one survey researcher pointed out: “If it’s a number, it’s precise—it’s $1.39; it’s 34 percent.” Ultimately, elections themselves are precise counts, creating a demand for decimal-point accuracy that no amount of aggregated survey data can responsibly offer.
This Nate Silver Effect is not merely a failure of interpretation, innumeracy or a misreading of probability, as Silver himself emphasized in an 11-part post-election series. Newsrooms do struggle with all of these things, but journalists are in the business of communicating, and as it turns out, it’s hard to characterize degrees of uncertainty without confusing an average reader.
For example, one polling analyst I spoke with described having “fights with editors” over whether a “2-point lead” for one candidate constituted an actual lead, or a virtual dead heat due to normal polling error. Survey data are not newsworthy if all they ever suggest is that either candidate has a decent chance of winning.
To many of those I interviewed, a still more troubling development tied to the advent of the aggregators has been the media’s diminishing role as gatekeepers of opinion data. In an earlier era, leading media organizations established editorial standards intended to weed out shoddy polls from their coverage.
Critics charge that these policies contributed to myopic coverage that focused only on polls sponsored by news organizations themselves, but standards differentiating between firms of ill-repute and those using sound and transparent methods were meant to guard against the reporting of dubious data.
Now, forecasting and aggregator sites, with the aid of social media, have provided survey firms a powerful platform for reaching readers hungry for their results—often regardless of the firms’ rigor or reputation.
In effect, gatekeeping around opinion polls has quietly shifted away from legacy media newsrooms altogether and into the hands of the aggregators and forecasters. Even media organizations that continue to employ strict polling standards cited numerous examples in recent elections in which polls otherwise deemed unfit for coverage could not be ignored because they drove larger campaign news cycles.
The tendency of aggregator sites to “throw everything in” without distinguishing among firms, as Iowa pollster Ann Selzer pointed out in a 2015 interview with the Columbia Journalism Review, has contributed to a culture where, generally, fewer are passing judgments about data quality and saying, “This is a bad poll; we’re not going to mention it.”
My own interviews echoed Selzer’s lament that few reporters are “doing the work of looking at the methodology.” Many instead professed to relying on “brand names” and personal relationships with pollsters as a proxy for data quality. One reporter at a leading national newspaper admitted, “If you wanted to hoodwink me and you had an institution and a trusted name behind it, you probably could.”