Polling is getting harder every cycle. For every hundred voters you ask to take a survey, only one or two actually do – which drives up costs and forces campaigns to keep surveys in the field longer. So the pitch behind AI "digital twins" is seductive: what if a model could answer questions like a real voter, at a fraction of the price? In this episode of the Campaign Trend Podcast, I sit down with Ben Leff, co-founder and CEO of Verasight, to test whether that promise holds up.
Ben and his colleagues G. Elliott Morris and Peter Enns ran a series of experiments building synthetic samples – drawing demographic profiles from the census, then asking a large language model to predict how each "twin" would answer. They pushed the idea hard, even preloading the model with a year of real survey attitudes from roughly 15,000 respondents on the theory that more data would mean better answers. The discouraging finding: more information didn't reliably help, and newer models sometimes performed worse. There's no clear line pointing toward synthetic data getting more accurate over time.
The headline conclusion is that these synthetic samples are predictions, not measurements. Across about 15 questions, the average gap between the AI's answers and a real survey was a little over six points – but sometimes under two, sometimes above twelve. Without a real poll to check against, you never know which. And where the model looked accurate, it was mostly restating what partisanship already tells you: a Kamala voter disapproves of Trump. As Ben put it, you don't need an LLM for that.
My deeper worry: the human benchmark itself is getting shakier as media diets fragment and the "representative sample" strains. Ben agrees the whole field is harder than ever, but doesn't think synthetic data is the answer. The takeaway for practitioners is measured – there may be a narrow role for quick, directional estimates, but for the close races where precision actually matters, asking real humans is still the best tool we have.