An update on the status of superforecasting: What about artificial intelligence?
As we all know, superforecasters are people with an innate extraordinary talent for predicting future events. Science shows that predictions by experts, politicians, pundits, propagandists, demagogues, blowhards and idiots are about the same as random guessing. On binary (right or wrong) predictions, they are wrong about 47% of the time. That includes experts with access to non-public information. Superforecasters just know how to predict without any special training or knowledge. They do tend to have traits like 1) being comfortable with numbers and probabilities, 2) open minded, 3) Bayesian (willing to change opinion and predictions based on new information), and 4) being "dragonfly eyed", i.e., actively seeking ways to see things from different points of view.
Those folks are the consummate pragmatic rationalists.
The data shows that superforecasters who are not experts and who do not have access to data sources such as secret intelligence databases outperform intelligence experts with decades of experience and access to secret databases.
But, since AI (artificial intelligence) is new on the scene, what about the combination of superforecasters working with AI? As one might expect, there is improvement. I asked Pxy to comment:
Q: Is the combination of human superforecasters working in tandem with artificial intelligence better at prediction than superforecasters working alone?
A: The research shows that combining human superforecasters with artificial intelligence does improve prediction accuracy compared to superforecasters working alone, though the specific gains depend on the type of AI assistance and context.
A pivotal study by Schoenegger et al. (2024) found that human forecasters using LLM assistants improved their prediction accuracy by 24-28% compared to control groups. When accounting for outliers, the superforecasting-optimized AI assistant increased accuracy by 41% compared to 29% for a "noisy" assistant. Importantly, this improvement occurred across all skill levels - the LLM assistance didn't disproportionately benefit less skilled forecasters, suggesting genuine synergy between human judgment and AI capabilities.
Research by Hong, Lamberson, and Page provides the analytical foundation for why human-AI hybrids can outperform either component alone. They identify that humans excel with "thick data" (contextual, narrative information) while AI processes "big data" effectively. The key insight is that humans add the most value in atypical cases where historical data may not be relevant, while AI performs better on predictable patterns.
For the wonks, the Brier Scores quantify the prediction phenomenon another way. A perfect forecaster gets a Brier score of 0.0. No one has been perfect so far and no one probably ever will be perfect. Random chance (50/50 guessing) produces a Brier score of 0.25. Random chance is the realm of experts, politicians, columnists, Fox News, blowhards, etc. The worst possible forecaster (always wrong) scores 1.0. Most ordinary forecasters in Philip Tetlock's studies achieved Brier scores between 0.20-0.30, indicating performance only slightly better than chance. For comparison, superforecasters typically achieve scores around 0.15-0.18.
By Germaine: A person very interested in getting a feel for the outer contours of human intelligence or cognition in view of the complexity and noise of the real world
Comments
Post a Comment