How AI Validated 118 Exoplanets: RAVEN Pipeline and the Exoplanet Revolution

Somewhere in the data from 2.2 million stars, 31 worlds had gone undetected. They were not hidden by distance or darkness. They were buried in noise, flagged as candidates but never confirmed, cycling through automated systems that could not separate signal from interference. Then a team at the University of Warwick pointed a machine learning pipeline called RAVEN at the problem and pulled those planets into the light [1][4].

The Transiting Exoplanet Survey Satellite has transformed how astronomers hunt for worlds beyond our solar system. As of September 2025, TESS has observed nearly 6,000 candidate and confirmed exoplanets [4]. The satellite watches stars for the slight dimming that occurs when a planet crosses its face. That dimming can be subtle, and the data is vast. Many signals that look promising turn out to be something else entirely: eclipsing binary stars, instrumental artifacts, or other false positives that fool automated screening tools.

This is the bottleneck RAVEN was designed to break. Developed by researchers at the University of Warwick, the pipeline applies a Bayesian framework to determine the posterior probability that a candidate is actually a planet [1]. Rather than relying on a single diagnostic, it integrates multiple lines of evidence and produces a confidence score that can be directly compared across candidates.

The system combines two classifiers: a Gradient Boosted Decision Tree and a Gaussian Process [1]. Both were trained on comprehensive synthetic training sets, simulations built to represent the full range of true planet signals alongside the various false positive scenarios that plague transit surveys. This training data is what allows RAVEN to distinguish between the real thing and imposters with high reliability.

The results are striking. When the team applied RAVEN to the TESS Science Processing Operations Center full-frame images, they validated 118 exoplanets from the dataset [2]. Of those, 31 were entirely new discoveries, worlds that had appeared in the data but never passed the threshold for confirmation until RAVEN applied its analysis [1]. The pipeline also identified over 2,000 vetted candidates that warrant further investigation [2].

David Armstrong and colleagues at Warwick published their findings in 2025, with a follow-up study in April 2026 that expanded the validated sample [1][2]. NASA confirmed the results, highlighting the partnership between the AI system and the Warwick team as a landmark in the emerging era of machine learning assisted astronomy [4].

What makes RAVEN different from simpler screening tools is its interpretability. The Bayesian approach does not simply output a yes or no. It delivers a probability estimate and can identify which features in the light curve most strongly influenced that verdict. Astronomers can therefore review the assessment and understand why the pipeline flagged a candidate as high confidence or flagged it as likely false positive. This is not autonomous discovery. It is a collaboration where the machine handles the overwhelming scale of candidates while human scientists retain oversight and can apply their judgment when the data warrants it.

TESS is identifying a wealth of short-period transiting planets, with orbital periods of approximately 30 days or less [2]. These are the systems most amenable to follow-up characterization with ground-based telescopes and future space observatories. Understanding their frequency, radii, and orbital properties is central to building a statistical picture of planetary systems across the galaxy. The demographic study published by the Warwick team in January 2026 examined occurrence rates for planets with radii between 2 and 20 Earth radii around FGK main-sequence stars, drawing on four years of TESS observations [3]. That kind of population-level analysis depends on having a reliably vetted sample of individual candidates. RAVEN delivers that sample at a scale that would be impossible through manual review alone.

The 31 new worlds in the validated set represent the most exciting finds. These are planets that appeared in the TESS data, passed initial candidate screening, and then stalled in the confirmation pipeline for months or years. Some had been flagged by multiple automated systems. None had enough individual evidence to cross the confirmation threshold on their own. RAVEN changed that calculus by looking at the full problem holistically rather than treating each candidate in isolation.

For the exoplanet field, the RAVEN results confirm something astronomers have suspected for years. The pipeline bottleneck was not a lack of real signals in the data. It was a lack of computational and analytical frameworks sophisticated enough to separate those signals from the noise. Machine learning, when properly trained and combined with human oversight, can break through that bottleneck. The partnership between AI systems and human researchers is becoming the standard model for large-scale astronomical surveys, and the Warwick team just demonstrated what that model looks like in practice.

How AI Found 118 New Planets Hiding in NASA's Telescope Data: The RAVEN Pipeline and the Exoplanet Revolution

What This Means for the Future of Exoplanet Science

References