AI-generated faces are getting hyperreal — you can train to spot the fakes

Lead

Artificial intelligence can now produce hyperrealistic human faces that routinely fool people, including so-called “super recognizers,” according to a 2025 study by Katie Gray and colleagues. In online experiments, participants had 10 seconds to judge whether a face was real or AI-made; untrained super recognizers detected only 41% of synthetic faces and typical recognizers only about 30%. A brief five-minute training that highlighted common rendering errors raised detection to 64% for super recognizers and 51% for typical recognizers. The authors suggest combining human expertise with automated detectors for stronger defenses against fake faces.

Key takeaways

  • Study source: Gray et al., Royal Society Open Science (2025); experiments were run online and analyzed immediately after training.
  • Baseline detection: super recognizers identified 41% of AI faces; typical recognizers identified ~30%.
  • False alarms at baseline: super recognizers labeled real faces as fake in 39% of trials; typical recognizers did so in ~46%.
  • After a five-minute training session, detection rose to 64% for super recognizers and 51% for typical recognizers.
  • Post-training false-alarm rates were 37% for super recognizers and 49% for typical recognizers, roughly similar to baseline levels.
  • Decision window and behavior: participants had 10 seconds per image; trained participants took longer—typicals by ~1.9s, super recognizers by ~1.2s.
  • Technical context: many fake faces are produced by generative adversarial networks (GANs) and can reach “hyperrealism,” making them appear more lifelike than some real photos.

Background

AI-generated faces have advanced rapidly with improvements in generative models such as generative adversarial networks (GANs). These systems iterate between a generator that creates images and a discriminator that tries to tell fakes from real photos; over repeated cycles, the generator produces ever more convincing images. The result is a proliferation of deepfake-style faces across social media, advertising, and other online spaces, raising concerns about misinformation, identity misuse, and trust in visual evidence.

Human ability to spot synthetic faces varies widely. “Super recognizers” are a small subset of people—often the top ~2% on standardized face memory and matching tests—who excel at remembering and matching unfamiliar faces. Researchers have proposed using their heightened perceptual skills in security and forensic settings, but until now few studies tested whether that advantage extends to spotting AI-generated images. The new study recruited super recognizers from the Greenwich Face and Voice Recognition Laboratory volunteer pool to fill this gap.

Main event

Gray and colleagues ran a sequence of online experiments where participants viewed images that were either real photographs or AI-generated faces from recent models. Each image appeared for up to 10 seconds while participants judged whether the face was real or synthetic. In the first experiment—without any training—super recognizers detected 41% of AI faces, a level the authors note is near chance for this task; typical recognizers detected about 30%.

Participants also differed in how often they called real faces fake: super recognizers did so in 39% of trials, typical recognizers in about 46%. This imbalance shows that many observers are conservative or uncertain, erring on the side of labeling ambiguous images as genuine or, conversely, suspiciously calling real images fake.

In a parallel experiment, a separate set of participants completed a targeted five-minute training session before repeating the task. The training showed examples of recurring rendering errors in synthetic faces—such as inconsistent teeth, odd hairlines and unnatural skin texture—followed by real-time feedback while participants judged 10 test images and a final recap. After training, super recognizers’ hit rate rose to 64% and typical recognizers’ to 51%.

The trained groups also took longer per decision: typical recognizers increased their inspection time by roughly 1.9 seconds, and super recognizers by about 1.2 seconds. The authors emphasize that slowing down and looking for specific clues improved performance in the short term.

Analysis & implications

The experiments indicate two key points: first, current state-of-the-art generated faces can routinely deceive even high-performing human observers; second, short, focused training materially improves detection accuracy. The boost—about a 23 percentage-point gain for super recognizers and a 21-point gain for typical recognizers—shows that targeted instruction on common artefacts can sharpen human scrutiny quickly.

However, gains came with trade-offs. False-positive rates (calling real faces fake) remained similar after training, suggesting improved sensitivity to fakes did not reduce cautious misclassification of genuine images. In applied settings—law enforcement, content moderation or verification workflows—raising true-positive rates without inflating false alarms will be crucial to avoid unnecessary investigations or content takedowns.

Operationalizing these findings could mean a hybrid approach: automated detectors flag likely synthetic images, and trained human reviewers—potentially including super recognizers—perform the final judgment. The authors explicitly propose a human-in-the-loop model where trained experts complement algorithmic screening to catch subtleties machines miss or to audit false positives.

Comparison & data

Group Baseline hit rate (AI faces) Post-training hit rate False alarms (real→fake) baseline False alarms post-training
Super recognizers 41% 64% 39% 37%
Typical recognizers ~30% 51% ~46% 49%

These figures come from the Gray et al. online experiments (2025). The immediate post-training gains show how quickly perceptual strategies can be taught, but the study did not measure how long the benefit persists. Comparison with previous work is limited because few prior studies have directly tested super recognizers on synthetic-image detection.

Reactions & quotes

“I think it was encouraging that our kind of quite short training procedure increased performance in both groups quite a lot,”

Katie Gray, University of Reading (study lead)

Gray framed the results as proof-of-concept: short, focused training can uplift performance, and super recognizers might offer distinctive cues that complement algorithmic methods.

“The training cannot be considered a lasting, effective intervention, since it was not re-tested,”

Meike Ramon, Bern University of Applied Sciences (research commentary)

Ramon highlighted methodological limits: the study tested different participants across conditions, so within-subject learning effects and durability of improvement remain unmeasured.

Unconfirmed

  • Duration of training effect: the study measured performance immediately after training; it did not test retention over days or weeks.
  • Individual learning gains: because separate participants were used for baseline and training groups, it is unclear how much a single person’s performance would improve pre- versus post-training.
  • Generality across models: the experiments used specific AI-generated images; results may differ with other generation methods or higher-fidelity models released after 2025.

Bottom line

State-of-the-art AI can create faces that routinely deceive human observers, including those with exceptional face-processing skills. Yet a brief, targeted training that points out recurring rendering errors produces a measurable improvement in detection for both super recognizers and typical observers.

For real-world defense, a combined approach appears most promising: automated filters to flag suspicious images, followed by trained human reviewers—ideally with specific instruction on artefacts—to make final calls. Policymakers and platforms should prioritize continued training, evaluation of retention, and frequent updates as generative models evolve.

Sources

Leave a Comment