← Back to Blog List

Prompt-Guided Synthetic Data: A New Strategy for Robust Dental AI

VISTA Lab Alumni Spotlight: VISTA Lab is more than just a center for advanced research; it is an academy that trains visionary engineers and researchers who address global challenges in AI. This week, we are proud to feature Fatih Uysal, a distinguished alumnus, as he explores a practical strategy to mitigate "domain shift" in medical imaging—an important study co-authored by our lab members and recently published in PeerJ Computer Science.
Executive Summary: Deep learning systems for medical imaging often struggle when encountering data from real-world environments that differ from their training sets—a phenomenon known as domain shift. This article explores how prompt-guided synthetic data generation, combining expert knowledge with vision-language foundation models, improves the robustness of dental lesion detectors. By generating controllable synthetic lesions in panoramic radiographs, we demonstrate performance gains of 8–18%, especially for small and rare lesions.

The Hidden Challenge in Medical AI: Domain Shift

Deep learning has made remarkable progress in medical image analysis. Models can now detect diseases and highlight abnormalities in radiographs with impressive accuracy. However, one major obstacle remains: models often fail when deployed in environments that differ from their training data.

This issue—known as domain shift is particularly common in medical imaging. Differences in imaging devices, acquisition protocols, or even patient populations can lead to substantial performance degradation. In dental radiology, this challenge is evident in the detection of periapical lesions. While panoramic X-rays are widely used for their low radiation and accessibility, their two-dimensional projection makes small lesions difficult to identify, leading to lower clinical sensitivity compared to advanced modalities like cone beam CT.

Why Data Scarcity Makes the Problem Harder

Training robust models requires massive, diverse datasets. In medical imaging, acquiring such data is expensive and time-consuming due to ethical guidelines and the need for expert annotations. Our study utilized 196 panoramic radiographs, of which 145 contained annotated lesions. While valuable, this size rarely covers full clinical variability, causing detectors to struggle with unusual lesion shapes or artifacts. To overcome this, we explored synthetic data augmentation.

Histograms of bounding box areas
Figure 1: Histograms of bounding box areas across different validation splits (Small, Medium, Large), showcasing the distribution of lesion sizes.

Two Approaches to Synthetic Lesion Generation

Our research compared two controllable pipelines for generating synthetic lesions:

1. Expert-Guided Synthesis (EGS)

In this approach, clinicians manually draw lesion masks on healthy radiographs. A procedural rendering pipeline then applies noise-based texture patterns, intensity modulation (to simulate radiolucency), and Gaussian edge blending to integrate the lesion realistically with the surrounding bone.

EGS Pipeline visualization
Figure 2: The EGS Pipeline: From a healthy radiograph (A) and manual mask (B) to the final synthetic lesion with detection box overlay (E).

2. Prompt-Guided Lesion Synthesis (PGLS)

We introduce a novel idea: using natural language prompts to guide generation. Instead of manual masks, clinicians describe attributes—size, location, margin sharpness, and contrast. A vision-language foundation model interprets these prompts to perform image-to-image edits. This approach requires no task-specific generative model training and is highly scalable through prompt libraries.

PGLS lesion synthesis
Figure 4: PGLS lesion synthesis conditioned on a healthy panoramic radiograph, showing the synthesized lesion (A) vs. the original (B).

Training Robust Lesion Detectors

We evaluated these strategies using a YOLOv10 object detection model trained on three regimes: Real-only data, Real + EGS, and Real + PGLS. The results were clear: both augmentation strategies improved performance, but prompt-guided synthesis consistently produced the strongest gains—achieving approximately 8–18% performance improvement across major metrics like mean Average Precision (mAP) and recall.

StyleGAN2 comparison
Figure 7: Comparison with StyleGAN2 baseline. Note the lack of radiographic fidelity in traditional GANs under data scarcity, whereas PGLS preserves anatomical integrity.

Why Prompt-Guided Augmentation Matters

Traditional generative approaches like GANs require massive datasets and complex tuning. Prompt-guided synthesis, however, uses existing foundation models for targeted edits. This allows for an iterative workflow: identify model failure cases (e.g., low-contrast lesions), describe them through prompts, generate targeted data, and retrain.

Conclusion: Toward More Robust Clinical AI

Beyond dental imaging, prompt-guided synthesis has broader implications for radiology, pathology, and dermatology. The goal is not simply to build accurate models in the lab, but to ensure they remain reliable across diverse clinical environments. Sometimes, improving AI performance is not about collecting more data—but about generating the right data.

References