Actively Recruiting
Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models
Led by Peking Union Medical College Hospital · Updated on 2026-03-30
1380
Participants Needed
1
Research Sites
50 weeks
Total Duration
On this page
Sponsors
P
Peking Union Medical College Hospital
Lead Sponsor
C
Chinese Academy of Medical Sciences
Collaborating Sponsor
AI-Summary
What this Trial Is About
This single-center, retrospective, observational study aims to construct a standardized benchmark evaluation system for intelligent breast ultrasound image interpretation and to systematically assess the diagnostic performance of current mainstream multimodal artificial intelligence (AI) models. De-identified B-mode breast ultrasound images with confirmed pathological diagnoses will be retrospectively collected from the institutional archive (2018-2025) and supplemented with images from published open-access datasets. Expert radiologists with varying experience levels will independently annotate all images according to the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) v2025 criteria, including glandular tissue composition, lesion characterization (mass vs. non-mass lesion), morphological descriptors, and final BI-RADS classification. Baseline deep learning models (CNN-based ResNet-50 and Transformer-based USFM) will be trained to establish performance baselines and to stratify cases by diagnostic difficulty through cross-architecture consensus. Multiple multimodal large language models (MLLMs), including both general-purpose and medical-domain models, will then be evaluated via standardized API calls using BI-RADS-guided chain-of-thought prompts at temperature 0 for reproducibility. Primary endpoints include BI-RADS classification accuracy and diagnostic AUC for benign-malignant differentiation. Model robustness and safety will be assessed through out-of-distribution rejection testing, temperature-stability experiments, and thinking-mode ablation studies. This study adheres to the FLAIR and TRIPOD-LLM reporting guidelines.
CONDITIONS
Official Title
Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models
Who Can Participate
Eligibility Criteria
You may qualify if you...
- B-mode breast ultrasound grayscale images from institutional PACS or published open-access datasets with original ethics approval
- Image quality suitable for clinical diagnosis with clear region of interest
- Pathological diagnosis confirmed for benign and malignant lesions, or normal breast status confirmed by senior expert radiologist
- Complete removal of all personally identifiable information from images
You will not qualify if you...
- Severely degraded images that prevent meaningful BI-RADS assessment
- Duplicate images from the same patient (only most representative kept per lesion)
- Images containing personally identifiable information after de-identification
- Cases with unclear, disputed, or missing pathological results
- Non-B-mode ultrasound images such as elastography, contrast-enhanced, or Doppler imaging
AI-Screening
AI-Powered Screening
Complete this quick 3-step screening to check your eligibility
Trial Site Locations
Total: 1 location
1
Peking Union Medical College Hospital
Beijing, China, 100730
Actively Recruiting
Research Team
Q
Qingli Zhu, MD
CONTACT
Y
Yinglan Wu, MD
CONTACT
How is the study designed?
Study Type
OBSERVATIONAL
Masking
N/A
Allocation
N/A
Model
N/A
Primary Purpose
N/A
Number of Arms
3
Not the Right Trial for You?
Explore thousands of other clinical trials that might be a better match.
Sign up to get personalized trial recommendations delivered to your inbox.
Already have an account? Log in here