Actively Recruiting

Age: 18Years - 75Years
FEMALE
Healthy Volunteers
NCT07500428

Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models

Led by Peking Union Medical College Hospital · Updated on 2026-03-30

1380

Participants Needed

1

Research Sites

50 weeks

Total Duration

On this page

Sponsors

P

Peking Union Medical College Hospital

Lead Sponsor

C

Chinese Academy of Medical Sciences

Collaborating Sponsor

AI-Summary

What this Trial Is About

This single-center, retrospective, observational study aims to construct a standardized benchmark evaluation system for intelligent breast ultrasound image interpretation and to systematically assess the diagnostic performance of current mainstream multimodal artificial intelligence (AI) models. De-identified B-mode breast ultrasound images with confirmed pathological diagnoses will be retrospectively collected from the institutional archive (2018-2025) and supplemented with images from published open-access datasets. Expert radiologists with varying experience levels will independently annotate all images according to the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) v2025 criteria, including glandular tissue composition, lesion characterization (mass vs. non-mass lesion), morphological descriptors, and final BI-RADS classification. Baseline deep learning models (CNN-based ResNet-50 and Transformer-based USFM) will be trained to establish performance baselines and to stratify cases by diagnostic difficulty through cross-architecture consensus. Multiple multimodal large language models (MLLMs), including both general-purpose and medical-domain models, will then be evaluated via standardized API calls using BI-RADS-guided chain-of-thought prompts at temperature 0 for reproducibility. Primary endpoints include BI-RADS classification accuracy and diagnostic AUC for benign-malignant differentiation. Model robustness and safety will be assessed through out-of-distribution rejection testing, temperature-stability experiments, and thinking-mode ablation studies. This study adheres to the FLAIR and TRIPOD-LLM reporting guidelines.

CONDITIONS

Official Title

Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models

Who Can Participate

Age: 18Years - 75Years
FEMALE
Healthy Volunteers

Eligibility Criteria

Eligible

You may qualify if you...

  • B-mode breast ultrasound grayscale images from institutional PACS or published open-access datasets with original ethics approval
  • Image quality suitable for clinical diagnosis with clear region of interest
  • Pathological diagnosis confirmed for benign and malignant lesions, or normal breast status confirmed by senior expert radiologist
  • Complete removal of all personally identifiable information from images
Not Eligible

You will not qualify if you...

  • Severely degraded images that prevent meaningful BI-RADS assessment
  • Duplicate images from the same patient (only most representative kept per lesion)
  • Images containing personally identifiable information after de-identification
  • Cases with unclear, disputed, or missing pathological results
  • Non-B-mode ultrasound images such as elastography, contrast-enhanced, or Doppler imaging

AI-Screening

AI-Powered Screening

Complete this quick 3-step screening to check your eligibility

1
2
3
+1

Trial Site Locations

Total: 1 location

1

Peking Union Medical College Hospital

Beijing, China, 100730

Actively Recruiting

Loading map...

Research Team

Q

Qingli Zhu, MD

CONTACT

Y

Yinglan Wu, MD

CONTACT

How is the study designed?

Study Type

OBSERVATIONAL

Masking

N/A

Allocation

N/A

Model

N/A

Primary Purpose

N/A

Number of Arms

3

Not the Right Trial for You?

Explore thousands of other clinical trials that might be a better match.
Sign up to get personalized trial recommendations delivered to your inbox.

Already have an account? Log in here