Content area
Full text
Introduction
New screening recommendations for breast cancer are presently being introduced across Europe and the United States1,2. Previous guidelines focused on mammography as the primary tool for breast cancer detection3,4. The latest guidelines advocate the use of magnetic resonance imaging (MRI) as a screening method for a significant number of women, particularly those with extremely dense breast tissue. This recommendation is reflected in the recently published EUSOBI guideline1. With these modifications, the use of MRI as a screening tool will need to be exponentially scaled up, potentially involving millions of women scanned annually within the European Union. This substantial rise in imaging demand is currently unmatched by a proportional increase in trained specialty radiologists. This disparity underscores a growing medical necessity for computed-assisted approaches, in particular deep learning (DL) methods. These systems can assist radiologists in interpreting breast MRI data, thereby enabling general radiologists to achieve a level of proficiency comparable to that of experts5. While high-quality evidence shows a potential clinical benefit of DL in mammography6, similar advancements in MRI for breast cancer face significant challenges. Notably, studies that have achieved high-performance DL models using MRI data often rely on large, proprietary datasets that are not publicly accessible. The performance of DL systems for medical image analysis scales with the amount of training data. Hence, the lack of data accessibility hinders reproducibility and limits collaborative efforts within the research community. Therefore, training accurate, high-performance DL models for breast cancer detection in MRI is constrained by two primary limitations: access to a large number of examinations and the availability of ground truth labels.
Traditionally, DL models for tumor detection in three-dimensional (3D) radiological data are trained in a supervised way, often using manually drawn tumor annotations as a golden standard7,8. This process imposes a significant time and labor demand on expert and trained radiologists. Moreover, obtaining precise voxel-level boundaries of tumors in MRI data is not always feasible due to inherent imaging ambiguities and significant intra- and inter-reader variability, which can affect subsequent measurements and model performance. Similar limitations apply to other strongly supervised methods, including bounding-box annotations or centroid annotations, all of which require expert input and can be subjective and ambiguous