To pilot a process for the independent external validation of an AI tool to detect breast cancer using data from the NHS breast screening programme (NHSBSP).
Methods:A representative dataset of mammography images from 26,000 women attending two NHS screening centres, and an enriched dataset of 2054 positive cases were used from the OPTIMAM image database. The use case of the AI tool was the replacement of the first or second human reader. The performance of the AI tool was compared to that of human readers in the NHSBSP.
Results:Recommendations for future external validations of AI tools to detect breast cancer are provided. The tool recalled different breast cancers to the human readers. This study showed the importance of testing AI tools on all types of cases (including non-standard) and the clarity of any warning messages. The acceptable difference in sensitivity and specificity between the AI tool and human readers should be determined. Any information vital for the clinical application should be a required output for the AI tool. It is recommended that the interaction of radiologists with the AI tool, and the effect of the AI tool on arbitration be investigated prior to clinical use.
Conclusion:This pilot demonstrated several lessons for future independent external validation of AI tools for breast cancer detection.
Advances in knowledge Knowledge has been gained towards best practice procedures for performing independent external validations of AI tools for the detection of breast cancer using data from the NHS Breast Screening Programme.
Comments (0)