Efficient Adversarial debiasing with Concept Activation Vector—Medical image case-studies

Bias and discrimination in Artificial Intelligence (AI) systems has been studied in multiple domains including healthcare applications, like melanoma detection [1], mortality prediction [2]. However, several studies have recently demonstrated significant racial disparities in medical imaging AI systems when model AI performance is stratified by self-reported race [3]. For example, state-of-the-art AI models produce significant differences in chest X-ray (CXR) under-diagnosis rates across racial and other demographic groups - leading to more Black and female patients being under-diagnosed compared to White and male patients [4]. Furthermore, it was found that the disparity was not due simply due to under-representation of certain groups in the training data and persisted even when training data was balanced [3]. It is not clear what imaging features have association with sensitive attributes that effects the model performance. Other works have suggested subtle demographic differences found it is possible to recover race from imaging-related studies [5].

To develop more equitable models, researchers have proposed techniques that either modify training data to develop a ‘fair’ model or influence model training in a strategic way to reduce the effect of the sensitive attributes. We recently published a comprehensive review on ‘fair’ AI model development focusing on both generic and medical imaging case-studies [6] where we broadly divided the methodologies for the development of ‘fair’ AI models for medical images into two categories - (i) ‘fair’ dataset curation for unbiased training and (ii) ‘fair’ representation learning. In ‘fair’ dataset curation for unbiased training category, Larrazabal et al. demonstrated that the use of a balanced datasets with respect to gender could improve individual performance for chest X-ray diagnosis [7]. However, it has been shown that often available medical imaging datasets are limited by geographic diversity with limited details on sociodemographic information creating data deserts that hamper health disparities research and development of fair models [8]. Futhermores, curating a racially balanced large training dataset is extremely challenging given that healthcare institutions often serve a specific geographic location and particular patient subgroup and sharing data between institutions or crowdsourcing is also often infeasible given the HIPAA constraints.

Disparities in performance have often been associated with models learning spurious correlations in our datasets. Even works such as ‘reading race’ have demonstrated that there exist some unknown indicators of race in medical imaging models [5]. Other works have attempted to reduce a models ability to learn race related feature which often focus on using a penalty term based on an auxiliary model’s ability to identify a demographic of interest [9], [10]. Despite gains in fairness the techniques are hindered by slight reduction in performance. ‘Fair’ representation learning focuses on learning relevant representations from the data by extracting useful information for targeted downstream classification/prediction task [11]. However, previous works [12], [13] have shown that deep learning models could achieve good performance by learning confounding features of the sensitive attributes, resulting in poor performance for minority classes. As a way to mitigate said biases adversarial debiasing [10], adapted from domain adaptation, has been to penalize the model for learning demographic features such as age, gender, and/or race [9], [14], [15]. Other novel strategies lie in the use of autoencoders that learn two representations capturing intrinsic and biasing features. The representations are further augmented to increase the diversity of biased conflicting samples, helping debias the model [16]. However, such representation learning methods have been rarely experimented on the medical imaging datasets [14], [17], probably due to the complexity of the datasets and target tasks. The mentioned algorithms have also been found to produce lower performance bounds compared to their original baseline models.

In this work, we propose a novel partial neural network learning strategy to train a ‘fair’ AI model that would improve the performance of adversarial learning techniques while reducing the performance deficit of models. The partial debiasing utilizes the CAV method [18] which was originally proposed as model interpretation technique, to identify the convolution layers primarily responsible for learning race information for targeted fine-tuning. We compared with a Densenet121 [19] baseline and model fine-tuning strategy. We also performed an comparative analysis with the full adversarial model training schemes and evaluated the effectiveness of the models in two medical image classification tasks - Chest X-ray image interpretation and breast density prediction. We release the code with academic open-source license - https://github.com/ramon349/JBI2023_TCAV_debiasing.

Statement of significance

Several studies have recently demonstrated significant racial disparities in medical imaging AI systems.

Disparities in performance have often been associated with models learning spurious correlations in the training datasets.

Curation of a racially balanced large training dataset is extremely challenging given that often the healthcare institutions serve a specific geographic location and particular patient subgroup. Sharing data between the institution is also often infeasible given the HIPAA constrains.

We propose a novel partial neural network learning strategy to train a ‘fair’ AI model and performed an comparative analysis of two parallel model training schemes (full and partial) that would improve the performance of adversarial learning techniques while reducing the performance deficit of models.

We evaluate the effectiveness of the models in two medical image classification tasks - Chest X-ray image interpretation and breast density prediction.

留言 (0)

沒有登入
gif