RNFLT2Vec: Artifact-corrected representation learning for retinal nerve fiber layer thickness maps

Glaucoma, the leading cause of irreversible blindness globally, is a progressive optic neuropathy characterized by irreversible visual field loss due to the degeneration of the retinal nerve fiber layer (RNFL) (Quigley and Broman, 2006). To diagnose and monitor glaucoma, clinicians rely on the two-dimensional (2D) retinal nerve fiber layer thickness (RNFLT) map obtained from spectral-domain optical coherence tomography (OCT) scans. However, clinicians often underutilize the rich information provided by the 2D RNFLT map, as humans have difficulty processing high-dimensional data for decision-making. This limitation makes manual glaucoma detection and monitoring on a large scale labor-intensive and sometimes impractical. Alternatively, some clinicians opt for using the one-dimensional circle scan of RNFLT for glaucoma diagnosis and monitoring. However, this approach may overlook valuable information present in the entire RNFLT map. Therefore, there is a strong need to effectively utilize the information and features extracted from the RNFLT map in a practical manner.

The remarkable success of artificial intelligence (AI) applications in medicine has sparked increasing efforts in automated glaucoma detection and diagnosis, leveraging techniques such as deep learning (Orlando et al., 2020, Wang et al., 2020a, Nayak et al., 2021, Mirzania et al., 2021). Recent studies have focused on utilizing RNFLT maps to predict visual fields (Christopher et al., 2020, Lazaridis et al., 2022) and discover patterns of RNFL thinning (Wang et al., 2020b). In contrast to clinicians’ subjective manual assessment, these AI methods extract and quantify clinically significant features from RNFLT maps related to glaucoma and its progression in a more objective and consistent manner. The effectiveness of these automated methods hinges on how well the relevant features in RNFLT maps can be preserved and represented, a process known as feature representation learning or embedding learning. By unveiling the intricate relationships between retinal structure changes (e.g., RNFL thinning) and visual functions (e.g., vision loss) in glaucoma (Chu et al., 2018), these methods play a crucial role. The success of machine learning algorithms is intrinsically tied to effective data representation or features (Bengio et al., 2013). Representation learning poses a fundamental challenge across diverse disciplines, including word embedding for natural language processing (Li and Yang, 2018), image representation learning for image understanding (Li et al., 2018), and network representation learning for graph mining (Zhang et al., 2018). The primary challenge in representation learning lies in extracting features from data that accurately capture both intra and inter-data relationships in a latent space.

The goal of learning representation for the RNFLT map is to preserve clinically significant features while capturing the inherent relationships among the maps in a latent space. By representing the 2D RNFLT map using lower-dimensional features, it becomes possible to uncover potential novel biomarkers associated with the visual field loss and progression in glaucoma. Previous studies have employed traditional unsupervised machine learning techniques, such as principal component analysis and non-negative matrix factorization, to reduce 2D RNFLT maps into pattern decomposition coefficients. These studies have identified specific patterns that are linked to visual field loss and progression (Christopher et al., 2018, Wang et al., 2020b). More recently, deep autoencoder models have been utilized to extract features from segmented retinal structures surrounding the optic nerve head. These extracted features have been employed in subsequent diagnostic tasks to identify new structural phenotypes in glaucoma (Panda et al., 2022). However, learning distinctive representations that are relevant to glaucoma poses significant challenges, primarily due to the following two obstacles:

Prevalent Artifact Noise: A substantial proportion of RNFLT maps are affected by artifacts due to layer segmentation errors by the OCT manufacturer’s software (please refer to Fig. 1 for the detail). The thickness artifact is noise that would deteriorate the actual retinal thickness distribution patterns, resulting in the representations learned from the RNFLT maps being inaccurate to reflect the true glaucoma stage of patients.

Obscure RNFLT Patterns: The pathological RNFL thinning patterns vary from inter-individual physiological variations. Therefore, there is little prior knowledge to naturally guide the representation learning to differentiate various thickness patterns shown in the RNFLT maps. In addition, the presence of artifact noise makes the relative affinities among RNFLT maps even indistinguishable.

The complex thickness patterns combined with artifact noise pose challenges for traditional representation learning methods, such as matrix factorization (Liu et al., 2011) and autoencoder (Tschannen et al., 2018), when directly applied to 2D RNFLT maps. Recent studies treat RNFLT maps as three-channel RGB images and utilize powerful convolutional neural networks such as ResNet (He et al., 2016) and VGGNet (Simonyan and Zisserman, 2014) to extract features for glaucoma prediction or visual field loss estimation (Panda et al., 2018, Asaoka et al., 2019, Christopher et al., 2020). However, these methods assume that RNFLT maps are free of artifact noise. Furthermore, these approaches are task-specific and require labeled data for end-to-end supervised training, limiting the generalizability of the learned representations to other relevant glaucoma tasks.

In this paper, we propose a novel framework called RNFLT2Vec to address the previously mentioned challenges and learn effective representations of RNFLT maps. To the best of our knowledge, this is the first work that focuses on feature representation learning for RNFLT maps using a large-scale dataset of glaucoma patients. Our approach utilizes an artifact correction component as the core of RNFLT2Vec to reduce the impact of artifact noise. This component corrects the thickness artifacts and learns representations that capture the true RNFLT patterns (see Fig. 2 for an example). To tackle the issue of unclear RNFLT patterns, we introduce two additional components to regularize RNFLT2Vec: contrastive learning and consistency learning. These components aid in learning distinctive representations among RNFLT maps. Unlike many supervised feature learning methods (Asaoka et al., 2019, Christopher et al., 2020) for RNFLT maps, our method is label-free and task-agnostic. The learned representations are generic and can be applied to various analytical tasks in glaucoma research, such as RNFL thinning pattern discovery and identification of new biomarkers for glaucoma.

The major contributions of this work are summarized as follows:

1.

We formally studied a fundamental problem of learning the feature representations of 2D RNFLT maps, which can be easily used for various clinical analyses such as analyzing RNFL thinning patterns and discovering new biomarkers.

2.

We proposed a novel deep learning model RNFLT2Vec to learn representations of RNFLT maps. It meticulously corrects RNFLT artifacts and derives representative features with an UNet-like autoencoder. A contrastive learning-based regularization is integrated to capture the relative affinities between RNFLT maps. Additionally, a consistency learning-based regularization is introduced to force representations of RNFLT maps to have pairwise distances aligned with their thickness distributions.

3.

We evaluated the effectiveness of representation learning by RNFLT2Vec on a real-world dataset including more than 42,000 RNFLT maps of glaucoma patients. The proposed method can be effectively adapted to learn other two-dimensional medical maps such as fundus images.

The rest of this paper is organized as follows. Section 2 surveys the related work. Section 3 presents preliminaries, including the formulation of the representation learning problem and the introduction of the neural network component used in our approach. The proposed model for representation learning of RNFLT maps is introduced in Section 4. Section 5 designs extensive experiments to evaluate the proposed approach. Section 6 discusses the proposed method for the representation learning of RNFLT maps. Finally, Section 7 concludes the paper.

留言 (0)

沒有登入
gif