Editorial: don’t forget basic performance measures in the endoscopic assessment of ulcerative colitis

During the last decade, the ulcerative colitis (UC) therapeutic paradigm shifted from symptom control to clinical remission off steroids associated with the disappearance of endoscopic inflammation. To that end, assessment of disease endoscopic activity has become the cornerstone of UC management in daily practice and in clinical trials as required by the US Food and Drug Administration and the European Medicines Agency. Two UC endoscopic scoring systems were recommended by a recently updated expert consensus: the Mayo endoscopic subscore (MES) and the Ulcerative Colitis Endoscopic Index of Severity (UCEIS).1, 2 The MES was constructed more than 30 years ago in the context of the endoscopic techniques of the 1980s and has never been validated.1 The MES suffers from several weaknesses: only four categories, absence of discrimination of each endoscopic lesion, and overlap between friability and erythema items in subscores 1 and 2. The more recent UCEIS combines the three more reproducible UC endoscopic lesions - namely vascular pattern, bleeding, and erosions/ulcers. It has been validated and may have better intra- and inter-observer reproducibility than the MES.3 Importantly, cut-offs for defining endoscopic remission have not been validated for either score.

Endoscopic central reading has been implemented in clinical trials to overcome these limitations and obtain a more reproducible assessment of UC endoscopic activity. Experienced and properly trained central readers, blinded to any clinical or study data, are less prone to observational bias than local readers possibly aware of a patient's history and sequence of treatments. Central reading has a substantial impact on results from controlled trials, as observed in a study comparing mesalazine to placebo in mild-to-moderate UC.4 In a recent post hoc analysis of the OCTAVE program, Feagan et al evaluated the agreement between central and local endoscopic reads.5 They confirmed that agreement was only moderate-to-substantial with a kappa value κ of 0.62 (95% confidence interval: 0.59-0.66). Moreover, in case of disagreements, local reads reported lower scores than central reads. The next step to standardise reproducible endoscopic assessments in UC should be artificial intelligence. This promising tool has been evaluated in several recent studies with performance as good as expert endoscopists.6-8

Whatever the method used to evaluate UC endoscopic activity, the key point remains image quality, especially the quality of bowel cleansing. Unfortunately, this important information was missing from the paper by Feagan et al although the European Society of Gastrointestinal Endoscopy recommends that quality criteria of endoscopic conditions are mandatory performance measures.9 This limitation has a significant impact on the level of agreement between readers as recently observed in Crohn's disease10 and should be taken into account when comparing the performance of experts and machine learning algorithms.

Declaration of personal interests: D Laharie declares counselling, boards, transports or fees from Abbvie, Biogaran, Biogen, Ferring, HAC-pharma, Janssen, MSD, Novartis, Pfizer, Prometheus, Roche, Takeda, Theradiag, Tillots. Pauline Rivière declares fees from Abbvie, Amgen and Janssen.

Comments (0)

No login
gif