Current neural networks demonstrate potential in automated cervical vertebral maturation stage classification based on lateral cephalograms

Selection Criteria

The authors searched four databases (MEDLINE, Scopus, Embase, and Web of Science) for diagnostic accuracy studies published in English. The search strategy was a combination of terms related to cervical vertebrae maturation (CVM) and neural networks (NNs), including “skeletal maturity”, “bone age”, “artificial intelligence”, “machine learning” and others. Experimental studies, prospective/retrospective observational studies, randomized controlled trials, case-control studies, and cohort studies were considered eligible. Abstracts, opinions, conference papers, reviews, and studies that did not use NNs were excluded. The study screening process was performed in two stages, each by two independent reviewers.

Key Study Factors

The review was conducted to compare the diagnostic accuracy of NNs and the ground truth determined by human observers. Among the 8 included studies, 6 utilized equally distributed cervical vertebrae maturation stage (CVS) data as the training datasets. Three studies used the radiograph image as the input data, while 5 utilized manually labeled datasets with measurements. The input measurements included linear measurements in both the vertical and horizontal directions, as well as ratios derived from them. Besides, 2 studies applied cross-validation method for the training and testing datasets, and 6 studies used separate datasets. Three studies utilized pre-developed convolutional NNs, and modified them to suit the input. Six studies created new NNs that were specifically designed for CVS classification, 1 of which used both pre-developed NN and newly developed NN. In all 8 included studies, human observers classified CVS according to Hassel method1 or the method modified by Baccetti et al.2

Main Outcome Measures

The main outcome of this review was the accuracy of CVS classification based on lateral cephalograms, reported as the level of agreement between NNs and the reference standard.

Main Results

The findings of included studies were presented in a descriptive manner and no meta-analysis was carried out. The reported accuracy varied significantly across these studies, ranging from less than 50% to over 95%. Notably, 5 studies reported accuracy levels of above 90%, while 1 study reported an accuracy of 58.3% and another reported 62.5%. Furthermore, 1 study exclusively reported accuracy values for different CVS (range, 47.4% to 93%) but did not provide an overall accuracy figure. Comparison between NNs and other forms of artificial intelligence (AI), such as Bayes models, was conducted in 3 studies and they all concluded that NNs were more accurate. Additionally, 1 study found that NN had the most stable results compared to other AI algorithms. According to the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) assessment tool, 7 studies that lacked evaluation of inter-observer or intra-observer agreement were considered having some concerns for bias in reference standard. Two studies without separate test dataset were considered having some concerns for bias in index test. All the 8 included studies presented low concerns regarding their applicability.

Conclusions

The authors concluded that NNs can successfully classify various stages of CVM based on lateral cephalograms. However, the accuracy of NNs showed significant variations across different studies.

留言 (0)

沒有登入
gif