Interrelated feature selection from health surveys using domain knowledge graph

Wang M, Yang Y, Liao Z. Diabetes and cancer: epidemiological and biological links. World J Diabetes. 2020;11(6):227.

Article  Google Scholar 

Jing X-Y, Zhang X, Zhu X, Wu F, You X, Gao Y, Shan S, Yang J-Y. Multiset feature learning for highly imbalanced data classification. IEEE Trans Pattern Anal Mach Intell. 2019;43(1):139–56.

Article  Google Scholar 

Zhang Yong, Sheng Ming, Liu Xingyue, Wang Ruoyu, Lin Weihang, Ren Peng, Wang Xia, Zhao Enlai, Song Wenchao. A heterogeneous multi-modal medical data fusion framework supporting hybrid data exploration. Health Inf Sci Syst. 2022;10(1):22.

Article  Google Scholar 

Vimalachandran Pasupathy, Liu Hong, Lin Yongzheng, Ji Ke, Wang Hua, Zhang Yanchun. Improving accessibility of the Australian My Health Records while preserving privacy and security of the system. Health Inf Sci Syst. 2020;8:1–9.

Article  Google Scholar 

Huang H, Liu H. Feature selection for hierarchical classification via joint semantic and structural information of labels. Knowl-Based Syst. 2020;195:105655.

Article  Google Scholar 

Agrawal M. Towards scalable structured data from clinical text. PhD diss., Massachusetts Institute of Technology; 2023.

Deng F, Zhou H, Lin Y, Heim JA, Shen L, Li Y, Zhang L. Predict multicategory causes of death in lung cancer patients using clinicopathologic factors. Comput Biol Med. 2021;129:104161.

Article  Google Scholar 

Parekh T, Fahim F. Building risk prediction models for daily use of marijuana using machine learning techniques. Drug Alcohol Depend. 2021;225:108789.

Article  Google Scholar 

Chen Songjing, Sizhu Wu. Identifying lung cancer risk factors in the elderly using deep neural networks: quantitative analysis of web-based survey data. J Med Int Res. 2020;22(3):e17695.

Google Scholar 

Pan Liangrui, Ji Boya, Wang Hetian, Wang Lian, Liu Mingting, Chongcheawchamnan Mitchai, Peng Shaolaing. MFDNN: multi-channel feature deep neural network algorithm to identify COVID19 chest X-ray images. Health Inf Sci Syst. 2022;10(1):4.

Article  Google Scholar 

Ying X. 2019, February. An overview of overfitting and its solutions. In Journal of physics: Conference series (Vol. 1168, No. 2, p. 022022). IOP Publishing.

Kim K. An improved semi-supervised dimensionality reduction using feature weighting: application to sentiment analysis. Expert Syst Appl. 2018;109:49–65.

Article  Google Scholar 

Jaworsky M, Tao X, Yong J, Pan L, Zhang J, Pokhrel S. Automated knowledge graph construction for healthcare domain. In: Proceedings of the International Conference on Health Information Science, pp. 258–265, Springer; 2022.

Li X, Wang Y, Wang D, Yuan W, Peng D, Mei Q. Improving rare disease classification using imperfect knowledge graph. BMC Med Inform Decis Mak. 2019;19(5):1–10.

Google Scholar 

Tao X, Pham T, Zhang J, Yong J, Goh WP, Zhang W, Cai Y. Mining health knowledge graph for health risk prediction. World Wide Web. 2020;23:2341–62.

Article  Google Scholar 

Pham Thuan, Tao Xiaohui, Zhang Ji, Yong Jianming, Li Yuefeng, Xie Haoran. Graph-based multi-label disease prediction model learning from medical data and domain knowledge. Knowl-Based Syst. 2022;235:107662.

Article  Google Scholar 

Pham Thuan, Tao Xiaohui, Zhang Ji, Yong Jianming. Constructing a knowledge-based heterogeneous information graph for medical health status classification. Health Inf Sci Syst. 2020;8:1–14.

Article  Google Scholar 

Howlader KC, Satu MS, Awal MA, Islam MR, Islam SMS, Quinn J, Moni MA. Machine learning models for classification and identification of significant attributes to detect type 2 diabetes. Health Inf Sci Syst. 2022;10(1):2.

Article  Google Scholar 

Akram T, Lodhi HM, Naqvi SR, Naeem S, Alhaisoni M, Ali M, Haider SA, Qadri NN. A multilevel features selection framework for skin lesion classification. Human-centric Comput Inf Sci. 2020;10:1–26.

Article  Google Scholar 

Nam J. Learning Label Structures with Neural Networks for Multilabel Classification. PhD thesis, Technische Universität; 2019.

Waegeman W, Dembczyński K, Hüllermeier E. Multi-target prediction: a unifying view on problems and methods. Data Min Knowl Disc. 2019;33(2):293–324.

Article  MathSciNet  MATH  Google Scholar 

Madjarov G, Kocev D, Gjorgjevikj D, Džeroski S. An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 2012;45(9):3084–104.

Article  Google Scholar 

Hu Weihua, Fey Matthias, Zitnik Marinka, Dong Yuxiao, Ren Hongyu, Liu Bowen, Catasta Michele, Leskovec Jure. Open graph benchmark: Datasets for machine learning on graphs. Adv Neural Inf Process Syst. 2020;33:22118–33.

Google Scholar 

Reiser Patrick, Eberhard André, Friederich Pascal. Graph neural networks in TensorFlow-Keras with RaggedTensor representation (kgcnn). Software Impacts. 2021;9:100095.

Article  Google Scholar 

Li Mufei, Zhou Jinjing, Jiajing Hu, Fan Wenxuan, Zhang Yangkang, Yaxin Gu, Karypis George. Dgl-lifesci: an open-source toolkit for deep learning on graphs in life science. ACS Omega. 2021;6(41):27233–8.

Article  Google Scholar 

Pes B. Learning from high-dimensional and class-imbalanced datasets using random forests. Information. 2021;12(8):286.

Article  Google Scholar 

Liu M, Xu C, Luo Y, Xu C, Wen Y, Tao D. Cost-sensitive feature selection via f-measure optimization reduction. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence; 2017.

Xu D, Shi Y, Tsang IW, Ong Y-S, Gong C, Shen X. Survey on multi-output learning. IEEE Trans Neural Netw Learn Syst. 2019;31(7):2409–29.

MathSciNet  Google Scholar 

Berrar D. Cross-validation; 2019.

Gonzalez-Dias P, Lee EK, Sorgi S, de Lima DS, Urbanski AH, Silveira EL, Nakaya HI. Methods for predicting vaccine immunogenicity and reactogenicity. Hum Vacc Immunother. 2020;16(2):269–76.

Article  Google Scholar 

Wang X, Zhai M, Ren Z, Ren H, Li M, Quan D, Chen L, Qiu L. Exploratory study on classification of diabetes mellitus through a combined random forest classifier. BMC Med Inform Decis Mak. 2021;21(1):1–14.

Article  Google Scholar 

Fan S-KS, Hsu C-Y, Jen C-H, Chen K-L, Juan L-T. Defective wafer detection using a denoising autoencoder for semiconductor manufacturing processes. Adv Eng Inform. 2020;46:101166.

Article  Google Scholar 

Song D, Vold A, Madan K, Schilder F. Multi-label legal document classification: A deep learning-based approach with label-attention and domain-specific pre-training. Information Systems, p. 101718; 2021.

Gupta N, Bohra S, Prabhu Y, Purohit S, Varma M. Generalized zero-shot extreme multi-label learning. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 527–535; 2021.

Wang J, Zhou F, Wen S, Liu X, Lin Y. Deep metric learning with angular loss. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2593–2601; 2017.

Zhu L, Yang Y. Inflated episodic memory with region self-attention for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4344–4353; 2020.

Ibrahim MA, Khan MUG, Mehmood F, Asim MN, Mahmood W. Ghs-net a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform. 2021;116:103699.

Article  Google Scholar 

Melacci S, Ciravegna G, Sotgiu A, Demontis A, Biggio B, Gori M, Roli F. Domain knowledge alleviates adversarial attacks in multilabel classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence; 2021.

Zhu P, Wang H, Saligrama V. Learning classifiers for target domain with limited or no labels. In: Proceedings of the International Conference on Machine Learning, pp. 7643–7653, PMLR; 2019.

Ruas P, Neves A, Andrade VD, Couto FM, Aragon ME. Lasigebiotm at cantemist: Named entity recognition and normalization of tumour morphology entities and clinical coding of spanish healthrelated documents. In: IberLEF@ SEPLN, pp. 422–437; 2020.

Bengio S, Dembczynski K, Joachims T, Kloft M, Varma M. Extreme classification (dagstuhl seminar 18291). In: Dagstuhl Reports, vol. 8, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik; 2019.

Dinh A, Miertschin S, Young A, Mohanty SD. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inform Decis Mak. 2019;19(1):1–15.

Article  Google Scholar 

Fang M, Chen Y, Xue R, Wang H, Chakraborty N, Su T, Dai Y. A hybrid machine learning approach for hypertension risk prediction. Neural Computing and Applications, pp. 1–11; 2021.

Zgodic A, Zahnd WE, Miller DP Jr, Studts JL, Eberth JM. Predictors of lung cancer screening utilization in a population-based survey. J Am Coll Radiol. 2020;17(12):1591–601.

Article  Google Scholar 

Jaworsky M, Tao X, Yong J, Pan L, Zhang J, Pokhrel SR. Knowledge-Based Nonlinear to Linear Dataset Transformation for Chronic Illness Classification. Health Information Science. Lecture Notes in Computer Science, vol 14305. Springer, Singapore. HIS 2023.

Zeng X, Tu X, Liu Y, Fu X, Su Y. Toward better drug discovery with knowledge graph. Curr Opin Struct Biol. 2022;72:114–26.

Article  Google Scholar 

Nicholson DN, Greene CS. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J. 2020;18:1414–28.

Article  Google Scholar 

Bitew FH, Nyarko SH, Potter L, Sparks CS. Machine learning approach for predicting under-five mortality determinants in Ethiopia: evidence from the 2016 Ethiopian demographic and health survey. Genus. 2020;76(1):1–16.

Article  Google Scholar 

Prashanth R, Roy SD. Novel and improved stage estimation in Parkinson’s disease using clinical scales and machine learning. Neurocomputing. 2018;305:78–103.

Article  Google Scholar 

Ricciardi C, Valente AS, Edmund K, Cantoni V, Green R, Fiorillo A, Picone I, Santini S, Cesarelli M. Linear discriminant analysis and principal component analysis to predict coronary artery disease. Health Inform J. 2020;26(3):2181–92.

Article  Google Scholar 

Georgakopoulos SV, Tasoulis SK, Vrahatis AG, Plagianakos VP. Convolutional neural networks for toxic comment classification. In: Proceedings of the 10th Hellenic conference on artificial intelligence, pp. 1–6; 2018.

Yeh, Alexander. More accurate tests for the statistical significance of result differences. arXiv preprint arXiv:cs/0008005 (2000).

Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A. Rusboost: Improving classification performance when training data is skewed. In: Proceedings of the 2008 19th International Conference on Pattern Recognition, pp. 1–4, IEEE; 2008.

Carrasco J, Lison F, Weintraub A. Rusboost: A suitable species distribution method for imbalanced records of presence and absence. A case study of twenty-five species of Iberian bats, bioRxiv; 2021.

Sahoo D, Liu C, Hoi SC. Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179; 2017.

留言 (0)

沒有登入
gif