Interrelated feature selection from health surveys using domain knowledge graph

Wang M, Yang Y, Liao Z. Diabetes and cancer: epidemiological and biological links. World J Diabetes. 2020;11(6):227.

Jing X-Y, Zhang X, Zhu X, Wu F, You X, Gao Y, Shan S, Yang J-Y. Multiset feature learning for highly imbalanced data classification. IEEE Trans Pattern Anal Mach Intell. 2019;43(1):139–56.

Article Google Scholar

Zhang Yong, Sheng Ming, Liu Xingyue, Wang Ruoyu, Lin Weihang, Ren Peng, Wang Xia, Zhao Enlai, Song Wenchao. A heterogeneous multi-modal medical data fusion framework supporting hybrid data exploration. Health Inf Sci Syst. 2022;10(1):22.

Article Google Scholar

Vimalachandran Pasupathy, Liu Hong, Lin Yongzheng, Ji Ke, Wang Hua, Zhang Yanchun. Improving accessibility of the Australian My Health Records while preserving privacy and security of the system. Health Inf Sci Syst. 2020;8:1–9.

Article Google Scholar

Huang H, Liu H. Feature selection for hierarchical classification via joint semantic and structural information of labels. Knowl-Based Syst. 2020;195:105655.

Article Google Scholar

Agrawal M. Towards scalable structured data from clinical text. PhD diss., Massachusetts Institute of Technology; 2023.

Deng F, Zhou H, Lin Y, Heim JA, Shen L, Li Y, Zhang L. Predict multicategory causes of death in lung cancer patients using clinicopathologic factors. Comput Biol Med. 2021;129:104161.

Article Google Scholar

Parekh T, Fahim F. Building risk prediction models for daily use of marijuana using machine learning techniques. Drug Alcohol Depend. 2021;225:108789.

Article Google Scholar

Chen Songjing, Sizhu Wu. Identifying lung cancer risk factors in the elderly using deep neural networks: quantitative analysis of web-based survey data. J Med Int Res. 2020;22(3):e17695.

Google Scholar

Pan Liangrui, Ji Boya, Wang Hetian, Wang Lian, Liu Mingting, Chongcheawchamnan Mitchai, Peng Shaolaing. MFDNN: multi-channel feature deep neural network algorithm to identify COVID19 chest X-ray images. Health Inf Sci Syst. 2022;10(1):4.

Article Google Scholar

Ying X. 2019, February. An overview of overfitting and its solutions. In Journal of physics: Conference series (Vol. 1168, No. 2, p. 022022). IOP Publishing.

Kim K. An improved semi-supervised dimensionality reduction using feature weighting: application to sentiment analysis. Expert Syst Appl. 2018;109:49–65.

Article Google Scholar

Jaworsky M, Tao X, Yong J, Pan L, Zhang J, Pokhrel S. Automated knowledge graph construction for healthcare domain. In: Proceedings of the International Conference on Health Information Science, pp. 258–265, Springer; 2022.

Li X, Wang Y, Wang D, Yuan W, Peng D, Mei Q. Improving rare disease classification using imperfect knowledge graph. BMC Med Inform Decis Mak. 2019;19(5):1–10.

Google Scholar

Tao X, Pham T, Zhang J, Yong J, Goh WP, Zhang W, Cai Y. Mining health knowledge graph for health risk prediction. World Wide Web. 2020;23:2341–62.

Article Google Scholar

Pham Thuan, Tao Xiaohui, Zhang Ji, Yong Jianming, Li Yuefeng, Xie Haoran. Graph-based multi-label disease prediction model learning from medical data and domain knowledge. Knowl-Based Syst. 2022;235:107662.

Article Google Scholar

Pham Thuan, Tao Xiaohui, Zhang Ji, Yong Jianming. Constructing a knowledge-based heterogeneous information graph for medical health status classification. Health Inf Sci Syst. 2020;8:1–14.

Article Google Scholar

Howlader KC, Satu MS, Awal MA, Islam MR, Islam SMS, Quinn J, Moni MA. Machine learning models for classification and identification of significant attributes to detect type 2 diabetes. Health Inf Sci Syst. 2022;10(1):2.

Article Google Scholar

Akram T, Lodhi HM, Naqvi SR, Naeem S, Alhaisoni M, Ali M, Haider SA, Qadri NN. A multilevel features selection framework for skin lesion classification. Human-centric Comput Inf Sci. 2020;10:1–26.

Article Google Scholar

Nam J. Learning Label Structures with Neural Networks for Multilabel Classification. PhD thesis, Technische Universität; 2019.

Waegeman W, Dembczyński K, Hüllermeier E. Multi-target prediction: a unifying view on problems and methods. Data Min Knowl Disc. 2019;33(2):293–324.

Article MathSciNet MATH Google Scholar

Madjarov G, Kocev D, Gjorgjevikj D, Džeroski S. An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 2012;45(9):3084–104.

Article Google Scholar

Hu Weihua, Fey Matthias, Zitnik Marinka, Dong Yuxiao, Ren Hongyu, Liu Bowen, Catasta Michele, Leskovec Jure. Open graph benchmark: Datasets for machine learning on graphs. Adv Neural Inf Process Syst. 2020;33:22118–33.

Google Scholar

Reiser Patrick, Eberhard André, Friederich Pascal. Graph neural networks in TensorFlow-Keras with RaggedTensor representation (kgcnn). Software Impacts. 2021;9:100095.

Article Google Scholar

Li Mufei, Zhou Jinjing, Jiajing Hu, Fan Wenxuan, Zhang Yangkang, Yaxin Gu, Karypis George. Dgl-lifesci: an open-source toolkit for deep learning on graphs in life science. ACS Omega. 2021;6(41):27233–8.

Article Google Scholar

Pes B. Learning from high-dimensional and class-imbalanced datasets using random forests. Information. 2021;12(8):286.

Article Google Scholar

Liu M, Xu C, Luo Y, Xu C, Wen Y, Tao D. Cost-sensitive feature selection via f-measure optimization reduction. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence; 2017.

Xu D, Shi Y, Tsang IW, Ong Y-S, Gong C, Shen X. Survey on multi-output learning. IEEE Trans Neural Netw Learn Syst. 2019;31(7):2409–29.

MathSciNet Google Scholar

Berrar D. Cross-validation; 2019.

Gonzalez-Dias P, Lee EK, Sorgi S, de Lima DS, Urbanski AH, Silveira EL, Nakaya HI. Methods for predicting vaccine immunogenicity and reactogenicity. Hum Vacc Immunother. 2020;16(2):269–76.

Article Google Scholar

Wang X, Zhai M, Ren Z, Ren H, Li M, Quan D, Chen L, Qiu L. Exploratory study on classification of diabetes mellitus through a combined random forest classifier. BMC Med Inform Decis Mak. 2021;21(1):1–14.

Article Google Scholar

Fan S-KS, Hsu C-Y, Jen C-H, Chen K-L, Juan L-T. Defective wafer detection using a denoising autoencoder for semiconductor manufacturing processes. Adv Eng Inform. 2020;46:101166.

Article Google Scholar

Song D, Vold A, Madan K, Schilder F. Multi-label legal document classification: A deep learning-based approach with label-attention and domain-specific pre-training. Information Systems, p. 101718; 2021.

Gupta N, Bohra S, Prabhu Y, Purohit S, Varma M. Generalized zero-shot extreme multi-label learning. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 527–535; 2021.

Wang J, Zhou F, Wen S, Liu X, Lin Y. Deep metric learning with angular loss. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2593–2601; 2017.

Zhu L, Yang Y. Inflated episodic memory with region self-attention for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4344–4353; 2020.

Ibrahim MA, Khan MUG, Mehmood F, Asim MN, Mahmood W. Ghs-net a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform. 2021;116:103699.

Article Google Scholar

Melacci S, Ciravegna G, Sotgiu A, Demontis A, Biggio B, Gori M, Roli F. Domain knowledge alleviates adversarial attacks in multilabel classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence; 2021.

Zhu P, Wang H, Saligrama V. Learning classifiers for target domain with limited or no labels. In: Proceedings of the International Conference on Machine Learning, pp. 7643–7653, PMLR; 2019.

Ruas P, Neves A, Andrade VD, Couto FM, Aragon ME. Lasigebiotm at cantemist: Named entity recognition and normalization of tumour morphology entities and clinical coding of spanish healthrelated documents. In: IberLEF@ SEPLN, pp. 422–437; 2020.

Bengio S, Dembczynski K, Joachims T, Kloft M, Varma M. Extreme classification (dagstuhl seminar 18291). In: Dagstuhl Reports, vol. 8, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik; 2019.

Dinh A, Miertschin S, Young A, Mohanty SD. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inform Decis Mak. 2019;19(1):1–15.

Article Google Scholar

Fang M, Chen Y, Xue R, Wang H, Chakraborty N, Su T, Dai Y. A hybrid machine learning approach for hypertension risk prediction. Neural Computing and Applications, pp. 1–11; 2021.

Zgodic A, Zahnd WE, Miller DP Jr, Studts JL, Eberth JM. Predictors of lung cancer screening utilization in a population-based survey. J Am Coll Radiol. 2020;17(12):1591–601.

Article Google Scholar

Jaworsky M, Tao X, Yong J, Pan L, Zhang J, Pokhrel SR. Knowledge-Based Nonlinear to Linear Dataset Transformation for Chronic Illness Classification. Health Information Science. Lecture Notes in Computer Science, vol 14305. Springer, Singapore. HIS 2023.

Zeng X, Tu X, Liu Y, Fu X, Su Y. Toward better drug discovery with knowledge graph. Curr Opin Struct Biol. 2022;72:114–26.

Article Google Scholar

Nicholson DN, Greene CS. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J. 2020;18:1414–28.

Article Google Scholar

Bitew FH, Nyarko SH, Potter L, Sparks CS. Machine learning approach for predicting under-five mortality determinants in Ethiopia: evidence from the 2016 Ethiopian demographic and health survey. Genus. 2020;76(1):1–16.

Article Google Scholar

Prashanth R, Roy SD. Novel and improved stage estimation in Parkinson’s disease using clinical scales and machine learning. Neurocomputing. 2018;305:78–103.

Article Google Scholar

Ricciardi C, Valente AS, Edmund K, Cantoni V, Green R, Fiorillo A, Picone I, Santini S, Cesarelli M. Linear discriminant analysis and principal component analysis to predict coronary artery disease. Health Inform J. 2020;26(3):2181–92.

Article Google Scholar

Georgakopoulos SV, Tasoulis SK, Vrahatis AG, Plagianakos VP. Convolutional neural networks for toxic comment classification. In: Proceedings of the 10th Hellenic conference on artificial intelligence, pp. 1–6; 2018.

Yeh, Alexander. More accurate tests for the statistical significance of result differences. arXiv preprint arXiv:cs/0008005 (2000).

Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A. Rusboost: Improving classification performance when training data is skewed. In: Proceedings of the 2008 19th International Conference on Pattern Recognition, pp. 1–4, IEEE; 2008.

Carrasco J, Lison F, Weintraub A. Rusboost: A suitable species distribution method for imbalanced records of presence and absence. A case study of twenty-five species of Iberian bats, bioRxiv; 2021.

Sahoo D, Liu C, Hoi SC. Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179; 2017.

View original article

HEALTH INFORMATION SCIENCE AND SYSTEMS

Like

分享书签

0 0 0 0 0 0 0

More from this channel

Interrelated feature selection from health surveys using domain knowledge graph

留言 (0)