A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation

Yükleniyor...
Küçük Resim

Tarih

2022

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

SPRINGER INDIA, 7TH FLOOR, VIJAYA BUILDING, 17, BARAKHAMBA ROAD, NEW DELHI 110 001, INDIA

Erişim Hakkı

info:eu-repo/semantics/openAccess
Attribution-NonCommercial-NoDerivs 3.0 United States

Özet

Background Nonalcoholic steatohepatitis (NASH), a severe form of nonalcoholic fatty liver disease, can lead to advanced liver damage and has become an increasingly prominent health problem worldwide. Predictive models for early identification of highrisk individuals could help identify preventive and interventional measures. Traditional epidemiological models with limited predictive power are based on statistical analysis. In the current study, a novel machine-learning approach was developed for individual NASH susceptibility prediction using candidate single nucleotide polymorphisms (SNPs). Methods A total of 245 NASH patients and 120 healthy individuals were included in the study. Single nucleotide polymorphism genotypes of candidate genes including two SNPs in the cytochrome P450 family 2 subfamily E member 1 (CYP2E1) gene (rs6413432, rs3813867), two SNPs in the glucokinase regulator (GCKR) gene (rs780094, rs1260326), rs738409 SNP in patatinlike phospholipase domain-containing 3 (PNPLA3), and gender parameters were used to develop models for identifying at-risk individuals. To predict the individual’s susceptibility to NASH, nine different machine-learning models were constructed. These models involved two different feature selections including Chi-square, and support vector machine recursive feature elimination (SVM-RFE) and three classification algorithms including k-nearest neighbor (KNN), multi-layer perceptron (MLP), and random forest (RF). All nine machine-learning models were trained using 80% of both the NASH patients and the healthy controls data. The nine machine-learning models were then tested on 20% of both groups. The model’s performance was compared for model accuracy, precision, sensitivity, and F measure. Results Among all nine machine-learning models, the KNN classifier with all features as input showed the highest performance with 86% F measure and 79% accuracy. Conclusions Machine learning based on genomic variety may be applicable for estimating an individual’s susceptibility for developing NASH among high-risk groups with a high degree of accuracy, precision, and sensitivity.

Açıklama

Anahtar Kelimeler

Algorithm, Artificial intelligence, Disease susceptibility, Fatty liver, Gene, Machine learning, Neural network model, Nonalcoholic fattyliver disease, Nonalcoholic steatohepatitis, Single nucleotide polymorphism, Support vectormachine

Kaynak

Indian Journal of Gastroenterology

WoS Q Değeri

N/A

Scopus Q Değeri

Q3

Cilt

41

Sayı

5

Künye