篇名 |
A Machine Learning Based Computational Method for Prediction of Functional SNPs in Rice Genome
|
---|---|
並列篇名 | A Machine Learning Based Computational Method for Prediction of Functional SNPs in Rice Genome |
作者 | Rong Li、Zhi-e Lou |
英文摘要 | Single nucleotide polymorphisms (SNPs) are the most prevalent and stable class of genetic diversity that exist in most organisms. Functional SNPs are the most commonly used genetic markers for diversity study and molecular breeding in plants, and their quick recognition is in urgent demand. In this work, a computational approach to identify functional SNPs in rice genome based on machine learning is presented. To characterize and prioritize variants, two different categories of features, the nucleotide-sequence based features and the allele-specific based features, are extracted. In particular, the weighted Euclidean distance is employed to measure the changes of the transcription factors (TFs) binding affinities caused by SNPs. To deal with the classification problem on unbalanced data, the support vector machine (SVM) together with an oversampling method is employed. We use mRMR to find the optimal feature set, and the result shows that our method can achieve accuracy with sensitivity of ~74.2% and specificity of ~72.3% after 10-fold cross-validation. Furthermore, the sources of data to build the proposed prediction model are mainly sequence context of SNP and TF profiles in JASPAR database, which are all easy to be acquired. So, the prediction method can be easily applied to other plant species.
|
起訖頁 | 317-328 |
關鍵詞 | transcription factor binding affinity、position weight matrix、functional SNP、support vector machine |
刊名 | 電腦學刊 |
期數 | 202310 (34:5期) |
DOI |
|
QR Code | |
該期刊 上一篇
| An Improved Cuckoo Search Algorithm Based on Elite Opposition-based Learning for Indoor Visible Light Positioning |