Journal of Guangzhou University(Natural Science Edition). 2024, 23(2): 37-47.
Abstract (
)
Download PDF (
)
Knowledge map
Save
Since late 2019, the widespread outbreak of the novel coronavirus has had a severe impact on public health and social order. Machine learning based prediction methods have the capability to determine the infectivity phenotype and pandemic risk of coronaviruses. Presently, six classes of coronaviruses that infect humans have been identified. These viruses exhibit significant differences in their genomic sequences, and the continuous genetic variation in these viruses has resulted in a decline in the performance of machine learning models, potentially causing issues related to learned forgetting. This study, based on an incremental learning model framework, employed a One class SVM algorithm for continuous discrimination of novel coronavirus subgroups. Furthermore, a combined strategy of parameter sharing and knowledge distillation to adapt a backpropagation ( BP) neural network for continuous learning and prediction of the human infecting phenotype of coronaviruses was employed. The results indicate that the One class SVM, with a combination of balancing parameters v at 0.92, 0. 81, 0.24, 0.11, 0.55, and 0. 2, achieved the optimal classification performance for the six virus classes. It was found that the prediction model achieved the best performance when the number of hidden layer nodes was increased to 6, with a maximum Index of Agreement ( IAC) value of 0 903 5 and a maxi mum Bias Total ( BT) value of – 0.039 9. This effectively suppressed the learning amnesia trend in the network model, with the model’s predictive performance being close to that of joint data training( IAC: 0 923 6 ) . This performance was significantly better than that of neural networks without knowledge distillation ( IAC: 0.776 4) . Moreover, in comparison to other incremental methods, our approach outperformed sample-based methods such as ESRIL ( IAC: 0.866 2) and model parameter based methods like CCLL ( IAC: 0. 885 3) . This research holds important implications for public health applications.