A Novel Framework for Predicting Phage-Host Interactions via Host Specificity-Aware Graph Autoencoder.
Due to the abuse of antibiotics, some pathogenic bacteria have developed resistance to most antibiotics, leading to the emergence of antibiotic-resistant superbugs. Therefore, researchers resort to phage therapy for bacterial infections. For phage therapy, the fundamental step is to accurately identify phage-host interactions. Although various methods have been proposed, the existing methods suffer from the following two shortcomings: 1) they fail to make full use of genetic information including both genome and protein sequence of phages; 2) host specificity of phages is not explicitly utilized when learning representations of phages and bacteria. In this paper, we present an efficient computational method called PHISGAE for predicting phage-host interactions, in which the host specificity is explicitly employed. Firstly, initial phage-phage connections are efficiently constructed via utilizing phage genome and protein sequence. Then, the refined heterogeneous network is derived by applying K-nearest neighbor strategy, keeping relatively more meaningful local semantics among phages and bacteria. Finally, a host specificity-aware graph autoencoder is proposed to learn high-quality representations of phages and bacteria for predicting phage-host interactions. Experimental results show that PHISGAE outperforms the state-of-the-art methods on predicting phage-host interactions at both species level and genus level (AUC values of 94.73% and 96.32%, respectively). Moreover, results of case study demonstrate that PHISGAE is able to identify candidate hosts with high probability for previously unseen phages identified from metagenomics, effectively predicting potential phage-host interactions in real-world applications.