Threshold sensitivity analysis for HIV-1 transmission cluster detection using different genomic regions and subtypes.

Journal: Virology
Published:
Abstract

HIV-1 cluster analysis has been widely used in characterizing HIV-1 transmission and some countries have implemented such molecular epidemiology as part of their prevention strategy. However, HIV-1 sequences derive from varying genome regions, which affects phylogenetic clustering outputs. Here, we apply different tools to run a sensitivity analysis for assessing which threshold give the most cohesive clustering outputs for different data sources. We used a dataset of 174 full-length sequences of subtype B from the Swiss HIV Cohort Study and publicly available subtype C from South Africa. Each dataset was divided into sub-genomic sub-datasets covering gag, pol, and env. pol was further subdivided into regions commonly used in HIV-1 genotyping laboratories (pr-rt, rt-int, and pr-rt-int). Cluster analyses for each sub-genomic region was performed specifying varying distance thresholds of 0.5 %-4.5 % and tree branch support of 70 %, 90 % and 99 % in ClusterPicker. Tree topologies and clustering outputs were compared against each other to assess cluster similarity. Pylogenies using pol, pr-rt-int, or rt-int had more robust tree topologies compared to gag and env. Cluster composition changed with increasing genetic distance threshold but was not affected by branch support. Cluster identity was most similar around genetic distances of 2.5 (±0.5)% for all sub-genomic regions and for both subtype B and C. Our study demonstrated the value of performing a sensitivity analysis before setting a genetic distance threshold for clustering output and that the pol region is appropriate for clustering outputs and can be used for near real-time HIV-1 cluster detection.

Relevant Conditions

HIV/AIDS