Background Adjusted Alignment-Free Dissimilarity Measures Improve the Detection of Horizontal Gene Transfer

Horizontal gene transfer (HGT) plays an important role in the evolution of microbial organisms including bacteria.Alignment-free methods based on single genome compositional information have been used to detect HGT.Currently, Manhattan and Euclidean distances based on tetranucleotide frequencies are the most commonly used alignment-free dissimilarity measures to detect HGT.By testing on simulated bacterial Chews sequences and real data sets with known horizontal transferred genomic regions, we found that more advanced alignment-free dissimilarity measures such as CVTree and d2* that take into account the background Markov sequences can solve HGT detection problems with significantly improved performance.

We also studied the influence of different factors such as evolutionary distance Double T Barrel Saddle Set between host and donor sequences, size of sliding window, and host genome composition on the performances of alignment-free methods to detect HGT.Our study showed that alignment-free methods can predict HGT accurately when host and donor genomes are in different order levels.Among all methods, CVTree with word length of 3, d2* with word length 3, Markov order 1 and d2* with word length 4, Markov order 1 outperform others in terms of their highest F1-score and their robustness under the influence of different factors.

Leave a Reply

Your email address will not be published. Required fields are marked *