A new technique that combines genomics with the mathematical field of algebraic topology can rapidly reveal the complex evolution of organisms—capturing gene flows at different evolutionary scales—which should help scientists better understand the origin of pandemic viruses, gene transfer in bacteria, hybrid species in eukaryotes (organisms whose cells contain a membrane-enclosed nucleus), and migrations in humans.
The technique, developed by Raul Rabadan, PhD, assistant professor of systems biology and biomedical informatics; MD/PhD student Joseph Chan; and Stanford mathematician Gunnar Carlsson, captures many evolutionary processes that are missed by other methods. It was published in the Nov. 12 issue of the Proceedings of the National Academy of Sciences.
Genomes can be viewed as points in a very high dimensional space that represents evolution. Algebraic topology computes simple, robust objects of these spaces that can be mapped to standard evolutionary processes.
The simplest and most standard of these spaces is a phylogenetic tree, the standard way of representing the evolutionary history of species. However, a tree fails to capture the exchange of genomic information between different organisms, a pervasive phenomenon in viruses, bacteria, and some eukaryotes. The new technique displays these events on a mathematical structure that captures three key ingredients: the type of exchange of genomic material, the frequency of the exchanges, and the evolutionary scale at which they take place.
Using genomic data from influenza viruses, the researchers uncovered complex patterns of gene exchange that are invisible to standard analysis.The evolution of viruses is particularly difficult to study because different strains frequently swap and merge portions of their genomes. Applying the new technique to genomic data from influenza viruses, the researchers were able to uncover complex patterns of gene exchange that are invisible to standard phylogenetic analysis. In the case of the bird flu outbreak in China earlier this year, the researchers were able to rapidly uncover the origins of the responsible virus, which resulted from a complex mixing of three different viruses circulating in birds.
This work is the first step of an ambitious program that links evolution to algebraic topology based on genomic data. The work is being extended to many other species, including humans, with fast developments on both the mathematical and biological sides.