townsley helps find covid-19 vaccine solution one genome sequence at a time
lipscomb alumni are leading the way to develop tools to aid researchers in finding a vaccine to fight covid-19.
kim chaudoin |
2022 update: the college of computing and technology is now the school of computing and the school of data analytics and technology.
while the world is feverishly working to develop a covid-19 vaccination, one lipscomb alumnus has been working with a group of scientists and researchers from around the world since march to build a pangenomic tool to aid vaccine research.
last week, thomas townsley, who graduated from lipscomb in december 2019 with a master of science degree in software engineering, was among a team of bioinformatics researchers who released pantographtm, a visual browser for graph genomes, a new way of capturing sequence data which aids researchers in studying the sequence diversity of the virus which helps in successfully developing a vaccine. the software is free and open source. townsley says his research team, led by josiah seaman, a ph.d. candidate at queen mary university of london, hopes to “get it into the hands of as many researchers as possible” to expedite the development of a covid-19 vaccination.
the problem that townsley and his team of researchers have been hard at work to solve is to provide those developing vaccines the most amount of true genetic diversity to generate a selection sweep against covid-19 when introducing a new vaccine.
“when we are talking about genomes, historically speaking we have worked with a type of data structure called a multiple sequence alignment,” explains townsley. “what you do in this process is take a lot of different sequences — like my dna sequence, your dna sequence and if we sequenced 60 other people’s dna — and stack all of these dna sequences on top of each other and then compare them and see how similar or different we are. the one individual we align the 60 sequences to is the linear reference genome which usually isn't a single person but more of a consensus.”
eliminating reference bias in genomes
townsley says that what happens is reference bias occurs with respect to aligned individuals and their reference genome. when genomes find a “consensus” the data does not capture the genomic diversity of a population. as a result, reference bias occurs. because of this, sample reads that highly differ from the reference do not map correctly, and are either mapped to the wrong position in the genome or remain unmapped altogether. incorrect read mapping in turn leads to false negative or false positive variant calls. this causes problems for researchers as they work to develop solutions or vaccines to serve a greater population.
what townsley’s group has focused on is identifying real structural variations in sars-cov-2. eliminating reference bias helps us more easily identify these variations by eliminating the false positives. a pangenome is the entire gene set of all strains of a species. it includes genes present in all strains (also known as core genome) and genes present only in some strains of a species, or variable genomes. the core genome represents the genes present in all strains of a species.
“using our previous illustration, we don’t want to lose information from the 60 individuals’ genomes we sequenced. we want to see how each individual in this group's genes vary from each other but we aren’t losing that information because of reference bias,” says townsely. “pangenomes are powerful entities that are going to play a part, particularly for the next 20 years or so, in genetics research especially as we develop drugs that are more targeted than they are now because we have a more precise description of what we are targeting.”
“we have already seen how coronavirus has mutated over time. so the more genetic information we have about coronavirus, the better our vaccines get." — thomas townsley (’19)
in the case of coronavirus, townsley says, proteins in sars-cov-2, or what has become known around the world as covid-19, are highly mutative. he said there are clinical trials for vaccines targeting the spike protein on the outside of the virus that rely on a lack of genetic diversity. but if there are strains of the mutant protein, the vaccine would be rendered ineffective and the virus would continue to spread. sars-cov-2 is an rna virus that has a much higher mutation capability because of the high number of people it has and continues to infect.
“we have already seen how coronavirus has mutated over time. so the more genetic information we have about coronavirus, the better our vaccines get,” he says. “but we also don’t want to make vaccines that have implicit reference bias in them. so that’s the gist of the problem we are trying to solve.”
unlocking the next level of population genetics
the pantograph project, which began in 2018 before covid-19 emerged as a way to unlock the next level of population genetics for researchers, quickly became more relevant to the current pandemic. according to the pantograph website, “the success or failure of our efforts to fight the covid-19 disease rely upon the sequence diversity of the virus itself. tests for infection rely on knowing the exact sequence being tested. a rearrangement in the order of genes, even if the content is the same, will return a false negative test.”
the project cites the common cold, also classified as a coronavirus, as an example of a virus that is difficult to eradicate because of the number of people infected and thus the high number of mutations that exist around the world. “current sequencing techniques may be under-representing the full sequence diversity of the virus because they are reference based. eliminating reference bias and enabling species genetic diversity on thousands of individuals is the core goal of using a graph genome,” according to the project’s findings.
townsley said pantograph is a small piece of the worldwide effort to find a vaccine and to eradicate covid-19 that can also have a wide-range of disease applications in the future. the short-term goal of the project is to extend the pantograph tool to add features to make it easier to study the sars-cov-2 mutants as the situation develops.
pantograph has developed the first graph genome browser design with the capability to scale to thousands of individuals and still show the individual’s nucleotide sequence. this means it’s uniquely suited to providing a global overview of species genetic diversity with the option to zoom in on small features.
genome graph solution
townsley explains that linear reference genome representations in current use cannot capture non-reference sequence variation when visualizing hundreds of individuals. genome graphs are an attractive solution to visualize and investigate the whole variation within large sets of individuals at glance. structural variants are associated with many diseases, including childhood cancers and schizophrenia. in order to make sequencing data actionable for clinicians, structural variants need to be put in the larger context of all known genetic variation. understanding genetic variation in plants is equally important for identifying pest and pathogen resistance and increasing crop yields to support a growing population.
the pantograph project developed pangenome schematics, the first visualization with the scalability to render graph genomes from thousands of individuals over gigabase genome sizes with the ability to show both whole chromosome features as well as zoom into nucleotide sequence variation. townsley says that no tool to date has satisfied all these criteria. scalability is accomplished through graph sorting and binning adjacent sequences to create shared syntenic blocks called components. non-linear structural variants can be shared by many individuals as links across the pangenome.
townsley joined the core project team in march as the point person responsible for maintaining the graph genome pipeline and as the release engineer to push the product out to researchers.
“our aim is to get this in the hands of comparative genome researchers, and that extends to vaccine research,” he says. “our hope is that anyone who has an interest in really studying the genetic variations of coronavirus can use our software to pick up on variations in coronavirus that they would not be able to see otherwise due to the limitation of other multiple sequence alignment technologies.”
the pantographtm pangenome graph browser for sars-cov-2 may be viewed at graphgenome.org.
“our aim is to get this in the hands of comparative genome researchers, and that extends to vaccine research. our hope is that anyone who has an interest in really studying the genetic variations of coronavirus can use our software to pick up on variations in coronavirus that they would not be able to see otherwise due to the limitation of other multiple sequence alignment technologies.” — thomas townsley
making a difference in the world
being part of a project that has the potential to have a positive impact on the lives of people around the world is very satisfying for townsley.
a sulphur, louisiana native who earned his undergraduate degree from loyola university in 2016, townsley’s background is in management and information systems. he says science has always been interesting to him. after getting a “good job” out of college in the corporate world, townsley says he didn’t feel like he was making a contribution to society. he decided to return to school and saw that 世界杯2022预选赛录像回放 was one of the few institutions that had an interdisciplinary program around computational biology.
“so i reached out to lipscomb’s college of computing & technology,” he recalls. “they were super nice and i told them my story. i was fortunate to work for a couple of years in a good job. but i quit my job and basically dove right into my studies at lipscomb.”
in 2019, prior to graduation, townsley was a supported research assistant in the college of computing & technology with tim wallace, associate professor and chair of the department of computational sciences, engaged in a new research collaboration effort lead by joe deweese, associate professor in the department of pharmaceutical sciences in the college of pharmacy.
after graduating from lipscomb in december 2019, townsley had the opportunity to remain at lipscomb to continue on as the team’s software developer. for the past year townsley has worked as part of deweese’s research group working on a software program that will predict features of protein structure using the amino acid sequence of a protein. this tool will be helpful in the development of new drugs and in studying new proteins.
"thomas has worked with my research collaborators and i over the last year in developing a new software tool to study protein structure,” says deweese. “thomas is a very talented programmer, and he continually comes up with creative ways to solve our programming challenges. we are currently testing our software tool before we prepare to release it for use by researchers worldwide."
deweese is friends with josiah seaman, who is overseeing the pantograph project. “when josiah put out the call for programmers, i knew thomas would be the right guy for the job,” recalls deweese.
it was a good fit for townsley who had been interested in “doing something in the covid space.” townsley has returned to the college of pharmacy research project now that he has completed his commitment to the pantograph project.
“i’m more of a computer person but i definitely enjoy working with biologists and scientists because they are so passionate about what they do,” he admits. “everything they work on has an immediate quality to it - like this could actually help us target cancer-causing entities. that is really motivating to me coming from a corporate background. i’d much rather be working on something like that where i feel like i’m making a meaningful difference.”
having been part of two major research projects in recent months, townsley discovered first-hand that “science can be really exciting and you don’t necessarily need to be a biologist in a lab to make a meaningful contribution,” he says.
“i don’t consider myself a biologist and i never thought in a million years that i would be working on a problem like this. i had this idea before that science was a niche thing and that only a select few people would actually be able to be a part of it. everyone actually does have a role that they can play.”
to learn more about the pantograph project, visit https://graph-genome.github.io/.
learn more about lipscomb’s college of computing & technology at www.fellaworld.com/computing.