Comparing genome alignment methods
Neuroscience

Comparing genome alignment methods


One of my current projects in the Matthew Hahn Lab is to investigate the effectiveness of a few different full-genome alignment methods. My mentor and I have been studying a new program called progressiveCactus, and comparing its output to other alignment methods. By comparing the number of indels (that is, insertions and deletions) between different species, we can compare the effectiveness of different genome-alignment methods. But my work has mostly been spent struggling to figure out how to get programs to run, and deciding the best way to parse output files.

How does progressiveCactus work, you ask? When I tried to answer that question the moment I began working in the Hahn Lab, I couldn't figure a thing out. After gaining much more experience in bioinformatics and analysis of complex systems, though, it has made more sense to me.
Circular genome plot
In order to allow multiple genomes to align to one another in any possible way, we can arrange them in a circular pattern, as shown above. This lets us create threads of different colors, in which each color represents a different sequence. The ends of the boxes (A1 and A4) are the telomeres, as in, the ends of the chromosomes. It's easy to find reverse complements, similarities, and other neat features. When we combine all of the different circular genome plots, this way, we can create "cactus" graphs.
Pictured: a cactus
From these chains, we can create entire networks upon networks to give us full-aligned genomes.

progressiveMauve has been shown to be very quick and effective with a small number of different genomes, and it has a very attractive GUI, as well. We're focusing on the output from this program to compare to that of progressiveCactus. 



Ever since I finished my work at Cornell, I've been much more confident and focused in my research at IU. I look forward to continuously keep moving onto bigger and better things in research and elsewhere. 




- Alzheimer's Disease: Genomic Data
Press release from the NIH: NIH deposits first batch of genomic data for Alzheimer’s disease Researchers gain rapid access to first set of raw human genome sequence 02 December 2013 [snip] "Researchers can now freely access the first batch of genome...

- Knockout Mice Are The Bomb!
From an NIH press release from earlier today: NIH Launches Knockout Mouse Project Genome-Wide Public Resource Will Provide New Mouse Models for Understanding Human Disease The National Institutes of Health (NIH) today awarded a set...

- Programming For Particle Physics - Monte Carlo Simulations And Markov Chains
Call me Ishmael. Some years ago - never mind how long precisely - having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.  I'll...

- Genetic Inversions, Bill Gates, And Pancakes
Imagine that you are a waiter running back and forth in your breakfast restaurant. Your life is constantly moving between the kitchen and the seating area in your usual "flow". Most days you have to work very hard to make ends meet, so you don't have...

- De Brujin Graphs And Velvet Optimiser
I'm working with the Velvet Assembler as part of my virus identification project. When I'm not trying to write a perl module to complement the already-bulky script files that I'm working with,  I like to do something that I usually wish...



Neuroscience








.