Thursday, November 30, 2006

Sequencing technology and the Neanderthal Genome

You've probably read about the two recent papers in Science and Nature reporting the sequencing of portions of the Neanderthal Genome. (Subscription required for full text of the Nature and Science papers. Check out Google for the many news stories on this.) This is exciting work, but I'm not really going to comment about the signficance of the results - I think it's worth understanding how new sequencing technology enabled researchers to sequence 1 million bases of 38,000 year-old DNA.

Each group used a different sequencing technology, and as a result, their coverage of the genome differed widely: the Nature group got about 1 million base pairs of sequence, while the Science group got about 65,000 base pairs. (Recall that the human genome contains about 3 billion base pairs, and the Neanderthal genome was undoubtedly similar.) The Neanderthal genome is a challenge because, obviously, any DNA remaining in the 30,000-40,000 year old bones we have is highly fragmented, and the amount of contaminating DNA from microbes (not to mention scientists) is significant. To eventually cover the entire genome, we need a method that can generate lots and lots of sequence without being prohibitively expensive.

The Science group used a more traditional method, which works well for most large-scale genome sequencing efforts, but is not really well-suited for getting huge chunks of Neanderthal genome. As a result, this group obtained only 65,000 base pairs of DNA sequence (which is still a significant accomplishment.) The big problem with this method is that the fragments of DNA isolated from Neanderthal bones have to be cloned before they can be sequenced. (In layman's terms - the DNA fragments have to be placed inside circular pieces of DNA called plasmids, which can be then grown in large quantities inside bactieria.) Many fragments of Neanderthal DNA fail to be cloned at this point, meaning that you lose much of the sample that was painstakingly isolated from the ancient bones.

The Nature group used something called pyrosequencing, which is done on machines called 454 sequencers. Crucially, this technique does not require a cloning step, which means much more of the isolated sample gets sequenced. Pyrosequencing also produces lots and lots of sequencing data very quickly. (One major downside is that each sequence is much shorter than what you get using traditional Sanger Sequencing, by a factor of ten at least. But in this case, the Neanderthal DNA fragments are so short this doesn't matter.)

You can read a really nice explanation of how pyrosequencing works at 454's web site. (For more technical coverage, look here.) Without this technology, a Neanderthal genome project would not be feasible. With this technology, we can now consider all sorts of sequencing projects that would not have been financially or technically feasible before - not just Neanderthal sequencing, but also large scale studies of gene varation in natural populations, including humans. Such large-scale sequencing could help up close in on the genes involved in complex diseases.

I said I was going to talk about sequencing, but I can't resist making a plug for completing the Neanderthal genome. It's interesting to learn about the changes in the genome that took place during evolution, but it's also extremely useful to have a more closely related genome as we try to find and understand the functional portions of the human genome. Having multiple genomes of close species has helped enormously in flies, worms, and yeast (for examples, check out this, this, and this). As is usual in almost any genome-level study of human biology, you can't get far without using an understanding of evolution.

1 comment:

genexs said...

Thanx for a good review of some of the recent Neaderthal research. You have a great blog.