This is an archived article.

November 18, 2004

Shotgun sequencing of genome may miss critical areas

A UW study comparing the two most favored techniques for sequencing the human genome has shown that one of the techniques, now the most widely used, gives an oversimplified view of the genome. That incomplete picture leaves out portions of the genome strand that are vital to understanding some genetic diseases, as well as the evolution of humans and other mammals. The study appears in the Oct. 21, 2004, edition of the journal Nature.

Researchers led by Dr. Evan Eichler, associate professor of genome sciences, compared two drafts of the human genome: one sequenced under the federally funded Human Genome Project, and another sequenced by Celera in a privately funded effort. The Human Genome Project relied on an approach called clone-ordered sequencing, in which the entire genome is broken down into 30,000 to 40,000 large projects, each with about 150,000 base pairs, the building blocks of DNA. Those projects are sequenced to completion, and then painstakingly reassembled. The process can be time-consuming and expensive.

The privately funded effort, however, used a technique called whole-genome shotgun sequencing, which can be faster and less expensive. Shotgun sequencing involves breaking the genome strand into millions of smaller sections, and randomly sequencing them as part of a single project. The genome sequence is assembled using computer programs that recognize where the sequence overlaps.

Eichler and his colleagues found that shotgun sequencing, which is now used in both public and private efforts, runs into problems with parts of the genome where base pairs are duplicated multiple times as large segments. The computer programs will either kick out those problematic parts of the genome, or insert them into the final sequence only once, instead of the many times they should appear.

Only about 4 percent of the genome sequence is lost from shotgun sequencing, Eichler found, while the other 96 percent is sequenced relatively accurately. However, the 4 percent that is lost – those large areas of gene duplication – are very important to genome research. Some genetic diseases are caused by deletions or rearrangements of the duplicated segments. Those segments are also important to the study of evolution.

“If we had relied on the simplified shotgun approach during the Human Genome Project, we would have missed those assemblies, which are very important,” said Eichler.

Eichler and his colleagues caution against using only the shotgun sequencing approach in the future. They suggest that genome scientists use the shotgun approach first, use that data to identify missing or duplicated regions of the genome, and then use the clone-ordered approach to fill in those areas.