Researchers at the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), have produced the first end-to-end DNA sequence of a human chromosome. The results published in Nature, show that generating a precise, base-by-base sequence of a human chromosome is now possible, and will enable researchers to produce a complete sequence of the human genome.
Humans have two sets of chromosomes, one set from each parent. For example, biologically female humans inherit two X chromosomes, one from their mother and one from their father. However, those two X chromosomes are not identical and will contain many differences in their DNA sequences.
In this study, researchers did not sequence the X chromosome from a normal human cell. Instead, they used a special cell type – one that has two identical X chromosomes. Such a cell provides more DNA for sequencing than a male cell, which has only a single copy of an X chromosome. It also avoids sequence differences encountered when analyzing two X chromosomes of a typical female cell.
The authors and their colleagues capitalized on new technologies that can sequence long segments of DNA. Instead of preparing and analyzing small pieces of DNA, they used a method that leaves DNA molecules largely intact. These large DNA molecules were then analyzed by two different instruments, each of which generates very long DNA sequences – something previous instruments could not accomplish.
After analyzing the human X chromosome in this fashion, Phillippy and his team used their newly developed computer program to assemble the many segments of generated sequence. Miga’s group led the effort to close the largest remaining sequence gap on the X chromosome, the roughly 3 million bases of repetitive DNA found at the middle portion of the chromosome, called the centromere.
There is no “gold standard” for researchers to critically evaluate the accuracy of assembling such highly repetitive DNA sequences. To help confirm the validity of the generated sequence, Miga and her collaborators performed a number of validation steps.
“We have never actually seen these sequences before in our genome, and do not have many tools to test if the predictions we are making are correct. This is why it is so important to have specialists in the genomics community weigh in and ensure the final product is high-quality,” Miga said.
The effort is part of a broader initiative by the Telomere-to-Telomere (T2T) consortium, partially funded by NHGRI. The consortium aims to generate a more complete reference sequence of the human genome.
The T2T consortium is continuing its efforts with the remaining human chromosomes, aiming to generate a complete human genome sequence in 2020.
“We don’t yet know what we’ll find in the newly uncovered sequences. It is the exciting unknown of discovery. This is the era of complete genome sequences, and we are embracing it wholeheartedly,” Phillippy said.
Potential challenges remain. Chromosomes 1 and 9, for example, have repetitive DNA segments that are much larger than the ones encountered on the X chromosome.
“We know these previously uncharted sites in our genome are very different among individuals, but it is important to start figuring out how these differences contribute to human biology and disease,” Miga said. Both Phillippy and Miga agree that enhancing sequencing methods will continue to create new opportunities in human genetics and genomics.