As of August 2011, there have been over a thousand studies testing for association between single nucleotide changes across the human genome and disease. Thousands of single variants have been reported to be associated with over 200 diseases or traits.

 

So, just how do researchers follow up on a single variant to see if it really means anything related to a disease?

Greg Cooper, Ph.D., a faculty investigator at the HudsonAlpha Institute for Biotechnology, along with Jay Shendure, Ph.D., of the University of Washington, reviewed methods for finding these relationships in the September 2011 issue of the journal Nature Reviews Genetics.

“Researchers are drowning in sequence data and need ways to make sense of it all,” said Cooper. “We describe recent advances in genomics which may help to find answers to genetic disease associations.”

For example, explained, Cooper, a group of scientists may conduct a study on genetic contributions to a complex disease such as autism. They might test over a million single nucleotide variations in large groups of both children with autism and those without the disease. The scientists may then continue with statistical tests to identify variants more common in each group and compare the two sets of variants. If some single changes are more common in children with autism, then these positions in the genome might be related to the disease in some way.

However, such studies often reveal lengthy lists of potentially relevant variants spread all across the genome, including many that are in reality unrelated to disease.  Scientists are then at the finding a needle in a stack of needles stage. How can they distinguish those variants that are truly biologically relevant to disease?

Cooper and Shendure point to the power of evolution. “If a particular base in our genome has been maintained in the genomes of many organisms in the same state, then it is more likely to be important,” said Cooper. “Natural selection removes variants which cause harm to the organism carrying it in its genome.”

In other words, the scientists can compare the exact sequence of variants they suspect might cause autism to genomes from many people and from many other organisms, and estimate a rate of evolution for these variants. If the rate is below the rate for the rest of the genome, this suggests that changes at these genomic positions may be harmful and potentially related to disease. Cooper and Shendure review a number of similar methods, making this article a valuable resource for scientists in a number of fields who are conducting association studies with disease.

The paper is the featured article for September 2011 and is freely available at this site.