The next generation of genome-wide association studies

BioArray News (subscription required) reports that genomic analysis technology provider Illumina has launched a new family of genotyping chips designed to simultaneously assay 4 million sites of variation in the human genome.

The chips are a major step up from the 1-million-feature chips that
currently represent the state of the art, and take advantage of several
public projects generating catalogues of human genetic variation (such
as the 1000 Genomes Project).
Illumina has also increased the density of markers in and around genes,
and fleshed out regions that have previously been associated with
complex traits and diseases.

The new chips are specifically designed to increase the coverage of two
types of variants that tended to be poorly captured by previous
generations of chips: rare variants, and structural variation.

Chip-based genotyping is very much a place-holder technology while we
wait for whole-genome sequencing to become cheap and accurate enough to
use for large-scale studies. Illumina clearly expects the market to
persist for at least another couple of years before sequencing takes
over completely:

There might be some customers who will hold off for the next generation
of arrays. "We think it will be a year to a year and a half until all
the content is out there and we arrive at a penultimate array that has
the content that everyone will want," [Illumina CEO Jay Flatley] said.

Of course, Illumina is well-placed to ride the wave regardless of when
the sequencing transition occurs; in addition to its genotyping
products it provides the most successful current second-generation
sequencing technology, the Genome Analyzer, and has secured an exclusive marketing contract for one of the most promising third-generation platforms, Oxford Nanopore.

Bigger is not always better
The BioArray News article also notes that the most recent generation of genotyping chips (the 1M series, with one million features) have "not seen adoption ... to the extent of other chips". There's a good reason for that, which is spelled out in an article in PLoS Genetics this week: despite the increased number of variants on the 1M chip, its value for money (in terms of power for a fixed study cost) is actually lower than earlier-generation chips.

Here's a table from that paper illustrating this point:

i-bcb09402e7159ed96123524c0c742f6f-gwas_cost_comparison.jpg

The table assumes a fixed budget of $2 million for genotyping. Despite having nearly twice the number of markers, the Illumina 1M chip actually has substantially lower power than the earlier-generation 610K chip, for a simple reason: because the 1M chip is almost twice as expensive, researchers have to settle for genotyping many fewer individuals; and the increased power from adding more markers doesn't make up for this drop in sample size.

The same economics may well apply to the new chips (depending on their pricing, of course). The addition of rare variants to the chip adds an extra element to the equation; but it should be noted that the low power of genotyping studies to detect rare disease-causing variants means that such studies will require very large sample sizes; if the new chips are too expensive such studies may well be impractical for most research groups, encouraging them to lean towards targeted resequencing of candidate genes instead.

More like this

The successes of genome-wide association studies (GWAS) in identifying genetic risk factors for common diseases have been heavily publicised in the mainstream media - barely a week goes by these days that we don't hear about another genome scan that has identified new risk genes for diabetes, lupus…
Razib points me to a great plain-language article reviewing our current scientific understanding of human genetic variation. The major focus is on copy-number variants (CNVs) - genetic variants involving the insertion or deletion of large chunks of DNA, sometimes spanning over a million bases.…
The genome-wide association study has been the technique du jour in human genetics for much of the last two years. It's a pure brute force approach, surveying up to a million sites of common variation throughout the genomes of thousands of people at a time, some of whom suffer from a particular…
Complete Genomics is finally back on the road towards fulfilling its promises of $5000 human genome sequences, after delays in obtaining funding for a massive new facility pushed back its plans by six months. The $45 million in funding it announced this week will be sufficient to build the new…

Interesting post; while the 1M chip may suffer (or already be suffering) from the fact that its high price isn't justified for common variation (since the smaller chips provide nearly equivalent power, and far greater power per dollar spent), the 4M or 10M or ... chips won't necessarily have that problem if new studies are designed to aggressively target rare variants, a challenge which requires a much larger number of SNPs in European populations.

Hi Jeff,

I completely agree, although I guess that the next generation of chips after this will be where the value of rare variants really starts to kick in (these new chips only contain ~100K variants from the 1KG project, so there are already two orders of magnitude more to draw on).

It will be interesting to see how the pricing works out, though - a 10M chip with loads of rare variants would be pretty powerful, but if it's too expensive I'm guessing people will go for other options instead: pulldown-reseq of a bunch of candidate genes, or maybe even low-coverage WGS with 1KG-style imputation. The chip manufacturers don't have much of a window left before large-scale WGS becomes feasible, so they'd better get their pricing structure right.

One final thought - as the chips dig lower and lower into the frequency spectrum it will be interesting to see if the companies start going for population-specific designs. Most of these super-rare variants are population-specific anyway, so it makes sense to create targeted chips rather than clutter up an East Asian GWAS with a whole bunch of European-specific variants.

right, though we won't know what is worth typing in a population until it is typed in a very large panel. Things at low frequency in Europe and absent in East asia in the 1000 genomes, could just be due to sampling noise. I suspect it'll be a while before we know how population specific low freq. alleles are (though clearly they will be more restricted in range).

You might be confusing the fact that the new Omni1-Quad chip will look at 4 million variants with the fact that 4 samples can be looked at on a chip. So it is actually just a reconfigured 1M chip, possibly cheaper, with new content swapped in.

That info is available here from Illumina:
http://www.illumina.com/downloads/HumanOmni1-Quad_DataSheet.pdf