Advertisement

A Deep Look Into Our Genes

Recent debates have focused on the degree of genetic variation and its impact upon health at the genomic level in humans (see the Perspective by Casals and Bertranpetit). Tennessen et al. (p. 64, published online 17 May), looking at all of the protein-coding genes in the human genome, and Nelson et al. (p. 100, published online 17 May), looking at genes that encode drug targets, address this question through deep sequencing efforts on samples from multiple individuals. The findings suggest that most human variation is rare, not shared between populations, and that rare variants are likely to play a role in human health.

Abstract

As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.

Get full access to this article

View all available purchase options and get full access to this article.

Already a subscriber or AAAS Member? Log In

Supplementary Material

Summary

Materials and Methods
Supplementary Text
Figs. S1 to S19
Tables S1 to S7
References (3047)

Resources

File (papv2.pdf)
File (tables6.xlsx)
File (tennessen.sm.pdf)

References and Notes

1
Bamshad M. J., et al., Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745 (2011).
2
Ajay S. S., Parker S. C., Abaan H. O., Fajardo K. V., Margulies E. H., Accurate and comprehensive sequencing of personal genomes. Genome Res. 21, 1498 (2011).
3
Sobreira N. L., et al., Whole-genome sequencing of a single proband together with linkage analysis identifies a Mendelian disease gene. PLoS Genet. 6, e1000991 (2010).
4
International HapMap Consortium, A haplotype map of the human genome. Nature 437, 1299 (2005).
5
Frazer K. A., et al., A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851 (2007).
6
Li J. Z., et al., Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100 (2008).
7
Fu Y. X., Statistical properties of segregating sites. Theor. Popul. Biol. 48, 172 (1995).
8
Marth G. T., et al., The functional spectrum of low-frequency coding variation. Genome Biol. 12, R84 (2011).
9
Ng S. B., et al., Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272 (2009).
10
O’Roak B. J., et al., Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat. Genet. 43, 585 (2011).
11
Ng S. B., et al., Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790 (2010).
12
Ng S. B., et al., Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42, 30 (2010).
13
Tennessen J. A., Madeoy J., Akey J. M., Signatures of positive selection apparent in a small sample of human exomes. Genome Res. 20, 1327 (2010).
14
Yi X., et al., Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75 (2010).
15
McClellan J., King M. C., Genetic heterogeneity in human disease. Cell 141, 210 (2010).
16
Manolio T. A., et al., Finding the missing heritability of complex diseases. Nature 461, 747 (2009).
17
Gibson G., Rare and common variants: twenty arguments. Nat. Rev. Genet. 13, 135 (2011).
18
Supplementary materials are available on Science Online.
19
Kimura M., Evolutionary rate at the molecular level. Nature 217, 624 (1968).
20
Ramírez-Soriano A., Nielsen R., Correcting estimators of theta and Tajima’s D for ascertainment biases caused by the single-nucleotide polymorphism discovery process. Genetics 181, 701 (2009).
21
Akey J. M., et al., Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol. 2, e286 (2004).
22
Coventry A., et al., Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat Commun 1, 131 (2010).
23
Gravel S., et al., Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. U.S.A. 108, 11983 (2011).
24
dos Reis M., Savva R., Wernisch L., Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 32, 5036 (2004).
25
McDonald J. H., Kreitman M., Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652 (1991).
26
Akey J. M., Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 19, 711 (2009).
27
Asimit J., Zeggini E., Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44, 293 (2010).
28
Kryukov G. V., Shpunt A., Stamatoyannopoulos J. A., Sunyaev S. R., Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl. Acad. Sci. U.S.A. 106, 3871 (2009).
29
Lohmueller K. E., et al., Proportionally more deleterious genetic variation in European than in African populations. Nature 451, 994 (2008).
30
Yu L., Martinez F. D., Klimecki W. T., Automated high-throughput sex-typing assay. Biotechniques 37, 662 (2004).
31
Li H., Durbin R., Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754 (2009).
32
DePristo M. A., et al., A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491 (2011).
33
Fisher S., et al., A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biol. 12, R1 (2011).
34
Li Y., Sidore C., Kang H. M., Boehnke M., Abecasis G. R., Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 21, 940 (2011).
35
Li H., Ruan J., Durbin R., Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851 (2008).
36
Durbin R. M., et al., A map of human genome variation from population-scale sequencing. Nature 467, 1061 (2010).
37
Manichaikul A., et al., Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867 (2010).
38
Adzhubei I. A., et al., A method and server for predicting damaging missense mutations. Nat. Methods 7, 248 (2010).
39
Kumar P., Henikoff S., Ng P. C., Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073 (2009).
40
Chun S., Fay J. C., Identification of deleterious mutations within three human genomes. Genome Res. 19, 1553 (2009).
41
Schwarz J. M., Rödelsperger C., Schuelke M., Seelow D., MutationTaster evaluates disease-causing potential of sequence alterations. Nat. Methods 7, 575 (2010).
42
Cooper G. M., et al., Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nat. Methods 7, 250 (2010).
43
Cooper G. M., et al., Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901 (2005).
44
Zwick M. E., Cutler D. J., Chakravarti A., Patterns of genetic variation in Mendelian and complex traits. Annu. Rev. Genomics Hum. Genet. 1, 387 (2000).
45
Gutenkunst R. N., Hernandez R. D., Williamson S. H., Bustamante C. D., Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5, e1000695 (2009).
46
Gravel S., et al., Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. U.S.A. 108, 11983 (2011).
47
Waldman Y. Y., Tuller T., Shlomi T., Sharan R., Ruppin E., Translation efficiency in humans: tissue specificity, global optimization and differences between developmental stages. Nucleic Acids Res. 38, 2964 (2010).

Information & Authors

Information

Published In

View large Science cover image
Science
Volume 337 | Issue 6090
6 July 2012

Article versions

You are viewing the most recent version of this article.

Submission history

Received: 17 January 2012
Accepted: 3 May 2012
Published in print: 6 July 2012

Permissions

Request permissions for this article.

Acknowledgments

We acknowledge the support of the NHLBI and the contributions of the research institutions, study investigators, field staff, and study participants in creating this resource for biomedical research; and the Population Genetics Project Team. Funding for GO ESP was provided by NHLBI grants RC2 HL-103010 (HeartGO), RC2 HL-102923 (LungGO), and RC2 HL-102924 (Women’s Health Initiative Exome Sequencing Project, WHISP). The exome sequencing was performed through NHLBI grants RC2 HL-102925 (BroadGO) and RC2 HL-102926 (SeattleGO). Filtered sets of annotated variants and their allele frequencies are available at http://evs.gs.washington.edu/EVS/ and have been deposited in dbSNP (www.ncbi.nlm.nih.gov/snp; local batch ID, ESP2500). Genotypes and phenotypes from a large subset of individuals are also available via dbGaP (www.ncbi.nlm.nih.gov/gap) using the following accession information: NHLBI GO-ESP: Women’s Health Initiative Exome Sequencing Project (WHI) – WHISP, WHISP_Subject_Phenotypes, pht002246.v2.p2, phs000281.v2.p2; NHLBI GO-ESP: Heart Cohorts Exome Sequencing Project (JHS), ESP_HeartGO_JHS_LDLandEOMI_Subject_Phenotypes, pht002539.v1.p1, phs000402.v1.p1; NHLBI GO-ESP: Heart Cohorts Exome Sequencing Project (FHS), HeartGO_FHS_LDLandEOMI_PhenotypeDataFile, pht002476.v1.p1, phs000401.v1.p1; NHLBI GO-ESP: Heart Cohorts Exome Sequencing Project (CHS), HeartGO_CHS_LDL_PhenotypeDataFile, pht002536.v1.p1, phs000400.v1.p1; NHLBI GO-ESP: Heart Cohorts Exome Sequencing Project (ARIC), ESP_ARIC_LDLandEOMI_Sample, pht002466.v1.p1, phs000398.v1.p1; NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Cystic Fibrosis), ESP_LungGO_CF_PA_Culture_Data, pht002227.v1.p1, phs000254.v1.p1; NHLBI GO-ESP: Early-Onset Myocardial Infarction (Broad EOMI), ESP_Broad_EOMI_Subject_Phenotypes, pht001437.v1.p1, phs000279.v1.p1; NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Pulmonary Arterial Hypertension), PAH_Subject_Phenotypes_Baseline_Measures, pht002277.v1.p1, phs000290.v1.p1; NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Lung Health Study of Chronic Obstructive Pulmonary Disease), LHS_COPD_Subject_Phenotypes_Baseline_Measures, pht002272.v1.p1, phs000291.v1.p1. C.D.B. is on the scientific advisory board for Personalis, Incorporated; Mubadala Medical Holding Company; 23andme “Roots into the future” project; and Ancestry.com. M.J.R. owns stock in Illumina.

Authors

Affiliations

Jacob A. Tennessen*
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Abigail W. Bigham*
Department of Pediatrics, University of Washington, Seattle, WA 98195, USA.
Present address: Department of Anthropology, University of Michigan, Ann Arbor, MI 48109, USA.
Timothy D. O’Connor*
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Wenqing Fu
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Eimear E. Kenny
Department of Genetics, Stanford University, Stanford, CA 94305, USA.
Simon Gravel
Department of Genetics, Stanford University, Stanford, CA 94305, USA.
Sean McGee
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Ron Do
Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
The Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA.
Xiaoming Liu
Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA.
Goo Jun
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Hyun Min Kang
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Daniel Jordan
Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.
Suzanne M. Leal
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
Stacey Gabriel
Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Mark J. Rieder
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Goncalo Abecasis
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
David Altshuler
Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Deborah A. Nickerson
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Eric Boerwinkle
Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA.
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA.
Shamil Sunyaev
Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA.
Carlos D. Bustamante
Department of Genetics, Stanford University, Stanford, CA 94305, USA.
Michael J. Bamshad [email protected]
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Department of Pediatrics, University of Washington, Seattle, WA 98195, USA.
Joshua M. Akey [email protected]
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Broad GO
Seattle GO
on behalf of the NHLBI Exome Sequencing Project

Notes

*
These authors contributed equally to this work.
To whom correspondence should be addressed. E-mail: [email protected] (J.M.A.); [email protected] (M.J.B.)

Metrics & Citations

Metrics

Article Usage
Altmetrics

Citations

Export citation

Select the format you want to export the citation of this publication.

Cited by

  1. Sex‐linked genetic diversity and differentiation in a globally distributed avian species complex, Molecular Ecology, 30, 10, (2313-2332), (2021).https://doi.org/10.1111/mec.15885
    Crossref
  2. Predicting Genetic Variation Severity Using Machine Learning to Interpret Molecular Simulations, Biophysical Journal, 120, 2, (189-204), (2021).https://doi.org/10.1016/j.bpj.2020.12.002
    Crossref
  3. Novel ultra-rare exonic variants identified in a founder population implicate cadherins in schizophrenia, Neuron, 109, 9, (1465-1478.e4), (2021).https://doi.org/10.1016/j.neuron.2021.03.004
    Crossref
  4. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses, Nature Genetics, 53, 8, (1260-1269), (2021).https://doi.org/10.1038/s41588-021-00892-1
    Crossref
  5. Human genetic variants disrupt RGS14 nuclear shuttling and regulation of LTP in hippocampal neurons, Journal of Biological Chemistry, 296, (100024), (2021).https://doi.org/10.1074/jbc.RA120.016009
    Crossref
  6. Natural Selection, Genetic Variation, and Human Diversity, Human Population Genomics, (205-234), (2021).https://doi.org/10.1007/978-3-030-61646-5
    Crossref
  7. Phenotypic recapitulation and correction of desmoglein-2-deficient cardiomyopathy using human-induced pluripotent stem cell-derived cardiomyocytes, Human Molecular Genetics, 30, 15, (1384-1397), (2021).https://doi.org/10.1093/hmg/ddab127
    Crossref
  8. Functional and clinical implications of genetic structure in 1686 Italian exomes, Human Mutation, 42, 3, (272-289), (2021).https://doi.org/10.1002/humu.24156
    Crossref
  9. Prognostic value of risk factors for non-infectious diseases in workers of a poultry enterprise (according to an 8-year prospective study), Profilakticheskaya meditsina, 24, 7, (22), (2021).https://doi.org/10.17116/profmed20212407122
    Crossref
  10. Unraveling von Willebrand factor deficiency, Blood, 137, 23, (3160-3161), (2021).https://doi.org/10.1182/blood.2021010942
    Crossref
  11. See more
Loading...

View Options

Check Access

Log in to view the full text

AAAS ID LOGIN

AAAS login provides access to Science for AAAS Members, and access to other journals in the Science family to users who have purchased individual subscriptions.

Log in via OpenAthens.
Log in via Shibboleth.
More options

Register for free to read this article

As a service to the community, this article is available for free. Login or register for free to read this article.

Purchase this issue in print

Buy a single issue of Science for just $15 USD.

View options

PDF format

Download this article as a PDF file

Download PDF

Media

Figures

Multimedia

Tables

Share

Share

Share article link

Share on social media