Skip to main content

Implications of using whole genome sequencing to test unselected populations for high risk breast cancer genes: a modelling study



The decision to test for high risk breast cancer gene mutations is traditionally based on risk scores derived from age, family and personal cancer history. Next generation sequencing technologies such as whole genome sequencing (WGS) make wider population testing more feasible. In the UK’s 100,000 Genomes Project, mutations in 16 genes including BRCA1 and BRCA2 are to be actively sought regardless of clinical presentation. The implications of deploying this approach at scale for patients and clinical services are unclear. In this study we aimed to model the effect of using WGS to test an unselected UK population for high risk BRCA1 and BRCA2 gene variants to inform the debate around approaches to secondary genomic findings.


We modelled the test performance of WGS for identifying pathogenic BRCA1 and BRCA2 mutations in an unselected hypothetical population of 100,000 UK women, using published literature to derive model input parameters. We calculated analytic and clinical validity, described potential health outcomes and highlighted current areas of uncertainty. We also performed a sensitivity analysis in which we re-ran the model 100,000 times to investigate the effect of varying input parameters.


In our models WGS was predicted to identify correctly 93 pathogenic BRCA1 mutations and 151 BRCA2 mutations in 120 and 200 women respectively, resulting in an analytic sensitivity of 75.5-77.5 %. Of 244 women with identified pathogenic mutations, we estimated that 132 (range 121–198) would develop breast cancer, so could potentially be helped by intervention. We also predicted that breast cancer would occur in 41 women (range 36–62) incorrectly identified with no pathogenic mutations and in 12,460 women without BRCA1 or BRCA2 mutations. There was considerable uncertainty about the penetrance of mutations in people without a family history of disease and the appropriate threshold of absolute disease risk for clinical action, which impacts on judgements about the clinical utility of intervention.


This simple model demonstrates the need for robust processes to support the testing for secondary genomic findings in unselected populations that acknowledge levels of uncertainty about the clinical validity and clinical utility of testing positive for a cancer risk gene.

Peer Review reports


Next generation sequencing technologies such as whole genome sequencing (WGS) have enhanced the speed and reduced the cost of seeking genetic mutations predisposing to conditions such as hereditary cancer syndromes [1]. There has, however, been intense debate about the appropriateness of using these technologies to screen individuals who do not meet traditional testing criteria. Arguments against population testing include uncertainty about the balance of benefit versus harm of interventions for high risk gene carriers without a strong family history of disease as well as other ethical, technical and cost concerns [26].

Internationally, clinical approaches to testing for and reporting secondary genomic findings, defined as findings that are actively sought by a practitioner but are not the primary target of investigation [7], vary widely. In the United Kingdom, the approach taken to secondary findings in the 100,000 Genomes Project–a flagship research project that aims to sequence 100,000 whole genomes from NHS patients by 2017 [8]–will act as a blueprint for future NHS practice. It is crucial therefore that the impact of using WGS for germline genetic testing on population health, clinical and laboratory services is appropriately considered.

In the 100,000 Genomes Project, deleterious alleles in 16 genes detected on whole genome analysis will be reported back to participants regardless of test indication [8]. These include the genes BRCA1 and BRCA2, implicated in hereditary breast and ovarian cancer syndrome. In contrast, usual clinical practice is to undertake BRCA1 and BRCA2 testing based on results of risk scores calculated using factors such as age, family history and personal cancer history [9, 10]. Germline genetic testing for high risk cancer genes aims to provide the best possible estimate of an individual’s cancer risk to inform decisions about undergoing risk-lowering interventions. In the absence of a family history, the disease risk for mutations identified and therefore the clinical utility of testing, is likely to differ from that seen in multi-case families and may be poorly estimated.

In this study we aimed to model the likely outcomes of testing for medically-actionable gene mutations in unselected populations undergoing WGS, using the example of BRCA1 and BRCA2. We considered the clinical validity of such testing and implications for individuals, laboratory and clinical services.


Model development

We built a simple model to calculate WGS test performance for identifying pathogenic BRCA1 and BRCA2 mutations in an unselected population of 100,000 UK women. Model input parameters were obtained from reviewing published literature on population prevalence of pathogenic BRCA1 and BRCA2 mutations and range and frequency of different mutation types including single nucleotide variants (SNVs), small insertions/deletions (indels) and copy number variants (CNVs). Where possible these were taken from studies in populations at low risk of breast cancer rather than multi-case families. We also used test performance literature for Illumina TruGenome Clinical Sequencing Services [11] and relevant laboratory standards [12, 13] to inform estimates of analytical validity–Table 1.

Table 1 Model input parameters (main analysis)

Analytic validity calculations

Calculations shown below for BRCA1 were repeated for BRCA2. We assumed that variants of uncertain significance (VUS) were not reported back to patients in line with common practice [14].

  • True positives = prevalence of pathogenic BRCA1 mutations x population size x (proportion of BRCA1 mutations that are small indels x sensitivity of WGS for detecting small indels + proportion of BRCA1 mutations that are SNVs x sensitivity of WGS for detecting SNVs + proportion of BRCA1 mutations that are CNVs x sensitivity of WGS for detecting CNVs) x horizontal gene coverage of WGS for BRCA1 *

  • False negatives = prevalence of pathogenic BRCA1 mutations x population size–true positives

  • False positives = in main model assumed to be 0 after a confirmatory step*. In sensitivity analysis we modelled the effect of an extra 0-5 % of false positive results in addition to true positive results

  • True negatives (including VUS) = population size-prevalence x population size-false positives

Test performance measures of WGS for detecting pathogenic mutations were calculated as follows:

  • Analytical sensitivity = True positives detected/Total with a true pathogenic mutation

  • Analytical specificity = True negatives detected/Total without a pathogenic mutation

  • Analytical positive predictive value = True positives detected/Total with a variant on testing

  • Analytical negative predictive value = True negatives detected/Total without a variant on testing

*Note in line with usual practice for next generation sequencing we assumed that any positive results would be confirmed by an independent test from a new DNA dilution or a secondary test e.g. a SNP assay [13]. It is not usual practice to confirm all negative findings but we assumed for the model that reporting standards for negative findings were met [13].

Sensitivity analysis

We also performed a sensitivity analysis to investigate the effect of varying model input parameters. The model was rerun 100,000 times with model input parameters being randomly selected from defined likely distributions using Stata’s random number generator function. The proportion of pathogenic mutations due to CNVs was assumed to be fixed at 0.1, but the proportion of small indels and SNVs varied according to an underlying normal distribution. Sensitivity of WGS for detecting CNVs was fixed at 0, based on current test performance literature, but sensitivity for detecting SNVs and small indels was selected from an underlying gamma distribution. We also assumed that false positives would occur at a rate of less than 10 % of the number of true positives, but heavily skewed towards 0. Calculations for true positives, false negatives and true negatives remained the same as for the main model. Underlying distributions for model input parameters are shown in Fig. 1.

Fig. 1
figure 1

Distribution of model input parameters used for sensitivity analysis using example of BRCA1

Assessing clinical validity

We assessed the performance of WGS for detecting future risk of breast cancer by applying population penetrance estimates from published literature to modelled numbers of women with each test outcome–true positives, false positives, true negatives and false negatives. We did not evaluate the clinical utility of testing for pathogenic BRCA1 and BRCA2 mutations because such testing is well-established internationally in the context of multi-case families. Evaluation of the complex ethical, legal and social implications was outside the scope of this paper.


Prevalence and types of BRCA1 and BRCA2 mutations

In an unselected UK population we initially assumed a carrier frequency of 0.12 % for pathogenic BRCA1 mutations and 0.2 % for BRCA2 mutations [15], which equated to 120 BRCA1 and 200 BRCA2 mutations in 100,000 unselected women. In our main model we also assumed that pathogenic BRCA1 mutations comprised 54 % small indels, 36 % SNVs and 10 % CNVs; corresponding figures for pathogenic BRCA2 mutations were 69 % small indels, 21 % SNVs and 10 % CNVs. This was based on findings that around 88-90 % of pathogenic BRCA1 and BRCA2 mutations are due to missense, nonsense and splice site mutations as well as small indels, with the remaining 10-12 % due to large rearrangements/duplications [16]. A recent population series of 2,222 ovarian cancer cases and 1,528 controls characterised the spectrum of mutations further: 85 BRCA1 mutations were detected, comprising 51 frameshift indels and 34 missense, nonsense or splice site mutations; there were also 98 BRCA2 mutations, made up of 75 frameshift indels and 23 missense, nonsense or splice site mutations [17].

Penetrance of BRCA1 and BRCA2 mutations

In our main model we used penetrance figures of 59 % for BRCA1 and 51 % for BRCA2 based on average cumulative risk of female breast cancer modelled over all possible modifiers for carriers born after 1950 [15], but also included ranges of 57-71 % for BRCA1 [1820] and 45–87.5 % [21, 22] for BRCA2 based on other literature findings. The penetrance of these mutations for breast cancer varies with age at testing, family history and mutation type, depending on a complex interplay between the genetic variant and other environmental and genetic modifiers. Consequently, published estimates of penetrance vary, reflecting both the effect of differing risk modifiers and methods of ascertainment of mutation carriers. Estimates from studies based on multi-case families are typically higher than those based on unselected breast cancer cases. There are no empirical estimates of the average penetrance for an unselected mutation carrier. However, estimates of the average risk have been derived from complex segregation analyses using data from both multi-case families and breast cancer case series by explicitly modelling the contribution of modifiers [15], which we considered the most appropriate estimates for our study.

Factors affecting WGS performance

Analytical validity of WGS for identifying and classifying genetic variants correctly depends on a range of factors including depth of coverage as well as technical accuracy for detecting different mutation types. In our model we assumed an adequate read depth but recognised that in practice the minimum depth of coverage varies depending upon the required sensitivity of the assay, the sequencing method and the type of mutation detected and should be established during the test validation process [13]. Algorithms in current clinical use allow >97 % of SNVs and >80 % of deletions to be identified by WGS but may be unable to detect larger insertion/deletion events of >7 bp, translocations, trinucleotide repeats or CNVs [11], although this area is developing rapidly. An additional factor is horizontal coverage across the genome, which is typically > 95 % for WGS, but reported as 99.41 % in the region of BRCA1 and 99.97 % for BRCA2 [11].

Test performance of WGS for detecting BRCA1 and BRCA2 pathogenic mutations

In our main model WGS successfully detected 93 women out of 120 with pathogenic BRCA1 variants and 151 women out of 200 with pathogenic BRCA2 variants in a hypothetical UK population of 100,000 women. This gave an analytic sensitivity of 77.5 % for BRCA1 variants and 75.5 % for BRCA2. Specificity was 100 %, with 100 % PPV and 99.9 % NPV for both genes. Sensitivity analysis gave similar results–Table 2.

Table 2 Numbers of pathogenic BRCA1 and BRCA2 mutations detected in an unselected population of 100 000 UK women using WGS

Clinical validity

The performance of WGS for predicting future risk of breast cancer depends both on test performance and on the association between genotype and disease. In our model, WGS correctly identified 132 women (range 121–198) with a pathogenic BRCA1 or BRCA2 mutation who would develop breast cancer. We also estimated that breast cancer would occur in 41 women (range 36–62) incorrectly identified with no pathogenic mutations and in 12,460 women who truly had no mutation, based on a background lifetime risk of female breast cancer of 12.5 % in the UK [23] Fig. 2.

Fig. 2
figure 2

Flowchart of expected breast cancer incidence in unselected women undergoing WGS

Potential health outcomes

Scenario 1 Analytic true positive women = 244/100,000

Positive health outcomes include reductions in breast cancer incidence and mortality for the 132 women predicted to develop breast cancer and their family members. The 112 other analytic true positive women would have the same preventive options but receive no benefit. Potential negative consequences include physical and psychological harms of chemoprevention or prophylactic surgery.

Scenario 2 Analytic false positive women = 0/100,000

There are no positive health outcomes for this group. All negative health outcomes identified for true positive women would apply, although numbers affected are likely to be extremely small.

Scenario 3: Analytic false negative women = 76/100,000

This group will have no positive health outcomes. There may be false reassurance associated with receiving a negative result. Opportunities to prevent breast cancer will be missed.

Scenarios 4 Analytic true negatives (includes VUS) = 99 680/100,000

There are no direct positive or negative health outcomes for this group, although without careful consent and clear clinical feedback some may be falsely reassured that they are not at risk of breast cancer which might affect subsequent health behaviour.


Using the example of pathogenic BRCA1 and BRCA2 mutations we demonstrate the type of process that should be undertaken when considering likely outcomes of testing for secondary genomic findings in unselected populations. We estimated that WGS would detect 75.5-77.5 % of pathogenic BRCA1 and BRCA2 mutations, with the majority of undetected mutations comprising CNVs. This is well below the 95 % sensitivity threshold recommended for clinical genetic diagnostic tests [13]. In a hypothetical UK population of 100,000 women, this would result in 244 identified for further interventions, potentially preventing around 132 cases of breast cancer. This would also result in unnecessary inteventions for 112 women with mutations predicted not to develop cancer, although this is in line with current practice [10]. We note that outside the context of WGS, routine BRCA1 and BRCA2 testing is neither recommended nor advocated at population level [9, 10, 24].

Key areas of uncertainty include limitations to current knowledge of the prevalence, spectrum and penetrance of pathogenic mutations associated with a variety of hereditary diseases, including those currently recommended for routine examination on WGS by Genomics England. In hereditary breast cancer, estimates of penetrance have frequently been derived from studies conducted in multi-case families as population estimates do not exist, and it may not be appropriate to apply such estimates to unselected populations undergoing WGS. For mutations in other genes conferring a lower risk of disease, there is uncertainty about the threshold for clinical action and thus the level of absolute disease risk at which secondary findings should be fed back.

Care pathway factors should also be considered when implementing this approach. Currently WGS has sub-optimal sensitivity for detecting certain types of mutation, in particular CNVs and some indels. Concerns have been raised about potential inconsistencies between laboratories in assuring quality of data generated by WGS and its interpretation [25]. The risk of false positives, although low, would be increased if laboratories did not undertake a confirmatory step. The number of pre-symptomatic mutation carriers identified across a range of genes tested for secondary findings is unknown so there is a lack of assurance that clinical services will be able to manage the extra work volume generated. For BRCA1 and BRCA2 mutation carriers this would include enhanced radiologic surveillance, chemoprophylaxis and/or prophylactic surgery to mitigate risk. In addition the cost effectiveness of such interventions is uncertain although recent data suggest that screening of generally healthy individuals using next generation sequencing may not currently be cost effective [26].

Globally there is divergent policy around secondary genomic findings: in 2013 the American College of Medical and Genetics and Genomics (ACMG) controversially recommended routine examination of 56 potentially actionable genes and types of variants whenever clinical exome or genome sequencing is undertaken [27] (although an opt out clause has since been added) [28]; the European Society of Human Genetics in contrast suggests using a targeted testing or reporting strategy where possible to minimise the risk of genetrating unsolicited findings [29]. In the UK the 16 genes recommended for routine examination in the 100,000 Genomes Project are based on the ACMG list plus ‘subsequent expert opinion’ [7]. In unselected populations, however, it is unclear how well this approach to secondary genomic findings allows quantification of an individual’s absolute disease risk, which is essential to making valid judgements about risks and benefits of clinical intervention.


In summary, we use a simple model to highlight issues that hinder the utility of actively seeking secondary findings using WGS, even for relatively well-characterised genes. Applying this method to other gene-disease combinations is likely to reveal further gaps. It is therefore imperative that robust processes are in place for managing and understanding these complex data and appreciating the levels of uncertainty around clinical validity and clinical utility of testing positive for a cancer risk gene. Detailed evaluation of developing practice and research will be essential to enable effective clinical implementation of this approach to secondary genomic findings in unselected populations.


  1. Luheshi L, Raza S. Clinical whole genome analysis: delivering the right diagnosis. PHG Foundation, 2014. Available from Accessed 29 April 2016.

  2. Allyse M, Michie M. Not-so-incidental findings: the ACMG recommendations on the reporting of incidental findings in clinical whole genome and whole exome sequencing. Trends Biotechnol. 2013;31(8):439–41.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Rosenblatt DS. Who’s on first in exome and whole genome sequencing? Is it the patient or the incidental findings? Mol Genet Metab. 2013;110(1–2):1–2.

    Article  CAS  PubMed  Google Scholar 

  4. Burke W, Matheny Antommaria AH, Bennett R, Botkin J, Clayton EW, Henderson GE, et al. Recommendations for returning genomic incidental findings? We need to talk! Genet Med. 2013;15(11):854–9.

    Article  CAS  PubMed  Google Scholar 

  5. Ross LF, Rothstein MA, Clayton EW. Mandatory extended searches in all genome sequencing: “incidental findings”, patient autonomy, and shared decision making. JAMA. 2013;310(4):367–8.

    Article  CAS  PubMed  Google Scholar 

  6. Wolf SM, Annas GJ, Elias S. Patient Autonomy and Incidental Findings in Clinical Genomics. Science. 2013;340(6136):1049–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Presidential Commission for the Study of Bioethical Issues. Anticipate and communicate: ethical management of incidental and secondary findings in the clinical, research and direct-to-consumer contexts. Washington D.C: Presidential Commission for the Study of Bioethical Issues; 2013.

    Google Scholar 

  8. Genomics England. The 100 000 Genomes Project protocol, 2015. Available at: Accessed 11 May 2015.

  9. Moyer VA, U.S. Preventive Services Task Force. Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2014;160:271–81.

    PubMed  Google Scholar 

  10. National Institute for Health and Care Excellence. Clinical Guideline 164: Familial breast cancer. 2013.

    Google Scholar 

  11. Illumina TruGenome Clinical Sequencing Services Methods and Technology FAQs, 2014. Available at Accessed 11 March 2015.

  12. Rehm HL, Bale SJ, Bayrak-Toydemir P, Berg JS, Brown KK, Deignan JL, et al. ACMG clinical laboratory standards for next-generation sequencing. Genet Med. 2013;15:733–47.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Ellard S, Lindsay H, Camm N, Watson C, Abbs S, Wallis Y, et al. Practice guidelines for targeted next generation sequencing analysis and interpretation, Association for Clinical Genetic Science, 2014. Available at: Accessed 11 March 2015.

  14. Hall A, Finnegan T, Alberg C. Realising Genomics in Clinical Practice. PHG Foundation. 2014. ISBN 978-1-907198-15-1.

    Google Scholar 

  15. Antoniou AC, Cunningham AP, Peto J, Evans DG, Lalloo F, Narod SA, et al. The BOADICEA model of genetic susceptibility to breast and ovarian cancers: updates and extensions. Br J Cancer. 2008;98:1457–66.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Petrucelli N, Daly M & Feldman G. BRCA1 and BRCA2 hereditary breast and ovarian cancer. In: Pagon RA, Adam MP, Ardinger HH, et al., editors. GeneReviews [internet]. Seattle (WA): University of Washington, Seattle, 1993–2015. Available at: Accessed 5 March 2015.

  17. Song H, Cicek MS, Dicks E, Harrington P, Ramus SJ, Cunningham JM, et al. The contribution of deleterious germline mutations in BRCA1, BRCA2 and the mismatch repair genes to ovarian cancer in the population. Hum Mol Genet. 2014;23:4703–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Ford D, Easton DF, Stratton M, Narod S, Goldgar D, Devilee P, et al. Genetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families. The Breast Cancer Linkage Consortium. Am J Hum Genet. 1998;62:676–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Mavaddat N, Peock S, Frost D, Ellis S, Platte R, Fineberg E, et al. Cancer Risks for BRCA1 and BRCA2 Mutation Carriers: Results from Prospective Analysis of EMBRACE. J Natl Cancer Inst. 2013;105:812–22.

    Article  CAS  PubMed  Google Scholar 

  20. Chen S, Parmigiani G. Meta-Analysis of BRCA1 and BRCA2 Penetrance. J Clin Oncol. 2007;25:1329–33.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Antoniou A, Pharoah PDP, Narod S, Risch HA, Eyfjord JE, Hopper JL, et al. Average risks of breast and ovarian cancer associated with BRCA1 and BRCA2 mutations detected in case seris unselected for family history: a combined analysis of 22 studies. Am J Hum Genet. 2003;72:1117–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Van Der Kolk D, De Bock GH, Leegte BK, Schaapveld M, Mourits MJE, De Vries J, et al. Penetrance of breast cancer, ovarian cancer and contralateral breast cancer in BRCA1 and BRCA2 families: high cancer incidence at older age. Breast Cancer Res Treat. 2010;124:643–51.

    Article  PubMed  Google Scholar 

  23. Cancer Research UK website: Breast Cancer Key Stats, 2014. Available at: Accessed 5 March 2015.

  24. Palomaki GE. Is it time for BRCA1/2 mutation screening in the general adult population?: impact of population characteristics. Genet Med. 2015;17:24–6.

    Article  CAS  PubMed  Google Scholar 

  25. Dewey FE, Grove ME, Pan C, Goldstein BA, Bernstein JA, Chaib H, et al. Clinical Interpretation and Implications of Whole-Genome Sequencing. JAMA. 2014;311:1035–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Bennette CS, Gallego CJ, Burke W, Jarvik GP, Veenstra DL. The cost-effectiveness of returning incidental findings from next-generation genomic sequencing. Genet Med 2014 Nov 13. doi:10.1038/gim.2014.156 [Epub ahead of print].

  27. Green RC, Berg JS, Grody WW, Kalia SS, Korf BR, Martin CL, et al. ACMG recommendation for reporting of incidental findings in clinical exome and genome sequencing. Genet Med. 2013;15:565–74.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. American College of Medical Genetics and Genomics. Incidental findings in clinical genomics: a clarification. Genet Med. 2013;15:664–6.

    Article  Google Scholar 

  29. Van El CG, Cornel MC, Borry P, Hastings RJ, Fellmann F, Hodgson SV, et al. Whole-genome sequencing in health care. Eur J Hum Genet. 2013;21:S1–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank members of the PHG Foundation for helpful comments and discussion on drafts of this work.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Charlotte Warren-Gash.

Additional information

Competing interests

All authors declare that they have no competing interests.

Authors’ contributions

CWG reviewed literature, developed models, analysed data and wrote the manuscript. MK, HB and PP conceived the study, provided academic supervision and critically revised the manuscript. All authors read and approved the final version.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Warren-Gash, C., Kroese, M., Burton, H. et al. Implications of using whole genome sequencing to test unselected populations for high risk breast cancer genes: a modelling study. Hered Cancer Clin Pract 14, 12 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: