Genetic variants of prospectively demonstrated phenocopies in BRCA1/2 kindreds

Background In kindreds carrying path_BRCA1/2 variants, some women in these families will develop cancer despite testing negative for the family’s pathogenic variant. These families may have additional genetic variants, which not only may increase the susceptibility of the families’ path_BRCA1/2, but also be capable of causing cancer in the absence of the path_BRCA1/2 variants. We aimed to identify novel genetic variants in prospectively detected breast cancer (BC) or gynecological cancer cases tested negative for their families’ pathogenic BRCA1/2 variant (path_BRCA1 or path_BRCA2). Methods Women with BC or gynecological cancer who had tested negative for path_BRCA1 or path_BRCA2 variants were included. Forty-four cancer susceptibility genes were screened for genetic variation through a targeted amplicon-based sequencing assay. Protein- and RNA splicing-dedicated in silico analyses were performed for all variants of unknown significance (VUS). Variants predicted as the ones most likely affecting pre-mRNA splicing were experimentally analyzed in a minigene assay. Results We identified 48 women who were tested negative for their family’s path_BRCA1 (n = 13) or path_BRCA2 (n = 35) variants. Pathogenic variants in the ATM, BRCA2, MSH6 and MUTYH genes were found in 10% (5/48) of the cases, of whom 15% (2/13) were from path_BRCA1 and 9% (3/35) from path_BRCA2 families. Out of the 26 unique VUS, 3 (12%) were predicted to affect RNA splicing (APC c.721G > A, MAP3K1 c.764A > G and MSH2 c.815C > T). However, by using a minigene, assay we here show that APC c.721G > A does not cause a splicing defect, similarly to what has been recently reported for the MAP3K1 c.764A > G. The MSH2 c.815C > T was previously described as causing partial exon skipping and it was identified in this work together with the path_BRCA2 c.9382C > T (p.R3128X). Conclusion All women in breast or breast/ovarian cancer kindreds would benefit from being offered genetic testing irrespective of which causative genetic variants have been demonstrated in their relatives. Electronic supplementary material The online version of this article (10.1186/s13053-018-0086-0) contains supplementary material, which is available to authorized users.


Background
Breast cancer (BC) is one of the most common human malignancies, accounting for 22% of all cancers in women worldwide [1]. A significant proportion of BC cases can be explained by hereditary predisposition and approximately 30% of this hereditary cancer risk is explained by the currently known high-penetrance susceptibility genes [2][3][4][5]. Notably, carriers of pathogenic BRCA1 or BRCA2 variants (path_BRCA1 or path_BRCA2) have an increased risk of developing BC (average lifetime risk of 35-85%) and ovarian cancer (average lifetime risk 11-39%). Further, carriers of pathogenic variants of ATM, CHEK2, PALB2, NBS1 and RAD50 have been found to confer two-to five-fold increased risk for developing BC [1,6]. It is also known that pathogenic variants in TP53, PTEN, STK11 and CDH1, resulting in Li-Fraumeni syndrome, Cowden syndrome, Peutz-Jeghers syndrome and hereditary diffuse gastric cancer, respectively, are associated with a high lifetime risk (> 40%) of BC. Moreover, pathogenic variants in RAD51 paralogs, i.e., RAD51C, confer an increased risk of ovarian cancer [7]. The frequency of pathogenic variants in BC-associated genes varies significantly among different populations, as exemplified by the frequently studied founder pathogenic variant c.1100delC in CHEK2 [6].
The identification of path_BRCA1 or path_BRCA2 in an affected BC individual enables access to evidencebased screening for family members, and thus facilitates the implementation of appropriate cancer prevention in these families [1,5,6]. However, some women in families with an identified pathogenic variant will develop cancer despite testing negative for the family's pathogenic variant, often denoted as phenocopies [8]. In BC kindreds having a demonstrated path_BRCA2 variant, the number of phenocopies is reportedly more frequent than expected by chance [8][9][10]. It has been proposed that these families may have additional genetic variants, which not only may increase the susceptibility of the families' path_BRCA1/2, but also be capable of causing cancer in the absence of the path_BRCA1/2 demonstrated in the families [5][6][7].
The current practice of genetic counselling for women who do not carry the path_BRCA1/2 variants of their relatives is challenging since their recognition is crucial for application of proper diagnostic and therapeutic approaches in these families. To discover additional inherited disease-causing variants in path_BRCA1/2 kindreds, we examined all prospectively detected BC or gynecological cancer cases in these kindreds by nextgeneration sequencing (NGS) using a panel of 44 cancer susceptibility genes. All detected variants were analyzed by RNA splicing-and protein-dedicated in silico methods. Variants predicted as the most likely to affect splicing were experimentally analyzed by using a cellbased minigene splicing assay.

Study population
For more than 20 years, we (the Hereditary Cancer Biobank from the Norwegian Radium Hospital, Norway; and the Department of Genomic Medicine from the University of Manchester, United Kingdom) have ascertained BC and breast/ovarian cancer kindreds by family history. The sisters and daughters of cancer patients were initially subjected to follow-up by annual mammography and gynecological examinations as appropriate at that time, and later they were all subjected to genetic testing [11].
Both collaborating outpatient genetic centers identified 48 women with prospective detected BC or gynecological cancer at follow-up, who were tested negative for their respective families' path_BRCA1/2 variants. Clinical data were obtained from pathology reports and clinical files.
Ethical approval for the prospective study was granted from the Norwegian Data Inspectorate and Ethical Review Board (ref 2015/2382). All examined patients had signed an informed consent for their participation in the study.

Targeted sequencing
Genomic DNA was isolated from peripheral blood samples and targeted sequencing was carried out using a TrueSeq amplicon based assay v.1.5 on a MiSeq apparatus, as previously described [12]. The 44-gene panel used in this work includes genes associated with cancer predisposition as described in a prior study [12].

Sequencing data analysis
Paired-end sequence reads were aligned to the human reference genome (build GRCh37) using the BWA-mem algorithm (v.0.7.8-r55) [13]. The initial sequence alignments were converted to BAM format and subsequently sorted and indexed with SAMtools (v.1.1) [13]. Genotyping of single nucleotide variants (SNV) and short indels was performed by GATK's HaplotypeCaller. Filtering of raw genotype calls and assessment of callable regions/ loci were done according to GATK's best practice procedures, as described more detail previously [12].
Variants were annotated using ANNOVAR (version November 2015) [14] and were queried against a range of variant databases and protein resources (v29, December 2015), as previously described [12].

Validation by cycling temperature capillary electrophoresis
The pathogenic variants identified in this study were validated by cycling temperature capillary electrophoresis. The method is based on allele separation by cooperative melting equilibrium while cycling the temperature surrounding capillaries [15]. This approach has previously been described and extensively used to detect somatic mutations and single nucleotide polymorphisms (SNPs) [16][17][18][19]. The amplicon design was performed by the variant melting profile tool (https://hyperbrowser.uio.no/ hb/?tool_id=hb_variant_melting_profiles/) [20]. Primer sequences, PCR reaction conditions and electrophoresis settings are described in Additional file 1.

Genetic variants nomenclature and classification
The nomenclature guidelines of the Human Genome Variation Society (HGVS) were used to describe the detected genetic variants [21]. The recurrence of the identified variants was established by interrogating six databases (in their latest releases as of November 2016): Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA), Breast Cancer Information Core Database (BIC), the International Society of Gastrointestinal Hereditary Tumors (InSiGHT) Database, the Leiden Open Variation Database (LOVD), ClinVar, and the Human Gene Mutation Database (HGMD).
Novel variants were considered pathogenic if either one of the following criteria was met: a) introduced a premature stop codon in the protein sequence (nonsense or frameshift); b) occurred at positions + 1/+ 2 or − 1/− 2 of donor or acceptor splice sites, respectively; and c) represented whole-exon deletions or duplications.

In silico analyses of VUS
Two types of bioinformatics methods were used to predict the impact of selected variants on RNA splicing. First, we used MaxEntScan (MES) and SSF-like (SSFL) to predict variant-induced alterations in 3′ and 5′ splice site strength, as described by Houdayer et al. [22], except that here both algorithms were interrogated by using the integrated software tool Alamut Batch version 1.5, (Interactive Biosoftware, http://www.interactive-biosoftware.com). For prediction of variant-induced impact on exonic splicing regulatory elements (ESR), we resorted to ΔtESRseq- [23], ΔHZei- [24], and SPANR-based [25] as described by Soukarieh et al. [26]. Score differences (Δ) between variant and wild-type (WT) cases were taken as proxies for assessing the probability of a splicing defect. More precisely, we considered that a variant mapping at a splice site was susceptible of negatively impacting exon inclusion if ΔMES≥15% and ΔSSFL≥5% [22], whereas an exonic variant located outside the splice sites was considered as a probable inducer of exon skipping if negative Δ scores (below the thresholds described below) were provided by all the 3 ESR-dedicated in silico tools. We chose the following thresholds: <− 0.5 for ΔtESRseq-, <− 10 for ΔHZei-, and < − 0.2 for SPANRbased scores. In addition, we evaluated the possibility of variant-induced de novo splice sites by taking into consideration local changes in MES and SSFL scores. In this case, we considered that variants located outside the splice sites were susceptible of creating a competing splice site if local MES scores were equal to or greater than those of the corresponding reference splice site for the same exon.

Cell-based minigene splicing assays
In order to determine the impact of the APC c.721G > A on RNA splicing, we performed functional assays based on the comparative analysis of the splicing pattern of WT and mutant reporter minigenes [27], as follows. First, the genomic region containing APC exon 7 and at least 150 nucleotides of the flanking introns (c.646-169 to c.729 + 247) were amplified by PCR using patient #12470 DNA as template and primers indicated in Additional file 2. Next, the PCR-amplified fragments were inserted into a previously linearized pCAS2 vector [26] to generate the pCAS2-APC exon 7 WT and c.721G > A minigenes. All constructs were sequenced to ensure that no unwanted mutations had been introduced into the inserted fragments during PCR or cloning. Then, WT and mutant minigenes were transfected in parallel into HeLa cells grown in 12-well plates (at7 0% confluence) using the FuGENE 6 transfection reagent (Roche Applied Science). Twenty-four hours later, total RNA was extracted using the NucleoSpin RNA II kit (Macherey Nagel) and, the minigene transcripts were analyzed by semi-quantitative RT-PCR using the One- Step RT-PCR kit (QIAGEN), as previously described [26]. The sequences of the RT-PCR primers are shown in Additional file 2. Then, RT-PCR products were separated by electrophoresis on 2.5% agarose gel containing EtBr and visualized by exposure to UV light under saturating conditions using the Gel Doc XR image acquisition system (Bio-Rad), followed by gel-purification and Sanger sequencing for proper identification of the minigenes' transcripts. Finally, splicing events were quantitated by performing equivalent fluorescent RT-PCR reactions followed by capillary electrophoresis on an automated sequencer (Applied Biosystems), and computational analysis by using the GeneMapper v5.0 software (Applied Biosystems).

Family history and clinical characteristics
In total, we identified 48 cases, of whom 18 BC or gynecological cancer patients who did not carry their respective families' path_BRCA1 or path_BRCA2 variants (n = 13 and n = 5, respectively) came from the Hereditary Cancer Biobank from the Norwegian Radium Hospital, while the Department of Genomic Medicine from the University of Manchester identified a total of 30 BC patients, all non-carriers of the family's path_BRCA2 variants (Fig. 1). The median age at first cancer diagnosis was 53.5 years (range 31-79 years). The incidence was higher for BC (92%), followed by ovarian cancer (4%) and endometrial and cervical cancer (2% each) ( Table 1).
Interestingly, one case with a familial path_BRCA2 (c.6591_6592delTG) was found to carry another pathogenic variant in the same gene (BRCA2 c.9382C > T, p.Arg3128Ter), which causes a premature stop in the codon 3128 and is known to be a high risk pathogenic variant ( Table 1).
The pathogenic variants in BC-related genes (2 in ATM and 1 in BRCA2) were found in 3 women with BC or ovarian cancer, while the MSH6 and the heterozygous MUTYH p.Gly393Asp pathogenic variant was found in a woman with endometrial cancer at 57 years and BC diagnosis at 56 years, respectively ( Table 1).

Validation of the cancer gene panel output
The presence of the five pathogenic variants detected by targeted NGS was confirmed by cycling temperature capillary electrophoresis, showing 100% correspondence between both methods.

Variants of unknown significance (VUS) and predicted protein alterations
In total, we found 26 unique VUS in 30 out of 48 patients (63%). Common polymorphisms (with an allele frequency ≥ 1% in the general population according to the ExAC database) and benign variants classified according to either ClinVar or the American College of Medical Genetics and Genomics (ACMG) guidelines were excluded from further analyses [41,58].
The VUS were furthermore analyzed by using 6 in silico protein prediction tools with different underlying algorithms (Fig. 2). The   five out of six predictions suggested a potentially damaging effect (Fig. 2). Discrepancies in protein-related predictions were even more pronounced for the variants in APC, AXIN2, RAD51B, DVL2, RAD51D, CDH1 and MSH2 c.2164G > A. In contrast, none of the six prediction tools showed deleterious effects for the detected variants in the AXIN2, ATM, RAD51B and MAP3K1 genes (AXIN2    (Fig. 2).

Splicing-dedicated in silico analysis and minigene splicing assays
Out of the 26 unique VUS, two (APC c.721G > A and MAP3K1 c.764A > G) were bioinformatically predicted as the most likely to affect RNA splicing, either by potentially creating a new splice site or by altering putative exonic splicing regulatory elements, respectively ( Table  2). Given that RNA data was not available for APC c.721G > A, we set out to experimentally evaluate the impact on RNA splicing produced by this variant, by performing a cell-based minigene splicing assay. As shown in Fig. 3, we observed that c.721G > A did not affect the splicing pattern of APC exon 7 in our system. These results are reminiscent of those recently obtained for MAP3K1 c.764A > G by using a similar splicing assay, in which the variant did not cause an alteration in the minigene's splicing pattern (Dominguez-Valentin et al. under submission). It would be important in both cases to validate the minigene results by analyzing RNA from the variant carriers/patients as compared to those from healthy controls. However, we do not have such material in our biobank. To our knowledge, the only other VUS from our list for which RNA data is available is MSH2 c.815C > T (p.Ala272Val). Previous results from different minigene assays revealed that, albeit located outside the splice sites, MSH2 c.815C > T induces partial skipping of exon 5 [28]. These results agree, at least in part, with those obtained by analyzing RNA from a LS patient carrying this same variant [29]. Indeed, the latter study revealed aberrantly spliced MSH2 transcripts associated with the presence of c.815C > T, but where the severity of the splicing defect was not addressed at the time. Of note, here we identified MSH2 c.815C > T together with another VUS (DVL2 c.596 T > C) and a path_BRCA2 c.9382C > T (different from the familial path_BRCA2) in a patient diagnosed with ductal carcinoma at 44 years of age (Patient 1,100,948) ( Table 1).

Discussion
Among prospectively detected BC or gynecological cancer phenocopies in the path_BRCA1/2 families, we found that 4/48 have pathogenic variants in highpenetrance cancer genes: two BC-and one CRCassociated gene (ATM, BRCA2 and MSH6, respectively). Our findings are in line with a previous study, which detected a likely pathogenic variant in a gene other than BRCA1/2 in a BC patient, i.e. MSH6 c.3848_3862del (p.(Ile1283_Tyr1287del) [30]. In addition, we found the MUTYH c.1178G > A (p.Gly393Asp) variant in a BC case, which is one of the most common path_MUTYH variants. Pathogenic MUTYH variants may cause a recessively inherited colon cancer syndrome. Whether or not individuals who are heterozygous for MUTYH mutations may be at risk for cancer is debated [31]. Among the five cases found to carry pathogenic variants, 2/13 were identified from families with path_BRCA1 and 3/ 35 with path_BRCA2 variants.
Our results are in concordance with the recently published NGS panel studies, which have demonstrated that besides high-risk genes, like BRCA1/2 and MMR genes, other genes may also contribute to familial cancer predisposition, thus providing a broader picture on the genetic heterogeneity of cancer syndromes [25,32,33]. In this regard, a molecular diagnosis yield of approximately 9% to identify a pathogenic or likely pathogenic variant in BC has been reported, and with yields of 13% in ovarian and 15% in colon/stomach cancer cases [25]. On the other hand, family history is currently used to identify high risk patients. However, the use of family history fails to identify women without close female relatives who are carriers of pathogenic variants [9].
Despite the potential of NGS to identify genetic causes among families that tested negative for pathogenic variants in high-risk genes using traditional methods [25,32,33], a high number of VUS are also detected and constitute a major challenge in oncogenetics [34]. In this study, we subjected 26 VUS to RNA splicing and protein in silico evaluations, and the bioinformatics predictions indicated that two VUS (APC c.721G > A and MAP3K1 c.764A > G) were likely to affect RNA splicing. Our results from minigene splicing assays suggest, however, that this is not the case. Complementary analysis of patients' RNA will be important to verify the impact on splicing of these variants in vivo. Of note, none of the six protein in silico prediction tools showed a deleterious effect for the MAP3K1 c.764A > G missense variant and inconsistences were found for the APC c.721G > A variant. Bioinformatics prediction tools are widely used to aid the biological and clinical interpretation of sequence variants, although it is well recognized that they have their limitations. Co-segregation studies for further evaluation will be key for understanding whether some of the VUS detected in this work may have a causal effect. Some of the VUS may in the future be reclassified as deleterious or benign, but in the meantime, they cannot be used to make clinical decisions [30].
A polygenic model involving a combination of multiple genomic risk factors, including the effect of low-or moderate-penetrance susceptibility alleles may explain the increased BC risk in women who tested negative for family's path_BRCA1/2 variants [5]. In addition, heterozygous whole gene deletions (WGD) and intragenic microdeletions have been reported to account for a significant proportion of pathogenic variants underlying cancer predisposition syndromes, although WGD were not a common mechanism in any of the three high-risk BC genes, BRCA1, BRCA2 and TP53 [35].
The clinical utility of gene panels such as the one used in this study is not yet fully established and the appropriate routes for clinical deployment of such tests remain under discussion [36]. So far, the large patient datasets generated by NGS panels may be used to explore the specific penetrance of the genes included in these panels, and to assess the performance and implications of the use of NGS in clinical diagnostics [34].

Conclusions
In kindreds carrying path_BRCA1/2 variants, testing only for the already known path_BRCA1/2 variants in the family may not be sufficient to exclude increased risk neither for BC nor for ovarian cancer or other cancers in the healthy female relatives. Our findings suggest that all women in BC or breast/ovarian cancer kindreds would benefit from being offered genetic testing irrespective of which causative genetic variants have been demonstrated in their relatives. In addition, we found a number of VUS in genes other than BRCA1/2 i.e. AXIN2, APC, DVL2, MAP3K1, RAD51B, NBN, POLE, CDH1, CDX2, MRE11A, MUTYH, NOTCH3, PTEN and RAD51D. All these may be suspected of being associated with cancer in the families studied and may be considered as candidates for being included in future gene panel testing to better understand why some families present aggregation of cancer cases.