Gene Expression Profiling in Hereditary, BRCA1-linked Breast Cancer: Preliminary Report

Global analysis of gene expression by DNA microarrays is nowadays a widely used tool, especially relevant for cancer research. It helps the understanding of complex biology of cancer tissue, allows identification of novel molecular markers, reveals previously unknown molecular subtypes of cancer that differ by clinical features like drug susceptibility or general prognosis. Our aim was to compare gene expression profiles in breast cancer that develop against a background of inherited predisposing mutations versus sporadic breast cancer. In this preliminary study we analysed seven hereditary, BRCA1 mutation-linked breast cancer tissues and seven sporadic cases that were carefully matched by histopathology and ER status. Additionally, we analysed 6 samples of normal breast tissue. We found that while the difference in gene expression profiles between tumour tissue and normal breast can be easily recognized by unsupervised algorithms, the difference between those two types of tumours is more discrete. However, by supervised methods of data analysis, we were able to select a set of genes that may differentiate between hereditary and sporadic tumours. The most significant difference concerns genes that code for proteins engaged in regulation of transcription, cellular metabolism, signalling, proliferation and cell death. Microarray results for chosen genes (TOB1, SEPHS2) were validated by real-time RT-PCR.

I In nt tr ro od du uc ct ti io on n DNA microarrays have been recently widely employed in studies on breast cancer encompassing research on breast cancer cell lines and resected tumo-ur tissues. Two directions of these studies seem to be especially spectacular and promising: the studies of the Norwegian/Stanford group that led to the recognition of several distinct molecular classes of breast cancers [1,2] and the studies of a group at the Nether-Expression profiling in BRCA1-linked breast cancer lands Cancer Institute, which brought identification of a 70-gene prognostic profile for patients with node--negative breast cancer [3,4]. The results of those studies indicate that gene expression analysis by DNA microarrays may help the understanding of the molecular background underlying development and progression of breast cancer as well as providing a clinically useful tool for more personalized treatment [reviewed in : 5]. It seems clear that a multi-gene approach will prove more useful and informative than currently used analysis of single markers.
Since the identification of two major predisposing genes, BRCA1 and BRCA2, and broad application of genetic testing, significant numbers of mutation carriers have been identified worldwide among breast cancer patients. This allowed further studies in order to estimate clinical features of those specific breast cancer cases. Some indications are accumulating that mutation--linked breast cancer may be a clinically distinct entity from the majority of malignant breast tumours. Among the characteristics of BRCA1 tumours are: earlier age of manifestation, high tumour grade, low oestrogen receptor content and elevated lymphocyte infiltration. In addition, these cases are often characterized by high proliferative activity, resulting in tumours with pushing margins and high mitotic index [6][7][8][9][10][11][12]. The data concerning survival in BRCA1 mutation carriers are confusing. There are intriguing observations that despite adverse prognostic indications, patients with BRCA1 mutations have survival rates similar to or even better than patients with sporadic breast cancer [13][14][15], own unpublished data]. The long-term aim of our study is an attempt to elucidate the molecular basis underlying described discrepancies by comparing gene expression profiles of BRCA1-associated hereditary breast cancer and sporadic breast cancer cases. The first attempt to compare hereditary versus sporadic breast cancers by DNA microarray analysis was published by Hedenfalk et al., who used cDNA microarrays containing 6512 cDNA clones [16]. In our study we used HG U133 Plus 2.0 Gene Chip (Affymetrix), allowing detection of over 47,000 transcripts. We also attempted to make a more careful selection of tumour specimens, which were chosen exclusively from among ER(-) cases. Our group of tumours was also more uniform according to histopathology; only ductal carcinomas and medullary carcinomas, all grade 3, were analysed.
M Ma at te er ri ia al ls s a an nd d m me et th ho od ds s T Ti is ss su ue e s sa am mp pl le es s. . Frozen surgical specimens of breast cancer and adjacent normal breast tissue were ob-tained from the Pomeranian Medical Academy in Szczecin. Only tissues from patients without preceding chemotherapy were used for microarray experiments. For this initial study we chose seven cancer tissues from women with germline mutation in the BRCA1 gene and seven samples of sporadic breast cancer. Three cases had mutation C61G in exon 5, one at 4153delA in exon 11, and three harboured the 5382insC mutation in exon 20. Sporadic cases were obtained from women without a family history of breast/ovarian cancer, in which, additionally, the three most common BRCA1 mutations in Poland were excluded by genetic tests. Eight cases were diagnosed as grade 3 medullary or atypical medullary carcinoma, and 5 cases were grade 3 ductal carcinomas. All tumours were ER negative (immunohistochemistry on paraffin-embedded material). The percentage of cancer cells within tumour specimens was estimated by a pathologist; in the majority of samples it ranged from 70% to 90%, while in 3 samples it was approx. 50%. In addition, we analysed six samples of unchanged glandular tissue surrounding the tumour and obtained during mastectomy. The lack of tumour samples from BRCA2 mutation carriers in our study reflects the specificity of the mutational spectrum in BRCA genes in Central-Eastern Europe, where BRCA2 mutations are very rare (it was estimated by sequencing that BRCA2 mutations account for only about 5% of all BRCA1 and BRCA2 mutations found in Polish families with a strong familial history of breast/ovarian cancer [17]). In total we analysed 14 tumour samples and 6 normal breast samples.
R RN NA A i is so ol la at ti io on n. . 20-40 mg of frozen tissue was placed in a lysing solution (4M guanidine thiocyanate, 25 mM sodium citrate, 0.5% sodium N-laurylsarcosinate, 0.1M β-mercaptoethanol) and homogenized with Lysing Matrix D in a FastPrep instrument (QBioGene). Total RNA was extracted from the supernatant according to [18]. RNA cleanup and simultaneous on-column digestion of DNA traces with DNAse I (Qiagen) was done using the RNeasy Mini Kit (Qiagen), according to the manufacturer's instructions. RNA quantity was estimated with the ND-1000 Spectrophotometer (NanoDrop Technologies). RNA quality was controlled by microcapillary electrophoresis measurements in the Agilent 2100 Bioanalyzer using the RNA 6000 Nano LabChip Kit and analysed with RNA Integrity Number software (Agilent).
O Ol li ig go on nu uc cl le eo ot ti id de e m mi ic cr ro oa ar rr ra ay ys s. . We used HG U133 Plus 2.0 Gene Chip oligonucleotide arrays (Affymetrix). The hybridization target was prepared according to recommendations of microarray manufacturer. Briefly: 8 µg of total RNA was used for synthesis of double stranded cDNA, half volume of cDNA was used for synthesis of biotynylated cRNA with the BioArray High Yield RNA Transcript Labeling Kit (Enzo Diagnostics). Both cDNA and cRNA were purified with Gene Chip Sample Cleanup Module (Affymetrix). 16g of cRNA was fragmented and hybridized to the microarray for 16h at 45°C. After washing and staining microarrays were scanned with Ge-neChip Scanner 3000 (Affymetrix).
S St ta at ti is st ti ic ca al l a an na al ly ys si is s o of f m mi ic cr ro oa ar rr ra ay y d da at ta a. . Data were obtained using GCOS 1.2 software (Affymetrix). The preprocessing was performed by Robust Multi-Array Analysis (RMA). Hierarchical clustering, Principal Component Analysis (PCA) and supervised comparisons were carried out using GeneSpring 7.2 software (Silicon Genetics). For selection of genes differentially expressed between breast cancer and normal breast tissue we used the parametric Welch test. False Discovery Rate was estimated by Benjamini-Hochberg algorithm. For selection of genes differentially expressed between hereditary and sporadic breast cancer we used the Bioconductor limma package, based on linear models with empirical Bayesian approach. This method provides stable results even when the number of analysed arrays is small. Q Qu ua an nt ti it ta at ti iv ve e R RT T--P PC CR R. . Quantitative RT-PCR analysis was performed using the ABI 7700 Sequence Detection System and dedicated software (Applied Biosystems). The reactions were performed using the Maste-rAmp Real-Time RT-PCR Kit (Epicentre), according to the manufacturer's recommendations. Primers for the SYBR Green system were designed using Primer3 online software (http://frodo.wi.mit.edu/cgi-bin/prime-r3/primer3_www.cgi). All results were normalised to the expression of the reference gene, eukaryotic translation initiation factor 4 gamma 2 (EIF4G2), which appeared to be equally transcribed in all tissues analysed by microarrays. Primer specificity was verified by sequencing of selected RT-PCR products for each gene. Sequences of the PCR primer pairs used for each gene are shown in Table 1.

R Re es su ul lt ts s
U Un ns su up pe er rv vi is se ed d a an na al ly ys si is s o of f o ob bt ta ai in ne ed d d da at ta a s se et t. . We performed Principal Component Analysis (PCA) to determine the major sources of variability in our data ( Fig.  1). PCA is an unsupervised algorithm, which, if performed 'on conditions', is able to detect intrinsic similarities and differences in the gene expression profiles of analysed samples. Results may be graphically presented and the distances between the dots visualize the level of similarity/dissimilarity between particular samples. It can be seen that the difference in gene expression profile between breast cancer tissue and normal breast tissue is large and is easily recognized by that unsupervised algorithm (normal-tumour difference was responsible for the sample subdivision by the first component, which accounted for 24.06% of total variance). Using a supervised method of data analysis we fo- und that this precise separation of cancer versus normal tissues in PCA may be ascribed to the differential expression of 8,063 genes (Welch test, Benjamini--Hochberg correction for multiple comparisons, FDR<0.05). In contrast, the difference between hereditary and sporadic breast cancer samples cannot be recognized by PCA, suggesting that the difference in gene expression profile between those two types of breast cancer is not very powerful. Samples obtained from normal glandular breast tissue clustered closely together, while tumours were much more dispersed. By unsupervised analysis we were unable to disclose any differences between hereditary and sporadic tumours; they were not visible also when only cancer samples were subjected to decomposition into principal components (data not shown).
G Ge en ne es s d di if ff fe er re en nt ti ia at ti in ng g b be et tw we ee en n h he er re ed di it ta ar ry y a an nd d s sp po o-r ra ad di ic c b br re ea as st t t tu um mo ou ur rs s. . As the unsupervised method showed that the distance between hereditary and sporadic tumours is not large and taking into account the limited number of samples in our analysis, we chose a supervised algorithm with a good balance of sensitivity and specificity of analysis. We used the limma package [19], a Bayesian method based on linear modelling with moderated t-statistic.
We selected 100 genes best differentiating between the two analysed groups, ranked according to the statistical power of the expression change (Table 2.). Four times more probe sets were down-regulated in hereditary tumours (78 probesets) in comparison to up-regulated transcripts (22 probe sets). Down-regulated genes showed an average decrease of 1.35 to 5-fold, up-regulated genes were changed by a factor of 1.2-7.8. Among the first hundred genes selected by limma the most prominent classes consisted of genes connected with regulation of transcription (19 genes), metabolism (12 genes), protein synthesis and degradation (10 genes), cellular signalling (8 genes), cell proliferation and death (6 genes) and DNA and RNA replication and processing (5 genes). Figure 2 shows that use of a set of 100 genes for hierarchical clustering results in almost perfect classification of hereditary and sporadic tumour samples. Only one sporadic sample falls into the branch of hereditary cases.
Within a set of 100 genes selected in our study, we found genes coding for proteins known to be interacting with the BRCA1 pathway, such as Fanconi Anemia complementation group A gene (FANCA, increased 2-fold) and XRCC5 (50% decrease), which are both engaged in double-strand break repair. FANCA protein is a component of the multi-subunit FA complex which takes part in sensing and/or regulation of the DNA damage response. The FA complex activates FANCD2 protein which is further targeted to the BRCA1 nuclear loci. Inactivation of the FA/BRCA pathway leads to chromosomal instability, due to impaired DNA repair [20]. DNA repair protein XRCC5 (80 kDa Ku autoantigen) is the DNA-binding component of the DNA-dependent protein kinase, and functions together with the DNA ligase IV-XRCC4 complex in the repair of DNA double-strand break by non-homologous end joining [21].
The TOB1 gene (transducer of ERBB2 gene, decreased in hereditary tumours) encodes a member of the tob/btg1 family of anti-proliferative proteins that have the potential to regulate cell growth, and is probably engaged in several human tumours (breast, lung, thyroid) [22][23][24][25]. This protein inhibits T cell proliferation and transcription of cytokines and cyclins. This is the only gene from Hedenfalk's list [15] that appears within the first 100 genes selected by limma.
Another interesting gene is selenophosphate synthetase 2 (SEPHS2, decreased in hereditary tumours). This protein encodes an enzyme that synthesizes selenophosphate from selenide and ATP. Selenophosphate is the selenium donor used to synthesize selenocysteine, which is co-translationally incorporated into selenoproteins at in-frame UGA codons. This protein itself contains a selenocysteine residue in its predicted active site. It has been proposed that the effects of selenium in preventing cancer and neurological disorders may be mediated by selenium-binding proteins [26].
The most prominent among genes up-regulated in hereditary carcinomas is a group of immune response genes (5 genes). On the contrary, no immunological genes are found in the list of down-regulated genes. This may reflect a common feature of BRCA1-linked breast cancer, i.e. the inflammatory state and lymphocyte infiltrate of a tumour. It is especially striking as all our tumour samples were inspected by a pathologist, and only pieces of tumour mass without a visible inflammatory state were taken for microarray experiments. Thus we conclude that this immunological imprint must be very prominent in BRCA1(+) breast cancer.
V Va al li id da at ti io on n o of f m mi ic cr ro oa ar rr ra ay y r re es su ul lt ts s b by y Q Q R RT T--P PC CR R. . We selected TOB1 and SEPHS2 genes (see Table 1) for further analysis by quantitative RT-PCR. We examined T Ta ab bl le e 2 2. . Genes differentiating between hereditary, BRCA1-positive breast cancers and sporadic tumours, selected by limma. Genes are ordered according to the fold change value and grouped into functional classes (according to Gene Ontology annotation, http://www.geneontology.org/) A Af ff fy y_ _I ID D G Ge en ne e G Ge en ne e T Ti it tl le e F Fo ol ld d c ch ha an ng ge e l li im mm ma a S Sy ym mb bo ol l B BR RC CA A1 1( (+ +) ) r ra an nk k v vs s. . s sp po or ra ad di ic c D Do ow wn n--r re eg gu ul la at te ed d i in n B BR RC CA A1 1( (+ +) ) C Ce el ll l p pr ro ol li if fe er ra at ti io on n a an nd d d  Fig. 3).

D Di is sc cu us ss si io on n
Our preliminary study was done on a relatively small number of cases; however it may be informative, as the samples were chosen carefully. In all previous microarray studies on breast cancer it was observed that the most striking difference in the gene expression profile is connected with oestrogen receptor status. Initially it was stated that ER(+) and ER(-) breast cancers represent distinct categories of breast tumours [27,28]. More detailed studies performed on a larger set of samples revealed further subtypes, e.g. luminal A and luminal B subtypes among ER(+) tumours and at least two subtypes: ERBB2+ and 'basal' within and ER(-) group [Perou et al., 2000;Sorlie et al., 2001]. Alternatively, another subdivision of ER(-) cases was proposed by Farmer at al., who claim two subgroups: 'basal', characterized by oestrogen and androgen receptors negativity and expression of cytokeratins 5 and 17, and 'molecular apocrine', negative for ER and positive for AR [29]. It was also shown previously that BRCA1 mutation-linked breast cancer tissues cluster within a 'basal' subtype [30]. However, these specific cases account only for the minority of 'basal' tumours; thus the genes that are differentially expressed between basal and other subgroups do not necessarily reflect a difference in the biology of hereditary (BRCA1--linked) and sporadic carcinomas. Moreover, a set of genes differentiating between breast cancer subtypes in the studies of Perou and Sorlie was selected from genes that are stably expressed in the biopsy specimens, taken from the same tumour before and after chemotherapy. This allowed the construction of so-called 'molecular portraits' of each tumour, but may have eliminated some significant genes that could be suppressed by chemotherapy. In our study we analyse only tumours that have not been subjected to neoadjuvant chemotherapy. To reduce, at least partially, the sources of variability that are not linked to BRCA1 status, we decided to analyse only ER(-) tumour samples.
The aim of our study was to reveal a set of genes differentially expressed between BRCA1 mutation-linked and sporadic breast cancer. Similar study have been performed by Hedenfalk et al., who analysed seven sporadic, seven BRCA1-and eight BRCA2-linked breast cancers and published a set of 51 differentially expressed genes. We tried to retrieve information about expression of those 51 genes from HG U133 Plus 2.0 Gene Chip used in our study. However, due to the incompatibility of both types of DNA microarrays, we were able to find expression data only for 40 genes (represented by 88 probesets). This set of genes was unable to discriminate between sporadic and hereditary cases from our cohort (not shown). This may be caused by shortening of the original list of genes and the fact that the set of 51 genes was selected by comparison of BRCA1 and BRCA2 tumours with the sporadic ones, all of them of rather mixed histopathology, grade and ER status. Of the genes from Hedenfalk's list only one (TOB1) is represented, at the 62nd position, in our list of 100 genes selected by limma.
It should be mentioned that Jazaeri et al., who analysed a cohort of hereditary and sporadic ovarian cancers, found by the unsupervised method of data analysis that BRCA1 and BRCA2 tumours localize separately from each other, while sporadic cases were placed by the algorithm between these two groups, some of them being closer to BRCA1 tumours, others to BRCA2 tumours [31]. This result confirms all previous observations that BRCA1 and BRCA2 mutation-linked carcinomas are different entities, and suggests that at least in some sporadic cases BRCA1 or BRCA2 pathways may be truncated, thus accounting for BRCA1-like and BRCA2-like sporadic tumours. It should be veri-F Fi ig g. . 2 2. . Hierarchical clustering of samples, based on 100 genes differentiating between hereditary and sporadic breast tumors. Colors on the right bar code for: regulation of transcription (orange), cell signalling (blue), cell proliferation and death (black), DNA and RNA replication transcription and processing (red), cellular metabolism (yellow), protein synthesis and degradation (green), immune response (violet) and other (grey). It may be seen, that most prominent cluster of genes upregulated in hereditary tumors consists of genes engaged to immune response, what probably reflects lymphocyte infiltrate of those tumors Expression profiling in BRCA1-linked breast cancer fied whether this hypothesis applies also to breast cancer. Interestingly, in our group of samples Jazaeri's set of genes performed better than a set published by Hedenfalk [15] (not shown).
Prediction of BRCA status in patients with breast cancer on the basis of clinical and molecular features would be very useful for genetic counselling and for cost-reduction of genetic screening [32,33]. It was shown recently by Lakhani et al., 2005, that ER status together with immunochemistry for some markers of 'basal' subtype (CK14, CK5/6, CK17 and osteonectin) is better able to predict BRCA1 mutation status in breast cancer patients than previously used criteria [34]. Although it must be experimentally proven, we believe that among 100 genes selected by limma some markers may appear to improve the specificity of prediction of BRCA1 status.