Glycosylation is a widely utilized chemical modification of proteins and lipids in mammalian cells during which saccharide units are covalently attached to the target structures and then sequentially elongated and branched – reactions facilitated by a large number of various glycosyltransferases. Based on the nature of this attachment and the type of saccharides comprising the mature structure, traditionally one has differentiated between four main types of glycans: The N-linked glycans, O-linked glycans, glycosaminoglycans (GAGs) and glycosphingolipids (GSLs).
The three first classes of glycans, the N-linked, O-linked and GAGs, are attached to polypeptide chains and are in case of N-linked glycans considered to be a co-translational modification. N-glycans are linked to asparagine residues through a bond between N-acetylglucosamine (GlcNAc) and the amine group of the side chain, the latter giving rise to this glycan class' designation. O-glycans, on the other hand, are bound to proteins through a covalent bond between N-acetylgalactosamine (GalNAc) and either serine or threonine. GAGs may also be bound to serine, but rather through xylose (Xyl) than GalNAc (in the case of chondroitin, dermatan and heparan sulfate). Keratan sulfate is found as a branch on either N- or O-linked glycans, while hyaluronan is not attached to any protein. Glycosphingolipids are the last type of glycans and, as their name implies, found associated with lipids.
All of the aforementioned glycan classes can be divided further into subgroups. O-glycans' and GSLs' subgroups are defined by the so-called core structures. A core structure is the portion of a glycan found closest to the bond linking the saccharide to a protein or lipid, respectively. There are eight O-glycan cores, denominated core 1 through 8. GSLs belong to one of four main groups: globo-, ganglio-, lacto- and neolacto-series. N-glycans are subdivided based on how their common core structure is elongated and not on the core structure itself: It can be extended by mannose residues (high-mannose N-glycans), branched (complex N-glycans) or a combination of the two (hybrid N-glycans). Glycosaminoglycans utilize several different core structures and are therefore distinguished by the disaccharide repeat comprising their side chains. Hyaluronan, not having a covalent attachment to any other structure, is unique due to the fact that it is not sulfated in contrast to the other GAGs.
The carbohydrate structures of these main glycan classes can be further elongated, trimmed or otherwise chemically modified depending on a number of factors, including specificity of glycosyltransferases and their localization in the Golgi apparatus. Although tightly regulated, glycosylation is not strictly reliant on a template-driven process, but rather based on a number of sequential enzymatic reactions – the main feature responsible for the unrivaled diversity of these oligosaccharides. As such, glycans are involved in a number of important intra- and extracellular functions (Figure 1).
Glycosylation in carcinogenesis. Glycosylation plays a principal role in a number of cellular processes of key importance for carcinogenesis. Two metastasizing carcinoma cells are illustrated, entering distant tissue through the blood stream. Six important processes for cancer development and progression (–) influenced by various glycosylation types are indicated; growth receptors (especially EGFR and TβR) are influenced by N-glycosylation in concert with galectins; growth factors and other signaling molecules may have elevated concentrations, filtered or sequestered by glycosaminoglycans and O-glycosylated mucins; cell-cell adhesion might be mediated either directly by for example glycosynapses consisting mainly of glycosphingolipids - or, more importantly, indirectly by modulation of integrins and cadherins by N-linked glycosylation; O-glycosylated mucins, both secreted and membrane-bound, may constitute a physical barrier or act on specific leukocyte receptors thereby modulating immune system response towards the malignant cells; N-linked glycosylation may enhance motility of transformed cells by regulating integrin functionality; adhesion to endothelium can be mediated by a number of mechanisms, including binding of Lewis antigens by endothelial selectins.
Alteration of glycosylation is intimately linked to the process of carcinogenesis in a variety of ways. Probably one of the most important aspects of this relationship is the modification of the cell surface carbohydrate structures which may affect both adhesion properties and cell–cell signaling as well as modulate the response elicited by the organism's immune system. This has indeed been demonstrated for various cancers and malignant cell lines (Brockhausen, 1999, 2006; Dube and Bertozzi, 2005; Hakomori and Zhang, 1997; Hakomori and Murakami, 1968; Lau and Dennis, 2008). Presentation of aberrant glycosylation is furthermore thought to happen early in the transformation process (Cho et al., 1994) and may therefore be highly important for both diagnosis and treatment of several types of cancer.
Despite the fact that glycosylation's role in disease and malignant transformation has been known for several decades (Hakomori and Murakami, 1968; Mora et al., 1969), the field of glycomics has been comparatively slow to develop and the importance of glycosylation in such processes has only been sporadically explored leaving available knowledge scarce and far from complete.
The primary aim of this review is to present the most important pathways of glycosylation and their significance in the process of carcinogenesis in general with focus on breast carcinomas. For a more comprehensive view of the processes involved, significant results from a descriptive analysis of the expression of 419 glycan-associated genes (glycan gene list, GGL) in two cohorts are discussed. One cohort consisted of tumors from 64 stage I–IV breast cancer patients and normal breast tissue from 79 healthy women without sign of disease, and the second cohort of tumor and adjacent normal tissue from 26 breast cancer patients. For a detailed description of the gene selection and the data sets used, see Data Appendix.
Such approach provides a unique insight into the full spectrum of changes in mRNA levels of genes coding for the various glycan-related proteins in malignant cells, and may aid in developing a more holistic picture of the alterations at hand.
N-Glycans: the diverse regulators of growth factors and adhesion signaling Synthesis of the mature N-glycan precursorN-linked glycans are remarkable for their use of dolichol-based anchor for the sequential synthesis of a precursor structure which in its finalized form (Glc3Man9GlcNAc2-P-P-Dol) is transferred to a protein (Figure 2A). The individual reactions in this pathway are catalyzed by transferases which are encoded by a gene family termed ALG (asparagine-linked glycosylation). Among its members are ALG14, encoding an enzyme responsible for transferring the second GlcNAc saccharide to the GlcNAc-P-P-Dol; ALG3 coding for a transferase that attaches the second α1,3 mannose to the glycan precursor immediately after its flipping to the inner layer of ER membrane, as well as two transferases (ALG8 and ALG10 genes) which add two terminal glucose residues to the precursor prior to its attachment to a polypeptide. All of these showed higher transcript levels in breast carcinomas in comparison to normal breast tissue in at least one of the investigated data sets (Tables 1 and 2B; Supplementary material: Table 3).
N-linked glycosylation. (For symbol explanation, see Figure 7) A) Initial steps of N-glycan synthesis pathway. First, two GlcNAcs are added to the Dol-PP anchor in the outer leaflet of ER's lipid membrane. Further, five mannose saccharides are attached to the structure. This precursor located in cytosol, is now flipped by a yet not fully elucidated mechanism to the inner leaflet. For the rest of the synthesis process, this glycan structure is situated inside the ER lumen. Here, four more mannoses as well as three glucoses are added to create the mature N-glycan precursor. Genes encoding transferases that are responsible for these reactions are designated ALG (asparagine-linked glycosylation). The mature precursor is then detached from its dolichol anchor and transferred to a target polypeptide sequence co-translationally by the large enzyme complex, OST (modified from Varki et al., 2009). B) N-glycan branching. After being transferred to a protein, the N-glycan goes through glucose and mannose trimming, the former being involved in polypeptide folding quality control. The resulting Man5GlcNAc2 structure may be branched – a process mediated by the Mgat family of GlcNAc-transferases. Up to four branches can be added by Mgat1, -2, -4 and 5 respectively and further elongated (orange arrows). Of these, Mgat5 appears to be the most interesting in carcinogenesis; the branch it initiates is preferentially elongated by polylactosamine. In addition to the four previously mentioned branches a so-called bisecting β-3 branch may be added by Mgat3. This bisecting GlcNAc terminates all further branching, including that mediated by Mgat5. Thus, activity of Mgat3 might inhibit polylactosamine synthesis. The two key reactions, performed by Mgat3 and Mgat5, are highlighted in red. The Mgat transferases require UDP-GlcNAc which is imported through a transporter (SLC35A3). Genes encoding relevant transferases are displayed in black italic font (modified from Lau and Dennis, 2008). C) Core fucosylation. This is one of the possible modifications made to N-glycans' core structure.
Summary of differentially expressed genes/pathways in normal versus malignant breast tissue. The most pronounced alterations in gene expression and the impact they may have on various glycosylation processes and prevalence of certain structures are presented. In each case relevant genes are listed. Up-regulation in tumors of both data sets in comparison to normal samples is indicated by an upward pointing arrow, ↑, and down-regulation – by an arrow pointing down, ↓. If a significant change in expression was seen only in one of the data sets, the arrow is shown in parentheses: (↓) or (↑).
Significantly differentially expressed genes in malignant versus non-malignant breast tissue. A) Top ten up- and down-regulated genes in tumors of data sets A and B in comparison to healthy tissue expression within the same data set. All genes on these lists had a q-value of 0. Genes found to be significantly differentially expressed (q < 0.05) in both data sets are underlined. B) All genes found to be significantly (q < 0.05) differentially expressed in both data sets sorted by functional classes.
Additionally, several factors other than transferases themselves may impact efficiency of these reactions and thus progression along the pathway, including availability of substrate. During this early stage of the N-glycan synthesis pathway three substrates are used, namely UDP-Glc, UDP-GlcNAc and GDP-Man. Both enzymes involved in their metabolism and solute carriers are therefore of relevance. Examples of genes encoding these are PMM1, PMM2 and SLC35A3. The former two code for enzymes that convert mannose-6-P to mannose-1-P. Mannose-1-P is then in turn converted to GDP-Man which serves as substrate for the sequential addition of the 9 mannose residues to the N-glycan precursor. SLC35A3 is a solute transporter for another substrate, UDP-GlcNAc (Figure 2B). Our analysis demonstrated elevated expression levels of SLC35A3 and PMM2 in malignant samples compared to cancer-free tissue. PMM1, however, was down-regulated in breast carcinomas.
Core fucosylation of N-glycansCores of N-glycans may be decorated with various sugar moieties along their passage through Golgi. The most common modification of the N-glycan core in mammals is the α1,6 fucosylation of the GlcNAc residue bound to asparagine (Figure 2C). Core fucosylation is of special interest due to its ability to modulate growth and development through altering functional properties of integrins (Figure 1, and ). It has for example been shown that core decoration by α1,6 fucose of certain motifs of the α3β1 integrin complex is essential for its association and activity (Zhao et al., 2008). In breast cancer patients an increase in core fucosylation of N-glycans for the circulating alpha-1-proteinase inhibitor (API) has previously been suggested (Goodarzi and Turner, 1995). FucT 8 (FUT8) is the only core fucosyltransferase that has been described (Miyoshi et al., 2008; Wang et al., 2005). The mRNA transcript for this gene was up-regulated in carcinomas of both data sets indicating higher contents of complex glycans in tumors (Tables 1 and 2B; Supplementary material: Table 3).
N-Glycan branchingAfter the saccharide part of the dolichol-P-P-glycan has been transferred to a protein (Figure 2A), a process of glucose trimming occurs in ER. This serves as a quality control step and insures proper folding of the glycosylated protein. If the folding succeeds, the glycan-bearing polypeptide is transported to cis-Golgi where a number of the mannose residues which constituted the mature precursor are removed by mannosidases. Number of removed mannoses may vary depending on the final destination and function of the glycoprotein. Usually, only 3 sugar units of this type are left which allows branches to be added to the glycan structure (Figure 2B).
Key transferases in the formation process of branched glycan structures are the mannosyl N-acetylglucosaminyltransferases. Several enzymes providing this functionality are known, each with its distinct and restricted specificity. Mgat1 (MGAT1) transferase is the first member of this family to gain access to the core N-glycan structure after terminal mannose trimming. It adds the first β1,2 N-acetylglucosamine to the α1,3 mannose. Further, two mannose units may be removed from the α1,6 mannose by mannosidases encoded by the MAN2A1 and MAN2A2 genes. This trimming allows up to three more antennas to be added. Transferases responsible for addition of these branches are enzymes encoded by genes MGAT2, MGAT4 and MGAT5, in that order (Figure 2B). However, all four antennas are not necessarily synthesized. Due to a regulatory mechanism this process may be permanently terminated, thus reducing number of branches. MGAT3 codes for a member of the same mannosyl N-acetylglucosaminyltransferase family, which by facilitating the addition of a so-called bisecting (β1,4 GlcNAc) branch, may prevent further branching by Mgat4 and Mgat5. All of the mentioned genes were found to have altered expression levels in breast carcinomas in relation to normal breast tissue. MAN2A1, MGAT2, MGAT4A and MGAT5B were up-regulated while MAN2A2 and MGAT3 showed lower mRNA levels in malignant samples (Tables 1 and 2; Supplementary material: Table 3).
Several reactions may be rate-limiting for the N-glycan branching process. Efficiency of Mgat1 is dependent on concentration of the acceptor structure, while enzymatic activities of Mgat4 and Mgat5 require high concentrations of UDP-GlcNAc as substrate (Lau and Dennis, 2008). The previously mentioned solute carrier for this nucleotide sugar, SLC35A3, might be of interest also in this case. Subsequent to the addition of GlcNAc, a family of galactosyltransferases may add a galactose to any of the antennas. Most important members of this family are encoded by three genes: B4GALT1, B4GALT2 and B4GALT3. The required substrate for the reactions they catalyze is UDP-Galactose. Among the relevant solute carriers for this molecule is the one encoded by the SLC35B1 gene. Both solute transporters and the β4-galactosyltransferase genes were found to be higher expressed in the carcinoma specimen when compared to healthy tissue of the breast.
Association between multi-antennary N-glycans and cancer was proposed over two decades ago (Yamashita et al., 1984) and has been since confirmed as one of the typical changes during carcinogenesis (Dube and Bertozzi, 2005). Traditionally the most studied transferase in this context has been Mgat5 (Lau and Dennis, 2008). Expression data presented in this study confirm this observation and further suggest that certain other parts of this pathway may also be up-regulated in breast cancer, including both the anabolic (MGAT2, MGAT4A, B4GALT3) and catabolic steps (MAN2A1), as well as solute carriers facilitating enhanced substrate availability (SLC35A3, SLC35B1) (Tables 1 and 2; Supplementary material: Table 3). Concordantly, one of the main inhibition mechanisms of antenna synthesis, namely the addition of a bisecting branch by Mgat3, appears to be significantly down-regulated in tumors.
Higher prevalence of branched N-glycans may have several implications (Figure 1, ). The β1,6 branch initiated by the Mgat5 transferase may often be elongated by a polylactosamine chain of varying length. It has been well established that such structures are present on epidermal growth factor (EGFR) and transforming growth factor-β (TβR) receptors and may in concert with the galectin family of glycan-binding proteins protect these receptors against endocytosis and thus facilitate their retention at the cell surface (Lau and Dennis, 2008; Partridge et al., 2004). This property of N-glycans might therefore be crucial for modulation of cytokine signaling involving these receptors which in turn are key factors in epithelial–mesenchymal transition; EMT (Partridge et al., 2004). Galectins are an important family of β-galactoside binding lectins which may be involved in a wider spectrum of processes of key importance in carcinogenesis (Yang et al., 2008). Besides the described impact on growth factor receptor modulation, such processes include tumor immune escape by T-cell specific induction of apoptosis, mediated by for example galectins encoded by genes LGALS1, -2, -3 and -9 (Sturm et al., 2004; Wada et al., 1997; Zhu et al., 2005; reviewed in Yang et al., 2008) and cell cycle arrest by LGALS12 (Liu and Rabinovich, 2005; Yang et al., 2001). However, the biological mechanisms behind these processes are yet to be fully understood and it is therefore in many cases uncertain whether these functions are due to interaction with N-glycans. In the studied cohorts mRNA transcripts of galectins 2, 8 and 9 (genes LGALS2, LGALS8 and LGALS9, respectively) were found to be significantly higher expressed, while 3, 4, 7 and 12 (LGALS3, LGALS4, LGALS7 and LGALS12) were found to have lower levels of expression in carcinomas compared to healthy tissue (Table 2B; Supplementary material: Table 3). Data on galectin 1 was conflicting showing higher expression in one data set and lower in the other. Nonetheless, the number of differentially expressed members of this family shows that galectins might be of importance in the malignant transformation of carcinomas of the breast.
N-glycans might also be of importance in regulation of integrin- and cadherin-induced signaling (Gu and Taniguchi, 2004; Zhao et al., 2008). In this regard the β1,6 branch has been shown to have a crucial role in subunit association and ligand binding of α5β1 integrin. Higher prevalence of N-glycans presenting this branch seemed to induce migration of cells on fibronectin (Zheng et al., 1994; Pochec et al., 2003) and, in concert with galectin 3, cause focal adhesion remodeling, as well as downstream activation of the kinases FAK and PI3K (Lagana et al., 2006). In line with these observations, over-expression of the transcript of the MGAT3 gene, which has already been described as responsible for addition of a bisecting branch and thus competitively inhibiting Mgat5, resulted in suppressed cell migration and spread of neurites (Gu et al., 2004). Similar interactions have also been demonstrated for another integrin, α3β1, where Mgat5 promoted cell migration on laminin 5 while Mgat3 negated this effect (Zhao et al., 2006). In addition to modulation of EGFR, TβR and integrin, homotypic cadherin adhesion has been shown to be influenced by N-glycosylation. In this case Mgat3 appears to play the central role. Previous studies have demonstrated that there is a mutual regulation between the two: MGAT3 expression is induced under dense conditions closely resembling the state of cells in a terminally differentiated tissue (Iijima et al., 2006); while cells transfected with Mgat3 showed greater adhesion through E-cadherin (Yoshimura et al., 1996). Loss of E-cadherin is one of the well-established changes characterizing EMT (Thiery, 2003). It should be noted however that despite the fact that MGAT3 over-expression and the enzymatic activity of the resulting Mgat3 transferase in most cases have been correlated to down-regulation of β1,6-branching, inhibition of motility and reduction in malignancy, Bhaumik et al. reported that homozygous knockout of Mgat3 in mice slowed down the progression of hepatocellular carcinomas (Bhaumik et al., 1998). This is surprising and yet an unexplained finding, but indicates that the bisecting branch is involved in a more elaborate network of interactions.
O-Linked glycosylation regulating adhesion and modulating immune responses O-Linked glycans in generalIn contrast to N-linked glycans, O-glycans do not utilize a dolichol-based core structure. The first step in the O-glycosylation pathway is rather a direct transfer of GalNAc to either a serine or threonine of a protein creating a so-called Tn antigen (Figure 3A). This transfer is catalyzed by a family of GalNAc transferases (GALNT1-14 and GALNTL1, -2, -4 and -5; reviewed in Ten Hagen et al., 2003). These GalNAc transferases are of different specificity (Tarp and Clausen, 2008) and regulate the pattern of mucin glycosylation (Figure 3B) as well as O-glycans on other acceptor proteins such as fibronectin. In regard to breast cancer, the GALNT6 gene is perhaps the most interesting since the transferase it encodes has been previously suggested to be a specific breast carcinoma marker (Berois et al., 2006). It has also been proposed to be the only member of the family besides GALNT3 to have the ability to synthesize the O-glycosylated isoform of oncofetal fibronectin (Bennett et al., 1999b). Oncofetal fibronectin is not present in adult breast tissue or benign lesions of the breast, but is expressed in higher-grade invasive breast carcinomas (Loridon-Rosa et al., 1990). These reports are in line with our finding that GALNT6 is higher transcribed in breast carcinomas than in normal tissue of both data sets. In addition a number of other GalNAc transferase genes were significantly differentially expressed; GALNT2, -3, -5, -6, -7, -10 and GALNTL4 were up-regulated, while GALNT11, -12, GALNTL1 and -2 were down-regulated in breast tumors compared to control samples (Tables 1 and 2; Supplementary material: Table 3).
O-linked glycosylation. (For symbol explanation, see Figure 7) A) Synthesis of O-linked glycans. Several different core O-glycan structures exist, but some are very rare. The initial synthesis of cores 1 and 2 as well as tumor-associated antigens (T antigens) are illustrated. Each of the cores can be extended and further modified by for example fucosylation (not shown). Genes encoding relevant transferases are displayed in black italic font (modified from Varki et al., 2009; Tarp and Clausen, 2008). B) Initiation of O-glycosylation on MUC1. This illustration shows a single VNTR (orange) of MUC1 with its amino acid sequence, and specificities of the well-studied GalNAcTs. The enzymes involved are sometimes overlapping in specificity, although it should be noted that some transferases require preceding activity of other GalNAcTs (not shown). It is also worth mentioning that other transferases from this family are known to act on MUC1 although their specificity is yet to be defined (Bennett et al., 1999a,b) (modified from Tarp and Clausen, 2008).
The Tn antigen can be further converted to either sialyl Tn or a core 1 structure (T antigen) (Figure 3A). Genes coding for the transferases responsible for the latter conversion are C1GALT1 and C1GALT1C1. The core 1 structure may be elongated either to core 2 or sialyl T (ST) by GCNT1, -3 or -4 and ST3GAL1 respectively. Enzymes at this step are competitive and their activity level determines which structure is synthesized (Dalziel et al., 2001). It has been shown that core 2 structures might be less frequent on breast carcinoma cells (Brockhausen, 2006) and are absent from certain breast cancer cell lines (Brockhausen, 1999). Furthermore, previous reports have indicated that up-regulation of ST3GAL1 in breast carcinomas can be linked to tumor grade and under-glycosylation of MUC1 (Burchell et al., 1999). Sialyl T may be additionally sialylated by ST6GALNAC1 to form disialyl T (diST) antigen. This transferase may also convert Tn antigen to sialyl Tn (STn). In our investigation GCNT1 and ST3GAL1 transcripts were found to be up-regulated in breast tumors in both data sets while GCNT3 was up only in data set A (Tables 1 and 2B; Supplementary material: Table 3). Since all three genes were significantly over-expressed in carcinomas, it is difficult to predict which of the competing transferases will dominate, although in both data sets ST3GAL1 had a higher mean mRNA expression value than GCNT1 and GCNT3 (data set A: 0.3 for GCNT1 and −0.9 for GCNT3 versus 0.7 for ST3GAL1; data set B: 7.8 for GCNT1 versus 11.6 for ST3GAL1). This might indicate that progression to sialyl T antigen is the most frequent outcome. In addition, transcript of the ST6GALNAC1 gene was significantly down-regulated.
Several of the structures mentioned above, including ST, Tn and STn, are known to be tumor-associated epitopes in breast cancer (Brockhausen, 2006). Our results are largely consistent with this and suggest that most of the O-glycans might remain as either sialyl T or Tn in carcinomas but that the pathway less frequently proceeds to synthesize disialylated version of T antigen or STn.
MucinsMucins are large glycoconjugates which similarly to glycosaminoglycans consist of a heavily glycosylated core protein. Mucin glycosylation is predominantly O-linked and occurs at specific sites throughout the VNTR domain (Figure 3B) – a shared feature of all mucins (Hollingsworth and Swanson, 2004). These glycan structures might be one of the defining characteristics for mucins' functionality. For instance, MUC1 has 5 potential glycosylation sites within each tandem repeat (Tarp and Clausen, 2008) as shown in Figure 3B. This illustration also presents specificities of the best-studied GalNAcTs towards the MUC1 VNTR. It is nonetheless imperative to notice that other GalNAcTs than those presented on the figure may act on MUC1 and other mucins, for example GalNAcTs-6 and -7 (Bennett et al., 1999a,b). The exact site specificity of these is however currently unknown. Of the shown genes GALNT2, -3 and -11 were found to be differentially expressed in the studied data sets (see Section 3.1 above).
Mucins are known to play several important roles in carcinogenesis (Figure 1, and ). These glycoproteins may be secreted or membrane-bound, and as glycosaminoglycans they have the ability to filter substances present in the cells' immediate surroundings and to comprise a protective steric barrier in foreign environments such as distant metastatic sites, or hinder passage of chemotherapeutics (Hollingsworth and Swanson, 2004). Retention ability of such mucin matrices has been described for a variety of interleukins (Cebo et al., 2001) although it is not clear whether interactions are specific or not (Hollingsworth and Swanson, 2004). This type of sequestration is also hypothesized to occur for growth and differentiation factors (Hollingsworth and Swanson, 2004) potentially contributing both to malignant transformation and modulation of immune response. It has been demonstrated that mucin-bound glycan structures may also have a more direct role in control of the immune system through interaction with leukocyte receptors, for example Siglec-1 (Nath et al., 1999; Varki and Angata, 2006) and selectins (Kim et al., 1999). Secreted mucins may thus negatively affect motility of leukocytes and activate them prematurely (Hollingsworth and Swanson, 2004).
Mucins attached to the outer cell membrane might also play an important role in carcinogenesis due to their proposed contribution to the MUC1's cytoplasmic tail (MUC1CT) signaling (Hollingsworth and Swanson, 2004; Kohlgraf et al., 2003) and cis-interactions with cell surface associated receptors. For example several members of the ErbB tyrosine kinase receptor family, including ERBB2, have been found to interact with MUC1CT (Carraway et al., 2003; Jepson et al., 2002; Singh and Hollingsworth, 2006). At present it is unknown whether this process is directly affected by aberrant glycan structures. It is however known that internalization of this mucin is influenced by O-glycosylation, which might alter its ability to interact with surface receptors (Singh and Hollingsworth, 2006). Conversely, other autocrine interactions may be more intimately linked with glycan structures. Several adhesion-mediating receptors have been described to engage both cis- and trans-interactions with the heavily glycosylated tandem repeats of mucins. Siglecs (sialic acid-binding immunoglobulin superfamily lectins) are a group of such receptors. Sialyl Tn antigen is a tumor-specific antigen correlated to poor prognosis in breast cancer patients (Brockhausen, 2006) and may be present on mucins. It can serve as a ligand for several siglecs (SIGLEC2, -3, -5 and -6) as has been shown by Brinkman-Van der Linden and Varki (2000). Our data shows a trend towards over-expression of most siglecs in malignant tumors, including several of the aforementioned genes; SIGLEC5, -6, -7 and -9 were found to be up-regulated in tumors, while SIGLEC1 was the only down-regulated gene of family in tumor samples in comparison to healthy breast tissues (Table 2; Supplementary material: Table 3).
Intercellular adhesion molecules (ICAMs) constitute another important class of adhesion mediators, for which mucins are of importance. ICAM1, the best-characterized member, was initially identified as LFA-1 ligand and mediator of leukocyte adhesion present on both leukocytes themselves and endothelial cells (Hollingsworth and Swanson, 2004; Rothlein et al., 1986). Notably, ICAM1 is also known to interact with under-glycosylated MUC1, suggesting that both of these molecules play a role in immune system evasion and mediation of adhesion to endothelium (Hayashi et al., 2001). In our data sets genes of two members belonging to this family, namely ICAM1 and ICAM2, were found to be differentially expressed. ICAM1 was higher expressed while ICAM2 showed lower expression in tumor samples than what was the case in normal breast tissue (Table 2B; Supplementary material: Table 3).
Mucins are thus implicated in malignant transformation in several ways: cis-interactions modulate adhesion or mask potential adhesion-mediating receptors (Hollingsworth and Swanson, 2004); trans-interactions on the other hand, conveyed either by mucins displayed on the cell surface or deposited in the local environment, act as ligands for some of the same receptors on cells they get in contact with. This duality in mucins' function is central to their proposed role in metastasis. Distant metastasis requires dissociation of normal adhesion at the primary site and establishment of new adhesive interactions at the distant site – of which mucins may offer both.
Lewis antigens – functionally important terminal glycan epitopesLewis antigens are functionally important terminal glycan epitopes, which were first reported to be implicated in breast cancer development several decades ago (Fukushi et al., 1984a,b; Fukushima et al., 1984). These antigens are common to many types of glycans, including N- and O-linked glycans as well as glycosphingolipids. Lewis antigens and their aberrant expression have been viewed as one of the underlying mechanisms for metastasis in different carcinomas, partially due to the interactions between these epitopes and E-selectin presented by activated endothelial cells (Barthel et al., 2007) (Figure 1, ). Sialyl Lewis X antigen expression is a common feature of breast carcinomas (Kannagi, 1997), and transformed cells appear in many cases to strongly interact with stimulated endothelium (Tozeren et al., 1995). Ectopic expression of the non-sialylated Lewis X has also been suggested to play a role in the interaction between breast carcinoma cells and endothelial cells promoting metastases, albeit presumably through a different mechanism involving colectin 12 (Elola et al., 2007).
Structurally these epitopes are comprised of a fucosylated Gal-GlcNAc-β1,3/4-Gal backbone which can optionally be sialylated (Figure 4). Addition of GlcNAc to the inner galactose of the backbone is mediated by several transferases, for example those encoded by the B3GNT1, -2, -3 and -5 genes. Subsequently, either a β1,3 or a β1,4-galactose can be added to this sugar moiety. β-1,3-Galactosyltransferases encoded by genes B3GALT1, -2 and -5 predetermine the final structure to be a type 1 Lewis epitope, a group which includes Lea, SLea and Leb structures. On the other hand, β-1,4-galactosyltransferases 1, 2, 3 and 4 (B4GALT1, -2, -3, -4) will synthesize type 2 Lewis antigens (for example Lewis X and Y) by transferring a β-1,4-galactose to the GlcNAc saccharide (Figure 4). Backbone of both types is further modified. All type 1 structures contain an α1,4-Fuc residue on the GlcNAc added by fucosyltransferase 3 (FUT3). The same is the case for type 2 antigens but the linkage is α1,3 instead (catalyzed by products of FUT3 through -7 and FUT9). The terminal galactose may or may not be modified. Examples of unmodified Lewis antigens are Lea and Lex. Alternatively, this galactose may be either sialylated or fucosylated. Sialylation by transferases encoded from ST3GAL3 and -4 yields SLea, and from ST3GAL6 results in SLex, while addition of fucose forms Leb and Ley epitopes which are both synthesized by fucosyltransferases 1 and 2 (FUT1 and FUT2).
Among this group of terminal glycan structures, several type 2 antigens including Lewis X (also known as SSEA-1 or CD15), sialyl Lewis X and Lewis Y are considered to be tumor-associated markers (Soejima and Koda, 2005). Higher prevalence of SLex has been reported in breast cancer cells and appears to correlate with the expression of an α-1,3-fucosyltransferase, FUT6 (Matsuura et al., 1998). Our findings are for the most part in agreement with these established observations. Higher transcription level of three of the four relevant β-1,4-galactosyltransferases (B4GALT1, -2, -3) was found in malignant tissues, while no change was seen in β-1,3-galactosyltransferase genes (Tables 1 and 2; Supplementary material: Table 3). This is in agreement with higher prevalence of type 2 epitopes in breast carcinomas. The fucosylation and sialylation processes appeared to be altered during the malignant transformation as well with up-regulation of FUT5, FUT11 and ST3GAL4, and down-regulation of FUT4, FUT9, FUT10, ST3GAL3 and ST3GAL6 in breast carcinoma samples. In line with Matsuura et al. (1998) we observed higher expression of FUT6 in tumors, although borderline (q-value of 0.055 in data set B). Furthermore, B3GNT1, -3 and -5 genes were found to have lower mRNA level in carcinomas than in non-malignant samples, while B3GNT2 was higher transcribed (Table 2A; Supplementary material: Table 3). From these findings no firm conclusion can be drawn on the exact outcome of such alterations, except the higher prevalence of type 2 Lewis epitopes associated with breast carcinomas.
Lewis antigens. (For symbol explanation, see Figure 7) Lewis antigens are usually subdivided into two groups – type 1 and 2 – depending on whether the terminal galactose is bound to the preceding GlcNAc by a β3 or β4 bond (in red font). Epitopes in the latter category are considered as tumor-associated antigens. Both type 1 and 2 structures may appear on a variety of glycans (denoted as “R”). Therefore they are important in interaction with other cells like endothelial cells. The main Lewis antigens are shown, alongside genes encoding key transferases in their synthesis (modified from Varki et al., 2009).
Glycosphingolipids (GSLs) are a large family of glycan structures covalently associated with lipids. These glycans are often subdivided into several series, named the lacto-, neolacto-, ganglio- and globo-series. In this review we focus on the ganglio-series since most of the GSL-related genes found differentially expressed in our data sets belonged to the synthesis pathway of these glycan structures (Figure 5).
Ganglio-series glycosphingolipid synthesis pathway. (For symbol explanation, see Figure 7) Key genes involved in the pathway are indicated, and four have been denoted numerically to indicate that the sialyltransferases encoded by these genes catalyze several steps of the pathway (modified from Varki et al., 2009).
Ganglio-series glycosphingolipids are in most cases highly sialylated structures built around a Galβ1-3GalNAcβ1-4Galβ1-4GlcβCer core. The first step in the synthesis pathway is the addition of a glucose monosaccharide residue to a ceramide. A ceramide glucosyltransferase encoded by UGCG is responsible for catalyzing this reaction. As the synthesis progresses a β4-galactose is added to create lactosylceramide (LacCer). This galactose may hold a branch of one, two or three sialic acid residues (Figure 5). The three resulting GSLs and structures derived from these have been given specific designations: a-, b- and c-series, respectively. Stepwise addition of these residues is performed by two sialyltransferases. Synthesis of the GM3 structure (a-series precursor) by addition of the innermost α3 sialic acid to LacCer is done by a transferase encoded by the ST3GAL5 gene. The other transferase (gene ST8SIA1) may add up to two more sialic acids to create GD3 (b-series precursor) and GT3 (c-series precursor). Further, pathways progress in parallel by similar reactions to synthesize a complete core structure. The first step is addition of a GalNAc which results in a GA2, GM2, GD2 or GT2 structure. The transferase catalyzing this step is coded by B4GALNT1. Product of the B3GALT4 gene may subsequently act on the backbone to extend the core further with a galactose (creating GA1, GM1, GD1b and GT1c). Both the GalNAc and the terminal galactose residues can be sialylated by transferases encoded by the genes ST3GAL1/-2 (resulting in cisGM1, GD1a and GT1b structures) and ST6GALNAC6 (creating GD1α, GT1aα and GQ1bα) in all but the c-series. Several other sialyltransferases are known to mediate this step, including for example ST6GALNAC3 and ST6GALNAC5, although these enzymes have narrower specificity and only catalyze addition of sialic acid to cisGM1, creating GD1α. The latter gene is of interest since it has recently been reported to enhance metastasis of breast carcinomas to the brain (Bos et al., 2009). It is not presently known which sialyltransferases are responsible for the last two sialylation steps in the synthesis pathway of the c-series. Our findings demonstrate differential expression of many genes involved in these pathways. UGCG and ST3GAL1 were over-expressed, while ST3GAL5, ST8SIA1, ST6GALNAC3 and ST6GALNAC6 were among the down-regulated genes in malignant samples when compared to healthy breast tissues (Tables 1 and 2B; Supplementary material: Table 3). Observed changes in the mRNA expression of the B3GALT4 gene were inconclusive since it was found up-regulated in one data set and lower expressed in the other. Furthermore, results of our expression analysis were in line with observations of Bos et al. (2009) demonstrating higher transcription levels of ST6GALNAC5 in breast carcinomas.
Several gangliosides are relevant to carcinogenesis, perhaps most notably GM3 and GD3. Certain reports have shown that induction of the cell surface expression of the GM3 structure was associated with reversion of malignancy through an integrin- and CD9-dependent mechanism (Mitsuzuka et al., 2005; Miura et al., 2004). This ganglioside has also been found to be associated with intermediate filaments (Gillard et al., 1992), although the precise role of this interaction has not been fully elucidated. Some attention has also been devoted to interactions between glycosphingolipids and immune response. In this regard GM3 and GD3 have been proposed to reduce cytotoxicity of NK-cells and peripheral blood leukocytes, perhaps by influencing the arachidonic acid cascade (Bergelson, 1993).
GlycosynapseAt least some of the biological effects GSLs convey are mediated through a somewhat unusual mechanism, emergent importance of which deserves a separate mention. It has been known for some time that GSLs have a tendency to aggregate in cell membranes forming clusters (Tillack et al., 1983) without recruitment of cholesterol. Such aggregates are necessary for glycosphingolipids to exert some of their biological activities. The term “glycosynapse” was introduced to describe such functional GSL aggregates (Hakomori Si, 2002). The aggregates have been subdivided into several subtypes depending on function and contents (Todeschini and Hakomori, 2008). Their importance is thought to relate to their ability to convey adhesion – between cells (Figure 1, ) as well as between cells and extracellular matrix – and potential to associate to and modulate a number of key receptors such as growth factor receptors and integrins (Todeschini and Hakomori, 2008).
DesialylationAs previously noted, glycosphingolipids are abundantly decorated by sialic acid residues where sialyltransferases are of crucial importance. The degradation process of these structures mediated by sialidases is equally interesting. Currently three sialidases are known, encoded by NEU1, -2 and -3, each of which has a distinct cellular localization and thus differs functionally (Miyagi et al., 2004). Some conflicting findings as to the effects of sialidase over-expression in malignancy have been published. Early work suggested that activity of these enzymes was higher in transformed cells (Bosmann and Hall, 1974; Schengrund et al., 1973; Usuki et al., 1988). A more recent study demonstrated that high level of the plasma membrane sialidase (NEU3) was linked to protection from apoptosis in colon cancer (Kakugawa et al., 2002). It has also been shown that activity of the lysosomal sialidase (NEU1) appears to be enhanced in hepatomas in comparison to normal liver tissue, while the cytosolic sialidase (NEU2) was generally less active in the malignant tissue (Miyagi et al., 1990, 1992). On the other hand, several studies have shown that sialidases might also possess anti-metastatic properties (Kato et al., 2001; Miyagi et al., 1994; Sawada et al., 2002; Tokuyama et al., 1997), leaving the precise role for sialidases in tumor progression still to be explained. Our expression analysis demonstrated that among the sialidase genes, NEU1 was higher transcribed in tumors of both cohorts, while NEU3 was only found to be up-regulated in carcinomas in data set A (Table 2B; Supplementary material: Table 3).
Glycosaminoglycans and the extracellular microenvironmentA growing body of evidence has emerged during the past decade indicating that tumor microenvironment plays an important role in most cancer types (Joyce and Pollard, 2009) including that of the mammary gland (Shekhar et al., 2001). Extracellular matrix (ECM) is an important tissue component synthesized by most cell types, and is mainly composed of various glycosaminoglycans (GAGs) (Figure 6). The GAGs are long linear polysaccharide chains comprising repeating disaccharide units attached to a core protein. GAGs which constitute the ECM are involved in a number of intra- and extracellular processes, ranging from structurally defining a tissue's distinct morphology to regulation of adhesion, motility and differentiation (Gandhi and Mancera, 2008; Iozzo, 1998) (Figure 1, ).
Structure of glycosaminoglycans (GAGs). (For symbol explanation, see Figure 7) The three main structural aspects of GAGs are illustrated; attachment to core proteins, the repeating disaccharide sequence, and the hallmark sulfation. Attachment to protein cores is mediated by different glycan structures, except for hyaluronan which is not covalently bound to any other molecule. Chondroitin and heparan sulfate share a common tetrasaccharide core structure while keratan sulfate may be attached to either N-linked or O-linked glycans (left). Further, GAGs consist of repeated disaccharide units which are unique for each glycosaminoglycan type (middle). Addition of sulfate is typical for all GAGs except hyaluronan. Possible sulfation patterns for one of the glycosaminoglycans, chondroitin sulfate, are shown to the right (modified from Varki et al., 2009; Sugahara et al., 2003).
Chondroitin sulfate (CS) is a sulfated glycosaminoglycan consisting of N-acetylgalactosamine–glucuronic acid (GalNAc–GlcA) disaccharide units (Figure 6).
Protein coreProtein cores provide a scaffold upon which a polysaccharide consisting of repeated disaccharide units of a GAG like CS can be attached. Each type of glycosaminoglycans has usually its own set of cores encoded by different genes. Versican is a well-known core protein for fibroblast chondroitin sulfate (Zimmermann and Ruoslahti, 1989) and belongs to the family of lecticans (Yamaguchi, 2000). This protein has been extensively studied and linked to atherosclerosis (Wight and Merrilees, 2004), ovarian cancer (Ricciardelli and Rodgers, 2006) and more recently, local invasiveness of breast carcinomas (Yee et al., 2007). Recent reports indicate that versican plays an important role in inflammation in advanced lung cancer by activating macrophages (Kim et al., 2009). Versican is likely to exercise a variety of biological functions related to carcinogenesis. These include inhibition of cell adhesion to extracellular matrix components such as fibronectin and type I collagen (Yamagata et al., 1986, 1989) thus facilitating motility (Ang et al., 1999), binding of inflammatory leukocytes through L-selectin, P-selectin and CD44 (Kawashima et al., 2000), and finally interaction with various chemokines (Hirose et al., 2001). Aggrecan (encoded by the ACAN gene) belongs to the same lectican family as versican (Yamaguchi, 2000). It is considered to play a role in chondrocyte differentiation (Chen et al., 1995) and has previously been implicated in some rare matrix-producing breast carcinomas (Kusafuka et al., 2008). For the majority of breast tumors, however, the expression has not been found to differ from normal breast tissue (Eshchenko et al., 2007). Both the VCAN and ACAN genes were in our analysis found to be significantly higher expressed in breast carcinomas than in normal tissues of the breast (Tables 1 and 2; Supplementary material: Table 3). The latter finding thus contradicts conclusions made by Eshchenko et al. warranting further and larger studies.
Core tetrasaccharideIn order for the disaccharide repeat chain to be attached to the protein scaffold, a short glycan core must first be synthesized. This core is a tetrasaccharide (GlcAβ1-3Galβ1-3Galβ1-4Xylβ1-) where the xylose is the innermost unit attaching the GAG chain to the protein, usually to the amino acid serine (Figure 6). Assembly of the core is a sequential process catalyzed by a number of transferases. Genes encoding these transferases are in the order of action: XYLT1 and -2 (xylosyltransferases); B4GALT7 (β4-galactosyltransferase); B3GALT6 (β3-galactosyltransferase); B3GAT1, -2 and -3 (glucuronyltransferases). Several of these genes were found to have an altered level of expression in breast carcinomas in comparison to normal tissue in the available data sets. Two of these, namely XYLT2 and B3GALT6, were up-regulated and one, B3GAT1, was down-regulated in tissues taken from cancer patients. XYLT1 was differentially expressed as well, but its transcript level showed contradictory trends in the two data sets (Tables 1 and 2B; Supplementary material: Table 3).
PolymerizationAfter the tetrasaccharide has been constructed, it is elongated with disaccharide structures specific to CS (GalNAcβ4GlcAβ1–3). This polymerization is performed by several enzymes. Chondroitin sulfate synthase 1 and 3 (CHSY1 and CHSY3) transfer both N-acetylgalactosamine and glucuronic acid elongating the linear chain of chondroitin sulfate. Additionally, a specific GlcA transferase (CSGLCAT) is thought to participate in the synthesis along with a polymerizing factor (CHPF) which does not possess any transferase activity by itself but facilitates the reaction. These genes are currently only in the process of being characterized and are yet to be linked to the carcinogenesis (Kitagawa et al., 2001, 2003; Nagase et al., 1999; Yada et al., 2003; Izumikawa et al., 2008). Glucuronic acid may also be converted to its C5 epimer iduronic acid by an epimerase coded by the DSE and DSEL genes to form dermatan sulfate (Pacheco et al., 2009). Degradation of the CS disaccharide occurs due to action of a hyaluronoglucosaminidase 1 (HYAL1), a lysosomal enzyme which is able to degrade CS by hydrolyzing the β1–4 bond of the GalNAc residue in the disaccharide, and a glucuronidase (GUSB) which is responsible for breaking the β1–3 GlcA bond. These genes have been studied more thoroughly. HYAL1 for example is known to be both cancer-promoting and have anticancer properties (Stern, 2008). Elevated expression of this gene has recently been proposed as predictor of development of invasive breast carcinoma (Poola et al., 2008). Our analysis revealed that CHSY1, CHPF and GUSB showed higher transcript levels in malignant tissues in comparison to cancer-free samples, while HYAL1 was down-regulated (Tables 1 and 2; Supplementary material: Table 3) contrasting previous findings (Poola et al., 2008).
SulfationGlycosaminoglycans may be modified by addition of sulfate groups in a variety of patterns, a process mediated by a number of different sulfotransferases (Kusche-Gullberg and Kjellen, 2003). There is a growing body of evidence suggesting that such patterns are of utmost importance for the function of GAGs (Angulo et al., 2004; Bulow and Hobert, 2004; Deepa et al., 2002; Kitagawa et al., 1997). CS has been shown to interact with several heparin-binding proteins, including various growth factors (Sugahara et al., 2003). Such interactions are important since they may both modulate adhesion and adjust the reservoir of signaling molecules in close proximity to the cell membrane at all times. Sulfation might provide a mechanism of controlling such functionality: L- and P-selectins as well as chemokines have an affinity for versican-based CS GAGs. This interaction is negated by the presence of CS-E (Kawashima et al., 2002). One may speculate that this weakens L-/P-selectin dependent adhesion. Such change might be advantageous to malignant cells since loss of L-selectin adhesion can prevent attachment of leukocytes, while reduction in affinity for P-selectins present on platelets and endothelium may potentially aid metastasis. Deepa et al. demonstrated that CS-E appeared to mediate direct binding of all but one of the eight heparin-binding growth factors tested (Deepa et al., 2002). Over-expression of CS-E thus implies higher concentration of these factors in the immediate surroundings of the transformed cells, which in turn may promote angiogenesis, be anti-apoptotic and mitogenic (Grose and Dickson, 2005).
Sulfation in itself is a complex process in which several of the 35 currently known sulfotransferases are involved. Multiple sites on the chondroitin disaccharide are available for sulfation – although not all combinations of these are possible (Figure 6). In the case of chondroitin sulfate, each disaccharide can exist in five distinct sulfation configurations classified as CS-A through E, where CS-B is also known as dermatan sulfate (Sugahara et al., 2003). Two carbons, the fourth and the sixth, of N-acetylgalactosamine can potentially bear a sulfate group. C4 sulfation is mediated by an N-acetylgalactosamine 4-O sulfotransferase in CS-A, -B and -E. There are several genes encoding enzymes which demonstrate such activity, including CHST11, -12, -13 and -14. Their end products show however some minor differences in specificities. The second sulfation site located at the sixth carbon requires a GalNAc-6-O transferase encoded by CHST3 and CHST7. Again, these enzymes have restricted specificity, and facilitate sulfation of CS-C and -D only. To sulfate C6 of a CS-E, N-acetylgalactosamine 4-sulfate 6-O transferase (GALNAC4S-6ST) is needed since GalNAc of chondroitin E already has a sulfate group on the fourth carbon. Glucuronic acid has only one sulfation site (C2) and currently only one known transferase associated with it. It is encoded by the UST gene. Results from the investigated data sets indicated that several of the genes related to the process of chondroitin sulfation were differentially expressed: CHST11 and GALNAC4S-6ST were transcribed to a higher degree in breast carcinoma samples than was the case in healthy tissue, while CHST3 and UST showed expression in the opposite direction (Tables 1 and 2; Supplementary material: Table 3). Based on these findings it is tempting to suggest that 2-O and 6-O sulfation is less prevalent in transformed cells while 4-O sulfation of CS-A, -B and -E and the 6-O sulfation of the latter may be more extensive.
Keratan sulfateKeratan sulfate (KS) is another sulfated glycosaminoglycan which does not utilize the core glycan structure of CS and HS (Figure 6). It consists of a sulfated poly-N-acetyllactosamine chain, and is subdivided into two types based on whether this chain is found on N-glycans (KS I) or O-linked glycans (KS II). KS may play a role in carcinogenesis of several cancer types, although findings have in some cases been conflicting (Nikitovic et al., 2008). In breast cancer, expression of several core proteins of KS, including lumican and decorin, has been studied and found to differ in neoplastic lesions in comparison to adjacent tissue. Lumican was found to be significantly up-regulated while decorin was lower expressed in the malignant tissue (Leygue et al., 2000). A more recent study showed on the other hand that lower expression of these two KS core proteins in early stage, node negative breast cancer correlates with poor prognosis (Troup et al., 2003). Glycan structures attached to these cores have not been investigated, leaving the role of glycosylation uncertain. Keratan sulfate may nonetheless potentially inherit some of the properties of polylactosamine and has recently been shown to constitute a suitable ligand for galectins (Iwaki et al., 2008) – a highly important interaction discussed earlier in relation to N-glycosylation. Further implications of the role of KS in carcinogenesis are yet to be investigated.
Our expression analysis showed that both synthesis and degradation of this GAG might be up-regulated. Genes B4GALT3 and GLB1, which respectively add and remove Gal in GlcNAc-Gal disaccharide structure of keratan, had higher expression levels in tumors of both data sets. Sulfation of the chain may be performed by transferases encoded by CHST1 which is responsible for sulfating both galactose and N-acetylglucosamine at sixth carbon, as well as specific GlcNAc-sulfating transferases CHST2, -4 and -6. Of these, CHST2 and -4 were down-regulated while CHST1 and -6 were up-regulated in tumor versus normal tissue (Tables 1 and 2; Supplementary material: Table 3).
Data appendix Material for expression analysisData set A was derived from women undergoing mammography between 2002 and 2007, referred by general practitioner or as part of the Norwegian national mammography screening program. The women from the screening program had been referred to a breast diagnostic center for a second assessment due to suspicious findings in the initial mammogram. From women diagnosed with breast carcinoma tru-cut biopsies of the tumor were taken before surgery (n = 64). From women with no sign of disease a core biopsy from areas of high mammographic density was sampled (n = 79). In the majority of cases these were areas without any benign lesion.
Data set B comprised both tumor tissue and adjacent non-cancerous breast tissue from the same patients (n = 26). These represent T1 and T2 ductal tumors collected at Akershus University Hospital.
The studies were approved by the Regional Ethical Committee (references S-02036 and 429-04148) and all samples were taken after informed consent.
All tissues were kept frozen at −80 °C until RNA extraction and expression analyses were performed. For data set A Agilent 4 × 44 K two-channel and for data set B, single-channel Agilent 4 × 44 K arrays, were used (Haakensen, V.D. et al. and Lüders, T. et al., unpublished data).
(Details on clinical and histopathological parameters of the tumors can be found in Supplementary material: Table 1)
Gene selectionThree principal gene classes were chosen for the GGL: (a) genes that belong to one of the catabolic glycosylation pathways, (b) genes which encode anabolic enzymes related to glycosylation, and (c) genes that code for proteins which are involved in binding carbohydrate ligands.
At present resources in the field of glycomics are decentralized, and although there are several online portals and databases, none of them encompass all available information (Dublin, 2008). For compilation of the GGL a multi-resource approach was therefore chosen (Figure 8). This implied combining information from several sources, predominantly KEGG (http://www.genome.jp/kegg/glycan/), GGDB (http://riodb.ibase.aist.go.jp/rcmg/ggdb/) and a judgmental assortment of relevant literature. KEGG was chosen as main reference due to its functional approach to gene classification and readily available pathway information.
Symbols. The different symbols used to denote the various sugar moieties in Figures 2–6 are shown. The orange rectangle illustrates a portion of a polypeptide with the mid part representing the amino acid to which glycan structures are attached.
Summary of sources used to compile the glycan gene list. Bold arrow indicates the main source.
To verify appropriateness and completeness of this gene selection, the gene list was compared to an independently compiled list based on chemical annotation. For this purpose 660 GO-terms related to either glycosylation or glycan binding were chosen utilizing GeneOntology website search engine (http://amigo.geneontology.org/cgi-bin/amigo/search.cgi) and further used to retrieve a list of gene names associated with the given terms. For association of GO-terms to gene names Agilent's probe description file was employed. The resulting gene list was approximately twice as long as the first KEGG-derived list. This can be explained by some of the more general GO-terms also including genes not directly related to glycans, and most of these additional genes were found to be irrelevant upon a closer manual inspection and were therefore omitted from the final list. Nonetheless 60 previously not included but relevant genes were identified in this process and added to the KEGG-based list. Match between the original and the GO-defined list was otherwise satisfactory.
The final version of the GGL based on KEGG database with additional refining by literature search and GO-terms consisted of 419 unique gene symbols (Supplementary material: Table 2).
Statistical analysis Data preprocessingData sets A and B were analyzed separately since two different platform versions had been used to generate the data (two-channel arrays for data set A, one-channel arrays for data set B).
For data set A an Agilent scanner was used and data processed by Feature Extraction 9.1.3.1. Locally weighted scatterplot smoothing (lowess) normalization was applied. The Stanford Microarray Database (SMD) was used for storing of log2-transformed, normalized data and retrieved from here for further analysis was filtered by presence of expression values, keeping genes with at least 80% good data and imputing missing values using the standard KNN-algorithm (n = 10) in R (version 2.8.1). Expression data for the selected glycan genes was then extracted from the filtered and imputed data file. The most up-to-date version of the Agilent probe description file available at the time (version 014850, 20.08.2007) was used for gene annotation. If a given gene was represented by several probes, mean expression was used. Genes displaying low variability over the total data set were removed using a minimal variability requirement defined as at least 3 samples with expression value over 1.3 standard deviation (SD) for each gene. This filtering and as mean-centering for hierarchical clustering analysis were performed in MatLab (version R2008b).
After scanning with an Agilent scanner, data for data set B was processed using Feature Extraction software version 9.5 and log2-transformed. Genes with at least 70% good data were kept and missing values for these genes were imputed using Least Square Adaptive method. Data constituting data set B was further preprocessed in the same way as in data set A, although with different filtering criteria for variation, namely 2 samples with expression value above 1.3 × SD, to compensate for the smaller cohort size.
Hierarchical clusteringClustering of gene-centered data was performed in Eisen Cluster (Eisen et al., 1998) version 3.0 using centroid linkage with similarity metric set to Pearson's correlation. Visualization of the results was done in a self-developed program (available at
When normal and tumor samples from data sets A and B were clustered separately, a clear segregation of malignant and non-malignant samples in both cohorts was observed (Figure 9A and B, respectively). Interestingly, a small subset of normal biopsies from data set A clustered tightly together. These samples have previously been defined as “cluster 1” by Haakensen et al. using whole-genome expression analysis (unpublished data). It may therefore appear that these samples also have a unique glycan gene signature distinguishing them from the rest of the normal samples. The main characteristics of this signature are up-regulation of eight lectin genes as well as genes involved in synthesis of N-glycan precursor and various T antigens. On the other hand the initial step of O-glycan pathway and N-glycan branching were both suppressed in tissue from samples belonging to “cluster 1”. Genes encoding enzymes responsible for heparan sulfate synthesis and degradation were up-regulated, while expression of several sulfotransferases acting on this glycosaminoglycan was significantly down-regulated in this group when compared to other non-malignant breast tissue samples.
Hierarchical clustering of glycan gene expression. A) Data set A. Malignant tissue and samples from healthy women are marked red and green on the dendrogram, respectively. These two groups are well separated indicating that glycan gene expression is profoundly altered during carcinogenesis. A subgroup of normal samples appears to have a different glycan gene signature. This group is largely identical to the “cluster 1” group defined by whole-genome expression analysis by Haakensen et al. (unpublished data). Samples classified as “cluster 1” are marked blue. B) Data set B. Tumor tissue samples are marked red on the dendrogram – bright red if extracted from a tumor biopsy or dark red in case whole-tumor was used. Core biopsies of adjacent breast tissue are marked green. The malignant and the non-malignant samples are well separated in this data set as well. Note that the normal and malignant samples are equally well separated in both data sets (with only a few misplaced samples) despite that data set A contains samples from different individuals while data set B consists of tumor and adjacent normal tissue from the same patient. These heatmaps with sample names and gene symbols can be found in supplementary material Figure 1A and B.
Significance analysis of microarrays (SAM) (Tusher et al., 2001) was utilized to compare gene expression between groups of samples. The two-class variation of SAM as implemented by the SAM Excel Plugin v.3.02 was used, and standard two-class analysis performed. Data was supplied in log2-transformed format and permutation count was set to 300, while all other options were left at their default settings (T-statistic as test statistic, automatic estimate of s0 factor for denominator; built-in imputation and median centering turned off). Genes with a false discovery rate under 5% were considered significant.
In total 156 and 124 genes with a q-value below 5% were identified in data sets A and B respectively (Supplementary material: Table 3). The top ten up- and down-regulated genes for both data sets are shown in Table 2A. Interestingly an over-representation of genes encoding glycan-binding receptors was found, especially among the highest-ranking down-regulated genes in data set B – comprising a total of 6 out of 10 genes.
Seventy-three differentially expressed genes were common to both data sets (Table 2B), leaving about 50% as uniquely differentially expressed in each data set. This relatively low overlap may be explained by the fact that the two data sets are derived from cohorts with different biological background, especially for the non-cancerous tissue. The gene expression profiles in the controls in data set A may be influenced by the increased mammographic density in this cohort, a predisposing phenotype per se; and the gene expression profiles in the adjacent normal tissue in data set B may be altered by the presence of the nearby tumor. This type of influence exercised on the surrounding tissue, the so-called field effect, was first proposed by Slaughter et al. as early as 1953 (Slaughter et al., 1953), and have since been confirmed in several cancer types including breast carcinomas (Braakhuis et al., 2003).
ConclusionGlycobiology, being one of the lesser explored and more complex areas of modern biology, gives rise to many possibilities and has potential in several areas – in particular the carcinogenic process (Figure 1). Earlier diagnosis and potential for new treatments are some of the more encouraging prospects that research on glycobiology may lead to. In this review we have focused on several important pathways of glycosylation and explored their possible role in the malignant transformation of breast carcinomas. In parallel we presented the first comprehensive analyses of mRNA expression of all known glycan-related genes in both breast carcinomas and normal breast tissue. Results of these analyses reveal that mRNA levels for many of these genes differ significantly between normal and malignant breast tissue, indicating that synthesis, degradation and adhesion mediated by glycans may be altered drastically in breast carcinoma (Table 1). Simultaneous analysis of the expression of all glycan-related genes clearly gives the advantage of enabling a comprehensive view of the genetic background of the glycobiological changes in cancer cells. Nonetheless, how changes in mRNA levels of glycan genes influence a cell's glycome and the precise role of such altered glycan structures in disease remains to be elucidated. Therefore, efforts should be made to improve our understanding of the complexity of these processes in their entirety by employing a systems biology approach.
AcknowledgementsThe authors would like to thank Kristian Prydz for valuable discussions. This work is supported by grants from the Norwegian Research Council (155218/V40, 175240/S10 to ALBD, FUGE-NFR 181600/V11 to VNK), and from South-Eastern Norwegian Regional Health Authority (VDH, AH and ALBD). IOP is an MD/PhD student supported with grants from the Faculty of Medicine, University of Oslo.
Supplementary data associated with this article can be found, in the online version, at
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2010. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Glycosylation is the stepwise procedure of covalent attachment of oligosaccharide chains to proteins or lipids, and alterations in this process have been associated with malignant transformation. Simultaneous analysis of the expression of all glycan-related genes clearly gives the advantage of enabling a comprehensive view of the genetic background of the glycobiological changes in cancer cells. Studies focusing on the expression of the whole glycome have now become possible, which prompted us to review the present knowledge on glycosylation in relation to breast cancer diagnosis and progression, in the light of available expression data from tumors and breast tissue of healthy individuals. We used various data resources to select a set of 419 functionally relevant genes involved in synthesis, degradation and binding of N-linked and O-linked glycans, Lewis antigens, glycosaminoglycans (chondroitin, heparin and keratan sulfate in addition to hyaluronan) and glycosphingolipids. Such glycans are involved in a number of processes relevant to carcinogenesis, including regulation of growth factors/growth factor receptors, cell–cell adhesion and motility as well as immune system modulation. Expression analysis of these glycan-related genes revealed that mRNA levels for many of them differ significantly between normal and malignant breast tissue. An associative analysis of these genes in the context of current knowledge of their function in protein glycosylation and connection(s) to cancer indicated that synthesis, degradation and adhesion mediated by glycans may be altered drastically in mammary carcinomas. Although further analysis is needed to assess how changes in mRNA levels of glycan genes influence a cell's glycome and the precise role that such altered glycan structures play in the pathogenesis of the disease, lessons drawn from this study may help in determining directions for future research in the rapidly-developing field of glycobiology.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway; Faculty of Medicine, Oslo University, Norway
2 Institute for Clinical Epidemiology and Molecular Biology (Epi-Gen), Faculty Division Akershus University Hospital, Faculty of Medicine, Oslo, Norway
3 Department of Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway; Department of Oncology, Oslo University Hospital Radiumhospitalet, Oslo, Norway
4 Institute for Clinical Epidemiology and Molecular Biology (Epi-Gen), Faculty Division Akershus University Hospital, Faculty of Medicine, Oslo, Norway; Department of Surgery, Akershus University Hospital, Oslo, Norway
5 Department of Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway; Institute for Informatics, Faculty of Natural Sciences and Mathematics, University of Oslo, Norway
6 Department of Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway; Institute for Clinical Epidemiology and Molecular Biology (Epi-Gen), Faculty Division Akershus University Hospital, Faculty of Medicine, Oslo, Norway
7 Institute for Informatics, Faculty of Natural Sciences and Mathematics, University of Oslo, Norway