Introduction
A majority of the proteins synthesized in the endoplasmic reticulum (ER) are transferred to the Golgi cisternae for further chemical modification by glycosylation (Alberts, 2002), a process that sequentially and covalently attaches sugar moieties to proteins, catalyzed by a set of enzymatic reactions within the ER and the Golgi cisternae. These enzymes, called glycosyltransferases, are localized in the ER and
In this article, we will focus on the role of glycans as markers of cell identity. For the glycans to play this role, they must inevitably represent a molecular code (Gabius, 2018; Varki, 2017; Pothukuchi et al., 2019). While the functional consequences of glycan alterations have been well studied, the glycan code has remained an enigma (Gabius, 2018; Pothukuchi et al., 2019; Bard and Chia, 2016; D’Angelo et al., 2013). We study the
There are two aspects of the cell-type-specific glycan code and the code generation mechanism that have an important bearing on quantifying fidelity. The first is that extant glycan distributions have high
Here, we define
First, since an important function of the glycan spectrum is cell type/niche identification, it seems natural to relate
Constructing a high-fidelity representation of a
Within our synthesis model, an increase in the number of Golgi cisternae drives an increase in the glycan complexity, keeping everything else fixed.
We explore the geometry of the
For fixed number of enzymes and cisternae, there is an optimal level of specificity of enzymes that achieves the complex target distribution with high fidelity. Keeping the number of enzymes fixed, having low specificity or sloppy enzymes and larger cisternal number could give rise to a diverse repertoire of functional glycans, a strategy used in organisms such as plants and algae. Promiscuous enzymes bring in the potential for
Thus, our results imply that the pressure to produce the target glycan code for a given cell type with high fidelity places strong constraints on the cisternal number and enzyme specificity (Sengupta and Linstedt, 2011). Taken together, our quantitative analysis of the trade-offs has deep implications for the non-equilibrium self-assembly of Golgi cisternae and suggests that the control of cisternal number must involve a coupling of non-equilibrium self-assembly of cisternae with enzymatic chemical reaction kinetics (Glick and Malhotra, 1998). This combined dynamics of chemical processing with non-equilibrium membrane dynamics involving fission, fusion, and transport (Sachdeva et al., 2016; Sens and Rao, 2013) opens up a new direction for future research.
Complexity of glycan code
Since each cell type (in a niche) is identified with a distinct glycan profile (Gabius, 2018; Varki, 2017; Pothukuchi et al., 2019), and this glycan profile is noisy because of the stochastic noise associated with the synthesis and transport (Pothukuchi et al., 2019; Bard and Chia, 2016; D’Angelo et al., 2013), a large number of different cell types can be differentiated only if the cells are able to produce a large set of glycan profiles that are distinguishable in the presence of this noise. Our task is to identify a quantitative measure for the
In order to identify such a quantitative measure of complexity, we first need a consistent way of smoothening or coarse-graining the raw glycan profiles obtained from MSMS measurements to remove measurement and synthesis noise. Here, we denoise the glycan profile by approximating it by a Gaussian mixture model (GMM) with a specified number of components that are supported on a finite set of indices (Bacharoglou, 2010). Since the size of the set of all possible -component Gaussian densities is an increasing function of , we define the complexity of a mixture of Gaussians as the number of components . Figure 1 demonstrates that the value of at which the -component GMM approximation of the target profile saturates is a good measure of complexity. Using this definition we see that the complexity of the glycan profiles of various organisms correlates well with the number of cell types in an organism (details of the procedure are given in Appendix 1). We will now describe a general model of the cellular machinery that is capable of synthesizing glycans of any complexity. We expect that cells need a more elaborate mechanism to produce profiles from a more complex set.
Figure 1.
Living cells display a complex glycan distribution.
(a) 3-Gaussian mixture model (GMM) and 20-GMM approximation for the relative abundance of glycans taken from mass spectrometry coupled with determination of molecular structure (MSMS) data of planaria
Synthesis of glycans in the Golgi cisternae
The glycan display at the cell surface is a result of proteins that flux through and undergo sequential chemical modification in the secretory pathway, comprising an array of Golgi cisternae situated between the ER and PM, as depicted in Figure 2. Glycan-binding proteins (GBPs) are delivered from the ER to the first cisterna, whereupon they are processed by the resident enzymes in a sequence of steps that constitute the N-glycosylation process (Varki, 2009). A generic enzymatic reaction in the cisterna involves the catalysis of a group transfer reaction in which the monosaccharide moiety of a simple sugar donor substrate, for example, UDP-Gal, is transferred to the acceptor substrate, by a Michaelis–Menten (MM)-type reaction (Varki, 2009)
(1)
Figure 2.
Enzymatic reaction and transport network in the secretory pathway.
Represented here is the array of Golgi cisternae (blue) indexed by situated between the endoplasmic reticulum (ER) and plasma membrane (PM). Glycan-binding proteins are injected from the ER to cisterna-1 at rate . Superimposed on the Golgi cisternae is the transition network of chemical reactions (column) – inter-cisternal transfer (rows), the latter with rates . denotes the acceptor substrate in compartment and the glycosyl donor is chemostated in each cisterna. This results in a distribution (relative abundance) of glycans displayed at the PM (red curve), which is representative of the cell type.
From the first cisterna, the proteins with attached sugars are delivered to the second cisterna at a given inter-cisternal transfer rate, where further chemical processing catalyzed by the enzymes resident in the second cisterna occurs. This chemical processing and inter-cisternal transfer continue until the last cisterna, thereupon the fully processed glycans are displayed at the PM (Varki, 2009). The network of chemical processing and inter-cisternal transfer forms the basis of the physical model that we will describe next.
Any physical model of such a network of enzymatic reactions and cisternal transfer needs to be augmented by reaction and transfer rates and chemical abundances. To obtain the range of allowed values for the reaction rates and chemical abundances, we use the elaborate enzymatic reaction models, such as the KB2005 model (Umaña and Bailey, 1997; Krambeck et al., 2009; Krambeck and Betenbaugh, 2005) (with a network of 22,871 chemical reactions and 7565 oligosaccharide structures) that predict the N-glycan distribution based on the activities and levels of processing enzymes distributed in the Golgi cisternae of mammalian cells. For the allowed rates of cisternal transfer, we rely on the recent study by Ungar and coworkers (Fisher et al., 2019; Fisher and Ungar, 2016), whose study shows how the overall Golgi transit time and cisternal number can be tuned to engineer a homogeneous glycan distribution.
Model
Chemical reaction and transport network in cisternae
We consider an array of Golgi cisternae, labelled by , between the ER and PM (Figure 2). GBPs, denoted as , are delivered from the ER to cisterna-1 at an injection rate . It is well established that the concentration of the glycosyl donor in the
Let denote the maximum number of possible glycosylation reactions in each cisterna , catalyzed by enzymes labelled as , with , where is the total number of enzyme species in each cisterna. Since many substrates can compete for the substrate-binding site on each enzyme, one expects in general that . The configuration space of the network in Figure 2 is . For the N-glycosylation pathway in a typical mammalian cell, , = 10–20, and = 4–8 (Umaña and Bailey, 1997; Krambeck and Betenbaugh, 2005; Krambeck et al., 2009; Fisher and Ungar, 2016). We account for the fact that the enzymes have specific cisternal localization by setting their concentrations to zero in those cisternae where they are not present.
The action of enzyme on the substrate in cisterna is given by
(2)
where . In general, the forward, backward, and catalytic rates , , and , respectively, depend on the cisternal label , the reaction label , and the enzyme label , which parametrize the MM reactions (Price and Stevens, 1999). For instance, structural studies on glycosyltransferase-mediated synthesis of glycans (Moremen and Haltiwanger, 2019) would suggest that the forward rate depends on the binding energy of the enzyme to acceptor substrate and a
A potential candidate for such a cisternal variable is pH (Kellokumpu, 2019), whose value is maintained homeostatically in each cisterna (Casey et al., 2010); changes in pH can affect the shape of an enzyme (substrate) or its charge properties, and in general the reaction efficiency of an enzyme has a pH optimum (Price and Stevens, 1999). Another possible candidate for a cisternal variable is membrane bilayer thickness (Dmitrieff et al., 2013); indeed, both pH (Llopis et al., 1998) and membrane thickness are known to have a gradient across the Golgi cisternae. We take , where is the binding probability of enzyme with substrate , and define the binding probability using a biophysical model, similar in spirit to the Monod-Wyman-Changeux model of enzyme kinetics (Monod et al., 1965; Changeux and Edelstein, 2005) that depends on enzyme-substrate-induced fit.
Let and denote, respectively, the optimal ‘shape’ for enzyme and the substrate . We assume that the mismatch (or distortion) energy between the substrate and enzyme is , with a binding probability given by
(3)
where is a distance metric defined on the space of (e.g. the square of the -norm would be related to an elastic distortion model [Savir and Tlusty, 2007]) and the vector parametrizes enzyme
Our synthesis model is mean field, in that we ignore stochasticity in glycan synthesis that may arise from low copy numbers of substrates and enzymes, multiple substrates competing for the same enzymes, and kinetics of inter-cisternal transfer (Umaña and Bailey, 1997; Krambeck et al., 2009; Krambeck and Betenbaugh, 2005). Then the usual MM steady-state conditions for (2), which assumes that the concentration of the intermediate enzyme-substrate complex does not change with time, imply that
where is the
Together with the constancy of the total enzyme concentration, , this immediately fixes the kinetics of product formation (not including inter-cisternal transport),
(4)
where
and
This reparametrization of the reaction rates in terms of is convenient since it relates to experimentally measurable parameters and MM constant , for each , which can be easily read out (see Appendix 2). As is the usual case, the maximum velocity is not an intrinsic property of the enzyme because it is dependent on the enzyme concentration ; while is an intrinsic parameter of the enzyme and the enzyme-substrate interaction. The enzyme catalytic efficiency, the so-called “” , is high for
We now add to this chemical reaction kinetics the rates of injection () and inter-cisternal transport from the cisterna to ; in Appendix 3, we display the complete set of equations that describe the changes in the substrate concentrations with time. These kinetic equations automatically obey the conservation law for the protein concentration (). At steady state, these kinetic equations lead to a set of nonlinear recursion equations (15)-(16) that are displayed in Appendix 3, which can be solved numerically to obtain the steady-state glycan concentrations, , as a function of the independent vectors , , and , the transport rates and specificity, .
Optimization problem
Let denote the ‘target’ concentration distribution, normalized to the distribution so that , for a particular cell type, that is, the goal of the sequential synthesis mechanism described in the section ‘Chemical reaction and transport network in cisternae’ is to approximate . Let denote the normalized steady-state glycan concentration distribution displayed on the PM. Then Equation 16 implies that , . We measure the (5)
The reason why we divide the KL divergence by the entropy of the target distribution is to enable comparison of the fidelity of the mechanism across target distributions of different complexity. Note that high fidelity corresponds to low values of , vice versa.
Thus, the problem of designing a sequential synthesis mechanism that approximates for a given enzyme specificity , transport rate , number of enzymes , and number of cisternae is given by
(6)
where we emphasize that the optimum fidelity is a function of . Note that there is a separation of time scales implicit in Optimization A – the chemical kinetics of the production of glycans and their display on the PM happens over cellular time scales, while the issues of trade-offs and changes of parameters are related to evolutionary timescales.
Optimization A, though well-defined, is a hard problem since the steady-state concentrations (16) are not
Results
The dimension of the optimization search space is extremely large . To make the optimization search more manageable, we make the following simplifying assumptions:
We ignore the -dependence of the vectors , or alternatively of – see Appendix 4 for details.
The enzyme-substrate-binding probability is still dependent on the substrate . We assume that the shape function is a scalar (a length), that is, . It further simplifies the algebra to assume that the lengths of the substrates are integer multiples of a basic unit (which we take to be 1), that is, . The norm that appears in (3) is taken to be the absolute value difference . Other metrics, such as , corresponding to the elastic distortion model (Savir and Tlusty, 2007), do not pose any computational difficulties, and we see that the results of our optimization remain qualitatively unchanged.
We drop the dependence of the specificity on and , and take it to be a scalar .
These restrictions significantly reduce the dimension of the optimization search, so much so that in certain limits we can solve the problem analytically (in Appendix 6, we show that Equation 21 can be solved analytically in the limit since the glycan index can be approximated by a continuous variable, and the recursion relations for the steady-state glycan concentrations Equations 15–16 can be cast as a matrix differential equation. This allows us to obtain an
The calculations in Appendix 6 imply, as one might expect, that the synthesis model needs to be more elaborate, that is, needs a larger number of cisternae or a larger number of enzymes , in order to produce a more complex glycan distribution. For a real cell type in a niche, the specific elaboration of the synthesis machinery would depend on a variety of control costs associated with increasing and . While an increase in the number of enzymes would involve genetic and transcriptional costs, the costs involved in increasing the number of cisternae could be rather subtle.
Notwithstanding the relative control costs of increasing and , it is clear from the special case that increasing the number of cisternae achieves the goal of obtaining an accurate representation of the target distribution. Suppose the target distribution for a fixed , that is, when , and 0 otherwise, and that the enzymes that catalyze the reactions are highly specific. In this limit,
Target distribution from coarse-grained MSMS
As discussed in the section ‘Complexity of glycan code’, we obtain the target glycan distribution from glycan profiles for real cells using MSMS measurements (Cummings and Crocker, 2020). The raw MSMS data, however, is not suitable as a target distribution. This is because it is very noisy, with chemical noise in the sample and Poisson noise associated with detecting discrete events being the most relevant (Du et al., 2008). This means that many of the small peaks in the raw data are not part of the signal, and one has to ‘smoothen’ the distribution to remove the impact of noise.
We use MSMS data from
This hierarchy allows us to study the trade-off between the complexity of the target distribution and the complexity of the synthesis model needed to generate the distribution as follows. Let denote the -component GMM approximation for the
Trade-offs between number of enzymes, number of cisternae, and enzyme specificity to achieve given complexity
We summarize the main results that follow from an optimization of the parameters of the glycan synthesis machinery to a given target distribution in Figure 3 and Figure 4.
Figure 3.
Trade-offs amongst the glycan synthesis parameters, enzyme specificity , cisternal number , and enzyme number to achieve a complex target distribution .
(a, b) Normalised Kullback–Leibler distance as a function of and (for fixed ), (c, d) as a function of and (for fixed ), with the target distribution set to the 3-Gaussian mixture model (GMM) (less complex) and 20-GMM (more complex) approximations for the
Figure 4.
Fidelity of glycan distribution and optimal enzyme properties to achieve a complex target distribution.
The target is taken from 3-Gaussian mixture model (GMM) (less complex) and 20-GMM (more complex) approximations of the
The optimal fidelity is a convex function of for fixed values for other parameters (see Figure 3), that is, it first decreases with and then increases beyond a critical value of .
The lower complexity distributions can be synthesized with high fidelity with small , whereas higher complexity distributions require significantly larger (see Figure 4a and b). For a typical mammalian cell, the number of enzymes in the N-glycosylation pathway is in the range (Umaña and Bailey, 1997; Krambeck and Betenbaugh, 2005; Krambeck et al., 2009; Fisher and Ungar, 2016), Figure 4b would then suggest that the optimal cisternal number would range from (Sengupta and Linstedt, 2011).
The fidelity is decreasing in and for fixed values of the other parameters, and increasing in the complexity of for fixed . The marginal contribution of and in improving fidelity is approximately equal (see Figure 4a and b). We discuss the origin of this symmetry later in this section.
The optimal enzyme specificity , which minimizes the error as function of with fixed at , is an increasing function of and the complexity of the target distribution (Figure 3a and b and Figure 4c and d). This is consistent with the results in Appendix 6 where we established that the width of the synthesized distribution is inversely dependent on the specificity : since a GMM approximation with fewer peaks has wider peaks, is low, and vice versa. Similar results hold when is fixed at , and is varied (see Figure 3c and d and Figure 4c and d).
Our results are consistent with those in Fisher et al., 2019. They optimize incoming glycan ratio, transport rate, and effective reaction rates in order to synthesize a narrow target distribution centred around the desired glycan. The ability to produce specific glycans without much heterogeneity is an important goal in the pharmaceutical industry. They define heterogeneity as the total number of glycans synthesized and show that increasing the number of compartments decreases heterogeneity and increases the concentration of the specific glycan. They also show that the effect of compartments in reducing heterogeneity cannot be compensated by changing the transport rate. Our results are entirely consistent with theirs – we have shown that decreases as we increase . Thus, if the target distribution has a single sharp peak, increasing will reduce the heterogeneity in the distribution.
We insert an important cautionary note here. It would seem that the results in Figure 4 imply that there is an approximate symmetry in the model, that is, increasing either or affects the fidelity, optimal enzyme specificity, and the sensitivity in approximately the same way. This would be an erroneous inference, and is a consequence of the distortion model we have used for calculating the binding probabilities of substrates with enzymes. The root cause for this apparent symmetry is that we have allowed for all enzymes to catalyze reactions in all cisternae (albeit with different efficiencies). This symmetry is violated by simply restricting the activity of the enzymes to be dependent on the cisternae. A simple realization of this in terms of the distortion model is given in Appendix 7.
Optimal partitioning of enzymes in cisternae
Having studied the optimum to attain a given target distribution with high fidelity, we ask what is the optimal partitioning of the enzymes in these cisternae? Answering this within the context of our chemical reaction model (section ‘Chemical reaction and transport network in cisternae’) requires some care since it incorporates the following enzymatic features: (a) enzymes with a finite specificity can catalyze several reactions, although with an efficiency that varies with both the substrate index and cisternal index , and (b) every enzyme appears in each cisternae; however, their reaction efficiencies depend on the enzyme levels, the enzymatic reaction rates, and the enzyme matching function , all of which depend on the cisternal index .
Therefore, instead of focusing on the cisternal partitioning of enzymes, we identify the chemical reactions that occur with high propensity in each cisternae. For this we define an effective reaction rate for in the (7)
According to our model presented in the section ‘Chemical reaction and transport network in cisternae’, the list of reactions with high effective reaction rates in each cisterna corresponds to a cisternal partitioning of the perfect enzymes. In a future study, we will consider a Boolean version of a more complex chemical model to address more clearly the optimal enzyme partitioning amongst cisternae.
Figure 5ai shows the heat map of the effective reaction rates in each cisterna for the optimal that minimizes the normalized KL distance to the 20-GMM target distribution (see Figure 5aii). The optimized glycan profile displayed in Figure 5aiii is very close to the target. An interesting observation from Figure 5ai is that the same reaction can occur in multiple cisternae.
Figure 5.
Optimal enzyme partitioning in cisternae.
(a) Heat map of the effective reaction rates in each cisterna (representing the optimal enzyme partitioning) and the steady-state concentration in the last compartment () for the 20-Gaussian mixture model (GMM) target distribution. Here, , , normalized . (b) Effective reaction rates after swapping the optimal enzymes of the fourth and second cisternae. The displayed glycan profile is considerably altered from the original profile.
Keeping everything else fixed at the optimal value, we ask whether simply repartitioning the optimal enzymes amongst the cisternae alters the displayed glycan distribution. In Figure 5bi, we have exchanged the enzymes of the fourth and second cisterna. The glycan profile after enzyme partitioning (see Figure 5biii) is now completely altered (compare Figure 5bii with Figure 5biii). Thus, one can generate different glycan profiles by repartitioning enzymes amongst the same number of cisternae (Jaiman and Thattai, 2018).
Geometry of the fidelity landscape
Here we show that the optimum solution is not unique, rather it is highly degenerate, with several equally good optimum solutions. Thus the multidimensional fidelity landscape in , , , and is typically rugged. We analyse the geometry of this fitness landscape by doing a local Hessian analysis about the optimal solutions.
Degeneracies in the synthesis model
The synthesis model is highly degenerate, in the sense that many combinations of parameters give rise to the same glycan profile. This makes the optimization non-convex as there are many equally good minima. These degeneracies are both discrete and continuous. The continuous degeneracies correspond to regions in reaction rate ()-transport rate () space moving along which does not change the concentration profile. The discrete degeneracies are disconnected regions in the parameter space which correspond to the same glycan profile. The number of discrete degeneracies increases exponentially with increase in (). We also find that the fraction of initial conditions converging to a solution close to the global minima increases on increasing (). Technical details of these issues are discussed in Appendix 8.
Stiff and sloppy directions
We analyse the change in fidelity on small perturbations in , , , and around the optimal solution. This allows us to determine where the cell needs to develop a tighter control mechanism (
Figure 6.
Stiff and sloppy directions in the optimization parameters.
(a) Eigenvectors of the Hessian matrix for . The x-axis indexes the eigenvectors, the y-axis indexes the
Implications for robustness to parametric noise
Since the synthesized glycan distribution displayed by the cell marks its identity, it must be robust to noise intrinsic to the synthesis machinery. The degeneracy of solutions and sloppy directions in the fidelity landscape makes the glycan distribution robust to intrinsic noise in the synthesis and cell-to-cell variations in the kinetic parameters. We find that the number of degeneracies increases on increasing (), and the average stiffness of the optimized parameters decreases on increasing (), making the synthesis more robust to parameter fluctuations. Further, while the parameter space is high dimensional, the dimension of
Strategies to achieve high glycan diversity
So far we have studied how the complexity of the target glycan distribution places constraints on the evolution of Golgi cisternal number and enzyme specificity. We now take up another issue, namely, how the physical properties of the Golgi cisternae, namely, cisternal number and inter-cisternal transport rate, may drive the diversity of glycans (Varki, 2011; Dennis et al., 2009). There is substantial correlative evidence to support the idea that cell types that carry out extensive glycan processing employ larger numbers of Golgi cisternae. For example, the salivary Brunner’s gland cells secrete mucous that contains heavily O-glycosylated mucin as its major component (Van Halbeek et al., 1983). The Golgi complex in these specialized cells contain 9–11 cisternae per stack. Additionally, several organisms such as plants and algae secrete a rather diverse repertoire of large, complex glycosylated proteins, for a variety of functions (McFarlane et al., 2014; Koch et al., 2015; O’Neill et al., 2004; Hayashi and Kaida, 2011; Kumar et al., 2011; Gow and Hube, 2012; Atmodjo et al., 2013; Free, 2013; Pauly et al., 2013; Burton and Fincher, 2014). These organisms possess enlarged Golgi complexes with multiple cisternae per stack (Becker and Melkonian, 1996; Mironov et al., 2017; Donohoe et al., 2007; Mogelsvang et al., 2003; Ladinsky et al., 2002).
We define
We use the sigmoid function as a differentiable approximation to the Heaviside function and define the following optimization to maximize diversity for a given set of parameter values, :
where, as before, /min, and /min, and is the threshold. See Appendix 2 for details on the parameter estimation.
The results displayed in Figure 7a show that for a fixed specificity the diversity at first increases with the number of cisternae , and then saturates at a value that depends on . For very-high-specificity enzymes, one can achieve very high diversity by appropriately increasing . This establishes the link between glycan diversity and cisternal number. However, this link is correlational at best since there are many ways to achieve high glycan diversity – notably by increasing the number of enzymes.
Figure 7.
Strategies for achieving high glycan diversity.
Diversity versus and transport rate at various values of specificity for fixed . (a) Diversity vs. at optimal transport rate . Diversity initially increases with , but eventually levels off. The levelling off starts at a higher when is increased. These curves are bounded by the curve. (b) Diversity vs. cisternal residence time () in units of the reaction time () at various value of , for fixed and .
On the other hand, one of the goals of glycoengineering is to produce a particular glycan profile with low heterogeneity (Fisher et al., 2019; Jaiman and Thattai, 2018). For low-specificity enzymes, the diversity remains unchanged upon increasing the cisternal residence time. For enzymes with high specificity, the diversity typically shows a non-monotonic variation with the cisternal residence time. At small cisternal residence time, the diversity decreases from the peak because of the early exit of incomplete oligomers. At large cisternal residence time, the diversity again decreases as more reactions are taken to completion. Note that the peak is generally very flat, which is consistent with the results in Fisher et al., 2019. To get a sharper peak, as advocated for instance by Jaiman and Thattai, 2018, one might need to increase the number of high-specificity enzymes further.
Discussion
The precision of the stereochemistry and enzymatic kinetics of these N-glycosylation reactions (Varki, 2009) has inspired a number of mathematical models (Umaña and Bailey, 1997; Krambeck et al., 2009; Krambeck and Betenbaugh, 2005) that predict the N-glycan distribution based on the activities and levels of processing enzymes distributed in the Golgi cisternae of mammalian cells and compare these predictions with N-glycan mass spectrum data. Models such as the KB2005 model (Umaña and Bailey, 1997; Krambeck and Betenbaugh, 2005; Krambeck et al., 2009) are extremely elaborate (with a network of 22,871 chemical reactions and 7565 oligosaccharide structures) and require many chemical input parameters. These models have an important practical role to play, that of being able to predict the impact of the various
Our focus is different. We are interested in the role of glycans as a marker or molecular code of cell identity (Gabius, 2018; Varki, 2017; Pothukuchi et al., 2019), and in particular, understanding enzymatic and transport processes located in the secretory apparatus of the cell that ensure that this code is generated with high
The glycan profile on the cell surface is a marker of
The glycans at the cell surface are the end product of a sequential chemical processing via a set of enzymes resident in the Golgi cisternae and transport across cisternae (Varki, 2017; Varki, 1998; Pothukuchi et al., 2019). We have proposed a general model for chemical synthesis and transport that, in principle, allows us to compute the
We define the
The results of the optimization over rates and enzyme configurations for a given value of and a target distribution of given complexity are given in Figure 3 and Figure 4. Here, we highlight some qualitative consequences of the model:
Keeping the number of enzymes fixed, a more elaborate transport mechanism (via control of and ) is essential for synthesizing high-complexity target distributions to within a high fidelity, or equivalently, low error (Figure 4a and b). Fewer cisternae cannot be compensated for by optimizing the enzymatic synthesis via control of parameters , , and . An empirical verification of this would involve a coordinated analysis of the glycan profiles, ultrastructure of Golgi, and the number of glycosylation enzymes across many species.
Thus, our study suggests that the requirement that a glycan code of a given complexity be synthesized with sufficiently high fidelity imposes functional control on the Golgi cisternal number. It also provides an argument for the evolutionary requirement of multiple compartments by demonstrating that the fidelity and robustness of the glycan code arising from a chemical synthesis that involves multiple cisternae are higher than the one that involves a single cisterna (keeping everything else fixed) (see Figure 4a and b and Figure 6) This feature, that with multiple cisternae and precise enzyme partitioning one may generically achieve a highly accurate representation of the target distribution, has been highlighted in an algorithmic model of glycan synthesis Jaiman and Thattai, 2018.
Combining (a) and (b), our study quantitatively shows that constructing a high-fidelity representation of a
Organisms such as plants and algae have a diverse repertoire of glycans that are utilized in a variety of functions (McFarlane et al., 2014; Koch et al., 2015; O’Neill et al., 2004; Hayashi and Kaida, 2011; Kumar et al., 2011; Gow and Hube, 2012; Atmodjo et al., 2013; Free, 2013; Pauly et al., 2013; Burton and Fincher, 2014). Our study shows that it is optimal to use low-specificity enzymes to synthesize target distributions with high diversity (Figure 7). However, this compromises on the complexity of the glycan distribution, revealing a tension between complexity and diversity. One way of relieving this tension is to have larger and .
Our study shows that for a fixed and , there is an optimal enzyme specificity that achieves the lowest distance from a given target distribution. As we see in Figure 4d, this optimal enzyme specificity can be very high for highly complex target distributions. Such high specificity can lower fitness when the environment, and hence the target glycan distribution, fluctuates rapidly, and the synthesis parameters cannot change rapidly enough to track the environment (Nam et al., 2012; Peracchi, 2018). This compromise, between robustness to a changing environment and high fidelity in synthesizing high-complexity glycan profiles, is achievable by sloppy enzymes coupled with error-correcting mechanisms (Nam et al., 2012; Peracchi, 2018). However, sloppy enzymes create ‘wrong’ glycans, and therefore, ex-post error-correcting mechanisms must be in place to correct synthesis errors to ensure high fidelity of the glycan code. A task for the future is to understand the role of intracellular transport in providing non-equilibrium proofreading mechanisms to reduce such coding errors, and its optimal adaptive strategies and plasticity in a time-varying environment.
Combining (c) and (d), we find that keeping the number of enzymes fixed, having low specificity or sloppy enzymes, and larger cisternal number could give rise to a diverse repertoire of functional glycans. Sloppy or promiscuous enzymes bring in the potential for
The model solution is degenerate, in the sense that there are many equally good global minimas. These degeneracies are both continuous and discrete. The continuous degeneracies correspond to regions in the reaction rate – transport rate space, moving along which will not change the concentration profile, thus ensuring
Our model implies that close to a local minima the inter-cisternal transport rate and the specificity of the enzymes are stiff directions, that is, the cell should exercise tighter control on and as compared to the other parameters. The reaction rates close to the local minima are sloppy directions, and moving along these directions does not change the glycan profile much.
Taken together, our quantitative analysis of the trade-offs has deep implications for non-equilibrium self-assembly of the Golgi cisternae, and suggests that the non-equilibrium control of cisternal number must involve a coupling of non-equilibrium self-assembly of cisternae with enzymatic chemical reaction kinetics (Glick and Malhotra, 1998).
Admittedly the chemical network that we have considered here is much simpler than the chemical network associated with the possible protein modifications in the secretory pathway. For instance, typical N-glycosylation pathways would involve the glycosylation of a variety of GBPs. Further, apart from N-glycosylation, there are other glycoprotein, proteoglycan, and glycolipid synthesis pathways (Alberts, 2002; Varki, 2009; Pothukuchi et al., 2019). Our task has been to get at a qualitative understanding using quantitative methods and thereby to arrive at general principles. We believe our analysis is generalizable and that the qualitative results we have arrived at would still hold. To conclude, our work establishes the link between the cisternal machinery (chemical and transport) and high-fidelity synthesis of a complex glycan code. We find that the pressure to achieve the target glycan code for a given cell type places strong constraints on the cisternal number and enzyme specificity (Sengupta and Linstedt, 2011). An important implication is that a description of the non-equilibrium self-assembly of a fixed number of Golgi cisternae must combine the dynamics of chemical processing and membrane dynamics involving fission, fusion, and transport (Sengupta and Linstedt, 2011; Sachdeva et al., 2016; Sens and Rao, 2013). We believe that this is a promising direction for future research.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022, Yadav et al. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Many proteins that undergo sequential enzymatic modification in the Golgi cisternae are displayed at the plasma membrane as cell identity markers. The modified proteins, called glycans, represent a molecular code. The fidelity of this
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer