Back to Home Page


Nonlinear modeling of Embryonic Salivary Gland Development
Michael Melnick, DDS, Ph.D. and Tina Jaskoll, Ph.D.

            It is axiomatic in developmental biology that organogenesis is the programmed expression of regulatory genes coupled to downstream structural genes and epigenetic events. This process is dependent on the combinatorial function of diverse signal transduction pathways composed of hundreds of cell signaling molecules that transmit information between and within cells (Noselli and Perrimon, 2000; Gilbert and Sarkar, 2000). To begin to chart this largely unexplored territory, the NIGMS has recently announced substantial support for a group of 50 scientists at 20 universities called the Alliance for Cellular Signaling (AFCS)(Science, 9/15/00, p. 1854). The AFCS plans to map interactions among signaling molecules and pathways to produce a model of how cardiac myocytes and B cells respond to stimuli. On a more modest scale, our laboratory is delineating the nonlinear, emergent dynamics of a focused network of signal transduction pathways by studying the molecular patterns and phenotypic outcomes of inhibitory perturbatory events during embryonic submandibular salivary gland (SMG) development.

            To sustain normal organogenesis, the proper balance between cellular proliferation, quiescence, and apoptosis must be maintained. An imbalance between these processes can result in organ aplasia, hypoplasia, hyperplasia, and dysplasia. The cells of a developing organ are, in an anthropomorphic sense, altruistic. They survive, multiply, and differentiate when needed and are suicidal when not. The latter appears to be the default state; sufficient apoptosis suppressing signals potentiate the former (Raff, 1996). In studying the ontogeny of any organ, then, the key is understanding how and what signals are initiated and integrated to achieve morphogenetic homeostasis. Regarding SMG organogenesis, years of experimentation in our laboratory using a variety of strategies (immunoperturbation, peptide inhibitors, antisense, and transgenic mice) indicate that a wide range of growth factors, cytokines, and transcription factors are important to SMG developmental homoeostasis (Jaskoll et al., 1994; Jaskoll and Melnick, 1999; Melnick and Jaskoll, 2000; Melnick et al., 2001a, b, c; Jaskoll et al., 2002). Based on our prior descriptive and functional studies, we postulate that specific growth factor- and/or cytokine-mediated signal transduction pathways differentially and combinatorially compensate for the dysfunction of any single pathway.

            These cellular and extracellular components may be visualized as a Connections Map which details the functional relationships within and between pathways (Fig. 1). The promotional and inhibitory, synergistic and antagonistic, molecular interactions noted are supported by an enormous experimental effort by numerous laboratories worldwide. To place this Connections Map in the context of 4-dimensional organogenesis, it is helpful to use Waddington’s (1957) “Epigenetic Landscape” (Fig. 2), a clever metaphor for the hierarchical nature of embryogenesis. The extra- and intracellular pathways (Fig. 1; 2B) turn out to be more analogous to the largely redundant, overlapping neural network of the brain than to traffic grids of intersecting streets and interacting vehicles. Understanding the nonlinear interactions between these pathways is intrinsic to understanding the regulation of SMG morphogenesis. This requires the integration of genomic, proteomic, and bioinformatic approaches, not least because development, in its most basic sense, is genes plus contexts.

Figure 1. Connections Map. This signaling map reflects the pathways investigated in our laboratory. Known and putative connections are based on published results of our laboratory and those of many others.

Differentiating tissues/organs are inherently organized; such organization emerges from within the “epigenetic landscape” rather than from without. Kaufman’s (1993, 1995) work at the Santa Fe Institute exemplifies the idea that complex networks of biological signaling pathways (Fig. 1) can arise from the interactions between simple pathways under local control. These networks exhibit emergent properties (Bhalla and Iyengar, 1999): there is integration of signals across multiple time scales; the generation of distinct outputs depend on input strength and duration [e.g. changes in cell fate induced by variable expression of EGF-R (Lillien, 1995)]; there are self-sustaining feedback loops [e.g. FGF and sprouty (Metzger and Krasnow, 1999)]. 

gscn4.jpg Figure 2. Waddington's "Epigenetic Landscape" (Waddington, 1957): A. "The path followed by the ball, as it rolls down towards the spectator, corresponds to the developmental history of a particular [organ]. There is first an alternative, towards the right or the left. Along the former path, a second alternative is offered; along the path to the left, the main channel continues leftwards, but there is an alternative path which, however, can only be reached over a threshold." B. Interacting network of signal transduction pathways. "The pegs in the ground represent genes; the strings leading from them the [pathways initiated by gene expression]. The modeling of the epigenetic landscape, which slopes down from above one's head towards the distance, is controlled by the pull of these numerous guy-robes [pathways] which are ultimately anchored to the genes."
Emergence links an empirical idea to a conceptual one (Sterelny and Griffiths, 1999). The empirical idea is that complex system-level behaviors arise out of locally interacting simple units. The conceptual idea is methodological. Heretofore, we have studied the system components (Fig. 2) in relative isolation. While this experimental model has yielded important clues to SMG development, this yield is at a low level of information relative to the contents of the developmental system being interrogated, a system of mutually dependent causative factors (Szallasi, 1998). Understanding emergence as an empirical phenomenon requires new models of biologic explanation, namely analytical frameworks that can simultaneously evaluate the role of multiple interacting factors. gscn3.jpg

Figure 3. From Gulukota, 1998. Neural Network (PNN). See text for details.

We model genomic and proteomic data using Probabilistic Neural Networks (PNNs) (Alberts et al., 1994; Gulukota, 1998). A PNN (Fig. 3) consists of a set of processing units (nodes) which simulate neurons and are interconnected via a set of “weights” (analogous to synaptic connections in the nervous system) in a way which allows signals to travel through the network in parallel as well as serially (Cross et al., 1995). The nodes are very simple computing elements and are based on the observation that a neuron behaves like a switch. That is, when sufficient neurotransmitter has accumulated in the cell body, an action potential is generated. This is modeled mathematically as a weighted sum of all incoming signals to a node, which is compared with a threshold. If the threshold is exceeded the node fires, otherwise it remains quiescent. PNN computational power derives not from the complexity of each processing unit/node (e.g. a given growth factor receptor), but from the density and complexity of the interconnections.

We delineate the nonlinear, emergent dynamics of a focused signaling network (Fig. 1) by studying the molecular patterns and phenotypic outcomes of nodal “short circuits”. As Alberts et al. (1994) note in “Molecular Biology of the Cell” (pp. 778-782), PNNs can “illuminate the complex behaviors of the interacting signaling cascades that are found in cells.” To wit, the highly interactive architecture of PNNs mimics a network of signaling proteins (Fig. 4). PNNs and signaling networks both function as pattern recognition devices, responding optimally to selected combinations of input stimuli. PNNs are often more accurate than the data used to build them because they amplify the hidden patterns and minimize, if not discard, unwanted noise (Alberts et al., 1994; Gauch, 1993). Signaling networks are not dissimilar in that eliminating one pathway does not totally disable the network (Melnick et al., 2001a, b, c).

 

gscn2.jpg
Figure 4. From Alberts et al., 1994. “A simple hypothetical signaling network. Each receptor activates (green arrows) or inhibits (black arrows) kinase 1 or 2 or both. Because signals converge onto kinase 3 (the output kinase), this network will be maximally active only when specific combinations of extracellular stimuli are present. Although this network is far simpler than likely to be found in a living cell, it could form part of a more complex signaling pathway.”

The striking analogy between PNNs and signaling networks, and the sophistication of PNN modeling, has the added pragmatic advantage of fostering greater scientific accuracy in our knowledge of signaling networks with more cost-effective experimental design (Gauch, 1993). Such research is a substantial undertaking. Nevertheless, with feasible data sets, PNN modeling allows us to discriminate between emergent network behaviors (patterns) and system noise in a parsimonious manner.

Probabilistic Neural Network Analysis      

            A neural network, then, is a programmable dynamical system of countless interrelated differential equations that quickly equilibrates as the system recognizes or recalls a pattern (Kosko, 1990). Prior to the development of neural network methods, scientists could only estimate functions statistically, a largely intuitive process we call mathematical modeling. Such intuitive modeling is linear or modestly nonlinear. Neural networks use the same input/output data but dramatically reduce our reliance on intuition. As such, PNNs are hypothesis driven and reflect the entirely nonlinear mechanistic manipulation of the experimental system used. Using PNNs, then, is a productive and parsimonious way to model the way in which signaling networks (e.g. Fig. 1) enable cells to respond to complex patterns of extracellular signals during development (Alberts et al., 1994; Gulukota, 1998).

            PNNs (Fig. 5) are composed of several layers of interconnected units (nodes), namely an input layer, a hidden layer, and an output layer; the connections between units are analogous to synapses and have modifiable “connection weights” that control the strength with which one unit influences another (Alberts et al., 1994). The most pragmatic characteristic of PNNs in model building is that they “learn.” That is, they can be trained to recognize specific patterns of input and respond to each pattern with a specific output pattern.

            PNNs “learn” by first finding linear relationships between inputs and the outcome, and weight values are assigned to the links between input and output nodes. Next, units are added to the hidden layer so that nonlinear relationships can be found. Values in the input layer are multiplied by the weights and passed to the hidden layer which in turn produces values to pass to the output that are based upon the sum of weighted values passed to the hidden layer. The output layer produces the appropriate pattern recognition results (classifications). Thus, PNNs “learn” by adjusting the interconnection weights between layers, adding hidden layer units as necessary to capture the nonlinear features of the data set. Eventually, after many iterations, a stable set of weights evolves so as to optimize a model that recognizes and responds to input pattern.

            PNN “training” data includes many sets of input variables and a cognate outcome variable. That is, the inputs are the independent variables and the output (classification) is the dependent variable. These are utilized as noted above. In an analogous way, specific intracellular signaling molecules come to recognize a particular combinatorial pattern of extracellular signals and help to translate nonlinear relationships into an emergent cellular response (output). A PNN is only as good as the data with which you train it. If you do not input to the PNN a large number of variables that affect the classification (output), you will inaccurately classify and, thus, inaccurately determine the relative importance of the inputs to the output. Thus, one might vary the input combinations by investigating in vitro development with and without the interruption of a several key signaling pathways and look at 3 different outputs, Development Stage (Pseudoglandular, Canalicular, Terminal Bud); Cell Proliferation (e.g. phospho-Rb); apoptosis (e.g. activated PARP). If you do not present the PNN with a wide variety of examples covering a range of input and output combinations, the model you build will not accurately classify from any combination of inputs.

gscan1.gif

Figure 5. From Alberts et al., 1994. “ A simple neural network. The activity of each neural unit (circles) is determined by the unit’s inputs. The output of each unit is usually a nonlinear function of the units inputs. Each connection between units has a particular strength, or “weight” which is indicated by differences in thickness of the connecting arrows.”

newgscn1.jpg

Figure 6. Adapted from Alberts et al. 1994. Figure 5 has been modified to reflect the factors identified in the PNN (Figure 7) as predictive of developmental stage; they are “ordered” according to their determined probabilistic weights.

(Ward Systems Group, Frederick, MD) is based upon the work of Specht and colleagues (Specht, 1988, 1990; Specht and Shapiro, 1991, Chen, 1996). An example from our published data (Melnick et al., 2001a) can provide some insight into how this PNN builds a model that classifies developmental stage. Specifically, we were studying the TNF/TNF-R1 signaling pathway and its related superfamily Fas/Fas L pathway.  The inputs included in vivo SMG mRNA levels for TNF, TNF-R1, TRADD, RIP, caspase 8, IL-6, Fas, Fas L and FAF (Fas associated factor). The output was developmental stage. The PNN-based algorithm built a model that was able to classify SMG developmental stage with 100% sensitivity and specificity (Table 1).

The algorithm also found the “best set of importance”of input values on an arbitrary scale of 0 to 1 (Fig. 6). The importance of input values is a relative measure of how significant each of the inputs is in the predictive model. Values closer to 1 are more important inputs; values closer to 0 are less important inputs. In our example, TNF-R1, RIP and FAF belong to the former; TNF and TRADD belong to the latter. Zero values (e.g. Fas and Fas L) have no relative importance in building the predictive model. Since the sum of all inputs is approximately 1, these important values may be thought of as the percent contribution to the model of the respective input variables. However, since the algorithm builds non-linear models, the concept of variable contribution is more vague than in linear models because the effect of the input variable on a model depends heavily on the settings of all other input variables. [As an analogy, consider a5 + b = c. When a approaches 0, b has a greater effect on the model; when a is very large, b has comparatively little effect]. Thus, the importance calculations are only estimates, though useful ones.

gscn5.gif
Figure 7. The PNN algorithm computes the importance of the input values, a relative measure of how significant each of the inputs is in the predictive model (Fig. 6).

To get a sense of the heuristic value of this PNN analysis, we can combine the information in Figs. 5 and 7 as a new Fig. 6. Here we assign specific inputs to the nodes (units) of the input layer, namely the 4 of greatest importance to building the predictive model (TNF-R1, FAF, RIP and IL-6). The output is the developmental stage of the SMG. TNF-R1 signaling is primarily growth promoting, although it can be pro-apoptotic in special circumstances (Ashkenazi and Dixit, 1998). FAF is associated with the pro-apoptotic Fas/Fas L pathway. Our PNN model (Figs. 6-7) demonstrates that progressive differentiation is defined by the variation of specific growth promotion factors (TNF-R1, RIP, IL-6) as well as apoptosis (FAF), the former being relatively more important than the latter.

This is, of course, but a small slice of the total signaling network (Fig. 1). The task is to build predictive developmental models at both the gene expression level and the protein expression level under variable conditions (with and without specific perturbations). In this way we will come to know the relative developmental importance of the inputs representing the signaling network derived from our prior work (Fig. 1), as well as the emergent relationship between the component pathways. This first such study has now been completed and published (Melnick et al., 2001c). The reader is invited to review this paper carefully in order to appreciate the value of PNN model building in developmental biology.

Conclusions

            As shown in our Connections Map (Fig. 1), it is apparent that each growth factor, cytokine, or transcription factor is functionally integral to a genetic network with broadly related, rather than independent, components. It may be said to represent the collective dynamics of a “small-world” network such that the average number of factors in the shortest chain connecting any two factors is small (Watts and Strogatz, 1998). Such dynamical systems with small-world coupling display enhanced signal-propagation speed and synchronizability. Thus, if one focuses on the superimposition of the various layers of information, namely morphology, gene expression, protein expression, and protein activity, one can visualize a coordinated, multidimensional response to an inhibited pathway. This visualization, however, is necessarily impressionistic even though our assays have some precision. This is so because we cannot extrapolate from transcriptome to proteome to activated proteins with any accuracy (in the absence of actual steady-state measures), and because in these experiments time is necessarily cross-sectional, not longitudinal. Nevertheless, relative to understanding a complex genetic network and organogenesis, our results demonstrate the importance of contemporaneously evaluating the gene, protein, and activated protein expression of multiple components from multiple pathways within broad functional categories. Understanding the signal dynamics of these pathways will require expanded models that encompass more aspects of regulation (e.g. Asthagiri and Lauffenberger 2001). Still, we will always be limited by the fact that phenotypes are complex, emergent phenomena (Kauffman,1993).

LITERATURE CITED 

Back To Top