Institut Curie, 26 rue d’Ulm, Paris, F-75248 France

INSERM, Paris, U900, F-75248 France

Mines ParisTech, Fontainebleau, F-77300 France

Sysra, Yerres, F-91330 France

Abstract

Mathematical modeling is used as a Systems Biology tool to answer biological questions, and more precisely, to validate a network that describes biological observations and predict the effect of perturbations. This article presents an algorithm for modeling biological networks in a discrete framework with continuous time.

Background

There exist two major types of mathematical modeling approaches: (1) quantitative modeling, representing various chemical species concentrations by real numbers, mainly based on differential equations and chemical kinetics formalism; (2) and qualitative modeling, representing chemical species concentrations or activities by a finite set of discrete values. Both approaches answer particular (and often different) biological questions. Qualitative modeling approach permits a simple and less detailed description of the biological systems, efficiently describes stable state identification but remains inconvenient in describing the transient kinetics leading to these states. In this context, time is represented by discrete steps. Quantitative modeling, on the other hand, can describe more accurately the dynamical behavior of biological processes as it follows the evolution of concentration or activities of chemical species as a function of time, but requires an important amount of information on the parameters difficult to find in the literature.

Results

Here, we propose a modeling framework based on a qualitative approach that is intrinsically continuous in time. The algorithm presented in this article fills the gap between qualitative and quantitative modeling. It is based on continuous time Markov process applied on a Boolean state space. In order to describe the temporal evolution of the biological process we wish to model, we explicitly specify the transition rates for each node. For that purpose, we built a language that can be seen as a generalization of Boolean equations. Mathematically, this approach can be translated in a set of ordinary differential equations on probability distributions. We developed a C++ software, MaBoSS, that is able to simulate such a system by applying Kinetic Monte-Carlo (or Gillespie algorithm) on the Boolean state space. This software, parallelized and optimized, computes the temporal evolution of probability distributions and estimates stationary distributions.

Conclusions

Applications of the Boolean Kinetic Monte-Carlo are demonstrated for three qualitative models: a toy model, a published model of p53/Mdm2 interaction and a published model of the mammalian cell cycle. Our approach allows to describe kinetic phenomena which were difficult to handle in the original models. In particular, transient effects are represented by time dependent probability distributions, interpretable in terms of cell populations.

Background

Mathematical models of signaling pathways are tools that answer biological questions. The most commonly used mathematical formalisms to answer these questions are ordinary differential equations (ODEs) and Boolean modeling.

Ordinary differential equations (ODEs) have been widely utilized to model signaling pathways. It is the most natural formalism for translating detailed reaction networks into a mathematical model. Indeed, equations can be directly derived using mass action laws, Michaelis-Menten kinetics or Hill functions for each reaction according to the observed behaviors. This framework has limitations, though. The first one concerns the difficulty to assign values to the kinetic parameters of the model. Ideally, these parameters would be extracted from experimental data. However, they are often chosen by the modeler so as to fit qualitatively the expected phenotypes. The second limitation concerns the cell population heterogeneity. In this case, ODEs are no longer appropriate since the approach is deterministic and thus focuses on the average behavior. To include non-determinism, an ODE model needs to be transformed into a stochastic chemical model. In this formalism, a master equation is written on the probabilities of the number of molecules for each species. In the translation process, the same parameters used in ODEs (more particularly in ODEs written with mass action law) can be used in the master equation, but in this case, the number of initial conditions explodes along with the computation time.

Boolean (or logical) formalism is another formalism used to model signaling pathways where genes/proteins are parameterized by 0s and 1s only. It is the most natural formalism to translate an influence network into a mathematical model. In such networks, each node corresponds to a species and each arrow to an interaction or an influence (positive or negative). In a Boolean model, a logical rule linking the inputs is assigned to each node. As a result, there are no real parameter values to adjust besides choosing the appropriate logical rules that best describe the system. In this paper, we will refer to a state in which each node of the influence network has a Boolean value as a network state, and the set of all possible transitions between the network states as a transition graph. There are two types of transition graphs, one deduced from the synchronous update strategy
^{#nodes}.

Both logical and continuous frameworks have advantages and disadvantages above-mentioned. We propose here to combine some of the advantages of both approaches in an algorithm that we call the “Boolean Kinetic Monte-Carlo” algorithm (BKMC). It consists of a natural generalization of the asynchronous Boolean dynamics

BKMC is not intended to replace existing tools but rather to complement them. It is best suited to model signaling pathways in the following cases:

• The model is based on an influence network, because BKMC is a generalization of the asynchronous Boolean dynamics. See “Examples” section. Note that this is a common requirement for most of Boolean software.

• The model describes processes for which information about the duration of a biological process is known, because in BKMC, time is parameterized by a real number. This is typically the case when studying developmental biology, where animal models provide time changes of gene/protein activities

• The model describes heterogeneous cell population behavior, because BKMC has a probabilistic interpretation. For example, modeling heterogeneous cell population can help understand tissue formation based on cell differentiation

• The model can contain many nodes (up to 64 in the present implementation), because BKMC is a simulation algorithm that converges fast. This can be useful for big models that have already been modeled with a discrete time Boolean method

Previous published works have also introduced a continuous time approach in the Boolean framework(

All abbreviations, definitions, algorithms and estimates used in this article can be found in Additional file

**Supplementary material.** Basic information on Markov process, abbreviations, definitions and algorithms.

Click here for file

Results and discussion

BKMC for continuous time Boolean model

Continuous time in Boolean modeling: past and present

In Boolean approaches for modeling networks, the state of each node of the network is defined by a Boolean value (node state) and the network state by the set of node states. Any dynamics in the transition graph is represented by sequences of network states. A node state is based on the sign of the input arrows and the logic that links them. The dynamics can be deterministic in the case of synchronized update

The difficulty to interpret the dynamics in terms of biological time has led to several works that have generalized Boolean approaches. These approaches can be divided in two classes that we call explicit and implicit time for discrete steps.

The explicit time for discrete steps consists of adding a real parameter to each node state. These parameters correspond to the time associated to each node state before it flips to another one (

The implicit time for discrete steps consists of adding a probability to each transition of the transition graph in the case of non-deterministic transitions (asynchronous case). It is argued that these probabilities could be interpreted as specifying the duration of a biological process. As an illustration, let us assume a small network of two nodes, A and B. At time t, A and B are inactive: [AB] = [00]. In the transition graph, there exist two possible transitions at t+1: [00] → [01] and [00] → [10]. If the first transition has a significant higher probability than the second one, then we can conclude that B will have a higher tendency to activate before A. Therefore, it is equivalent to say that the activation of B is faster than the activation of A. Thus, in this case, the notion of time is implicitly modeled by setting probability transitions. In particular, priority rules, in the asynchronous strategy, consist of putting some of these probabilities to zero

As an alternative to these approaches, we propose BKMC algorithm.

Properties of BKMC algorithm

BKMC algorithm was built such as to meet the following principles:

• The state of each node is given by a Boolean number (0 or 1), referred to as node state;

• The state of the network is given by the set of node states, referred to as network state;

• The update of a node state is based on the signs linking the incoming arrows of this node and the logic;

• Time is represented by a real number;

• Evolution is stochastic.

We choose to describe the time evolution of network states by a Markov process with continuous time, applied to the asynchronous transition graph. Therefore, the dynamics is defined by transition rates inserted in a master equation (see Additional file

Markov process for Boolean model

Consider a network of **S** of Boolean values, _{
i
}∈{0,1},_{
i
} is the state of the node

A stochastic description of the state evolution is represented by a

Notice that for all

In order to simplify the stochastic process, Markov property is imposed. It can be expressed in the following way: “the conditional probabilities in the future, related to the present and the past, depend only on the present” (see Additional file

Any Markov process can be defined by (see Van Kampen

1. An initial condition:

2. Conditional probabilities (of a single condition):

Concerning time, two cases can be considered:

• If time is discrete: _{0},
_{1},⋯}, it can be shown that all possible conditional probabilities are function of transition probabilities

• If time is continuous:

Notice that a discrete time Markov process can be derived from continuous time Markov process, and is called a

If the transition probabilities or transition rates are time independent, the Markov process is called a **S** and **S**
^{
′
} if and only if

Asynchronous Boolean dynamics as a discrete time Markov process

Asynchronous Boolean dynamics

In the case of asynchronous Boolean dynamics, the system is given by _{
i
}(**S**) is specified and depends only on the nodes _{1} =_{3} AND NOT_{4}, where _{3} and _{4} are the Boolean values of nodes 3 and 4 respectively, and _{1}is the Boolean logic of node 1). The notion of

To define a Markov process, the transition probabilities
**S** and **S**
^{
′
}, let **S**) be the number of asynchronous transitions from **S** to all possible states **S**
^{
′
}. Then

In this formalism, the asynchronous Boolean dynamics completely defines a discrete time Markov process when the initial condition is specified. Notice that here the transition probabilities are time independent, **S**).

Continuous time Markov process as a generalization of asynchronous Boolean dynamics

To transform the discrete time Markov process described above in a continuous time Markov process, transition probabilities should be replaced by transition rates

Because we want a generalization of the asynchronous Boolean dynamics, transition rates

- only if

where _{
i
}
^{up} corresponds to the activation rate of node _{
i
}
^{down} corresponds to the inactivation rate of node ^{up/down} and an initial condition.

Asymptotic behavior of continuous time Markov process

In the case of continuous time Markov process, instantaneous probabilities always converge to a stationary distribution (see Additional file

Notice that instantaneous probabilities

The asymptotic behavior of a continuous time Markov process can be detailed by using the concept of

Oscillations and cycles

In order to describe a periodic behavior, the notion of cycle and oscillation for a continuous time Markov process is defined precisely.

A

The question is then to link the notion of cycle to that of periodic behavior of instantaneous probabilities. The set of instantaneous probabilities cannot be perfectly periodic. They can display a damped oscillating behavior, or none at all (see Additional file

According to theorems described in Additional file

BKMC: Kinetic Monte-Carlo (Gillespie algorithm) applied to continuous time asynchronous Boolean Dynamics

It has been previously stated that a continuous time Markov process is completely defined by its initial condition and its transition rates. For computing any conditional probability (and any joint probability), a set of linear differential equations has to be solved (the master equation). Theoretically, the master equation can be solved exactly by computing the exponential of the transition matrix (see Additional file
^{
n
}×2^{
n
}, the computation soon becomes impossible if

The Kinetic Monte-Carlo
_{max} to Σ. The set of stochastic trajectories represents the given Markov process in the sense that these trajectories can be used to compute probabilities. A finite set of these trajectories is produced, then, from this finite set, probabilities are estimated (as described in “Methods” section). The algorithm is based on an iterative step: from a state **S** at time _{0}(given two uniform random numbers), it produces a transition time _{0}
_{0} +

The exact iterative procedure is the following. Given **S** and two uniform random numbers ^{
′
}∈[0,1]:

1. Compute the total rate of possible transitions for leaving state **S**:

2. Compute the time of the transition:

3. Order the possible new states

4. Compute the new state
^{(0)}=0).

This algorithm will be referred to as

Practical use of BKMC, through MaBoSS tool

Biological data are translated into an influence network with logical rules associated to each node of the network. The value of one node depends on the value of the input nodes. For BKMC, another layer of information is provided when compared to the standard definition of Boolean models: transition rates are provided for all nodes, specifying the rates at which the node turns on and off. This refinement conserves the simplicity of Boolean description but allows to reproduce more accurately the observed biological dynamics. The parameters do not need to be exact as it is the case for nonlinear ordinary differential equation models, but they can be used to illustrate the relative speed of reactions. We developed a software tool, MaBoSS, that applies BKMC algorithm. MaBoSS stands for Markov Boolean Stochastic Simulator.

How to build a mathematical model using MaBoSS

Once MaBoSS is installed (see webpage for instructions,

1. Create the model using MaBoSS language in a file (myfile.bnd, for instance): (a) write the logic for each node, and (b) assign values to each transition rate.

2. Create the configuration file (myfile.cfg, for instance) to define the simulation parameters.

3. Run MaBoSS (the order of the arguments does not matter):

(we assume that MaBoSS is accessible through you PATH).MaBoSS creates three output files:

•

This file contains the network state probabilities on a time window, the entropy, the transition entropy and the Hamming distance distribution (see “Methods”)

•

This file contains the stationary distribution characterization (see “Methods”)

•

This file contains a summary of MaBoSS simulation run.

4. Import output csv files in Excel or R and generate your graphs.

Transition rates in MaBoSS

MaBoSS defines transition rates
_{
i
}
^{up/down} (**S**) (see equations 6). The functions can be written using all Boolean operators (AND, OR, NOT, XOR), arithmetic operators (+,-,*,/), comparison operators and the conditional operator (?:). Examples of the use of the language are given below to illustrate three different cases: (1) different speeds for different inputs, (2) buffering effect and (3) the translation of discrete variables (with three values: 0, 1 and 2) into a Boolean model.

1. Modeling different speeds for different inputs.Suppose that C is activated by A or B, but that B can activate C faster than A, and that C is inactivated when A and B are absent. In this case, we write:

}

When C is off (equal to 0), it is activated by B at a speed $kb. If B is absent, then C is activated by A at a speed $ka. If both are absent, C is not activated. Note that if both A and B are present, because of the way the logic is written in this particular case, C is activated at the highest speed, the speed $kb. When C is on (equal to 1), it is inactivated at a rate equal to 1 in the absence of both A and B.

To implement the synergistic effect of A and B,

}

2. Modeling buffering effect.Suppose that B is activated by A, but that B can remain active a long time after A has shut down. For that, it is enough to define different speeds of activation and inactivation:

}

B is activated by A at a rate equal to 2. When A is turned off, B is inactivated more slowly at a rate equal to 0.001.

3. Modeling different levels for a given node.Suppose that B is activated by A, but if the activity of A is maintained, B can reach a second level. For this, we define a second node B_h (for “B high”) with the following rules:

}

}

In this example, B is separated in two variables: B which corresponds to the first level of B and B_h which corresponds to the higher level of B. B is activated by A at a rate equal to 1. If A disappears before B has reached its second level B_h then B is turned off at a rate equal to 1. If A is maintained and B is active, then B_h is activated at a rate equal to 1. When A is turned off, B_h is inactivated at a rate equal to 1.

Simulation parameters in MaBoSS

To simulate a model in MaBoSS, a set of parameters needs to be adjusted (see “Parameter list” in the reference card available in the webpage). MaBoSS assigns default values, however, they need to be tuned for each model to achieve optimal performances: the best balance between the convergence of estimates and the computation time needs to be found. Therefore, several simulations should be run with different sets of parameters for best tuning.

• Internal nodes:

• Time window for probabilities:

• Maximum time:

• Number of trajectories:

• Number of trajectories (

Comparison with biological data

Each node of the network should account for different levels of activity of the corresponding species (mRNA, protein, protein complex, etc.). It is possible to have more than two levels for one node, as shown in the example “Modeling different levels for a given node”.

It is possible to extract the transition rates from experimental data, using the following property: the rate of a given transition is the inverse of the mean time for this transition to happen. It should be noticed than BKMC is an algorithm based on a linear equation (Additional file

BKMC algorithm provides estimates of the network state probabilities over time. These probabilities can be interpreted in terms of a cell population. The asymptotic behavior of a model, represented by a linear combination of indecomposable stationary distributions, can be interpreted as a combination of cell sub-populations. Indeed, a sub-population can be defined by network states with non-zero probability in the indecomposable stationary distribution. Therefore, a cell in a sub-population can only evolved in this sub-population (Additional file

Comparison of MaBoSS with other existing tools for qualitative modeling

MaBoSS contributes to the effort of tool development for qualitative modeling of biological networks. We propose to compare MaBoSS to some existing tools. However, it is difficult to compare the performance of these tools since each of them achieves different purposes and provides different outputs. As an alternative, we recapitulate, in Figure

Comparison of tools for discrete modeling, biological implication

**Comparison of tools for discrete modeling, biological implication.** Comparison table of the following tools: MaBoSS, GINsim, CellNetAnalyzer, BoolNet, GNA, SQUAD. Technical aspects are provided, along with the inputs/outputs relations between a model and data. The last row illustrates graphically the typical outputs that can be obtained from each tool.

As an illustration, the third example of the “Examples” section below, the mammalian cell cycle, was implemented in three of the tools presented in Figure

**Model of the mammalian cell cycle with GINsim, BoolNet and MaBoSS.** The cell cycle presented in the “Examples” section has been modeled using three tools: GINsim, BoolNet, and MaBoSS. The results for each tool are presented: (1) GINsim provides steady state solutions and transition graphs for two different initial conditions: when CycD=0 and CycD=1. For the synchronous strategy, the transition graph can be visualized whereas for the asynchronous strategy, it is not easy to read or use; BoolNet constructs two graphical representations of the trajectories based on synchronous update strategy, for the case of CycD=0 (steady state) and CycD=1 (cycle); (3) MaBoSS estimates indecomposable stationary distributions for the case of CycD=0 (one fixed point, not shown) and CycD=1 (distribution of probabilities of different network states), and time-dependent activities of the cyclins showing damped oscillations. All results are coherent but are presented differently with a different focus for each tool.

Click here for file

Examples

We have applied BKMC algorithm to three models of different sizes. The first one is a toy model illustrating the dynamics of a single cycle; the second one is a published Boolean model of p53-Mdm2 response to DNA damage and illustrates a multi-level case; and the third one is a published Boolean model of mammalian cell cycle regulation. Note that MaBoSS has been used for these three examples, but Markov process can be computed directly for the two first ones, without our BKMC algorithm because these models are small enough (by computing exponential of transition matrix, see Additional file
^{10}). The first two examples were chosen for their simplicity, and because they illustrate how global characterizations (entropy and transition entropy, see “Entropiesł::bel sect:entropies” in “Methods”) can be used. The third example shows the use of BKMC/MaBoSS for a more consequent and complex model for which the analysis is not obvious.

For the purpose of this article, we built the transition graphs for the first two examples (with GINsim

All input files and results are given in the webpage of MaBoSS (

Toy model of a single cycle

We consider three species, A, B and C, where A is activated by C and inhibited by B, B is activated by A and C is activated by A or B (Figure

Toy model

**Toy model.** Toy model of a single cycle. **(a)** Influence network. **(b)** Logical rules and transition rates of the model. **(c)** Simulation parameters.

The model is defined within the language of MaBoSS by a set of logical rules associated to each node (Figure

Transition graph of the toy model

**Transition graph of the toy model.** Transition graph for the toy model (generated by GINsim). The node states should be read as [ABC] = [^{∗∗∗}]. [ABC]=[100] corresponds to a state in which only A is active. The nodes in green belong to a cycle, the node in red is the fixed point and the other nodes are in blue.

The only stationary distribution is the fixed point [ABC]=[000]. We study two cases: when the rate of the transition from state [001] to state [000] (corresponding to the inactivation of C) is fast and when this rate is slow. We will refer to this transition rate as the ^{∗∗}] where ^{∗}can be either 1 or 0, along with the trajectories of the entropy and the transition entropy.

In the first case, when the escape rate is fast, we set the parameter for the transition to a high value (rate_up = 10). In Figure

MaBoSS outputs of the toy model with fast escape rate

**MaBoSS outputs of the toy model with fast escape rate.** BKMC algorithm is applied to the toy model, with a fast escape rate. Trajectory of the network state probabilities [ABC]=[000] and [ABC]=[1^{∗∗}] (where ^{∗}can be either 0 or 1), the entropy (

In the second case, when the escape rate is slow, we set the parameter for the transition to a low value (rate_down = 10^{−5}). As illustrated in Figure
^{−4}, it can be anticipated that the cyclic behavior is not stable. We can conclude on stable cyclic behaviors only when the transition entropy is exactly 0.

MaBoSS outputs of the toy model with slow escape rate

**MaBoSS outputs of the toy model with slow escape rate.** BKMC algorithm is applied to the toy model, with a slow escape rate. Trajectory of the network state probabilities [ABC]=[000] and [ABC]=[1**], the entropy (

MaBoSS outputs of toy model with slow escape rate, large time scale

**MaBoSS outputs of toy model with slow escape rate, large time scale.** BKMC algorithm is applied to the toy model, with a slow escape rate, plotted on a larger time scale. Trajectory of probabilities ([ABC]=[000] and [ABC]=[1**]), the entropy (

By considering the spectrum of the transition matrix (see Additional file

p53-Mdm2 signaling

We consider a model of p53 response to DNA damage

Model of p53 response to DNA damage

**Model of p53 response to DNA damage.** Model of p53 response to DNA damage. **(a)** Influence network. **(b)** Logical rules and transition rates of the model. **(c)** Simulation parameters.

The model is written in MaBoSS, with two levels of p53 (Figure

Transition graph of the model of p53 response to DNA damage

**Transition graph of the model of p53 response to DNA damage.** Transition graph of the p53 model (generated by GINsim). The node states should be read as [p53 Mdm2C Mdm2N Dam] = [^{∗∗∗∗}] (where ^{∗}can be either 0 or 1). For instance, [p53 Mdm2C Mdm2N Dam]=[1000] corresponds to a state in which only p53 (at its level 1) is active. The nodes in green and the nodes in light blue belong to two cycles, the node in red is the fixed point and the other nodes are in dark blue.

In order to represent the activity of p53, the trajectories of the probabilities of all network states with p53 equal to 1 and with p53 equal to 2 are plotted (Figure
^{∗}11] and for the situation when p53 is set to its highest value (2 equivalent to p53_h) and thus can promote Mdm2 cytoplasmic activity.

MaBoSS outputs of the model of p53 response to DNA damage

**MaBoSS outputs of the model of p53 response to DNA damage.** Trajectories of the network state probabilities of [p53 Mdm2C Mdm2N Dam] = [1^{∗∗∗}] and of [p53 Mdm2C Mdm2N Dam] = [2^{∗∗∗}], the entropy (

The qualitative results obtained with MaBoSS are similar to those of Abou-Jaoudé and colleagues. However, at the level of cell population, some discrepancies appear: in Figure

With MaBoSS, we clearly interpret the system as a population and not as a single cell. In addition, we can simulate different contexts, presented in the initial article as different models, within one single model that uses different simulation parameters to account for these contexts.

Note that the existence of transient cycles, as shown in the toy model, can be deduced from the trajectory of the entropy that is significantly higher than the trajectory of the transition entropy (which is non-zero, therefore the transient cycles are not stable) (Figure

Mammalian cell cycle

For the last example, we propose a model of the mammalian cell cycle initially published as on ODE model by Novák and Tyson

We implement the logical rules of the published model in MaBoSS and define two parameter values for the transition rates: a slow one (set to 1) and a fast one (set to 10). The choice between slow and fast rates for each transition is based on the choice made in the published Boolean model: different priority classes were used in mixed discrete a/synchronous simulation and corresponded to the differences in speed of cellular processes such as transcription, degradation and protein modification. We could, of course, refine the analysis by setting different rates for each transition. The network, the logical rules and the simulation parameters can be found on the webpage.

As mentioned before, MaBoSS can provide two types of outputs: the probabilities of different network states over time (along with the entropy and transition entropy) and the indecomposable stationary distributions.

We consider two biological cases, in the presence of growth factors where the cell enters its division cycle and in the absence of growth factors where the cell is stuck in a G1-like state (state preceding replication of DNA). In the model, the activity of CyclinD (CycD), a G1-cyclin, illustrates the presence of growth factors. In our simulations, we set an initial condition corresponding to a G1 state with two CDK/cyclin inhibitors, p27 and cdh1, on, and with CyclinD on in order to account for the external growth signal. We plot the trajectories of the probabilities of all the cyclins A, B and E (Figure

MaBoSS outputs of the model of the mammalian cell cycle: trajectories of probabilities

**MaBoSS outputs of the model of the mammalian cell cycle: trajectories of probabilities.** BKMC algorithm is applied to the mammalian cell cycle model, with an initial condition corresponding to a G1 state in the presence of growth factors (CyclinD is on). Trajectories of the cyclins probabilities, the entropy (

The indecomposable stationary distributions are identified by the clustering algorithm of MaBoSS and illustrated in Figure

MaBoSS outputs of the model of the mammalian cell cycle: stationary distributions

**MaBoSS outputs of the model of the mammalian cell cycle: stationary distributions.** BKMC algorithm is applied to the mammalian cell cycle model, with random initial conditions. Results of the clustering algorithm that associates a cluster to each indecomposable stationary distribution. **(a)** Probability of reaching each identified cluster; these probabilities are estimated by the proportion of trajectories that belong to each cluster. **(b)** First estimated cluster that can be interpreted as a desynchronized population of cells that are dividing. **(c)** Second estimated cluster, corresponding to a fixed point, that can be interpreted as a G1 cell cycle arrest with no growth factors.

These two indecomposable stationary distributions correspond to the two attractors identified by discrete time modeling in Fauré

Conclusions

We have presented a new algorithm, Boolean Kinetic Monte-Carlo or BKMC, applicable to dynamical simulation of signaling networks based on continuous time in the Boolean framework. BKMC algorithm is a natural generalization of the asynchronous Boolean dynamics

We applied this algorithm to three different models: a toy model that illustrates a simple cyclic behavior, a published model of p53 response to DNA damage, and a published model of mammalian cell cycle dynamics.

This algorithm is provided within a freely available software, MaBoSS, that can run BKMC algorithm on networks up to 64 nodes in the present version. The construction of a model uses a specific language that introduces logical rules and transition rates of node activation/inactivation in a flexible manner. The software provides global and semi-global outputs of the model dynamics that can be interpreted as signatures of the dynamical behaviors. These interpretations become particularly useful when the network state space is too large to be handled. The convergence of BKMC algorithm can be controlled by tuning some simulation parameters: maximum time of the simulation, number of trajectories, length of a time window on which the average of probabilities is performed, and the threshold for the definition of stationary distribution clusters.

The next step is to apply BKMC algorithm with MaBoSS on other existing large signaling networks,

We also expect to implement MaBoSS in broadly used software environments for Boolean modeling, like GINsim

Methods

BKMC generates stochastic trajectories. In this section, we describe how we use and interpret these trajectories.

Network state probabilities on a time window

To relate continuous time probabilities to real processes, an observable time window

BKMC is used for estimating

1. **Estimate for one trajectory.** For each trajectory **S**, in the window [

2. **Estimate for a set of trajectories.** Compute the average over

Entropies

Once

The entropy measures the disorder of the system. Maximum entropy means that all states have the same probability; a zero entropy means that one of the states has a probability of one. The estimation of the entropy can be seen as a global characterization of a full probability distribution by a single real number. The choice of
^{
H(τ)}is an estimate of the number of states that have a non-negligible probability in the time window [

The **S**, there exists a set of possible transitions

By convention,
**S **to any other state.

Therefore, the transition entropy **S**:

Similarly, **S**) = 0 if there is no transition from **S** to any other state. The transition entropy on a time window

This transition entropy is estimated in the following way:

1. **Estimate for one trajectory.** For each trajectory **S** in the time window [_{
S
}. The estimated transition entropy is:

2. **Estimate for a set of trajectories.** Compute the average over

This transition entropy is a way to measure how deterministic the dynamics is. If the transition entropy is always zero, the system can only make a transition to a given state.

If probability distributions on a time window tend to constant values (or tend to a stationary distribution), the entropy and the transition entropy can help characterize this stationary distribution such that:

• A fixed point has zero entropy and zero transition entropy,

• A cyclic stationary distribution has non-zero entropy and zero transition entropy.

Entropy and transition entropy can be considered as “global characterizations” of the model: for a given time window, they always consist of two real numbers, whatever the size of the network is.

Hamming distance distribution

The **S**and **S**
^{
′
} is the number of nodes that have different node states between **S**and **S**
^{
′
}:

where _{
i
}=_{
i
}
^{
′
},
_{
i
}≠_{
i
}
^{
′
}). Given a reference state **S**
_{ref}, the Hamming distance distribution (over time) is given by:

The estimation of the Hamming distance distribution on a time window **P**(

The Hamming distance distribution is a useful characterization when the set of instantaneous probabilities is compared to a reference state (**S**
_{ref}). In that case, the Hamming distance distribution describes how far this set is to this reference state. The Hamming distance distribution can be considered as a “semi-global” characterization of time evolution: for a given time window, the size of this characterization is the number of nodes (to be compared with probabilities on a time window whose size is 2^{#nodes}).

Input, internal, output and reference nodes

**S**:

• If the only possible transitions from state **S **to any other state consist of flipping an internal node, the transition entropy is zero.

• If there is, at least, one transition from state **S **to another state that flips an output node, then only the output nodes will be considered for computing probabilities in equation 10. In particular,

Stationary distribution characterization

It can be shown (see Additional file

• If both the transition entropy and the entropy converge to zero, then the process converges to a fixed point.

• if the transition entropy converges to zero and the entropy does not, then the process converges to a cycle.

More generally, the complete description of the Markov process asymptotic behavior can be expressed as a linear combination of the indecomposable stationary distributions.

A set of finite trajectories, produced by BKMC, can be used to estimate the set of indecomposable stationary distributions. Consider a trajectory
_{0}) is done by averaging over the whole trajectory:

Therefore, a set of indecomposable stationary distribution estimates can be obtained by a set of trajectories. These indecomposable stationary distribution estimates should be clustered in groups, where each group consists of estimates for the same indecomposable stationary distribution. For that, we use the fact that two indecomposable stationary distributions are identical if they have the same support, _{0}
^{(i)} and _{0}
^{(i)} , is defined:

where

Clusters can be constructed when a similarity threshold

For each cluster

Errors on this estimate can be computed by:

Notice that this clustering procedure has no sense if the process is not Markovian; therefore, no nodes are considered as internal.

Abbreviations

BKMC: Boolean Kinetic Monte-Carlo; AT: Asynchronous transition; ODEs: Ordinary differential equations; MaBoSS: Markov Boolean Stochastic Simulator.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

G. Stoll organized the project, set up the algorithms, participated in writing the software, set up the examples and wrote the article. E. Viara wrote the software and participated in setting up the algorithms. E. Barillot participated in discussions and corrected the manuscript. L. Calzone organized the project, set up the examples and wrote the article. All authors read and approved the final manuscript.

Acknowledgements

This project was supported by the Institut National du Cancer (SybEwing project), the Agence National de la Recherche (Calamar project). The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement nb HEALTH-F4-2007-200767 for APO-SYS and nb FP7-HEALTH-2010-259348 for ASSET. GS, EB and LC are members of the team “Computational Systems Biology of Cancer”, Equipe labellisée par la Ligue Nationale Contre le Cancer. We’d like to thank Camille Sabbah, Jacques Rougemont, Denis Thieffry, Elisabeth Remy, Luca Grieco and Andrei Zinovyev.