• Access by Staats- und Universitaetsbibliothek Bremen

Statistical mechanics of multiedge networks

O. Sagarra, C. J. Pérez Vicente, and A. Díaz-Guilera

  • Departament de Física Fonamental, Universitat de Barcelona, E-08028 Barcelona, Spain

Phys. Rev. E 88, 062806 – Published 5 December, 2013

DOI: https://doi.org/10.1103/PhysRevE.88.062806

Abstract

Statistical properties of binary complex networks are well understood and recently many attempts have been made to extend this knowledge to weighted ones. There are, however, subtle yet important considerations to be made regarding the nature of the weights used in this generalization. Weights can be either continuous or discrete magnitudes, and in the latter case, they can additionally have undistinguishable or distinguishable nature. This fact has not been addressed in the literature insofar and has deep implications on the network statistics. In this work we face this problem introducing multiedge networks as graphs where multiple (distinguishable) connections between nodes are considered. We develop a statistical mechanics framework where it is possible to get information about the most relevant observables given a large spectrum of linear and nonlinear constraints including those depending both on the number of multiedges per link and their binary projection. The latter case is particularly interesting as we show that binary projections can be understood from multiedge processes. The implications of these results are important as many real-agent-based problems mapped onto graphs require this treatment for a proper characterization of their collective behavior.

Article Text

I. INTRODUCTION

The increasing and unprecedented quality and quantity of available data coming from very different areas is boosting the field of complex networks. Interdisciplinary science demands new efforts and new tools, addressed not only to develop more efficient computational strategies to analyze incoming data but also to span a theoretical framework where both more accurate and more tractable models can provide predictions closer to the reality one wants to face. In this context, a standard approach consists of representing in a graph the complex structure of interactions among the elements of a given system. Statistical mechanics is an extraordinary framework where such a complex structure can be appropriately modeled and with this aim a large amount of studies has appeared in recent years .

The simplest representation of a network assumes the existence of nodes and edges. The edges do not need to be necessarily symmetric and they can provide information about the relative influence (interaction) of a node onto another, i.e., they can be directed and have a certain intensity. However, the early studies in the field were essentially focused on a binary projection of the network on a graph where only the existence of an edge and its distribution were required to determine some of its properties, without considering the effect of weights. In this way one could compute the probability for two arbitrary sites (nodes) to be connected through an edge. It is well known that such probability keeps certain analogies with occupation numbers in quantum statistics, in particular, with fermionic systems . Further developments have extended these results to richer and more complex structures such as directed and weighted graphs finding analogy with bosonic systems .

A successful and complete description of any system amenable to be represented as a complex network through statistical mechanics requires an appropriate characterization regarding the detailed features of the microscopic components forming the network. Up to now, the actions represented by an edge have always been considered as indistinguishable events, but in some cases, the data represented by edges and their weights can be generated by independent, identifiable actions. This circumstance needs to be taken into account whenever using any statistical technique applied to complex networks. Its impact being important as can be seen in analogy to statistical mechanics where the distinguishability of particles leads to different descriptions in terms of Fermi-Dirac, Bose-Einstein or Maxwell-Boltzmann statistics.

Processes generated by single agents represent single events, for instance, a trip between two locations, a call between two cities or communications at a given time. A particularly clear example where this distinguishability of weight units is crucial are transportation networks , where human flows between different locations are used to build the so-called Origin Destination matrices which collect global information about their mobility or allocation. A naive approach based on a standard weighted description of a network is not satisfactory for this problem as it was already pointed out by Wilson who mapped transport systems to statistical physics using an entropy maximization approach. Hence, modern complex networks theory calls for the introduction of new general models taking into account the possible distinguishability of the edges forming the considered network.

The present work addresses this issue by presenting a general statistical mechanics approach to networks created from distinguishable single-unit events that can be grouped in edges, in analogy to the allocation of particles in discrete energy levels of classical statistical mechanics systems. In such networks, the edge weights encode information about independent processes performed by different agents at (possibly) different times, and hence those are necessarily represented by integer numbers and constitute quantized, distinguishable entities. The difference between this representation, which leads to what we call multiedge networks, and the already considered weighted ones is clear as schematically shown in Fig. . Multiedge networks assume the existence of a minimal weight of unit value (and hence a quantization) representing an indissociable event. In some cases, groups of these events connect the same pair of nodes, hence forming an edge with multiple connections, which has a different nature from a weighted one where no quantization is imposed. If one aims to apply the well-known ensemble theory used in statistical physics to networks , one needs to take into account this subtle yet important difference, which has profound implications on the associated statistics. The choice of one or the other representation will thus depend on the problem at hand and makes a big difference in terms of collective behavior of the whole network as will be shown along this paper. This study fills the existing gap in the field by introducing a complete and necessary discussion on distinguishability that is to be added to the already existing models and thus complements the complex networks body of theory.

FIG. 1

The distinguishability issue: Out of a set of three distinguishable nodes and three distinguishable (directed) events, one can see that the number of configurations (described in terms of edge names) compatible with a given occupation number sequence is degenerate, since all permutations of events generate different microstates.

Many examples of systems studied using network theory can be understood as multiedge networks. All those networks composed by the actions of independent agents (mobility networks , call networks ) and in general those whose edges are formed by the sum of independent actions developed during a certain time fall within this category . The present work deals with the challenge of introducing null models that provide exact analytical expectations for the main network observables under some given fixed constraints. This, in turn, allows one to assess whether any observed network characteristic from real data is due to the imposed constraints or to the contrary, to some unexpected phenomena worth exploring.

The paper is structured as follows: Sec. introduces the concept of multiedge networks together with some basic nomenclature and definitions used. Sections introduce the general derivation of the considered ensembles using different sets of constraints (mainly linear constraints on occupation numbers, binary-projection constraints, and both). In each section, some important examples are explicitly derived and formulas for the state statistics and the ensemble entropies are obtained. Finally some concluding remarks are given discussing the work done. The Appendices contain details, additional discussion, and some mathematical developments on mentioned examples in the text.

II. ENSEMBLE APPROACH TO MULTIEDGE NETWORKS: NOMENCLATURE AND DEFINITIONS

A multiedge network is a collection of nodes that may be connected by none, one or more than one element (event) taken from a set of independent, distinguishable entities. Such an object admits a coarse-grained representation in terms of a multiedge matrix T. The matrix entries are bounded integer stochastic variables denoting the number of events joining nodes and and represents the total number of events. In the context of statistical mechanics they play an analog role to occupation numbers. The matrix need not necessarily be symmetric so in general. Let be the number of nodes of the network which generate possible states where individual events can be allocated. represents the number of occupied states (regardless of their occupation number as long as ) in the network, representing the Heaviside step function,

Such a function ensures that accounts for the number of existing connections regardless of the number of events contained in each entry of the matrix, therefore creating a binary projection of the network.

Our goal is to construct an ensemble framework that allows one to treat multiedge networks with any given set of constraints defined in terms of the variables of the system. The constraints define a macrostate and restrict the available phase space to all the possible graphs compatible with such constraints. In this context, we assume two main starting hypotheses. First, all the configurations (microstates) compatible with the observed constraints have the same a priori probability of appearance. Secondly, we assume that being given (this means that the topological structure of the network, the number of available states, does not change) which allows a statistical treatment of the problem. We further assume that the distribution of occupation numbers is stationary and defines in turn a probability that fixes our thermodynamic limit. This probability indicates the asymptotic (relative) distribution of occupation numbers:

(1)

where denotes an average performed over the ensemble considered. Under these considerations it is possible to establish a complete mapping between the problem at hand and a classical statistical mechanics problem. The events correspond to distinguishable particles occupying any of the 𝑁(𝑁1) available energy states [the methodology is laid for directed graphs, but can be easily adapted to undirected ones, yielding 𝑁(𝑁1)/2 available states]. The main difference between the proposed system and the ones studied by classical statistical equilibrium mechanics regards the constraints used: They will not be in general of extensive, global nature but local at the level of nodes. The standard procedure, already used by different authors, consists of finding an expression for the probability of obtaining a macrostate defined by its multiedge adjacency matrix 𝐓,

𝒫(𝐓)=𝒫({𝑡𝑖𝑗}),

and then maximizing its associated entropy,

𝑆SH=𝛤𝒫(𝐓)ln𝒫(𝐓),
(2)

where the sum runs over all possible configurations of events (microstates) on the accessible phase space 𝛤 of the given ensemble. Since distinguishability plays an important role in our work, it is important to note that if one wishes to compute the sum over possible values of the occupation numbers {𝑡𝑖𝑗} (our canonical variables) rather than over configurations, a degeneracy term needs to be added to expression  (see Fig. ),

𝒟({𝑡𝑖𝑗})=(𝑖𝑗𝑡𝑖𝑗)!𝑖𝑗𝑡𝑖𝑗!.
(3)

In the remainder of the paper, we shall work on the occupation-number space 𝛺 (which is a coarsened representation of the 𝛤 configurational space), including the degeneracy term and hence considering an expression for the entropy of the form,

𝑆SH=𝛤𝒫({𝑡𝑖𝑗})ln𝒫({𝑡𝑖𝑗})=𝛺({𝑡𝑖𝑗}𝒟({𝑡𝑖𝑗})𝒫({𝑡𝑖𝑗})ln(𝒟({𝑡𝑖𝑗})𝒫({𝑡𝑖𝑗}))=𝛺({𝑡𝑖𝑗})𝑃({𝑡𝑖𝑗})ln𝑃({𝑡𝑖𝑗}),
(4)

where we changed the notation 𝒫𝑃 to denote the counting over occupation numbers in the (coarse-grained) 𝛺 space rather than event configurations in 𝛤 space.

A final comment deserves the seminal work by Wilson on transport theory . He followed a similar scheme and found expressions for the expected number of events connecting two arbitrary nodes according to some considered constraints. However, in the present paper we go further; we present a modern methodology able to handle more complicated constraints as compared to those analyzed by Wilson and also consider constraints which affect simultaneously the distribution of occupation numbers and its binary projection on the graph, for instance, those affecting the strength (weighted degree ) of a node and its degree.

III. MULTIEDGE NETWORK WITH GIVEN LINEAR CONSTRAINTS ON THE OCCUPATION NUMBERS 𝑡𝑖𝑗

In this work we proceed following a microcanonical scenario. In this ensemble, all the configurations are equally probable and the constraints are considered “hard,” i.e., the phase space accessible always fulfils strictly the constraints. Thus, we must only maximize the expression ln𝛺({𝑡𝑖𝑗}) with respect to 𝑡𝑖𝑗 which allows one to find the most probable value of some relevant observables, i.e., their statistically expected value in the ensemble of the maximally random graph which strictly fulfills the constraints:

max{𝛺({𝑡𝑖𝑗})|𝐶({𝑡𝑖𝑗})=𝐶}max{𝛺({𝑡𝑖𝑗})}.
(5)

We follow the same methodology developed in seminal works by Bianconi and write down the volume of the 𝛺 space introducing auxiliary fields 𝑖𝑗,

𝛺={𝑐}𝑒{𝑖,𝑗}𝑖𝑗𝑡𝑖𝑗𝑄𝑞𝛿(𝐶𝑞𝐶𝑞({𝑡𝑖𝑗}))={𝑐}𝑒{𝑖,𝑗}𝑖𝑗𝑡𝑖𝑗𝑄𝑞𝐷[𝜃𝑞]𝑒𝜃𝑞(𝐶𝑞({𝑡𝑖𝑗}𝐶𝑞)),
(6)

where {𝑐} denotes a sum over microscopic configurations ( 𝛤 space), 𝑄 is the total number of constraints, and 𝜃𝑞 are the related Lagrange multipliers. Note also that an integral representation of the Kronecker delta has been used. The introduction of auxiliary fields 𝑖𝑗 allows one to recover all the central moments of the distribution of 𝑡𝑖𝑗. In fact, one can see that  is closely related to the cumulant generating function of 𝑡𝑖𝑗 and hence all its cumulants can be recovered by differentiation. In particular,

𝑡𝑖𝑗=𝜕𝑖𝑗ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=0𝑖,𝑗,𝜎2𝑡𝑖𝑗=𝜕2𝑖𝑗ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=0𝑖,𝑗,𝜎2𝑡𝑖𝑗,𝑡𝑘𝑙=𝜕2𝑖𝑗𝑘𝑙ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=0𝑖,𝑗.
(7)

If we wish to perform the sum over occupation numbers ( 𝛺 space) rather than over configurations of the system, we need to take into account the degeneration given in ,

𝛺={𝑡𝑖𝑗}𝑇!𝑖𝑗𝑡𝑖𝑗!𝑒𝑖𝑗𝑖𝑗𝑡𝑖𝑗𝑞𝐷[𝜃𝑞]𝑒𝜃𝑞𝐶𝑞𝑒𝜃𝑞𝐶𝑞({𝑡𝑖𝑗}).
(8)

For this ensemble and for linear constraints on occupation numbers, one can write

𝐶𝑞({𝑡𝑖𝑗})=𝑖𝑗𝑐(𝑖𝑗)𝑞𝑡𝑖𝑗,
(9)

being 𝑐(𝑖𝑗)𝑞 a quantity that usually depends on a “property” of the edge between nodes 𝑖 and 𝑗 (a distance, for instance) or be a real number ( 𝑐(𝑖𝑗)𝑞=𝛿𝑖,𝑞 for outgoing strength sequence constraints, for example, as we shall see). In such a situation, the sum over {𝑡𝑖𝑗} sequences such that 𝑡𝑖𝑗=𝑇 can be exactly performed inside the integral in  yielding

𝛺=(𝑞𝐷[𝜃𝑞])𝑒𝑞𝜃𝑞𝐶𝑞𝑒𝑇ln𝑖𝑗exp{(𝑖𝑗+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞)}=(𝑞𝐷[𝜃𝑞])𝑒𝑓({𝜃𝑞,𝐶𝑞},{𝑖𝑗}).
(10)

The occupation number statistics can be shown to have multinomial nature (see Appendix ),

𝑡𝑖𝑗=𝑇𝑝𝑖𝑗;𝜎2𝑡𝑖𝑗=𝑇𝑝𝑖𝑗(1𝑝𝑖𝑗),𝜎2𝑡𝑖𝑗,𝑡𝑘𝑙=𝑇𝑝𝑖𝑗𝑝𝑘𝑙.
(11)

This fact assures that the relative fluctuations of occupation numbers vanish in the thermodynamic limit ( 𝑇) since

𝜎𝑡𝑖𝑗𝑡𝑖𝑗=1𝑝𝑖𝑗𝑝𝑖𝑗𝑇0,as𝑇.

We identified 𝑝𝑖𝑗 as the probability for an individual event to be assigned to state 𝑖𝑗. Explicitly,

𝑝𝑖𝑗=𝑒𝑞𝑐𝑖𝑗𝑞𝜃𝑞𝑖𝑗𝑝𝑖𝑗.
(12)

Concerning the entropy of the graphs in this ensemble, the integral in  can be approximated to first order by using the steepest descent methods,

𝑆BG=ln𝛺|𝑖𝑗𝑖,𝑗=0𝑓*=𝑞𝐶𝑞𝜃*𝑞+𝑇ln𝑒𝑞𝜃*𝑞𝑖𝑗𝑐(𝑖𝑗)𝑞,
(13)

where 𝜃*𝑞 are the solutions of the saddle point equations given by

𝜕𝜃𝑞𝑓|𝑖𝑗=0𝑖𝑗=0𝐶𝑞=𝐶𝑞(𝑡𝑖𝑗)=𝑇𝑖,𝑗𝑐(𝑖𝑗)𝑞𝑝𝑖𝑗.
(14)

Merging  one obtains the event-specific entropy of a given graph in this ensemble:

𝑆BG𝑇=(𝑞𝜃𝑞𝑖,𝑗𝑐(𝑖𝑗)𝑞𝑝𝑖𝑗ln𝑖𝑗𝑒𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞)=(𝑖𝑗𝑝𝑖𝑗𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞ln𝑖𝑗𝑒𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞)=𝑖𝑗𝑝𝑖𝑗ln𝑝𝑖𝑗=𝑆SH𝑇,
(15)

which has a final Shannon entropy form, equivalent to the Boltzmann-Gibbs entropy.

If one is able to exactly solve the saddle point equations obtained from the steepest descent approximation, the full distribution of occupation numbers is recovered. Despite being in a microcanonical framework, one could consider a canonical ensemble where the constraints are fulfilled on average, hence 𝐶𝑞(𝑡𝑖𝑗)=𝐶𝑞(𝑡𝑖𝑗). Having proven that the partition function  has a multinomial-cumulant form over the occupation numbers and considering only linear soft constraints, the requirement that those constraints need to be fulfilled only on average in the sampling over the phase space is automatically satisfied. Moreover, the constraints have vanishing relative fluctuations in the thermodynamic limit,

𝐶𝑞({𝑡𝑖𝑗})=𝑖,𝑗𝑐(𝑖𝑗)𝑞𝑡𝑖𝑗=𝑇𝑖,𝑗𝑐(𝑖𝑗)𝑞𝑝𝑖𝑗,𝜎2𝐶𝑞=𝑖,𝑗,𝑘,𝑙𝑐(𝑖𝑗)𝑞𝑐(𝑘,𝑙)𝑞𝜎2𝑖𝑗,𝑘𝑙𝑇,𝜎2𝐶𝑞𝐶𝑞21𝑇0as𝑇,

where we have used the properties of the multinomial distribution presented in . Although the theoretical basis for the generation of graphs in different ensembles is introduced in this paper (see Appendix ), the challenges for the exact and efficient generation of such ensembles will be shortly tackled and presented in future work.

In the following, we shall consider some explicit cases of linear constraints on 𝑡𝑖𝑗.

A. No constraints

This is the simplest case where we have a single hard constraint (apart from the number of nodes 𝑁) 𝑇=𝑖𝑗𝑡𝑖𝑗. Therefore  reads

𝛺=𝑡𝑖𝑗=𝑇𝑇!𝑖𝑗𝑡𝑖𝑗!𝑖𝑗(𝑒𝑖𝑗)𝑡𝑖𝑗=(𝑖𝑗𝑒𝑖𝑗)𝑇.

In this case it is straightforward to determine the average value of the occupation numbers (in this case 𝑡 denotes an average over a single graph realization):

𝑡𝑖𝑗=𝑇(𝑁(𝑁1))𝑡𝑇𝑝𝑖,𝑗.
(16)

Therefore, all the occupation numbers have constant probability 𝑝𝑖𝑗=𝑝𝑖,𝑗 of being chosen per event sorted. The result is according to what intuition would tell us: Under no constraints events are equally distributed among levels which reminds the high-temperature regime in classical systems where there is an arbitrary large (but finite) number of energy levels. It is also possible to compute the covariances on occupation numbers,

𝜎2𝑡𝑖𝑗,𝑡𝑘𝑙={𝑇𝑝2𝑖𝑗𝑘𝑙𝑇𝑝(1𝑝)𝑖𝑗=𝑘𝑙.

Other typical network magnitudes of interest such as the strengths (both incoming 𝑠(in)𝑗=𝑖𝑡𝑖𝑗 and outgoing 𝑠(out)𝑖=𝑗𝑡𝑖𝑗), which will be extensively used in this paper, can also be computed. They are random integer variables with fixed mean 𝑠𝑖𝑠=(𝑁1)𝑝𝑇𝑝𝑠𝑇 and variance 𝜎2𝑠=𝑇𝑝𝑠(1𝑝𝑠) 𝑖[1,𝑁] which are on average equal for each node and have also a multinomial character.

Concerning the Boltzmann-Gibbs entropy of the ensemble,

𝑆BG=𝑇ln(𝑁(𝑁1))=𝑇ln𝑝=𝑆SH.
(17)

Which recovers a Shannon form over events as expected, 𝑆SH=𝑗𝑖𝑝ln𝑝.

B. Fixed average event cost 𝑐=𝐶/𝑇

If additionally to the number of events 𝑇, we consider a cost matrix (symmetric, dense, and positive definite) 𝐃={𝑑𝑖𝑗} and fix the total cost 𝐶𝑇=𝑑𝑖𝑗𝑡𝑖𝑗, we trivially get

𝛺=𝐷[𝜃]exp{𝜃𝐶𝑇+𝑇ln𝑒𝑖𝑗+𝜃𝑑𝑖𝑗}=𝐷[𝜃]exp{𝑐(𝜃,{𝑖𝑗},{𝑡𝑖𝑗})},

which leads to the saddle point equation,

𝜕𝜃𝑐|𝑖𝑗=0𝑖𝑗=0𝐶𝑇𝑇𝑐=𝑑𝑖𝑗𝑒𝜃𝑑𝑖𝑗𝑒𝜃𝑑𝑖𝑗,

and finally,

𝑡𝑖𝑗=𝑇𝑒𝜃𝑑𝑖𝑗𝑒𝜃𝑑𝑖𝑗=𝑇𝑝(𝜃,𝑑𝑖𝑗),𝜎2𝑡𝑖𝑗=𝑇𝑝(𝜃𝑑𝑖𝑗)(1𝑝(𝜃,𝑑𝑖𝑗)).

And this leads to a weighted version of the Waxman graph . This reasoning can be extended to study the interesting case where the distribution of costs is also fixed, which is of particular interest in the field of O-D matrices used to analyze mobility, where the mobility of users using certain types of transports is assumed to follow particular statistical forms (). This case is analyzed in detail and solved in Appendix .

For any of the cases involving cost matrices, especially those related with distances, it is very important to remark that the allocation of occupation numbers is not independent in each state and hence it is not true that the probability 𝑊(𝑡𝑖𝑗,𝑑𝑖𝑗,𝜃) of having a state occupied by 𝑡𝑖𝑗 events at distance 𝑑𝑖𝑗 is 𝑊(𝑡𝑖𝑗,𝑑𝑖𝑗)=𝐾𝑓(𝜃,𝑑𝑖𝑗). Rather 𝑊(𝑡𝑖𝑗) represents a conditional probability of observing 𝑡𝑖𝑗 events (trips in this scenario) at distance 𝑑𝑖𝑗 given the distance matrix 𝐃 and the rest of the constraints (included in 𝐾). This means in particular that if a deference function 𝑓(𝑑𝑖𝑗) is proposed to explain observed flows between locations, as usually done in O-D matrix studies under a maximum entropy assumption, its predictive results need to be statistically tested against the full expected distribution of 𝑡𝑖𝑗, and not only against the (biased) statistic of observed or existing occupation numbers.

C. Fixed relative strength sequence 𝑠={(𝑠out,𝑠in)𝑖},𝑇

We consider now the case in which the only given constraint is the strength sequence 𝑠. In this case, the constraints read

𝐶𝑖(𝑡𝑖𝑗)=𝑖𝑡𝑖𝑗=𝑠out𝑖𝐶𝑗(𝑡𝑖𝑗)=𝑗𝑡𝑖𝑗=𝑠in𝑗,
(18)

and Eq.  becomes

𝛺=𝑒𝑖𝛼𝑖𝑠out𝑖𝑒𝑗𝛽𝑗𝑠in𝑗(𝑁𝑖𝐷[𝛼𝑖]𝐷[𝛽𝑖])×𝑡𝑖𝑗=𝑇𝑇!𝑖𝑗𝑡𝑖𝑗!𝑖𝑗(𝑒𝛼𝑖+𝛽𝑗+𝑖𝑗)𝑡𝑖𝑗=(𝑁𝑖𝐷[𝛼𝑖]𝐷[𝛽𝑖]𝑒𝛼𝑖𝑠out𝑖𝑒𝛽𝑖𝑠in𝑖)(𝑖𝑗𝑒𝛼𝑖+𝛽𝑗+𝑖𝑗)𝑇=𝐷[{𝛼𝑖,𝛽𝑗}]exp𝑖𝛼𝑖𝑠out𝑖𝑗𝛽𝑗𝑠in𝑗+𝑇ln(𝑖𝑗𝑒𝛼𝑖+𝛽𝑗+𝑖𝑗)=𝐷[{𝛼𝑖,𝛽𝑗}]exp𝑓({𝛼𝑖},{𝛽𝑖},{𝑖𝑗}).

Now we need to solve the 2𝑁 saddle point equations,

𝜕𝛼𝑖𝑓|𝑖𝑗=0𝑖,𝑗=0𝑠out𝑖=𝑇𝑗𝑖𝑒𝛼𝑖+𝛽𝑗𝑒𝛼𝑖+𝛽𝑗,𝜕𝛽𝑗𝑓|𝑖𝑗=0𝑖𝑗=0𝑠in𝑗=𝑇𝑖𝑗𝑒𝛼𝑖+𝛽𝑗𝑒𝛼𝑖+𝛽𝑗.
(19)

And we apply again  to get

𝑡𝑖𝑗=𝑇𝑥𝑖𝑦𝑗𝑖𝑗𝑥𝑖𝑦𝑗=𝑇𝑝𝑖𝑗,
(20)

where we have identified 𝑥𝑖=𝑒𝛼𝑖,𝑦𝑗=𝑒𝛽𝑗 and 𝑝𝑖𝑗 has the same multinomial structure and properties as in previous examples except for the fact that these probabilities are state dependent. In fact, the average number of multiedges between two nodes factorizes in an uncorrelated form.

For large 𝑁 we are led to

𝑠out𝑖=𝑇𝑥𝑖𝑗𝑦𝑗𝑖𝑗𝑥𝑖𝑗𝑦𝑗=𝑇𝑥𝑖𝑋𝑥𝑖𝑇𝑥𝑖𝑋,𝑋𝑖𝑥𝑖;𝑌𝑗𝑦𝑗;𝑠out𝑖𝑥𝑖;𝑠in𝑗𝑦𝑗,
(21)

and we recover the weighted configuration model ,

𝑡𝑖𝑗=𝑠out𝑖𝑠in𝑗𝑇.
(22)

This result recovers the expression in and is in accordance with a maximum likelihood principle (contrary to the case of weighted networks as explained in ).

Let us notice that these results allow a straightforward extension to the canonical ensemble. By identifying from ,

𝑝𝑠out𝑖=𝑥𝑖𝑋𝑝𝑠in𝑗=𝑦𝑗𝑌,
(23)

one can work with a strength sequence which is no longer fixed but a collection of fluctuating integer random values with a multinomial structure, since the partial grouping of multinomial random variables 𝑡𝑖𝑗 preserve their multinomial character and by construction 𝑗𝑝𝑠in𝑗=𝑖𝑝𝑠out𝑖=1. Therefore,

𝑠out𝑖=𝑇𝑝𝑠out𝑖,𝑠in𝑗=𝑇𝑝𝑠in𝑗,𝜎𝑠out𝑖=𝑇𝑝𝑠out𝑖(1𝑝𝑠out𝑖),𝜎𝑠in𝑗=𝑇𝑝𝑠in𝑗(1𝑝𝑠in𝑗).
(24)

Concerning the entropy in this canonical ensemble scenario, making use of  we further obtain a closed expression in terms of the constraints of the problem,

𝑆BG𝑖𝑠out𝑖ln𝑠out𝑖𝑇𝑗𝑠in𝑗ln𝑠in𝑗𝑇+𝑇ln𝑖𝑗𝑠out𝑖𝑠in𝑗𝑇2=𝑇(𝑖𝑝𝑠out𝑖ln𝑝𝑠out𝑖+𝑗𝑝𝑠in𝑗ln𝑝𝑠in𝑗),𝑆BG𝑇=𝑖𝑝𝑠out𝑖ln𝑝𝑠out𝑖𝑗𝑝𝑠in𝑗ln𝑝𝑠in𝑗=𝑆SH𝑇.
(25)

Let us remind that in this context 𝑝𝑠𝑖=𝑗𝑝𝑖𝑗 represents the probability of a certain node 𝑖 to accumulate 𝑠𝑖 (incoming or outgoing) events on average. We therefore recover a Shannon form for the entropy in both microcanonical and canonical ensembles, which scales with the total number of events 𝑇 and is node (but also state) specific [from Eq. ].

Further additional constraints can be added leading to different models. An example would be to merge the last two examples of a network living in a metric space (or any network where we can give a cost to the edges) with fixed “accessibility” of each node (distance-weighted strength). We could also fix the average cost per trip 𝑐 (global constraint) and the strength of each node (node local constraint). In this case one would obtain the popular doubly constrained gravity model in several forms , also Stouffers’ intervening opportunities model and even the newly proposed radiation model (by choosing an appropriate form for the cost function; see for extended discussion).

The case of the gravity law models deserves a closer look since an entropy maximization approach yields a form 𝑡𝑖𝑗=𝑥𝑖𝑦𝑗𝑓(𝛾,𝑑𝑖𝑗) which is not equivalent (in general) to 𝑡𝑖𝑗=𝑠𝛼𝑠𝛽𝑓(𝛾,𝑑𝑖𝑗) because the values of the multipliers depend on the particular spatial distribution of the considered nodes and their relative strengths. Hence, despite the success attained by these kinds of models to reproduce empirical data an entropy maximization approach could unify the different sets of exponents observed in each study (see ).

IV. MULTIEDGE NETWORK WITH GIVEN NONLINEAR CONSTRAINTS ON THE BINARY PROJECTION OF THE OCCUPATION NUMBERS 𝛩(𝑡𝑖𝑗)

Let us consider now more complex situations such as the case of nonlinear constraints on 𝑡𝑖𝑗. One of the most relevant objects to look at when dealing with complex networks is the degree distribution. It concerns only the existence of links between arbitrary nodes of the graph regardless the number of multiedges between them. In the framework presented in this paper it can be worked out as a function of the binary projection of the occupation numbers on the graph so, in general, such constraints can be expressed as

ˆ𝐶𝑞=𝑖𝑗̂𝑐(𝑖𝑗)𝑞𝛩(𝑡𝑖𝑗).
(26)

The main technical difficulty in dealing with these types of constraints is that they do not allow the summation in a multinomial form of the terms in {𝑡𝑖𝑗} inside the integral of the partition function for exactly fixed 𝑇. A workaround to perform the summation can be found, however: Instead of making the multinomial sum at once, we proceed in two steps. We first introduce a Kronecker delta in integral form inside the integral of the partition function  with associated Lagrange multiplier 𝜃 which fixes the total number of events. We secondly allow the sum inside the integral to cover all the available values of the phase space ( {𝑡𝑖𝑗[0,𝑇]𝑖,𝑗}. Finally, since the grouping of terms in the sum not fulfilling the constraints (including the constraint on the total expected number of events) will be penalized by the Kronecker deltas introduced earlier, we relax the limit on the sum over individual occupation number configurations from 𝑇 what reminds the standard approach to the grandcanonical ensemble (in Appendix the calculation with finite 𝑇 is also given).

Proceeding as explained we obtain a new version of Eq. :

𝛺=𝐷[𝜃]𝑒𝜃𝑇𝑒𝑞𝜆𝑞ˆ𝐶𝑞(𝑞𝐷[𝜆𝑞])𝑇!{𝑖,𝑗}𝑒(𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞)𝛩(𝑡𝑖𝑗)𝑒(𝑖𝑗+𝜃)𝑡𝑖𝑗𝑡𝑖𝑗!=𝐷[𝜃]𝑒𝜃𝑇𝑒𝑞𝜆𝑞ˆ𝐶𝑞(𝑞𝐷[𝜆𝑞])𝑇![𝑒(𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞)𝑇𝑡𝑖𝑗=1𝑒(𝑖𝑗+𝜃)𝑡𝑖𝑗𝑡𝑖𝑗!+10!]𝑡𝑖𝑗=𝑇𝑡𝑖𝑗(𝑒𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞)𝛩(𝑡𝑖𝑗)(𝑒𝑖𝑗+𝜃)𝑡𝑖𝑗𝑡𝑖𝑗!=𝐷[𝜃]𝑒𝜃𝑇𝑒𝑞𝜆𝑞ˆ𝐶𝑞(𝑞𝐷[𝜆𝑞])𝑇!exp{𝑖𝑗ln{𝑒𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝑖𝑗+𝜃1)+1}},
(27)

where we have introduced auxiliary fields 𝑏𝑖𝑗 for the binary projections of the occupation numbers. Both the complete set of cumulants of the occupation numbers and their binary projections can now be recovered by differentiation. For the occupation numbers using  and in the case of the binary projection,

𝛩(𝑡𝑖𝑗)=𝜕𝑏𝑖𝑗ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=𝑏𝑖𝑗=0𝑖,𝑗,𝜎2𝛩(𝑡𝑖𝑗)=𝜕𝑏2𝑖𝑗ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=𝑏𝑖𝑗=0𝑖,𝑗,𝜎2𝛩(𝑡𝑖𝑗),𝛩(𝑡𝑘𝑙)=𝜕𝑏𝑖𝑗𝑏𝑘𝑙ln𝛺({𝑡𝑖𝑗})|𝑖𝑗=𝑏𝑖𝑗=0𝑖,𝑗.
(28)

From expression  one can compute explicitly those values yielding

𝛩(𝑡𝑖𝑗)ˆ𝑝𝑖𝑗=𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝜃1)𝛥𝑖𝑗,𝑡𝑖𝑗=𝑒𝑒𝜃𝑒𝜃𝑒𝑒𝜃1ˆ𝑝𝑖𝑗𝑡+ˆ𝑝𝑖𝑗,𝜎2𝛩(𝑡𝑖𝑗)=ˆ𝑝𝑖𝑗(1ˆ𝑝𝑖𝑗);𝜎2𝑡𝑖𝑗=𝑡𝑖𝑗(1+𝑒𝜃𝑡𝑖𝑗),𝜎2𝑡𝑖𝑗,𝑡𝑘𝑙=𝜎2𝛩(𝑡𝑖𝑗),𝛩(𝑡𝑘𝑙)=0if𝑖𝑗𝑘𝑙,𝛥𝑖𝑗𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝜃1)+1.
(29)

For ease in notation, we perform the change 𝜌𝑒𝜃 and we identify 𝛩(𝑡𝑖𝑗) with the probability ˆ𝑝𝑖𝑗 of connection of nodes 𝑖 and 𝑗 and 𝑡+ with the graph-average occupation of existing links. In Appendix we prove that the resulting partition function is equal to the cumulant generating function of the outcome of 𝑁(𝑁1) independent zero-inflated Poisson processes (ZIP ) with individual associated probability,

𝑃(𝑡|ˆ𝑝,𝜌)=(1ˆ𝑝)1𝛩(𝑡)(ˆ𝑝𝑒𝜌1𝜌𝑡𝑡!)𝛩(𝑡),
(30)

which in turn, regarding only the binary projection, corresponds to the outcome of independent Bernoulli processes with probabilities ˆ𝑝,

𝑃(𝛩(𝑡)|𝑝)=ˆ𝑝𝛩(𝑡)(1ˆ𝑝)1𝛩(𝑡).
(31)

From  one sees that in this case the multiedge structure is completely determined by the binary constrained topology. Our coarse-grained description in terms of independent occupation numbers implies that for each state, two outcomes can be considered: Either the edge does not exist (obviously with 0 occupation) or it does exist, in which case the resulting (conditioned) statistics being Poisson with mean value 𝑡|𝑡1=𝜌𝑒𝜌𝑒𝜌1.

The constant relation of proportionality 𝑡𝑖𝑗ˆ𝑝𝑖𝑗 rapidly allows one to identify the graph-average occupation of the expected existing links 𝐸=𝑖𝑗ˆ𝑝𝑖𝑗,

𝑡+=𝑇𝐸=𝑒𝜌𝜌𝑒𝜌1>1;𝜌>0,
(32)

which can be inverted leading to

𝜌=𝑊(𝑒𝑡+𝑡+)+𝑡+,
(33)

where 𝑊(𝑥) is the Lambert 𝑊 function . Figure  shows a plot of Eq.  stressing the rapid asymptotical convergence 𝜌𝑡+ as 𝑡+ (in fact, the approximation is clearly good as soon as 𝑡+5).

FIG. 2

The result of Eq.  is shown together with the equality line 𝜌=𝑡+. One can see the rapid convergence: For 𝑡+=2.31 we obtain 𝜌𝑡+=0.9 and for 𝑡+=4.615 one finds 𝜌𝑡+=0.99.

With  and  we can compute the relative fluctuations of both occupation numbers and binary links in the thermodynamic limit,

lim𝑇𝜎2𝛩(𝑡𝑖𝑗)𝛩(𝑡𝑖𝑗)2=lim𝑇(1ˆ𝑝𝑖𝑗)ˆ𝑝𝑖𝑗=(1ˆ𝑝𝑖𝑗)ˆ𝑝𝑖𝑗,lim𝑇𝜎2𝑡𝑖𝑗𝑡𝑖𝑗2=lim𝑇(1+𝜌𝑡+ˆ𝑝𝑖𝑗)𝑡+ˆ𝑝𝑖𝑗=1+𝑡+(1ˆ𝑝𝑖𝑗)𝑡+ˆ𝑝𝑖𝑗,𝜎2𝑡𝑖𝑗𝑡𝑖𝑗2𝜎2𝛩(𝑡𝑖𝑗)𝛩(𝑡𝑖𝑗)2asˆ𝑝𝑖𝑗fixed,𝑡+=𝑇/𝑖𝑗ˆ𝑝𝑖𝑗.
(34)

These expressions reflect the bimodal structure of the state statistics and explains the nonvanishing nature of the relative fluctuations: The variance of the occupation numbers has a maximum for ˆ𝑝𝑖𝑗|max=12(1+𝑡+1)12, vanishes for the absence ( ˆ𝑝0) of an edge and converges to Poisson statistics for edges that always exist ( ˆ𝑝1). The existence of an edge is a binary event, hence the maximum variability corresponds to the draw situation ( 50% chance). In such a case, approximately half of the times a graph is created the considered edge will have (on average) occupation 𝑡+ and the other half occupation 0, generating vast fluctuations on the overall statistics which are caused by the constrained binary structure of the graph.

Concerning the entropy, performing the steepest descent approximation on  to first order as in  we have

𝑆BG𝑇ln𝜌𝑖𝑗ˆ𝑝𝑖𝑗𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞+ln(𝑇!)+𝑖𝑗ln𝛥𝑖𝑗.

Using 𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝜌1)=ˆ𝑝𝑖𝑗1ˆ𝑝𝑖𝑗, Eq. , one is lead to

𝑆BG={𝑖𝑗(ˆ𝑝𝑖𝑗lnˆ𝑝𝑖𝑗+(1ˆ𝑝𝑖𝑗)ln(1ˆ𝑝𝑖𝑗))}{𝑇ln𝜌ln𝑇!ln(𝑒𝜌1)𝐸},𝑆BG=𝑆bin+𝑆dist.
(35)

This expression has two clear contributions: The first term 𝑆bin is the entropy corresponding to the binary constrained topology while the second one 𝑆dist corresponds to the additional multiedge distinguishable structure. In other words, 𝑆bin counts all the possible ways to select 𝐸 states out of a total number 𝑁(𝑁1) while 𝑆dist refers to the possible ways to allocate the 𝑇 events on these 𝐸 surviving states.

The second term in  can be explicitly evaluated in two limiting cases: The dense case which corresponds to the thermodynamic limit and the sparse for which the binary and weighted structure are equal 𝑡+=𝑇/𝐸1 (hence 𝐸=𝐸 is fixed).

Considering the sparse case, one has from  𝑡+1 so 𝜌0 and hence,

lim𝜌0,𝑇𝐸𝑆dist=ln𝐸!,
(36)

which corresponds to the microcanonical counting of configurations coming from valid permutations of distinguishable multiedges over the fixed binary structure of 𝐸 surviving occupied states.

The dense case corresponds to 𝑇, which implies from  𝜌𝑡+=𝑇/𝐸 and ln(𝑇!)𝑇ln𝑇𝑇,

lim𝑇𝑆dist𝑇=ln𝐸,
(37)

which has a Shannon form if we consider 𝑝𝐸1 which would be the probability associated with a multinomial process of sorting 𝑇 events over the surviving 𝐸 binary links with identical probability 𝑝. In this limit, the difference between sorting 𝐸 independent Poisson processes with mean 𝑇/𝐸 and sorting 𝐸 Poisson processes excluding the zero-occupation events is negligible.

In the following, we present some examples to clarify the usage of this new methodology.

A. Fixed 𝐸,𝑇

We start by considering the most simple case where we only fix the total number of events 𝑇=𝑖𝑗𝑡𝑖𝑗 and the total number of existing binary links 𝐸=𝑖𝑗𝛩(𝑡𝑖𝑗) on the network, in analogy to the paradigmatic model of the Erdos-Renyi graph in binary networks .

In this case we introduce two Lagrange multipliers ( 𝜆,𝜃) to fix ( 𝐸,𝑇). Proceeding from Eq. , we readily obtain the saddle point equations,

𝐸=𝑖𝑗𝛩(𝑡𝑖𝑗)=𝑁(𝑁1)𝜒(𝑒𝜌1)𝜒(𝑒𝜌1)+1,𝑇=𝑖𝑗𝑡𝑖𝑗=𝑁(𝑁1)𝜒𝑒𝜌𝜌𝜒(𝑒𝜌1)+1,
(38)

where we identify 𝜒=𝑒𝜆,𝜌=𝑒𝜃. We see that the average occupation numbers and edge existence probability are constant and their average values proportional as expected. Using  we compute the relevant magnitudes,

𝛩(𝑡𝑖𝑗)ˆ𝑝=𝐸𝑁(𝑁1)=Cnt𝑡𝑖𝑗=𝑒𝜌𝜌𝑒𝜌1ˆ𝑝=𝑡+ˆ𝑝=𝑇𝐸ˆ𝑝=𝑇𝑁(𝑁1)=Cnt𝜎2𝑡=𝑇𝐸ˆ𝑝(1+𝜌𝑇𝐸ˆ𝑝)𝜎2𝛩=ˆ𝑝(1ˆ𝑝).
(39)

We recover the binary structure of the well-known Erdös-Renyi graph as a result of the binary projection of a nontrivial multiedge structure, which on average values fulfills 𝑡=𝑇𝐸𝑝=𝑡+𝑝.

Despite the average occupation numbers over the ensemble being equal to the cases in Sec. , the underlying statistic is not. All the nodes (and states) in this case are statistically equivalent and their associated strengths and degrees (incoming and outgoing) are proportional on average (since they are fluctuating quantities not being fixed by the constraints):

𝑠=𝑠=𝑇𝐸𝑘=𝑇𝐸𝑘=𝑇𝐸𝐸𝑁=𝑇𝑁;𝜎2𝑘𝑘2=𝑘1𝑗ˆ𝑝2𝑖𝑗𝑘2;𝜎2𝑠𝑠2=1𝑡+𝑘+𝑗ˆ𝑝(1ˆ𝑝)𝑘2𝜎2𝑘𝑘2as𝑇.
(40)

The entropy is readily computed from  yielding,

𝑆BG=𝑆𝐸𝑅+𝑆dist,𝑆𝐸𝑅=𝑁(𝑁1)(ˆ𝑝lnˆ𝑝+(1ˆ𝑝)ln(1ˆ𝑝)),𝑆dist=ln(𝑇!)𝑇ln(𝑡++𝑊(𝑡+𝑒𝑡+))+𝐸ln(𝑒𝑡++𝑊(𝑡+𝑒𝑡+)1).
(41)

Expression  can also be computed using combinatorial arguments: Consider a process in which one selects 𝐸 states out of 𝑁(𝑁1) and then populates each state with a single event chosen out of a set of 𝑇 distinguishable entities, finally, the rest of the 𝑇𝐸 events are sorted in the 𝐸 surviving states chosen in the first place. The counting of microstates reads

𝛤(𝐸,𝑇,𝑁)=(𝑁(𝑁1)𝐸)𝑇!(𝑇𝐸)!𝐸𝑇𝐸,
(42)

and hence the microcanonical entropy is 𝑆BG=ln𝛤(𝐸,𝑇,𝑁). Here again one recovers the equivalence with  in both the sparse and dense limits considered earlier.

B. Fixed degree sequence 𝑘={(𝑘out,𝑘in)𝑖},𝑇

The next important case to consider is the one where the node binary connectivity of the graph is fixed. Such a situation is specially interesting as our framework permits one to understand binary networks as the projection (for instance, due to partial information or limited resolution) of a process generated by independent agents.

In this case the constraints read

ˆ𝐶𝑖(𝑡𝑖𝑗)=𝑖𝛩(𝑡𝑖𝑗)=𝑘out𝑖,ˆ𝐶𝑗(𝑡𝑖𝑗)=𝑗𝛩(𝑡𝑖𝑗)=𝑘in𝑗.
(43)

So we introduce two sets of Lagrange multipliers ( {𝜆𝑖,𝛽𝑖}) and for expression  we have

𝛺=𝐷[{𝛼𝑖,𝛽𝑗}]exp{𝑖𝛼𝑖𝑘out𝑖𝑗𝛽𝑗𝑘in𝑗}×exp{𝜃𝑇+ln𝑇!+𝑖𝑗ln[𝑒𝛼𝑖+𝛽𝑗+𝑏𝑖𝑗(𝑒𝑒𝑖𝑗+𝜃1)+1]}=𝐷[{𝛼𝑖,𝛽𝑗}]exp𝑔({𝛼𝑖,𝛽𝑗,𝑖𝑗,𝑏𝑖𝑗},𝜃).

And we solve the saddle point equations,

𝜕𝜃𝑔|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑇=𝑖,𝑗𝑥𝑖𝑦𝑗𝑒𝜌𝜌𝑥𝑖𝑦𝑗(𝑒𝜌1)+1,𝜕𝛼𝑖𝑔|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑘out𝑖=𝑗𝑥𝑖𝑦𝑗(𝑒𝜌1)𝑥𝑖𝑦𝑗(𝑒𝜌1)+1,𝜕𝛽𝑗𝑔|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑘in𝑗=𝑖𝑥𝑖𝑦𝑗(𝑒𝜌1)𝑥𝑖𝑦𝑗(𝑒𝜌1)+1,
(44)

where we have identified 𝑥𝑖𝑒𝛼𝑖, 𝑦𝑗𝑒𝛽𝑗, and 𝜌𝑒𝜃. The occupation numbers and binary occupation probability, respectively, read

𝑡𝑖𝑗=𝑥𝑖𝑦𝑗𝑒𝜌𝜌𝑥𝑖𝑦𝑗(𝑒𝜌1)+1,ˆ𝑝𝑖𝑗=𝑥𝑖𝑦𝑗(𝑒𝜌1)𝑥𝑖𝑦𝑗(𝑒𝜌1)+1.
(45)

As expected we recover the proportionality of strengths and degrees for each node under this particular set of constraints.

𝑡𝑖𝑗=ˆ𝑝𝑖𝑗𝑒𝜌𝜌𝑒𝜌1𝑠𝑖=𝑡𝑘𝑖=𝑇𝐸𝑘𝑖.
(46)

Considering only the binary projection of the graph, one gets

ˆ𝑝𝑖𝑗=𝜇𝜅𝑖𝜆𝑗𝜇𝜅𝑖𝜆𝑗+1,
(47)

where we have identified 𝜅𝑖𝑥𝑖, 𝜆𝑗𝑦𝑗, and 𝜇(𝑒𝜌1).

The previous expression corresponds exactly with the expression for the so-called canonical ensemble of the random graph with any given degree distribution , where for fixed 𝑁, the 𝜇 parameter controls the edge density . And since 𝑥𝑖 is a quantity related with node 𝑖 (hence related with 𝑘𝑖), we obtain again the structural correlations of the configuration graph model .

V. MULTIEDGE NETWORK WITH GIVEN LINEAR CONSTRAINTS DEPENDING ON BOTH THE OCCUPATION NUMBERS 𝑡𝑖𝑗 AND THEIR BINARY PROJECTION 𝛩(𝑡𝑖𝑗)

We analyze for the sake of completeness the most general case where both types of considered constraints are fixed,  and .

We introduce two sets of Lagrangian multipliers for the multiedge constraints {𝜃𝑞} and the binary ones {𝜆𝑞}, plus an additional 𝜃 corresponding to the constraint on the total number of events. The procedure then is analogous to the one in the previous section yielding finally from ,

𝛺=𝐷[𝜃]𝑒𝜃𝑇𝑒𝑞𝜆𝑞ˆ𝐶𝑞𝑒𝑞𝜃𝑞𝐶𝑞(𝑞𝐷[𝜃𝑞])×(𝑞𝐷[𝜆𝑞])𝑇!exp{𝑖𝑗ln×{𝑒𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝑖𝑗+𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞1)+1}}.
(48)

Using  and  we obtain the statistic for the occupation numbers and their projections,

𝑡𝑖𝑗=1𝛥𝑖𝑗𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞exp(𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞)𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞,𝜎𝑡𝑖𝑗=𝑡𝑖𝑗(1+𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞𝑡𝑖𝑗),ˆ𝑝𝑖𝑗=1𝛥𝑖𝑗𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞),𝜎𝛩(𝑡𝑖𝑗)=ˆ𝑝𝑖𝑗(1ˆ𝑝𝑖𝑗),𝛥𝑖𝑗=𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞1)+1.
(49)

Assuming that the saddle point equations can be solved, which means that the imposed combination of binary and multiedge constraints is graphical, i.e., 𝑠𝑖𝑘𝑖𝑖[1,𝑁] for the case of fixed strength and degree sequence, for instance, then the probability to obtain a graph can still be written in terms of the Lagrange multipliers as a sum of independent ZIP processes with different parameters,

𝑃(𝐓|{ˆ𝑝𝑖𝑗,𝜇𝑖𝑗})=𝑖,𝑗(1ˆ𝑝𝑖𝑗)1𝛩(𝑡𝑖𝑗){ˆ𝑝𝑖𝑗𝑒𝜇𝑖𝑗1𝜇𝑡𝑖𝑗𝑖𝑗𝑡𝑖𝑗!}𝛩(𝑡𝑖𝑗),
(50)

where 𝜇𝑖𝑗=𝑒𝜃+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞 is a quantity related to the average value of the occupation number of the given link, conditioned that this link exists,

𝑡=1𝑡𝑖𝑗𝑃(𝑡𝑖𝑗|𝜇𝑖𝑗,ˆ𝑝𝑖𝑗,𝑡𝑖𝑗>0)𝑡=1𝑃(𝑡𝑖𝑗|𝜇𝑖𝑗,ˆ𝑝𝑖𝑗,𝑡𝑖𝑗>0)=𝜇𝑖𝑗𝑒𝜇𝑖𝑗𝑒𝜇𝑖𝑗1=𝑡+𝑖𝑗.
(51)

The relative fluctuations on occupation numbers do not vanish in the thermodynamic limit due to the strong constraints imposed by the binary structure. One can still express 𝜇𝑖𝑗(𝑡+𝑖𝑗) using ,

𝜇𝑖𝑗(𝑡+𝑖𝑗)=𝑡+𝑖𝑗+𝑊(𝑡+𝑖𝑗𝑒𝑡+𝑖𝑗).
(52)

In the thermodynamic limit ( 𝑇 which implies 𝑡+𝑖𝑗), expression  converges to 𝜇𝑖𝑗𝑡+𝑖𝑗.

Regarding the relation between expected occupation numbers and their binary projections, one has 𝑡𝑖𝑗=𝑡+𝑖𝑗ˆ𝑝𝑖𝑗 and the constant relation of proportionality is broken 𝑡𝑖𝑗ˆ𝑝𝑖𝑗Cnt. This extends the well-known result that it is impossible to generate uncorrelated networks both at the level of strengths and degrees for multiedge networks or weighted networks .

Concerning the entropy, approximating  by saddle point methods using  we obtain the general expression that includes all the previous cases considered,

𝑆BG=𝑆dist+𝑆bin,𝑆dist=ln𝑇!+𝑖𝑗ˆ𝑝𝑖𝑗ln(𝑒𝜇𝑖𝑗1)𝑖𝑗𝑡𝑖𝑗ln𝜇𝑖𝑗,
(53)

where 𝑆bin is still the binary contribution to the entropy [with the same form as in ].

The two limiting cases early considered can again be evaluated. The sparse case implies that 𝑡+𝑖𝑗=𝑇𝐸1𝑖,𝑗 which means 𝜇𝑖𝑗0 and 𝑡𝑖𝑗ˆ𝑝𝑖𝑗 obtaining

lim𝑇𝐸𝑆dist=ln𝐸!,
(54)

which is identical to the previous one (since it is equivalent to dropping the strength constraints).

For the thermodynamic limit (dense case), we have 𝑒𝜇𝑖𝑗1𝑒𝜇𝑖𝑗𝑒𝑡+𝑖𝑗 and then,

lim𝑇𝑆dist𝑇=𝑖𝑗𝑡𝑖𝑗𝑇ln𝑡+𝑖𝑗𝑇=𝑖𝑗𝑡𝑖𝑗𝑇ln𝑡𝑖𝑗𝑇ˆ𝑝𝑖𝑗,
(55)

for which the previous expressions encountered are limiting cases. On one hand, if we relax the constraints on the occupation numbers, then 𝑡+𝑖𝑗=𝑡+=𝑇/𝐸𝑖,𝑗 and we recover expression . On the other hand, not fixing any binary related quantity implies that as 𝑇, ˆ𝑝𝑖𝑗=1𝑒𝑡𝑖𝑗1 (fully connected topology) and we are lead to . Finally, not fixing any constraints, 𝑡+𝑖𝑗=𝑡𝑖𝑗=𝑡, we recover .

For simplicity, the only example we report in this section corresponds to the very relevant case in which both strength and degree sequences are fixed, the rest of the cases being easily derivable from the general theory exposed.

A. Fixed relative strength sequence and fixed degree sequence 𝑠={(𝑠out,𝑠in)𝑖},𝑘={(𝑘out,𝑘in)𝑖}

We analyze here a situation where the constraints imposed are the strength and degree sequence [ and ]. The calculations are analogous to the previous section obtaining

𝜕𝜃|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑇=𝑖𝑗1𝛥𝑖𝑗𝑥𝑖𝑦𝑗𝑒𝑧𝑖𝑤𝑗𝑒𝜃𝑒𝜃𝑧𝑖𝑤𝑗,𝜕𝛼𝑖|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑘out𝑖=𝑗1𝛥𝑖𝑗𝑥𝑖𝑦𝑗(𝑒𝑧𝑖𝑤𝑗𝑒𝜃1),𝜕𝛽𝑗|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑘in𝑗=𝑖1𝛥𝑖𝑗𝑥𝑖𝑦𝑗(𝑒𝑧𝑖𝑤𝑗𝑒𝜃1),𝜕𝛾𝑖|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑠out𝑖=𝑗1𝛥𝑖𝑗𝑥𝑖𝑦𝑗𝑧𝑖𝑤𝑗𝑒𝑧𝑖𝑤𝑗𝑒𝜃𝑒𝜃,𝜕𝜀𝑗|𝑖𝑗=𝑏𝑖𝑗=0𝑖𝑗=0𝑠in𝑗=𝑖1𝛥𝑖𝑗𝑥𝑖𝑦𝑗𝑧𝑖𝑤𝑗𝑒𝑧𝑖𝑤𝑗𝑒𝜃𝑒𝜃,𝛥𝑖𝑗𝑥𝑖𝑦𝑗{exp(𝑒𝜃𝑧𝑖𝑤𝑗)1}+1,

where we have identified 𝑥𝑖𝑒𝜆(𝑘out)𝑖,𝑦𝑗𝑒𝜆(𝑘in)𝑗,𝑧𝑖𝑒𝜃(𝑠out)𝑖,𝑤𝑗𝑒𝜃(𝑠in)𝑗 corresponding to the 4𝑁 Lagrange multipliers introduced. We hence obtain the saddle point equations,

𝑡𝑖𝑗=1𝛥𝑖𝑗𝑥𝑖𝑦𝑗exp(𝑒𝜃𝑧𝑖𝑤𝑗)𝑧𝑖𝑤𝑗𝑒𝜃,𝜎𝑡𝑖𝑗=𝑡𝑖𝑗(𝑧𝑖𝑤𝑗𝑒𝜃+1𝑡𝑖𝑗),ˆ𝑝𝑖𝑗=1𝛥𝑖𝑗𝑥𝑖𝑦𝑗(𝑒𝑧𝑖𝑤𝑗𝑒𝜃1),𝜎𝛩(𝑡𝑖𝑗)=ˆ𝑝𝑖𝑗(1ˆ𝑝𝑖𝑗).
(56)

Note that the expressions found in the previous cases are particular examples of this general problem and can be readily recovered by removing the appropriate constraints, i.e., making the Lagrange multipliers equal to zero, which in this case is equivalent to setting 𝑥𝑖=𝑦𝑗=1𝑖,𝑗 or 𝑧𝑖=𝑤𝑗=1𝑖,𝑗 or both.

We can revisit the case where only the strength sequence is fixed. Now, although the resulting statistics are Poisson and not multinomial, it can be proved that in the thermodynamic limit both descriptions are equivalent (see Appendix ). Additionally, in such a case one can obtain the statistics of the binary projection of the occupation numbers ˆ𝑝𝑖𝑗=1𝑒𝑡𝑖𝑗.

Unfortunately, the explicit form of the Lagrange multipliers for the degrees or the strengths cannot be solved, since the uncorrelated approximation is no longer valid,

𝑥𝑖𝑦𝑗(𝑒𝑒𝜃𝑧𝑖𝑤𝑗1)𝑒𝑒𝜃𝑧𝑖𝑤𝑗𝑥𝑖𝑦𝑗,if𝑥𝑖𝑦𝑗𝑒𝑒𝜃𝑧𝑖𝑤𝑗1ˆ𝑝𝑖𝑗𝑥𝑖𝑦𝑗𝑒𝑒𝜃𝑧𝑖𝑤𝑗.
(57)

In this last expression the factorization of the connection probability in two node-dependent magnitudes is impossible, despite the approximation assumed. Hence one sees again that there is no way of generating uncorrelated networks at the level of degrees under the strict set of constraints considered.

VI. CONCLUSIONS

The present work deals with the statistical framework of multiedge networks, which in contrast to weighted networks, are entities formed by distinguishable events that can be grouped in edges weighted by integer numbers. It introduces flexible analytical null models that provide expectations for the main variables of a system represented in terms of a complex network (the entries of its adjacency matrix) thus complementing the existing studies on weighted networks. The decision upon which model to take depends on the kind of (physical) process generating the network at study.

We have started by properly defining the differences between weighted and multiedge networks based on the distinguishability or not of the elements forming a network. We have then properly set up a framework of multiedge networks in a statistical mechanics approach by defining ensembles of networks fulfilling some given sets of general constraints. Such a framework has been used to obtain analytical expressions considering cases with general constraints depending linearly on the number of events allocated to each edge as well as their binary projections, considering only the existence or not of a given edge. Some common interesting cases have been explicitly developed as examples such as fixed strength sequence, degree sequence, cost, strength sequence plus cost, total number of binary edges, and both strength and degree fixed sequences. Previous results found in the literature have been recovered, specifically the correlations of the configurational model (for binary constraints on the degree sequence) and the absence of correlation between occupation numbers and degree once the degree sequence is fixed among others. Our treatment uncovers explicit relations between the binary occupation probability of an edge and its expected occupation number, which allows one to fully characterize binary networks as projections of multiedge graph instances. These results permit an extensive treatment of the finite size effects present in this kind of network, since they are fully valid both in a dense scenario and intermediate cases.

Furthermore, we have explicitly derived general forms for the probability of obtaining a given graph with given constraints in terms of its adjacency matrix elements statistics, which can be understood as the canonical and grand-canonical description of the considered ensembles. Such expressions permit explicit entropy measures, useful for the analysis of similar systems for data inference . Additionally, the analytical character of the presented methodology allows for the explicit calculation of some network metrics using probabilistic examples such a clustering and strength-degree correlations, for instance. As a complement we have also introduced the main ideas which can lead to the efficient generation of multiedge graphs though its details are left for development in future work.

The applications of the theory developed here can be used in a wide variety of fields, mainly all those network studies applied to data generated by agent-based systems and in particular in the very active transportation research area, where networks are the result of complex, time-dependent, mobility patterns . It also opens the door to extensions of this kind of network to multiple layer scenarios .

1 Please note that the framework we present permits self-edges if one desires; it is just a matter of extending the sums over available states over all values of 𝑖,𝑗 or excluding the term 𝑖𝑗. It also is valid for both directed and undirected networks: One needs to perform such sums only over states for which 𝑗<𝑖 in such cases.

2 In the remainder of the paper, the * signs will be omitted to simplify notation.

3 As done in for the binary case.

This work has been partially supported by the Spanish DGICYT Grant No. FIS2009-13730-C02-01 and by the Generalitat de Catalunya 2009-SGR-00838. O.S. has been supported by the Generalitat de Catalunya through the FI Program.

Appendix A: CUMULANT GENERATING FUNCTIONS

In this section we present the cumulant generating functions for the different models proposed (Bernoulli, multinomial, Poisson, and zero-inflated Poisson) and show that their close relationship to the expressions developed in the main text.

The cumulant generating function of the probability distribution 𝑃() of a variable is defined as 𝐾(,𝑥)=ln𝑀(,𝑥), being 𝑀(,𝑥)=𝑒𝑥 its moment generating function and 𝑥 and auxiliary field. Once 𝐾(,𝑥) is known, all central cumulants 𝜅𝑘 can be obtained by derivation, uniquely determining the distribution.

𝜅𝑘=𝜕𝑘𝐾(,𝑥)|𝑥=0.
(A1)

Note that if we consider the joint distribution of two (or more) independent variables 1,2, being it a product of the individual distributions 𝑃(1),𝑃(2), then 𝑀1,2(1,2,𝑥1,𝑥2)=𝑀1(1,𝑥1)𝑀2(2,𝑥2) and finally its joint cumulant generating function factorizes in the sum 𝐾12(1,2,𝑥1,𝑥2)=𝐾1(1,𝑥1)+𝐾2(2,𝑥2).

Having introduced that, if we start at the microcanonical level [Eq. ] and identify

𝑃({𝑡𝑖𝑗})=𝛺1𝑞𝛿(𝐶𝑞𝐶𝑞({𝑡𝑖𝑗}))={{𝑐}𝑄𝑞𝛿(𝐶𝑞𝐶𝑞({𝑡𝑖𝑗}))}1×𝑞𝛿(𝐶𝑞𝐶𝑞({𝑡𝑖𝑗}))𝐶1𝑞𝛿(𝐶𝑞𝐶𝑞({𝑡𝑖𝑗})),ln𝛺({𝑖𝑗})=ln{𝐶{𝑐}𝑃({𝑡𝑖𝑗})𝑒𝑖𝑗}=ln𝐶+𝐾({𝑡𝑖𝑗},{𝑖𝑗}),
(A2)

we clearly see that

𝜕𝑖𝑗ln𝛺({𝑡𝑖𝑗},{𝑖𝑗})=𝜕𝑖𝑗𝐾({𝑡𝑖𝑗},{𝑖𝑗}),
(A3)

and the relation between both objects is apparent.

Starting at the level where only the number of events 𝑇 is fixed (where all the distributions reduce to a multinomial form with associated probabilities {𝑝𝑖𝑗}), we take Eq.  and perform the saddle point approximation to obtain

ln𝛺𝑇ln𝑖𝑗exp(𝑖𝑗+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞)+𝐹({𝜃𝑞},{𝐶𝑞})=𝐻({𝑖𝑗})+𝐹({𝜃𝑞},{𝐶𝑞}).
(A4)

Despite having additional terms, with regards to differentiation with respect to 𝑖𝑗 (needed to recover the moments of the distribution of 𝑡𝑖𝑗), the form obtained is always of the type 𝐻({𝑖𝑗})=𝑇ln𝜋𝑖𝑗𝑒𝑖𝑗 (since 𝐹 is a function not depending on the auxiliary fields {𝑖𝑗}) and hence the underlying statistic is multinomial because 𝑖𝑗𝜋𝑖𝑗=1.

Considering now only the most general case in Eq.  we have again (adding external fields for the binary projection {𝑏𝑖𝑗}),

ln𝛺𝑖𝑗ln{𝑒𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝑖𝑗+𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞1)+1}+𝐹({𝜃𝑞},{𝜆𝑞},{𝐶𝑞},{𝐶𝑞}).
(A5)

Dropping the constraints on the binary projection, i.e., 𝜆𝑞=𝑏𝑖𝑗=0𝑞,𝑖𝑗 yields

𝐻({𝑖𝑗})=𝑖,𝑗𝑒𝑖𝑗𝜇𝑖𝑗,
(A6)

where we identified 𝜇𝑖𝑗=𝑒𝑞𝜃𝑞𝑐(𝑖𝑗)𝑞. This expression (up to derivation with respect to 𝑖𝑗) has the same form as a sum of independent Poisson cumulant generating functions ( 𝜇𝑒𝑖𝑗).

Dropping the constraints on the multilink nature of the network 𝜃𝑞=𝑖𝑗=0𝑞,𝑖𝑗 (except the one on the total number of multilinks, 𝜃), we have

𝐻({𝑞𝑖𝑗})=𝑖,𝑗{ln(ˆ𝑝𝑖𝑗𝑒𝑏𝑖𝑗+(1ˆ𝑝𝑖𝑗))ln(1ˆ𝑝𝑖𝑗)},
(A7)

where we identified ˆ𝑝𝑖𝑗 from Eq. . And we see that the prior expression is closely related to the cumulant generating function of independent Bernoulli processes (with respect again to derivation on {𝑞𝑖𝑗} terms).

Finally, the most general case can be mapped to a mixed zero-inflated Poisson process as we shall prove: Imagine the outcome of a process in which we sort 𝑁(𝑁1) independent Bernoulli processes and from the result of it, if the outcome is positive, we sort a Poisson process on top of it (discarding the no-occurrence event). Since the processes are independent, we shall consider a single one of them and then write the overall probability of the events as the product of the different probabilities 𝑃(𝑡). The associated probability of the event just described is

𝑃𝑏𝑝(𝑡)=(1𝑝)𝛩(𝑡)(𝑝𝑒𝜇1𝜇𝑡𝑡!)𝛩(𝑡),
(A8)

which represents a probability measure over an integer quantity. We can compute the mean and variance of 𝑡 yielding

𝑡=0+𝑝𝑒𝜇1𝑡=1𝑡𝜇𝑡𝑡!=𝑝𝜇𝑒𝜇𝑒𝜇1,𝜎2𝑡=𝑡(1+𝜇𝑡),𝛩(𝑡)=𝑝;𝜎2𝛩(𝑡)=𝑝(1𝑝).
(A9)

The obtained expressions need to be compared with , which allows one to identify 𝜇=𝑒𝜃𝑞𝑒𝜃𝑞𝑐(𝑖𝑗)𝑞 and 𝑝=ˆ𝑝𝑖𝑗. Moreover, concerning the cumulant generating function, one finds

ln𝑒𝑡=ln𝑡=0𝑃𝑏𝑝(𝑡)𝑒𝑡=ln{1𝑝+𝑝𝑒𝜇1(𝑒𝜇𝑒1)}=ln(1+𝑝)+ln{1+𝑝1𝑝(𝑒𝜇𝑒1)𝑒𝜇1},
(A10)

which is identical to the argument in the sum of Eq.  (except for a linear constant) and captures the more general case considered.

Appendix B: ENSEMBLE EQUIVALENCE AND GRAPH GENERATION

Throughout this paper we have uncovered the mathematical expressions allowing one to generate networks under different ensembles using a probabilistic framework over 𝑡𝑖𝑗. Explicitly they can be summarized:

  • (1) Canonical ensemble (linear constraints on 𝑡𝑖𝑗):

    𝑃(𝐓|𝑇,{𝜃𝑞})=𝑇!𝑖𝑗𝑡𝑖𝑗!𝑖𝑗𝑝𝑡𝑖𝑗𝑖𝑗.
    (B1)
  • (2) Grandcanonical ensemble [linear constraints on 𝛩(𝑡𝑖𝑗) and/or on 𝑡𝑖𝑗]:

    𝑃(𝐓|{ˆ𝑝𝑖𝑗,𝜇𝑖𝑗})=𝑖,𝑗(1ˆ𝑝𝑖𝑗)1𝛩(𝑡𝑖𝑗){ˆ𝑝𝑖𝑗𝑒𝜇𝑖𝑗1𝜇𝑡𝑖𝑗𝑖𝑗𝑡𝑖𝑗!}𝛩(𝑡𝑖𝑗).
    (B2)
  • (3) Microcanonical ensemble. This ensemble can be used by generating sequences of {𝑡𝑖𝑗} using the two above expressions and discarding those not corresponding exactly with the imposed constraints.

We have shown already that the relative fluctuations of the linear constraints on the occupation numbers 𝑡𝑖𝑗 vanish in the thermodynamic limit and that the binary depending constraints are nonvanishing in this limit. We finally prove here that the grand-canonical ensemble considering only linear constraints on occupation numbers is strictly equivalent to the canonical and the microcanonical in the thermodynamic limit.

To do so, we make use of the properties of the multinomial distribution to recover it under the cases where no constraints or only linear constraints on 𝑡𝑖𝑗 are imposed. In this case 𝑝𝑖𝑗=1𝑒𝜇𝑖𝑗  and  reduce to the product of Poisson distributions with different mean parameters 𝜇𝑖𝑗.

The outcome of a process in which we sort different independent Poisson variables of parameters {𝜇𝑖𝑗} can be equivalently expressed as a product of a multinomial process of 𝑇=𝜇𝑖𝑗 multinomial trials with associated probabilities {𝑝𝑖𝑗=𝜇𝑖𝑗𝑇}. Hence we have that

𝑃(𝐓)=Mult({𝑝𝑖𝑗}|𝑇)Pois(𝜆𝑇=𝜇𝑖𝑗).
(B3)

And since the resulting occupation numbers statistics derive to a Poisson distribution, which has vanishing fluctuations on the thermodynamic limit ( 𝜎2𝑡𝑖𝑗/𝜇2𝑖𝑗=𝜇1𝑖𝑗𝑇1, then the equivalence between the two presented calculations is completely proved.

Appendix C: FINITE T IN EQ. 

Let us assume that instead of having 𝑇 in  we consider 𝑇 finite. Therefore,

𝛺=𝑒𝜃𝑇𝑒𝑞𝜆𝑞𝐶𝑞𝐷[𝜃](𝑞𝐷[𝜆𝑞])×𝑇![𝑒𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞𝑇𝑡𝑖𝑗=1𝑒(𝑖𝑗+𝜃)𝑡𝑖𝑗𝑡𝑖𝑗!+10!],×𝑡𝑖𝑗=𝑇𝑡𝑖𝑗(𝑒(𝑏𝑖𝑗+𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞)𝛩(𝑡𝑖𝑗)+𝑖𝑗+𝜃)𝑡𝑖𝑗𝑡𝑖𝑗!,
(C1)

and we now perform the finite sum inside the integral,

𝑇1𝑧𝑡𝑡!=𝛤(𝑇+1,𝑧)𝑒𝑧𝛤(𝑇+1).
(C2)

And for the occupation numbers and edge probability obtain

𝑡𝑖𝑗=1𝛥𝑖𝑗𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞𝑒𝜃(𝑒𝑒𝜃𝛤(𝑇+1,𝑒𝜃)𝛤(𝑇+1)𝑒𝑇𝜃𝛤(𝑇+1)),ˆ𝑝𝑖𝑗=1𝛥𝑖𝑗𝑒𝑞𝜆𝑞̂𝑐(𝑖𝑗)𝑞(𝑒𝑒𝜃𝛤(𝑇+1,𝑒𝜃)𝛤(𝑇+1)1),𝛥𝑖𝑗=𝑒𝑞𝜆𝑞̂𝑐(𝑖,𝑗)𝑞(𝑒𝑒𝜃1)+1,
(C3)

which converge extremely quickly to the obtained results for 𝑇 (see Fig. ).

FIG. 3

𝑇1𝜌𝑡𝑡!𝑒𝜌=𝛤(𝑇+1,𝜌(𝑡+))𝛤(𝑇+1) for different values of 𝑡+ and 𝑇, we observe the extreme rapid convergence to unity as 𝑇 grows (note that 𝑇𝑡+=𝑇/𝐸).

Appendix D: ADDITIONAL EXAMPLES. FIXED BINNED DISTRIBUTION OF COSTS {𝑐𝑛,𝑁𝑐𝑛}

We here report an additional example which may be of interest for studies on transportation origin-destination matrices, where some forms of trip-cost distribution have been discussed .

Starting from Sec. (), it is a matter of considering the additional term,

exp(𝑛𝜅𝑛𝜉𝑛(𝑑𝑖𝑗)),

on the equations, where 𝜅𝑛 are additional Lagrange multipliers satisfying that

𝑁𝑛=𝑖𝑗𝜅𝑛𝜉𝑛(𝑑𝑖𝑗)𝑡𝑖𝑗,

where 𝑁𝑛 is the number of trips whose distance is in the interval [𝑑𝑛1,𝑑𝑛) and 𝜉𝑛 is the indicator function of such an event. The size of the bins needs to be chosen in an appropriate manner as to give consistency to the distribution obtained. Such an example is particularly important for its importance to assess whether an observed O-D is caused by a particular tendency of the agents that create it to move or conversely, the space where the move shapes the form of the obtained O-D.

The expressions of 𝑝𝑖𝑗 in this case read

𝑝𝑖𝑗=𝑒𝑛𝜅𝑛𝜉(𝑑𝑖𝑗)𝑖𝑗𝑒𝑛𝜅𝑛𝜉(𝑑𝑖𝑗).

And the prior considerations are also valid. Note also that keeping the multinomial framework, all the quantities considered are intensive, while the expected strength and average occupation numbers remain extensive variables. Additionally, in the thermodynamic limit these quantities have vanishing relative fluctuations.

References (40)

  1. R. Albert and A.-L. Barabási, Rev. Mod. Phys. 74, 47 (2002).
  2. M. E. J. Newman, Networks: An Introduction (Oxford University Press, Oxford, 2010), p. 720.
  3. S. N. Dorogovtsev and J. F. F. Mendes, Evolution of Networks: From Biological Nets to the Internet and WWW (Oxford University Press, Oxford, 2003).
  4. J. Park and M. E. J. Newman, Phys. Rev. E 70, 066117 (2004).
  5. D. Garlaschelli and M. Loffredo, Phys. Rev. Lett. 102, 038701 (2009).
  6. G. Krings, F. Calabrese, C. Ratti, and V. D. Blondel, J. Stat. Mech.: Theory Exp. (2009) L07003.
  7. M. Barthélemy, Phys. Rep. 499, 1 (2011).
  8. P. Kaluza, A. Kölzsch, M. T. Gastner, and B. Blasius, J. R. Soc. Interface R. Soc. 7, 1093 (2010).
  9. C. Roth, S. M. Kang, M. Batty, and M. Barthélemy, PLoS One 6, e15923 (2011).
  10. A. G. Wilson, Transportation Research 1, 253 (1967).
  11. A. Annibale, A. C. C. Coolen, L. P. Fernandes, F. Fraternali, and J. Kleinjung, J. Phys. A. Math. Gen. 42, 485001 (2009).
  12. S. E. Ahnert, D. Garlaschelli, T. M. A. Fink, and G. Caldarelli, Phys. Rev. E 76, 016101 (2007).
  13. F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, Nature (London) 484, 96 (2012).
  14. P. Expert, T. S. Evans, V. D. Blondel, and R. Lambiotte, Proc. Natl. Acad. Sci. 108, 7663 (2011).
  15. P. Holme and J. Saramäki, Phys. Rep. 519, 97 (2012).
  16. D. Garlaschelli and M. I. Loffredo, Phys. Rev. E 78, 015101(R) (2008).
  17. A. Wilson, Geogr. Anal. 42, 364 (2010).
  18. M. Barthélemy, A. Barrat, R. Pastor-Satorras, and A. Vespignani, Physica A 346, 34 (2005).
  19. G. Bianconi, Europhys. Lett. 81, 28005 (2008).
  20. B. M. Waxman, Sel. Areas Commun. IEEE J. 6, 1617 (1988).
  21. A. Bazzani, B. Giorgini, S. Rambaldi, R. Gallotti, and L. Giovannini, J. Stat. Mech.: Theory Exp. (2010) P05001.
  22. X. Liang, X. Zheng, W. Lv, T. Zhu, and K. Xu, Physica A 391, 2135 (2012).
  23. M. A. Serrano and M. Boguna, AIP Conf. Proc. 776, 101 (2005).
  24. M. A. Serrano, M. Boguna, and R. Pastor-Satorras, Phys. Rev. E 74, 055101 (2006).
  25. T. Squartini and D. Garlaschelli, New J. Phys. 13, 083001 (2011).
  26. S. Erlander and N. F. Stewart, The Gravity Model in Transportation Analysis: Theory and Extensions (CRC Press, Boca Raton, 1990).
  27. S. Goh, K. Lee, J. S. Park, and M. Y. Choi, Phys. Rev. E 86, 026102 (2012).
  28. J. Park and M. E. J. Newman, Phys. Rev. E 70, 066117 (2004).
  29. D. Lambert, Technometrics 34, 1 (1992).
  30. R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth, Advances in Computational Mathematics (Springer, New York, 1996), pp. 329–359.
  31. G. Bianconi, A. C. C. Coolen, and C. J. Perez Vicente, Phys. Rev. E 78, 016114 (2008).
  32. P. Erdős and A. Rényi, Publications of the Mathematical Institute of the Hungarian Academy of Sciences (Hungarian Academy of Sciences, Budapest, 1960), pp. 17–61.
  33. A. Barrat, M. Barthélemy, R. Pastor-Satorras, and A. Vespignani, Proc. Natl. Acad. Sci. 101, 3747 (2004).
  34. M. Boguñá and R. Pastor-Satorras, Phys. Rev. E 68, 036112 (2003).
  35. K. Anand, G. Bianconi, and S. Severini, Phys. Rev. E 83, 036109 (2011).
  36. Z. Burda and A. Krzywicki, Phys. Rev. E 67, 046118 (2003).
  37. G. Bianconi, P. Pin, and M. Marsili, Proc. Natl. Acad. Sci. 106, 11433 (2009).
  38. P. Colomer-de-Simon and M. Boguñá, Phys. Rev. E 86, 026120 (2012).
  39. G. Bianconi, Phys. Rev. E 87, 062806 (2013).
  40. A. Solé-Ribalta, M. De Domenico, N. E. Kouvaris, A. Díaz-Guilera, S. Gómez, and A. Arenas, Phys. Rev. E 88, 032807 (2013).