Screen reader users, click here to load entire articleThis page uses JavaScript to progressively load the article content as a user scrolls. Screen reader users, click the load entire article button to bypass dynamically loaded article content.
ScienceDirect is phasing out support for older versions of Internet Explorer on Jan 12, 2016. For the best product experience, we recommend you upgrade to a
newer version of IE
or use a different browser: Firefox or Chrome.
For additional information please see the ScienceDirect Blog page.
Close
Quantification of human group-behavior has so far defied an empirical, falsifiable approach. This is due to tremendous difficulties in data acquisition of social systems. Massive multiplayer online games (MMOG) provide a fascinating new way of observing hundreds of thousands of simultaneously socially interacting individuals engaged in virtual economic activities. We have compiled a data set consisting of practically all actions of all players over a period of 3 years from a MMOG played by 300,000 people. This large-scale data set of a socio-economic unit contains all social and economic data from a single and coherent source. Players have to generate a virtual income through economic activities to ‘survive’ and are typically engaged in a multitude of social activities offered within the game. Our analysis of high-frequency log files focuses on three types of social networks, and tests a series of social-dynamics hypotheses. In particular we study the structure and dynamics of friend-, enemy- and communication networks. We find striking differences in topological structure between positive (friend) and negative (enemy) tie networks. All networks confirm the recently observed phenomenon of network densification. We propose two approximate social laws in communication networks, the first expressing betweenness centrality as the inverse square of the overlap, the second relating communication strength to the cube of the overlap. These empirical laws provide strong quantitative evidence for the Weak ties hypothesis of Granovetter. Further, the analysis of triad significance profiles validates well-established assertions from social balance theory. We find overrepresentation (underrepresentation) of complete (incomplete) triads in networks of positive ties, and vice versa for networks of negative ties. Empirical transition probabilities between triad classes provide evidence for triadic closure with extraordinarily high precision. For the first time we provide empirical results for large-scale networks of negative social ties. Whenever possible we compare our findings with data from non-virtual human groups and provide further evidence that online game communities serve as a valid model for a wide class of human societies. With this setup we demonstrate the feasibility for establishing a ‘socio-economic laboratory’ which allows to operate at levels of precision approaching those of the natural sciences.
All data used in this study is fully anonymized; the authors have the written consent to publish from the legal department of the Medical University of Vienna.
Keywords
Social network analysis;
Network theory;
Massive multiplayer online game;
Social balance;
Triadic closure;
Quantitative social science
1. Introduction
Quantification of collective human behavior or social dynamics poses a unique, century old challenge. It is remarkable to some extent that mankind knows more about dynamics of subatomic particles than it knows about the dynamics of human groups. The reason for this situation is that the establishment of a fully experimental and falsifiable social science of group dynamics is tremendously complicated by two factors: First, unlike many problems in the natural sciences, dynamics of societies constitute a complex system, characterized by strong and long-range interactions, which are in general not treatable by traditional mathematical methods and physical concepts. Second, data is of comparably poor availability and quality ( Watts, 2007 and Lazer et al., 2009). Evidently it is much harder to obtain data from social systems than from repeatable experiments on (non-complex) physical systems. Despite these severe problems, it is nevertheless paramount to arrive at a better understanding of collective human behavior. Only recently it became most evident in the context of economics and finance, which costs are associated to misconceptions of human collective behavior. If the dynamics behind collective behavior are going to remain as poorly understood as they are today, without being able to generate statements with predictive value, any attempts of managing crises will turn out not a whit better than illusionary.
Many complex systems cannot be understood without their surroundings, contexts or boundaries, together with the interactions between these boundaries and the system itself. This is obviously necessary for measuring large-scale dynamics of human groups. Regarding data acquisition it is therefore essential not only to record decisions of individual humans but also the simultaneous state of their surroundings. Further, in any data-driven science the observed system should not be significantly perturbed through the act of measurement. In social science experiments subjects usually are fully aware of being observed—a fact that might strongly influence their behavior. Finally, data acquisition in the social sciences becomes especially tiresome on group levels, see, e.g. Newcomb (1961). Traditional methods of social science such as interviews and questionnaires do not only need a lot of time and resources to deliver statistically meaningful assertions, but may introduce well-known biases ( Carrington et al., 2005). To many it might seem clear that social sciences can not overcome these problems, and that therefore social sciences would always remain on a lower quantitative and qualitative level than the natural sciences.
Both issues, the availability of data, and the possibility to take simultaneous measurements on subjects and their surroundings, might appear in a radically more positive light when looking at massive multiplayer online games (MMOGs) (Castronova, 2005). Such computer games not only allow to conduct complete measurements of socially interacting humans, they also provide data at rates comparable to physical experiments. Remarkably, one of the largest collective human activities on the planet is the playing of online games. Currently more than a hundred million people worldwide play MMOGs—the well-known game World of Warcraft alone has more than 10 million subscribers as of today. MMOGs exhibit such an enormous success due to offering their players possibilities to experience alternative or second lifes, not only providing (virtual) economic opportunities, but also a huge variety of possible social interactions among players. Many MMOGs provide rich virtual environments facilitating socialization and interactions on group levels ( Yee, 2006a, Yee, 2006b and Castronova, 2005). Motivation of players to participate in MMOGs are highly heterogeneous, ranging from establishing friendships, gain of respect and status within the virtual society, to the fun of destroying the hard work of other players. Besides economical and social interactions, modern MMOGs also offer a component of exploration, e.g. players can explore their ‘physical’ environment, such as specific features of their universe, ‘biological’ details of space-monsters, etc., and share their findings within ‘specialist’ communities.
From a scientific point of view online games provide a tool for understanding collective human phenomena and social dynamics on an entirely different scale (Bainbridge, 2007 and Castronova, 2006). In these games all information about all actions taken by all players can be easily recorded and stored in log-files at practically no cost. This quantity of data has been unthinkable in the traditional social sciences where sample sizes often do not exceed several dozens of questionnaires, school classes or students in behavioral experiments. In MMOGs on the other hand, the number of subjects can reach several hundred thousands, with millions of recorded actions. These actions of individual players are known in conjunction with their surroundings, i.e. the circumstances under which particular actions or decisions were taken. This offers the unique opportunity to study a complex social system: conditions under which individuals take decisions can in principle be controlled, the specific outcomes of decisions can be measured. In this respect social science is on the verge of becoming a fully experimental science ( Lazer et al., 2009) which should increasingly become capable of making a great number of repeatable and eventually falsifiable statements about collective human behavior, both in a social and economical context.
Another advantage over traditional ways of data acquisition in the social sciences is that players of MMOGs do not consciously notice the measurement process.1 These ‘social experiments’ practically do not perturb or influence the sample. Moreover MMOGs not only open ways to explore sociological questions, but – if economic aspects are part of the game (as it is in many MMOGs) – also to study economical behavior of groups. Here again economical actions and decisions can be monitored for a huge number of individual players within their social and economical contexts. This means that MMOGs offer a natural environment to conduct behavioral economics experiments, which have been of great interest in numerous small-scale surveys, see, e.g. Gächter and Fehr (1999) and Henrich et al. (2005). It becomes possible to study the socio-economic unit of large online game societies.
In the past years we have recorded practically all actions of all players taken in the self-developed, proprietary MMOG Pardus which is online since 2004. Pardus is an open-ended game with a worldwide player base of more than 300,000 people. Players reside and act within a virtual, persistent futuristic universe and make up their own goals. Most players invent and develop their virtual social lifes without constraints by the game setup. The game’s environmental topology is given but can be manipulated by the players to some extent. Players self-organize within groups and subgroups, claim territories, decide to go to war, etc., completely on their own accounts. Players typically participate in the game for several weeks to several years.
Players of Pardus characteristically engage in various economic activities to increase their wealth (non-convertible game money): There are numerous possibilities for jobs, such as mining and processing basic resources from the environment, trade, production, assembly and consumption of commodities, etc. Economic life is embedded in a production tree which provides a basic framework for player-created industries. Trade occurs following simple ‘rules’ within dynamic and demand-oriented virtual markets constituted by groups of players. Social life within Pardus is based on means of communication with fellow players in various forms, such as chat, forum, private messages, which allow the establishment of e.g. friendships or hostile relations. There are a number of ways to publicly display one’s ‘status’ within the virtual society: Purchase of expensive status symbols, such as space ships, earning of medals of honor for war efforts or for defeating outlaws, etc. These possibilities are not only well used, but constitute an important psychological driving force for many players.
Given the complete data set from the Pardus game, here we follow two major directions of research.
1.1. Network analysis
It is possible to directly access the dynamics of several types of social networks such as dynamics of friend networks, networks of enemies, or communication networks. Especially the latter offer a fantastic way to directly relate findings in the game with real-world communication networks, such as a data set of cell phone calls which has been recently analyzed from a network perspective (Onnela et al., 2007 and Lambiotte et al., 2008). While there exists some insight into real-world friend networks in the literature, e.g. of the Facebook community (Golder et al., 2007), there is practically no knowledge of topology and dynamics of enemy networks (Labianca and Brass, 2006). Since the time resolution of our data is accurate to 1 s, it becomes possible to study time courses of global network properties. This way it can be understood if and how communities show aging effects, such as densification, i.e. shrinking diameters and growing average degrees. This phenomenon has been observed in societies and online communities ( Leskovec et al., 2007 and Leskovec et al., 2008), as well as in the evolution of scientific fields and cities ( Bettencourt et al., 2007 and Bettencourt et al., 2008). The vast majority of social network studies analyze single or at best small numbers of network snapshots. Important exceptions include time resolved studies of an internet dating community ( Holme et al., 2004), the analysis of a university email network ( Kossinets and Watts, 2006), of the web of scientific coauthorships ( Ravasz, 2004, Newman, 2001 and Newman, 2004), as well as several large-scale networks of various types ( Leskovec et al., 2007).
For quantification purposes we employ network measures such as betweenness centrality ( Freeman, 1977) and overlap which measures how often a given pair of nodes has links to other common nodes ( Onnela et al., 2007). To our knowledge, no longitudinal measurements of large-scale signed networks exist as of today. One well-known social network study on monks in a monastery can be found in the classic literature ( Sampson, 1968), as well as a modern long-time survey of social dynamics in classrooms ( Jordán, 2009). These are first attempts of systematic social balance experiments, however being far from conclusive due to limited data and small scales (10–100 nodes), and a low number of samples (about ≈10 observations). Further, the extent of reciprocity and Triad significance profiles ( Milo et al., 2004) together with their dynamics can be directly accessed from the game data. For the quantification of these concepts we use recent technology developed in the context of motif distributions. To understand microscopic changes in social network dynamics, transition rates between dyadic and triadic structures can be measured—yielding parameters needed, e.g. for calibrating agent-based models of social network dynamics, as e.g. in Antal et al. (2006). So far these transition rates could only be assumed by model builders and have never been measured in actual societies.
Further research directions may include economic analysis, e.g. comparison of the Pardus economy to real economies (Yakovenko, 2009, Dragulescu and Yakovenko, 2001, Chatterjee et al., 2007 and Cont, 2001), focus on co-evolving networks (Biely et al., 2007 and Biely et al., 2009), as well as group formation dynamics or gender and country aspects. In (Szell et al., 2010) multi-relational (multiplex) aspects of social networks were studied. It was shown that relations driven by aggression lead to markedly different systemic characteristics than relations of non-aggressive nature. Network–network interactions reveal a non-trivial structure of this multi-dimensionality, and how humans play very different roles in different relational networks.
It is not obvious a priori that a population of online players is a representative sample of real-world societies ( Williams et al., 2009). However, several recent studies are providing evidence that human behavior on a collective level is remarkably robust, meaning that statistical differences of real-world communities and game-societies are often marginal ( Johnson et al., 2009 and Jiang et al., 2009).
The paper is organized as follows. In Section 2 we present the game, describe the sample of players and explain their modes of communication. We introduce the three types of social networks studied. Section 3 contains the network measures used in our analyses. Results are presented in Section 4 and are discussed in Section 5. Finally we conclude in Section 6.
2. The game
2.1. Overview
Pardus (http://www.pardus.at) is a browser-based MMOG in a science-fiction setting, open to the public and played since September 2004. A browser-based MMOG is characterized by a substantial number of users playing together in the same virtual environment connected by an internet browser. For a detailed categorization of online games see Bartle (2004) and Castronova (2005).
In Pardus every player owns an account with one character per game universe; players are forbidden to operate multiple accounts. A character is a pilot owning a spacecraft with a certain cargo capacity, roaming the virtual universe trading commodities, socializing, and much more, ‘to gain wealth and fame in space’ (http://www.pardus.at/index.php?section=about). The main component of Pardus consists of trade simulation with a society of players heavily driven by social factors such as friendship, cooperation or competition. There is no explicit ‘winning’ in Pardus as there is no inherent set of goals nor allowed or forbidden ‘moves’ (with a few exceptions mainly concerning decent language and behavior towards fellow players). Pardus is a virtual world or synthetic world with a gameplay based on socializing and role-playing, with interaction of player characters with others and with non-player characters as its core elements ( Castronova, 2005).
There exist three separate game universes: Orion, Artemis, and Pegasus. Presently Pardus is actively played by ≈14,000 players, over 300,000 have registered so far. Orion and Artemis each inhabit ≈6500 active characters, Pegasus with its ≈1400 characters is for paying customers only. We count existing characters who have mastered the game’s tutorial environment as active (characters which are inactive for 120 days get deleted automatically, see Section 2.3). Fig. 1 depicts the evolution of universe populations for the time range where data is available (see Section 2.2). The majority plays the game for free, paying members receive Premium accounts which bestow them with additional features not available to users with Free account status (such as the possibility of character creation in the Pegasus game universe). Orion was opened on September 14th 2004, Artemis and Pegasus 1000 days later, on June 10th 2007. Between universes it is impossible to move, trade, or exchange game money. The universes are independent. 2
Fig. 1.
Evolution of number of active characters in the game universes. The large increase of players in Orion between days ≈800 and 1000 is due to ad campaigns after October 2006. At day 1000 (dotted line) Artemis and Pegasus were opened. Some thousand players abandoned their Orion characters focusing on their new Artemis characters. This explains the mirrored development in these two universes after day 1000.
Daily database backups recorded at 05:32 GMT3 are available from 2005-09-09 to 2008-09-01. The day 2005-09-09 is the 360th after 2004-09-14, i.e. the 360th day after Orion was opened. Backups from the following dates are not available due to unknown reasons: 2006-03-24 to 2006-04-23, 2006-04-28 to 2006-04-30, 2006-10-24 to 2006-10-26, 2007-03-20, 2007-05-10, 2007-09-21, 2008-02-09, 2008-06-09. Since we have complete data only for Artemis and Pegasus (with the exception of 3 days), and because Artemis has more active characters than Pegasus, in this work we refer only to Artemis data. Results are remarkably robust between universes. For clarity only weekly data points are shown in all time evolution plots, except for Fig. 8 and Fig. 12.
2.3. Players and census of characters
2.3.1. Age and nationality
In a poll taken in the Pardus forums in the beginning of 2005 the age of players was assessed. From a total of 255 votes, 5% reported their age to be less than 15 years, 18% between 15 and 19, 34% between 20 and 24, 23% between 25 and 29, and 20% are older than 29 years. The distribution of player nationalities can be estimated by technical means and reads approximately as follows: United States 40%, United Kingdom 14%, Canada 5%, Austria 4%, Germany 4%, Australia 4%, and other 29%.
2.3.2. Lifetimes of characters
Characters are automatically deleted after an inactivity (not logging in) period of 120 days. Additionally, every player has the option to delete her account or characters at any time. Rarely, it happens that accounts get deleted due to breaking of game rules, such as the operation of multiple accounts. We call all deletions which are not due to inactivity self-induced . Fig. 2 shows the cumulative distribution of character lifetimes (in days) of all 16,980 characters who existed for at least 1 day, but not on the last one. If a character’s lifetime lies before day 120 (dotted line), her deletion could have been self-induced only. If a character’s lifetime is longer than 120 days, her deletion was either self-induced or automatic due to inactivity. These two regimes are evident in the figure: the regime of only self-induced deletion follows a power-law with exponent γ=−0.063. In the second regime, the distribution is neither a power-law nor an exponential probably due to the overlay of the two deletion schemes.
Fig. 2.
Cumulative distribution of character lifetimes (in days) of all 16,980 characters who existed at least for 1 day but not on the last one. The dotted line marks day 120. The inset shows the distribution of lifetimes of <120 days in a double-logarithmic scale.
Of all characters 7.6% have a lifetime of 0 days, i.e. they delete themselves on the first day of their existence. At least 31.4% of all deletions are self-induced, and ≈13% of all characters become inactive after their first day.
2.3.3. Gender of characters
When signing up for the first time, players have to choose between a male and female character; this decision is irrevocable. Depending on gender a male or female avatar (profile-like picture) of characters is displayed in certain places in the game. In the Artemis universe, ≈90% of all characters are male.
2.4. Structure of the universe
Space in Pardus is two-dimensional. Each game universe is divided into 400 sectors , each sector consisting of 15×15fields on average. Fields are the smallest units of space and are displayed as quadratic images in-game. They form a square grid on which continuous ship movement is possible by clicking on the desired destination field within the space chart . This chart is a 7×7 fields cut-out of the universe visible to every player with their current position located on the central field, see Fig. 3. A sector’s boundary is impenetrable; moving between nearby sectors is possible by tunneling through field objects called wormholes . A collection of nearby ≈20 sectors is called a cluster. The typical spatial range of activities of a character is usually confined to one cluster for several weeks or longer.
Fig. 3.
Space chart. A cut-out of 7×7 fields of the universe (grid lines) visible to every player on his navigation screen; the current position is the central field. By clicking on another field, the ship moves to that location.
Every game action carried out by a player (trade, travel, etc.) costs a certain amount of so-called action points (APs). These points can not exceed a maximum of 6100 APs per character. For characters owning less APs than their maximum, every 6 min 24 APs are automatically regenerated, i.e. 5760 APs per day. Once a player’s character is out of APs, she has to wait for being able to play on. As a result the typical Pardus player logs in once a day to spend all her APs on several activities within a few minutes (for each character/universe). This makes APs, the game’s unit of time, the most valuable factor: Those players who use their APs most efficiently can experience the fastest progress or earn the highest profits. Social activities such as chatting (see Section 2.7) do not consume APs. Highly involved players usually spend a lot more real time on the game’s features of socialization as well as on planning and coordinating their future moves than on actually spending their APs.
2.6. Trade and industry
The currency of game money within Pardus is the so-called credit. This money is not convertible to real money. Every player starts her life with 5000 credits. Since most assets needed for making progress – such as ships, ship equipment, buildings – are traded in credits, it is of basic interest to earn money during the game (the richest players currently possess hundreds of millions of credits). There exist a number of possibilities to do this, usually through participation in the economy. Players find themselves strongly encouraged to take part in the struggles of economic life, as known from the real world: collaboration, competition, cartelization, fraud, etc. We will analyze the Pardus economy in detail in a separate work.
2.7. Social interaction
There are three ways for players to communicate inside the game, facilitating social activity. Players may use these facilities independently from game-mechanic states (such as their ship’s location within the universe, their wealth, etc.):
1.
Chat:Pardus offers several built-in chat channels per universe where players can simultaneously communicate with many others. Chat entries scroll up and disappear; thus the chat is well-suited only for temporary talks.
2.
Forum: In the forum, messages, called posts, can consist of several lines and stay for a long time. This enables more thorough discussions than in chat. Posts are organized within threads which correspond to a topic. There are universe-specific subforums as well as global subforums which can be accessed from all universes.
3.
Private message: Within a universe it is possible to send private messages (PMs) to any other player; this action and the PM’s content is only seen by sender and receiver—a system similar to email. When a PM is sent the receiver gets immediately notified in a status bar. PMs always have exactly one recipient. Presently a daily total of ≈10,000 PMs are exchanged within Pardus.
2.8. Friends and enemies
For a small amount of APs, players can mark others as friend or enemy. This can be done for any reason. The marked characters are added to the markers personal friends or enemies list. Additionally, every player has a personal friend of and enemy of list, displaying all players who have marked them as friend or enemy, respectively. When being marked or unmarked as friend or enemy the affected player immediately receives an informatory system message. It is only possible to mark someone either as friend or as enemy, but not both.
We stress that friend/enemy lists and friend of/enemy of lists are completely private, meaning that no one except the marking and marked players have information about ties between them. It is not possible to see second degree neighbors (e.g. friends of friends) or the number of ties another player has. 4 Note that this is in contrast to many online social networking services such as Facebook, where usually second degree neighbors and number of friends are visible. Thereby Pardus’ system does not introduce potentially strong biases concerning accumulation of friends (some users may tend to accumulate friends for the main purpose of increasing their publicly visible number of friends ( Golder et al., 2007)). Our data thus represents a more realistic social situation, in the sense that social ties are not immediately accessible but need to be found out by e.g. communication with or careful observation of others.
Besides character names and online status being displayed on every player’s personal PM contacts page for quick access, the friends and enemies lists serve game-mechanic purposes: friends/enemies are automatically or optionally included/excluded for certain actions. For example, enemies of building owners are not able to use the services offered in the respective places. Note that friend and enemy markings need not necessarily denote affective friendships or enmity, they rather indicate a certain degree of cooperative or uncooperative stance motivated by affective and/or cognitive incentives. However, we assume these two motives to coincide to a great extent, e.g. it seems highly unlikely that someone marked as enemy/friend due to rational considerations at the same time constitutes the affective opposite of friend/enemy within the game (and vice versa). PMs as well as friend and enemy relations can be displayed as networks, Fig. 4, see also Section 3.2.
Fig. 4.
(a) Accumulated PM communications over all 445 days between 78 randomly selected individuals who existed on the first and last day. Link shades of light gray (thin), gray, and black (thick) correspond to 1–10, 11–100 and 101–1000 PMs sent, respectively. (b) Friend (solid) and enemy (dashed) relations on day 445 between the same individuals. See our Youtube channel http://www.youtube.com/user/complexsystemsvienna for animated time evolutions of these networks.
Orion has the longest history but has some missing data at its beginning (see Section 2.2). Orion has converged to social structures and ties which are relatively constant over time, as seen in a number of measures (not shown). In contrast, Artemis and Pegasus can be observed from their start on, and measuring the evolution of social networks from the time of their birth is possible. Here we do not face the problem of the ‘missing past’ (Leskovec et al., 2007). Further, since character creation in Orion and Artemis is free but not so in Pegasus, the former universes display considerably higher fluctuations in player numbers and activity rates than the latter.
3. Networks
3.1. Definitions
3.1.1. Graph
In mathematical terms, networks are described by graphs ( Wasserman and Faust, 1994 and Dorogovtsev and Mendes, 2003). An undirected graph is defined as a pair of sets, the node set containing all nodes ni and the link set containing unordered pairs lij≔{ni,nj} denoting those nodes which are connected by an undirected link (edge). A directed graph (digraph ) has a link set which contains ordered pairs lij≔(ni,nj) marking nodes which are connected by a directed link (arc) going from ni to nj. The expressions N, L denote cardinalities of the respective sets. A graph is called complete if connections between all pairs of nodes exist.
3.1.2. Symmetrization
The symmetrization or reflexive closure of a digraph is constructed as follows: Start with , where is an empty link set, and for all pairs of nodes ni and nj add the undirected link lij to if the directed link or if .
3.1.3. Weighted graph
In unweighted graphs all links are treated equally. A weighted graph is a generalization in which the weight wij of a link lij may take any non-zero real value.
3.1.4. Dyad
A dyad is a (sub)graph consisting of two nodes. A directed dyad can be a null dyad (no links), asymmetric (one link, going in one direction), or mutual (two links, one going in one direction and the other going in the opposite one).
3.1.5. Signed graph
A signed digraph is a pair , where is a digraph and is a sign function assigning each directed link a binary value, e.g. in the context of social networks denoting positive or negative relationship ( Doreian and Mrvar, 1996). We write sij short for σ(lij), and set sij≔0 when lij does not exist.
Every signed digraph has a valency matrix V with entries vij defined as ( Harary et al., 1965):
These entries correspond to null (o) dyads, to dyads with only positive ties (p), to dyads with only negative ties (n), and to dyads with one positive and one negative tie (a for ambivalent relationship), respectively.
3.1.6. Degree
In an undirected graph the degree ki of a node ni is the number of links connecting to it. All ki nodes which are directly linked to ni are called (nearest) neighbors of ni. A node with degree 0 has no neighbors and is called isolated . We denote the average degree of all nodes in a network by . In a directed graph the in-degree of a node ni is the number of its incoming links, the out-degree the number of its outgoing links. We denote the average degree of all nearest neighbors of a node ni by . We denote the average degree of all nearest neighbors of all nodes as a function of degree k by knn(k).
3.1.7. Geodesic
In an undirected graph, the geodesic or shortest path gij of two nodes ni and nj is the smallest number of links one needs to get from ni to nj. If a graph is disconnected, i.e. there exist at least two non-empty sets of non-connected nodes (called components ), geodesics between all nodes of different components are set to ∞. The average geodesic of a random graph is ( Dorogovtsev and Mendes, 2003).
3.1.8. Clustering coefficient
The clustering coefficient Ci of node ni in an undirected graph is the ratio between the number yi of links between its ki neighbors and the number of all possible links ki(ki−1)/2 between them,
The network’s clustering coefficient C is the average over all clustering coefficients, C=(1/N)∑iCi. A random graph’s clustering coefficient Cr is given by ( Dorogovtsev and Mendes, 2003).
3.1.9. Efficiency
Global efficiency of an unweighted network with N nodes is defined as
Global efficiency Eglob can be thought of as a measure how efficiently information is exchanged over a network, given that all nodes are communicating with all other nodes concurrently. Local efficiency Eloc, as a measure of a system’s fault tolerance is defined as
where is the graph of all neighbors of node ni (not containing ni). Both values Eglob and Eloc are in the interval [0,1]. Note that global efficiency is a reasonable approximation for the inverse geodesic in unweighted graphs; local efficiency is a good approximation for the clustering coefficient when most local networks are not sparse (Latora and Marchiori, 2001).
3.1.10. Reciprocity
Reciprocity measures the tendency of individuals to reciprocate connections, i.e. the creation of mutual instead of asymmetric dyads ( Wasserman and Faust, 1994). Following Holme et al. (2004), a naive reciprocity index can be defined by
where L∗ is the number of undirected links in the reflexive closure of the digraph. 5 Values of R=0 and R=1 stand for no mutual dyads and mutual dyads only, respectively. Reciprocity may also be quantified by defining the fraction
where L↔≡2(L−L∗) counts the number of directed links in all mutual dyads of the digraph. Due to conceptual problems with r∗, Garlaschelli and Loffredo (2004), we use the following reciprocity index
with ā≔L/N(N−1) measuring the ratio of observed to possible directed links, and ρmin≔−(ā/1−ā)∈[−1,0] for ā≤1/2 (the expression ρmin makes sense for ā≤1/2. Otherwise it is not possible to have L↔=0). The index ρ allows to distinguish between reciprocal (ρ>0), areciprocal (ρ=0) and antireciprocal (ρ<0) networks. Further, ρ enables a clear ordering of networks independent of link density which is not possible with r∗ ( Garlaschelli and Loffredo, 2004).
3.1.11. Assortativity
Assortative mixing coefficients are the Pearson correlation coefficients of the degrees at either ends of a link (Newman, 2002):
Bars denote averages, kto and kfrom index the (in-, out- or undirected) degrees of nodes at the beginning and end of links, respectively. Following Holme et al. (2004) we measure assortativity rundir in the reflexive closures as well as coefficients for all four combinations of in- and out degrees in the directed networks: rinin, rinout, routin, routout. A positive degree–degree correlation coefficient indicates assortativity, i.e. the tendency of nodes with high (low) degrees connecting to nodes with high (low) degrees, a negative correlation means disassortativity, i.e. the tendency of nodes with high (low) degrees connecting to nodes with low (high) degrees.
3.1.12. Bridge
A bridge is a link which, when removed, increases the amount of disconnected components in the graph by one ( Wasserman and Faust, 1994). A link is a local bridge of degree i if its removal causes its endpoints to have geodesic i ( Granovetter, 1973).
3.1.13. Overlap
Overlap of two neighboring nodes measures the amount of neighbors common to both of them. We adopt the definition used in Onnela et al. (2007),
where mij is the number of neighbors common to both nodes ni and nj. A value of Oij=0(1) corresponds to an empty (identical) common neighborhood of nodes ni and nj.
3.1.14. Betweenness
Link betweenness centrality, short link betweenness or load , is defined for an undirected link lij by
where θef(lij) is the number of geodesics between ne and nf that contain lij, and θef is the total amount of geodesics between ne and nf ( Onnela et al., 2007). Betweenness can be viewed as a measure of traffic if e.g. all pairs of nodes exchange information at the same rate ( Dorogovtsev and Mendes, 2003).
3.1.15. Largest connected component
In graphs with infinitely many nodes one observes the emergence of a giant component when crossing a percolation threshold ( Dorogovtsev and Mendes, 2003). The emerging giant component is the only component holding infinitely many nodes. In finite graphs, we call the component having the highest number of nodes largest connected component . We denote the fraction of nodes being in the largest connected component by .
3.1.16. Triad
A triad is a (sub)graph consisting of three nodes. In a digraph there exist 16 isomorphism classes of triads ( Harary et al., 1965). We adopt the notation of Milo et al. (2002) for the 13 connected classes and label the unconnected ones by a, b and c, see Fig. 5. Within the group of connected triad classes, seven are complete. 6
Fig. 5.
The 16 isomorphism classes of triads and their ids.
The triad significance profile (TSP) is the vector of statistical significances of each connected triad class compared to random networks drawn from the U(X∗+,X+∗,M∗) distribution, i.e. of random networks having identical in/out degrees and equally likely numbers of mutual dyads for each node ( Roberts, 2000 and Milo et al., 2002). Statistical significance of a triad class i is measured by the Z score
where is the frequency of occurrence of the triad class in the considered network, and and are the average frequency of occurrence and the standard deviation in an ensemble of random networks drawn from U(X∗+,X+∗,M∗). The TSP is the normalized vector of all 13 Z scores,
Note that the TSP emphasizes the relative significances of triad classes, constituting an appropriate comparison parameter for networks of arbitrary sizes ( Milo et al., 2002).
3.2. Network extraction
We represent all measured networks as digraphs with nodes representing characters. Note that we do not consider isolated nodes, i.e. characters having no PM communication or friend/enemy relations. In the following we use ‘link’ short for ‘directed link’.
3.2.1. Private messages—communication networks
The first set of networks is extracted by considering all PM communications on a weekly timescale. Within the timeframes [d−6,d] for all days d>6 all PMs between all characters (who exist over these timeframes) were used to define the PM network at day d : a weighted link pointing from node ni to node nj is placed if character i has sent at least one PM to character j within a given week. Weights correspond to the total number of PMs sent within this week. Fig. 4(a) illustrates a subgraph of PM networks of accumulated PM communications over all 445 days between 78 randomly selected characters.
3.2.2. Friends and enemies
Friend and enemy markings constitute the second and third sets of extracted networks: A link is placed from ni to nj if character i has marked character j as friend/enemy. Note that friend/enemy markings exist until they are removed by players (or as long as the related players exist), while PM networks are constructed through an accumulating process. Friend and enemy networks are unweighted, since it is not possible mark friends/enemies more than once.
Since links of friend and enemy networks never coincide (it is not possible to mark someone as both friend and enemy), we can consider the union of friend and enemy networks as signed networks. Fig. 4(b) illustrates a subgraph of the friend/enemy network of day 445. Note the intense cliquishness/reciprocity of friends and the strong enemy in-hub. We show below that these features are typical for these networks.
4. Results
In this section we measure social networks as constituted by different relations between players, focusing on the evolution of network properties. We provide a round-up of the most important results in Section 5.
4.1. Testing preferential attachment
The model of preferential attachment (PA) asserts that nodes which link to a network for the first time tend to attach to nodes with high degrees, i.e. to ‘popular’ nodes (Barabási and Albert, 1999). In the extracted directed networks of friends (enemies) we assume this popularity (‘disdain’) being well expressed by the in-degree. We call characters who get connected to a network for the first time newcomers . To test whether evolutions of present networks display a PA bias we measure in-degrees of characters who are marked by newcomers as friend (enemy). Whenever there exists a link lij on day d+1 which has not existed on the previous day d we say that a (1-day) link event has taken place between ni and nj on day d ; we call ni the source and nj the destination of this event.
In the classic model of PA it is assumed that the probability P of a newcomer connecting to an existing node ni with in-degree is P(kin)∝(kin)α with α=1. Fig. 6 shows P(kin) versus kin for friend and enemy networks; all link events between newcomers and their destinations have been used from day 200 to 400. Least squares fits in double-logarithmic scale yield an exponent of α=0.62 for friend markings with kin<30, and α=0.90 for all enemy markings. We observe an increased upward bending for players having in-degrees larger than about 100, i.e. for very popular players. These findings are fully consistent with other game universes and other time ranges (not shown).
Fig. 6.
Empirical probability P(kin) for newcomers connecting to nodes with in-degree kin. Data is used between days 200 and 400. The black line depicts slope α=1 and indicates the linear dependence assumption needed in the PA model. Thick lines denote least squares fits. Values for enemies are vertically displaced by a factor 0.1 for better visibility.
A connection between PM networks and friend/enemy networks can be made visible by partitioning all pairs of characters {ni,nj} into four classes of friend and/or enemy relations, corresponding to the possible valency matrix entries vij of a signed digraph. These classes o,p,n,a correspond to dyads without friend/enemy ties (o ), dyads with asymmetric or mutual friend markings (p ), dyads with asymmetric or mutual enemy markings (n ), dyads with one friend and one enemy marking (a ). We measure the following: Relations of class a almost never appear. The majority (>95%) of PM partners consists of positively related characters (≈40% on the last day) and of characters having no friend or enemy relation (≈58% on the last day)—we therefore expect a much stronger correlation of PM networks with friend networks than with enemy networks.
4.3. Measurement of basic network properties
We measure the time evolution of the following basic network properties: number of nodes N , directed links L and average degree , relative size of largest connected component , average geodesic , clustering coefficient C , as well as the comparison values and C/Cr. Average degrees, geodesics and clustering coefficients are measured on the reflexive closures of the networks. Geodesics and clustering coefficients were calculated using the MatlabBGL package, 7 which efficiently implements standard procedures such as Johnson’s algorithm for finding all geodesics in sparse graphs. The measured network properties are displayed in Fig. 7, values of days 50, 150, and 445 are shown in Table 1. Cumulative distributions of in- and out-degrees of the last day’s networks are depicted in Fig. 9(a)–(c).
Fig. 7.
Network properties: (a) number of nodes N , (b) number of (directed) links L , (c) average degree , (d) clustering coefficient C , (e) clustering coefficient C divided by clustering coefficient of corresponding random graph Cr, (f) local efficiency Eloc, (g) average geodesic , (h) average geodesic divided by average geodesic of corresponding random graph , (i) global efficiency Eglob, (j) reciprocity ρ, (k) assortative mixing coefficient (of clustering in undirected network) rC(k), and (l) assortative mixing coefficient (in undirected network) rundir. Arrows mark the outbreak of an in-game war at day 422.
4.3.1. Growing average degrees, shrinking geodesics
Average degrees are growing, Fig. 7(c). Merely the enemy network reaches a steady state shortly before day 400. Geodesics decrease, Fig. 7(g).
4.3.2. Links versus nodes
The L versus N curves for the most part have slopes between 1 and 2, Fig. 8. Due to heavy fluctuations and limited extension over N, fits to power-laws are not reliable.
Fig. 8.
Links L versus nodes N for all timepoints. Values for enemies and PMs are shifted vertically for clarity.
Cumulative degree distribution of (a) PM, (b) friend and (c) enemy networks; clustering coefficient C as a function of degree for the (d) PM, (e) friend and (f) enemy networks; nearest neighbor degree knn versus degree of the (g) PM, (h) friend, and (i) enemy networks. All distributions were taken at the last day.