Situating agent-based modelling in population health research

In line with our claim that public health is an emergent property of a complex system, in this section we explain the key characteristics of complex systems, the ways in which human society fulfills these characteristics, and the resultant impact on research efforts related to human social systems.

Emergence

A fundamental property of complex systems is emergence, which philosopher Mark Bedau divided into strong and weak forms in the context of complex systems [19]. Frequently, references to emergence in the natural sciences and philosophy are referring to strong emergence, which describes properties of systems that are not deducible from the behaviour of their component parts. To paraphrase Bedau’s example, the inscrutable phenomenon of consciousness is clearly a consequence of neural activity, yet our knowledge of the behaviour of neurons does not provide us with any insight into the function of consciousness [19]. Consciousness is not a property exhibited by individual neurons, and appears distinct from any particular neural property or behaviour, and still it arises from neural activity. Consciousness can also change our neural activity, despite being distinct from it; this is downward causation, meaning that an emergent property can alter the behaviour of the component parts from which it emerges.

Strong emergence, while successfully capturing the idea of a macro-level property that is distinct from and yet capable of influencing its own components—as opposed to a macro-property that is merely an interesting consequence of micro-level activity—is philosophically problematic. Strongly emergent properties appear to be essentially autonomous from their components, and yet are able to exert strong causal influence on those same properties.

Consequently, Bedau’s exploration of weak emergence has become an important concept for complex systems science:

Macrostate P of S [system composed of micro-level components] with microdynamic [micro-level behaviour] D is weakly emergent iff [if and only if] P can be derived from D and S’s external conditions, but only by simulation [19, p. 4]

In other words, the behaviour of a system composed of interacting micro-level components is ultimately derived from its micro-level behaviours and the influence of its environment. If we can simulate these interactions explicitly, we can simulate the dynamic that generates the emergent property, and thus we can replicate the emergent macrostate of that system. Weakly-emergent properties are accordingly more accessible to scientific study, in that their emergence can be replicated via step-by-step simulation of the interaction of their constituting components and the surrounding environment. Conversely, if we do not simulate this dynamic, we cannot replicate the weakly-emergent behaviour.

This latter point is particularly important, as the macro-level emergent outcomes of a system’s microdynamic cannot be straightforwardly predicted, even with perfect knowledge of a system’s initial state and the rules driving its microdynamic. Bedau illustrates this using the Game of Life, a famous computational system in which cells on a grid change state according to the states of their neighbours. Cells in Life have only two states—alive or dead—and their future states are determined by very simple rules according to how many of their neighbouring cells are alive or dead.Footnote 1 Despite this simplicity, even very simple starting configurations in Life can produce remarkably complex behaviour (see e.g. Fig. 1), and Life can even play host to patterns capable of replicating any possible computation (a property known as computational universality) [20, 21]. As Bedau notes, this has profound implications:

With few exceptions, it is impossible without simulation to derive the macrobehaviour of any state in a Life configuration even given complete knowledge of that configuration. In fact, since a universal Turing machine can be embedded in Life, the undecidability of the halting problem proves that in principle there can be no algorithm for determining whether the behaviour exhibited in an arbitrary Life world will ever stabilize. Yet all Life phenomena can be derived from the initial conditions and the birth-death rule. [19, p. 14]

Thus, the only way we can replicate the weakly-emergent macrostates of the Game of Life is to simulate its behaviour step-by-step. By extension, given that most complex systems will have significantly more complicated microdynamics than the Game of Life, replicating the macrostates of weakly-emergent systems requires the use of simulation to replicate their microdynamics.

Fig. 1figure1

An example of the unexpected complexity of simple patterns in the Game of Life. This 7-cell pattern is called an ‘acorn’ and stabilises after 5206 steps with a population of 633 live cells

A well-known example of an agent-based model replicating a weakly-emergent phenomenon from the interaction of micro-level entities is given by Schelling’s residential segregation model [22, 23]. In this model, very simple agents are living in a virtual grid-based world, and at each discrete step of the simulation are able to choose to move their location. Their decision to move is based on a preference for the group composition of their neighbourhood; if the number of their neighbours belonging to a different group than themselves is above a certain threshold, the agent will move to a new random square on the grid. Here the agents’ segregation is the emergent property of the system, while the preference for in-group neighbours is the parameter driving the agents’ behaviour. Schelling showed that even a relatively low threshold generates a high degree of residential segregation, a result that is not predictable solely by knowing the agents’ behavioural rules (see Fig. 2). Thus we can describe Schelling’s model as weakly emergent, given that its macrostate is derivable only by simulating the system step-by-step, despite its known and very simple behavioural rules.

Fig. 2figure2

Sample run of the Schelling segregation model

Similarly, the phenomena of interest in public health research—from the development of health inequalities to the spread of obesity in certain communities—are consequences of complex interactions between individuals and their physical and social environments. Successful interventions may seek to influence the phenomena by altering individual, low-level behaviours through a number of different routes, in the hope that the population-level picture which emerges from those actions changes for the better—much like the simulation scientist tweaking the preferences of Schelling’s agents in the hope of reducing segregation. Following Axelrod and Tesfatsion, we might align ourselves to using ABMs for normative understanding, or ‘evaluating whether designs proposed for social policies, institutions, or processes will result in socially desirable system performance over time’ [24].

If we accept that population-level health patterns are weakly-emergent phenomena deriving from the interactions between individual, society and environment, then it follows that answering some questions about those properties will require simulating those interactions explicitly via simulation. Agent-based modelling, as a methodology tailored to the investigation of emergent properties, is well-placed to provide insight into these phenomena by explicitly modelling both individual behaviours in response to an intervention and interactions between individuals and with their environment.

Non-linearity

In the context of complex systems science, there are two extant definitions of ‘non-linearity’. The first, which we will refer to as causal non-linearity, encompasses the manner in which complex systems tend to be characterised by cyclic, relational and mutual causal relationships between the variables describing the system’s state. This characteristic of complex systems means that popular causal inference methods like directed acyclic graphs (DAGs) cannot be used to characterise complex systems, as DAGs cannot include feedback loops.

The Schelling model again provides a useful example of this concept. Schelling’s simple agents change the local environment of both their previous and current neighbours, thereby affecting the probability that those neighbours may move to another location. In this way, the agents are affecting the actions of other agents indirectly via their shared environment.

Applying this concept in the context of public health, we might imagine an ABM that examines how unhealthy behaviours propagate in a social network by explicitly representing agents with their characteristics, social relations and interactions. For example, a simulation can explore the spread of unhealthy habits (such as smoking, drinking, and drug use) among socially-related individuals, and evaluate the effectiveness of various strategies of network intervention that intend to induce desirable behavioural change across the social network.

Complex systems are characterised by a tangled web of non-linear causal relationships, which blurs the distinction between exogenous and endogenous variables. The non-linear causal relationship among the system’s components and between the components and their environment is the fundamental reason behind our inability to forecast the system’s dynamics from the components’ behavioural rules: the web of causal relationships is simply too tangled for our limited cognitive capabilities to take all of them into account when trying to run a mental simulation of the system.

The second definition of ‘non-linearity’ refers to the kind of relationship between one or more exogenous variables and one endogenous variable: in this case, saying that the relationship is not linear means that variations in the exogenous and the endogenous variables are not proportional. In the Schelling model, we may change the agents’ tolerance for different neighbours without noticing any significant change in residential segregation at the population level, as long as we are below or above a threshold level. Once we reach this level, however, a slight change in the agents’ tolerance changes the system from a mixed state to a segregated one. Thus, the relationship between the agents’ tolerance and the system’s level of residential segregation is a non-linear one.

A similar relationship is found in the dynamics of infectious disease. The spread of an infectious disease is defined by the basic reproduction number, which is the average number of new cases caused by an infected individual during his infectious period. The basic reproduction number is a threshold that dictates whether the infection will persist over time. When the reproduction number is lower than one, the infection can not persist in the population, whereas if it is greater than one the infection will spread and persist. Non-linearity is significant to both the Schelling model and in infectious disease modelling, since it demonstrates how past trends may abruptly change once a certain threshold is crossed, producing a qualitative change in the state of the system: from mixed to segregated and from a non-persistent infection to an epidemic.

Of course, these two kinds of ’non-linearity’ are strictly related. Complex systems are characterized by non-linear causal relationships between their components, and thus we often observe a non-linear relationship between exogenous and endogenous variables at the aggregate level.

Adaptive behaviour

Adaptive behaviour refers to the capacity or propensity for an agent to change its state following a change in its environment (including the behaviour of its neighbours).Footnote 2 This fundamental characteristic of complex systems allows for non-linear causality: an environmental variation prompts the agents’ behavioural responses, which then feed back into additional environmental variations, and so on. In other words, the system’s components may affect each other both directly and indirectly through changes in their common environment.

For example, consider the development of a new road that passes through a neighbourhood that will increase traffic, noise and air pollution in the area. As a result, residents who can afford to move may leave, and local housing prices may decline. The decline in housing prices may attract a less affluent population to the area. In this situation, individuals are adapting to new conditions in the environment, and the system—and thus the neighbourhood—self-organises as a result. In the new order that emerges an increasingly deprived population is located in a neighbourhood with poor environmental conditions and exposed to greater health risks.

Given that the components are the ‘engines’ of the co-evolutionary process driving the system’s dynamics, the behavioural model of these components is the fundamental building block of any complex systems model. As these are weakly-emergent phenomena, we cannot replicate the dynamics of the system unless we simulate it as the result of the interaction between the system components and their environment.

Human societies are characterised by adaptive behaviour of the most complex kind, as human beings are able to recognise that they are in a complex system, identify the system’s emergent properties and develop models that take them into account to drive their own actions. This phenomenon of second-order emergence, or the fact that emergent social institutions become part of the agents’ models driving their behaviour, create direct causal relationships between the components’ behaviour and the system’s dynamics, which further compounds the complexity of the system [25].

The complex systems challenge to traditional epidemiology

Having outlined the defining characteristics of complex systems, we can better understand why they pose a challenge to the statistical approach typically adopted by epidemiology and how ABM can help epidemiology to rise to this challenge. Public health problems can be seen as the emergent outcomes of the complex social system that is human society. As such, to understand their dynamics we need to develop models based on the explicit representation of the components of society—individual human beings.

Individuals’ adaptive behaviour and the resulting web of causal relationships between agents and their environment mean that non-linear relationships between system variables are pervasive in human social systems. This means that a very small variation of system inputs can generate a big variation in system outcomes, or vice versa. We can visualise a non-linear complex system as one where the space of possible outputs is very rough along the many input dimensions: because of the number of factors affecting the relationship between any two variables, points that are very near to each other in the space of any input can be far apart in the space of outcomes. While the traditional statistical approach can be used in principle to shed light on the causal mechanism through which variable X affects variable Y (and in fact much epidemiological research consists of the addition of confounders and mediators to the original theoretical model to enhance our understanding of how variable X affects variable Y), success relies on the availability of a ‘sufficient’ number of observations for the analysis to have enough statistical power, a threshold that increases with the number of confounding variables in the causal model. This represents a limit to the complexity of the theoretical model that can be statistically analysed.

With respect to complex systems like human society, this creates two major problems. First, a complex system may contain variables for which it is difficult or impossible to gather empirical values. Second, even if our theoretical model does not contain such variables, in complex systems the number of potentially conditioning variables is typically very large, so we may have too few observations to conduct a meaningful statistical analysis of the relationship between the variables of interest, or reach the limits of analytic tractability of a mathematical model with dozens of variables.

Thus, we see the reason why most causal models in traditional epidemiology are relatively simple compared to ABMs: the number of observations must be large enough for the analysis to have the desired statistical power, while remaining analytically solvable. In other words, our tools force us to assume that numerous variables which we may ideally want to include in our models do not affect the relationship between X and Y. We call this the stability assumption, in that it requires that the relationship of interest is unaffected by changes in contextual variables.

Statistical approaches suffer further when data is sparse, as is often the case in human social systems. Properly-specified theoretical models can still be applied in these cases as means for increasing our understanding of system behaviour; such models can form the basis for the examination of ‘what if’ scenarios and for probing system behaviour via sensitivity analysis. We propose that ABM can be very effective in this regard, as the approach requires us to formally codify our theoretical knowledge of a system in the form of an explicit computational model of the processes underlying it. Through simulations we can produce counterfactuals, allowing us to evaluate which contextual variables we may exclude as conditioning variables, and whether the stability assumption is tenable. If the stability assumption does not hold, we can examine the effects of the conditioning variables on the relationship of interest. All the while we are able to model system processes explicitly, including non-linearities and feedback loops.

In this context, ABM can be seen as a complementary tool to assess the limits of statistical approaches as applied to a complex system, and to investigate system behaviour when quantitative data is too scarce to perform robust statistical analyses. While the traditional modelling of epidemiology is statistical—hypotheses relating to the causes of a health outcome are tested in a mathematical framework against observed data—the modelling approach for an ABM involves taking theories and assumptions underlying population health research and instantiating them in a computational framework.

Comments (0)

No login
gif