Simulating a neighborhood network (not serious science)

Intro

Just to warn you: This is not an attempt at serious science. It is just something I made for fun, based on a project that two students are working on.

Background

Let me start by giving some background to what I’ve done. I’m involved in the European GLAMURS project as a postdoctoral researcher. Among other things, we investigate what role bottom-up citizen initiatives can have in sustainability transitions. In the Netherlands, one of the initiatives we work together with is Vogelwijk Energie(k), which is a very interesting local energy initiative in the Hague. One of the things that makes the initiative special is that it is situated in a neighborhood with exceptionally strong investment power, and relatively many connections to important professional networks. Since its early stages (it started in 2009), Vogelwijk Energie(k)  has had around 250 members (there are over 2000 households in the neighborhood). One of the ambitions of the board of the initiative is to mobilize a ‘second wave’ of people within and outside the neighborhood, to have them invest in sustainable development. This doesn’t mean that Vogelwijk Energie(k) wants to attract more members; the idea is to raise awareness about sustainability among a broader group of people, and stimulate them to act on that awareness.

We had the opportunity to write an assignment on this problem for a course on Agent Based Modeling (ABM). Two students have been working on that assignment for some time now. They are basically attempting to develop an ABM that models the process through which the mobilization of the ‘second wave’ could take place, allowing for the exploration of different scenarios. Within a few weeks, the two students had developed some very interesting ideas for the model. Their work so far seems to be based on the implicit theory that people can develop a stronger awareness about sustainability by talking about the subject with their neighbors, and that increased awareness may at some point trigger an exploratory process that may, or may not lead to the decision to start investing in sustainability. Their conceptual model is actually more complex, but this is the gist of it. One of the reasons why I like their ideas is that, without knowing, they included some kind of awareness-behavior gap in their model (increased awareness does not immediately lead to changes in behavior).

The problem

One of the operational problems that the students are currently facing is how to simulate the interactions between neighbors. Intuitively, we can already say that the likelihood of neighbors interacting with each other is not equal for all pairs of neighbors. Any person is likely to interact frequently with only a small amount of his/her neighbors. Who these neighbors are will also depend (intuitively speaking) on how close they live to each other, and how many opportunities they have to meet (e.g., their children go to the same school, they go to the same supermarket, they are member of the same associations). You’ll get the idea.

One approach to modeling the patterns of interactions between neighbors is the social networks approach. We can visualize neighbors as nodes, and we can visualize frequent interactions between neighbors by drawing edges between those neighbors. In this context, the students had a question for me: “What do you think this network should look like?” This is a rather difficult question to answer without any empirical observations on the actual interactions that take place in the neighborhood. Ideally, we would do a survey on this among the residents of the neighborhood, but that is well beyond the scope of the students’ assignment. I told the students I needed to think about this problem for a while.

Creating a random network based on distances

One of my initial thoughts on the problem described above is that the structure of a social network in a neighborhood will at least depend on the geographical proximity of the neighbors in the network. I realize that there are many other mechanisms that will influence the structure of the network, but it is the geographical dimension that I focused on this evening.

One of the things that I wanted to do is to place neighbors in a space that more or less corresponds with the geographical boundaries of the Vogelwijk neighborhood. I first took a look at a map of the neighborhood on Google Maps. In the visualization below, the neighborhood is marked by the red area.

Vogelwijk

As you can see, the neighborhood includes two large green areas that have no houses. I used Google Maps to mark the ‘inhabited’ areas, and make a rough calculation of their combined surface area (see below).

VogelwijkSurface

Google maps tells us that the area with the black outlines is about 1.12 square kilometers, but to keep things simple, I decided to assume that the area is a rectangular area that is 2.5 kilometers wide, and 0.5 kilometers high, which brings us to a surface area of 1.25 square kilometers.

I also decided to focus on households, rather than neighbors, and I found that there are over 2000 households in the neighborhood. I decided to round the number of households down to 2000.

I created a so-called nodes list in which I listed 2000 households that I simply numbered from 1 to 2000. I randomly assigned X-coordinates and Y-coordinates to each household. The X-coordinates are a random number between -1250 and 1250, and the Y-coordinates are a random number between -250 and 250. See a screen shot of part of the nodes list below.

Nodes

I wrote a script in R with several functions. One of the functions calculates the Euclidean distances between all the households in the neighborhood, based on the randomly assigned coordinates. The function returns a distance matrix that reports the distances between all pairs of households. Another function normalizes these distances (all distances are converted to proportional values between 0 and 1), and inverts them to create proximities (1-distance). The resulting proximity matrix became the basis for the simulation of the network.

I wrote a simple function in R to simulate the network in the neighborhood, using the proximities of the households as a basis. The logic of the function is very simple: The function considers each pair of households in turn. The proximity of the households (a number between 0 and 1) is multiplied by a random number, for example a number between 0 and 1 (which is used to simulate other influences; I know it is very naive). The resulting number is then compared to a threshold. If the number is below that threshold, then there is no tie between the two households. If the number is equal to, or above the threshold, then a tie between the households in created. I made sure that it is possible for the user of the function to set the threshold, as well as boundaries for the random number that the proximities are multiplied with. Different parameters for this will also lead to different networks. I ran the function numerous times, each time with different parameters. Below, I visualize 2 examples.

Both examples are visualized with Gephi. To visualize a network I opened its adjacency matrix in Gephi, and I imported the nodes list with the household coordinates separately. I used the MDS Layout algorithm to layout the households. Below is a first example (clicking the picture should enlarge it). The size of the nodes is based on their degree. The colors indicate communities, which I identified using Gephi’s built-in modularity algorithm.

Example_3

In the picture you can see that ties between the households are relatively sparse. This network has a degree distribution of 2.063. In the picture below you can see that most households have one or two connections with other households in the neighborhood. This may not seem like much, but from the few papers on neighborhood networks that I have scanned so far, I understood that ties within a neighborhood tend to be sparse (people are typically more strongly connected with people outside their neighborhood). The visualization also nicely shows the effect of the simple simulation function that I wrote: The connections only exist between households that are relatively proximate.

DegreeDistribution_Example3

The pictures below show another example, based on a network that I generated using other parameters. This network has more connections, which can also be seen in the graph of the degree distribution. In this case, most households seem to have connections with 17 other households in their neighborhood, which intuitively sounds unrealistic to me, but it sure creates a pretty picture. I also like how the neighborhood divides up nicely in different communities.

Example_4

DegreeDistribution_Example4

Closing

So, that was it. I know this is not a very serious simulation of neighborhood networks, and indeed it is based on intuitions, and not on good science. But it was fun to do!