Community Structure, Line Graphs and Edge Partitions, Clique Graphs

The giant connected component of the coauthorship network of scientists, taken from the data set of Mark Newman. The edges are each coloured using the Louvain method on the weighted line graph (no self-loops) but with the null model factor in the modularity scaled by 10.0. A subgraph centred on Mark Newman from the coauthorship network of scientists. The edges are each coloured via the weighted line graph method. . Zachary’s Karate Club. The edges are coloured via the weighted line graph method using the Louvain community detection method. The vertex colours correspond to the partition found by Zachary using the Ford-Fulkerson method.

There is no single definition of a community in a network. Roughly speaking, a community is a subgraph of closely related nodes and edges. The idea is that we should use the shape (the topology) of the network to tell us what these communities are. Usually they are tightly knit subgraphs within a graph. Most work has been done by defining communities as a partition of the set of vertices. This is equivalent to giving every node a colour, and all the nodes of one colour are in the same community.

When describing a network, there seems to be a natural tendency to put the emphasis on its nodes whereas a graph is a both a set of nodes and a set of links. R.Lambiotte and I have suggested that we could also consider a partition of edges, that is give each and every edge a colour. Link partitioning has been overlooked until now. This leads us to a well known construction in graph theory called the line graph. A line graph is derived from the original graph but every edge in the original graph is now represented by a vertex in the line graph while the nodes of the original graph are related to links of the line graph. One can apply any algorithm to the line graph and you are actually doing something to the original graph but where edges and nodes have been switched. So we applied vertex partitioning algorithms to the line graph, that is coloured its vertices, and thus derived an edge colouring, edge partitioning, of the original graph. Line graphs are normally have edges with no weights. These have been used previously in the context of networks and in their community detection. However as far as we are aware we are the first the suggest using weighted line graphs in any context. We found that the unweighted line graph version is unsuitable for most purposes we could think of.

Note that our method produced overlapping communities in terms of the vertices. Thus it can be compared against the Clique percolation method implemented in the CFinder programme which is discussed below.

Recently I have extended these ideas to create what I call Clique Graphs. I describe my ideas in “Clique Graphs and Overlapping Communities” and also in the slides from a talk entitled “Clique Graphs and Overlapping Communities” which I gave at NetSci 2010. This includes the graph used in clique percolation community detection method as a special case. Clique graph C++ code available from me. A revised version of the American College Football network of Girvan and Newman was used in this paper, see below for further comments on common errors in this data set.

Another aspect of my work on communities picks up on my interest on networks embedded in space. This overlaps with my interest in archaeology and network models. It also picks up on an important aspect of publication networks since the location of most authors is specified on the paper. Its clear than whenever you have additional information on the vertices (`metadata’) this information ought not to be overlooked when making an analysis. Looking for communities when the position of vertices is known is a perfect example of this problem. I have looked at one way to tackle this using modularity and gravity models.

Much of my work (and that of others) on community detection in networks can be considered from the point of view of random walks. This view was the focus of a talk given at USP, Sao Paulo in December 2011, Random Walks and Networks.