Dividing an organization

  using clustering of weighted graphs    

Modularity versus mincut

   

Centrality versus modularity and mincut



An example graph

Consider the problem of dividing a graph into two parts. The links in the graph have a weight. When the graph is split some links will be cut. We want to minimize the sum of the weights of the links that are cut. This is called the minimal cut of the graph.
The graph could model an organization. The nodes could be actors in the organization and the weights could represent the importance of their relations. When a reorganization requires a division, the minimal cut could be the best way to allocate the actors to the two new parts of the organization.

The tree of minimal cuts

The so called min-cut algorithm of Gomory-Hu [1] finds the minimal cut between all pairs of nodes in a graph. The result of this algoritm is a tree with the same nodes as the graph, but where each link represents a cut in the original graph. The Gomory-Hu tree of the example graph is shown in the figure on the left.
Example: When we remove the (red) link between A5 and A8 from this tree, two subtrees remain: A4,A5,A6,A7 on the right side and A8,A9,A1,A2,A3 on the left side. The weight of the link we just removed is 4. This happens to be precisely the sum of the weights of the links that are cut by this partitioning in the original graph. It is the minimal cut between the nodes A5 and A8.

The best split

This is the minimal cut we just described. The sum of the weights of the cut links is 4. In the Gomory Hu tree it is the (red) link between the nodes A5 and A8

One of the two next best alternatives

This is the next best minimal cut. In the Gomory Hu tree it is the (blue) link between nodes A8 and A9. The sum of weights of the cut links is 6.

Multi-cluster algorithm

Newman [2] defined a quantity called modularity. It ranges from 0 (bad) to 1 (best) and is a measure for the internal coherence of a given clustering of nodes in a graph. In a random graph the modularity is close to zero for all clusterings. In a graph with no links between clusters (only links within clusters) the modularity approaches 1. A search algorithm can look for the highest modularity over all possible clusterings.
For 2 clusters this algorithm happens to produce the same result as the min-cut algorithm in the case of the current example graph. For 3 clusters the optimal clustering is shown in the figure on the left. The modularity is 0.4

References

[1] R. E. Gomory, T. C. Hu
Multi-Terminal Network Flows
Journal of the Society for Industrial and Applied Mathematics, Vol. 9, No. 4 (Dec., 1961), pp. 551-570

[2] M. E. J. Newman
Fast algorithm for detecting community structure in networks
Phys. Rev. E 69, 066133 (2004)