Detecting Communities in Complex Unipartite and Bipartite Networks by Maximizing the Modularity

Date

2013-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This dissertation develops and improves methods to detect the modular structure of complex unipartite and bipartite networks using the method of modularity maximization, in which one seeks to find the partition of nodes that maximizes a quality function known as the modularity. Finding partitions that maximize modularity has been proven to be NP-hard and therefore, fast, approximate algorithms are developed to find high quality partitions. First, novel methods are developed using an ensemble approach to detect and analyze multiple high quality modularity partitions to identify modular and hierarchical structure in networks. These methods are applied to the genetic regulatory network of Escherichia coli and a set of functional communities are identified and used to predict candidate regulatory interactions. The problem of community detection in bipartite networks is then studied and a new bipartite modularity, the Information Preserving (IP) modularity, is introduced. The IP modularity detects communities in bipartite networks using a unimode projection that includes more information about the structure of the bipartite network in the projection than previous methods, and is applicable when the links in the bipartite network are weighted. The IP modularity is shown to detect meaningful structure in both a model network and the real-world metabolic network of E. coli . Finally, the leading-eigenvalue method with final tuning is adapted to find the partition that maximizes bipartite modularity and an agglomerative step that merges communities is added to the algorithm. This step extends the effective applicability of the leading-eigenvalue method of modularity maximization to networks of tens of thousands of nodes in size. The methods developed in this dissertation significantly enhance the science of uncovering the structure in complex networks and are applicable to study a wide range of biological, medical, social and physical networks.

Description

Keywords

Community detection, Modularity, Complex networks, Networks, Clustering, Modules, Bipartite, Unipartite, Z-score, Genetics, Metabolism, E. coli

Citation