Comparative analysis of differential network modularity in tissue specific normal and cancer protein interaction networks

Islam, Md Fahmid; Hoque, Md Moinul; Banik, Rajat Suvra; Roy, Sanjoy; Sumi, Sharmin Sultana; Hassan, F M Nazmul; Tomal, Md Tauhid Siddiki; Ullah, Ahmad; Rahman, K M Taufiqur

doi:10.1186/2043-9113-3-19

Research
Open access
Published: 06 October 2013

Comparative analysis of differential network modularity in tissue specific normal and cancer protein interaction networks

Md Fahmid Islam¹,
Md Moinul Hoque¹,
Rajat Suvra Banik¹,
Sanjoy Roy²,
Sharmin Sultana Sumi¹,
F M Nazmul Hassan¹,
Md Tauhid Siddiki Tomal¹,
Ahmad Ullah¹ &
…
K M Taufiqur Rahman¹

Journal of Clinical Bioinformatics volume 3, Article number: 19 (2013) Cite this article

8140 Accesses
13 Citations
Metrics details

Abstract

Background

Large scale understanding of complex and dynamic alterations in cellular and subcellular levels during cancer in contrast to normal condition has facilitated the emergence of sophisticated systemic approaches like network biology in recent times. As most biological networks show modular properties, the analysis of differential modularity between normal and cancer protein interaction networks can be a good way to understand cancer more significantly. Two aspects of biological network modularity e.g. detection of molecular complexes (potential modules or clusters) and identification of crucial nodes forming the overlapping modules have been considered in this regard.

Methods

In the current study, the computational analysis of previously published protein interaction networks (PINs) has been conducted to identify the molecular complexes and crucial nodes of the networks. Protein molecules involved in ten major cancer signal transduction pathways were used to construct the networks based on expression data of five tissues e.g. bone, breast, colon, kidney and liver in both normal and cancer conditions. MCODE (molecular complex detection) and ModuLand methods have been used to identify the molecular complexes and crucial nodes of the networks respectively.

Results

In case of all tissues, cancer PINs show higher level of clustering (formation of molecular complexes) than the normal ones. In contrast, lower level modular overlapping is found in cancer PINs than the normal ones. Thus a proposition can be made regarding the formation of some giant nodes in the cancer networks with very high degree and resulting in reduced overlapping among the network modules though the predicted molecular complex numbers are higher in cancer conditions.

Conclusion

The study predicts some major molecular complexes that might act as the important regulators in cancer progression. The crucial nodes identified in this study can be potential drug targets to combat cancer.

Background

Reductionist philosophy has directed biological research for decades [1, 2]. A significant amount of information has been generated so far in the field of biological sciences as enrichment of human knowledgebase to understand life [1]. Despite enormous success of reductionism to decode the structural and functional attributes at cellular and molecular levels of life-organization, it is progressively becoming clearer that biological functions can rarely be credited to discrete perception of individual molecules. Alternatively, most biological phenomena emerge due to extremely interactive complexity derived from functional integrity of cell’s numerous constituents [2]. Various recent approaches have been initiated and accomplished to study biological systems in more integrative and comprehensive way. Network model can play an important role to understand the complex network system based on multiple sets of interactions and to make plain and clear analysis of the origin of observed network characteristics [3–7]. Network biology has thus come out at present time as a revolutionary approach for the empirical study to understand complex biological systems [3, 8–12].

In cancer condition, genomic instability results in alterations of downstream signal transduction pathways and protein-protein interactions. Current understanding of the dynamic changes at genomic and proteomic levels indicates that cancer can be considered as a stochastic phenomenon rather than being the result of some specific linear alterations [13]. Insightful understanding of comparative regulatory patterns in normal and cancerous cells requires in detailed study of molecular interactions [14] and network biology has prospective usefulness in this regard [15]. The concepts of network biology can be utilized to decipher the differential interaction patterns between normal and cancer conditions through construction of biomolecular networks and subsequent in depth analysis of the networks.

Studying modularity of biomolecular networks can be an efficient way to understand their inherent properties and identify the crucial molecular sets and components of the networks (which is a basic challenge of the study of these networks) [16]. In most of the cases biomolecular networks show modular organization that means the network can be divided into modules according to the density of connections among the nodes of a network. More specifically, modules are the subsets of a network that have comparatively high connectedness among the nodes (through the edges) forming the modules. The modules have lots of connections within themselves but sparse connections among them [17, 18]. From a general point of view, depiction of the modules is useful in understanding the structural and functional features of networks, which has stimulated many empirical researches as well as practical applications e.g. protein complex and drug target identification [19, 20].

The main objective of this paper was to study the differential modularity patterns of normal and cancer protein interaction networks (PINs). The PINs were constructed for five tissues e.g. bone, breast, colon, kidney and liver in both normal and cancer conditions [21]. The network construction was based on expression data of protein molecules participating in ten major cancer signal transduction pathways. MCODE (molecular complex detection) [22] method was used to identify and analyze potential molecular complexes (modules or clusters) of the networks. Another method ModuLand [23, 24] was used for identification and subsequent analysis of crucial nodes forming overlapping modules of the networks.

Methods

The primary data required were retrieved from differential expression database GeneHub-GEPIS (an online bioinformatics tool for inferring gene expression patterns in a large panel of normal and cancer tissues; http://research-public.gene.com/Research/genentech/genehub-gepis/index.html) [25] and protein-protein interaction prediction tools e.g. PIPs (Human Protein-Protein Interaction Prediction; http://www.compbio.dundee.ac.uk/www-pips/) [26, 27] and STRING (a database of known and predicted protein interactions; http://string.embl.de/) [28–33]. Cytoscape software package [34–36] was used to construct protein interaction networks (PINs) (Additional files 1 and 2) [21]. For modularity analysis two Cytoscape plugins namely MCODE and ModuLand were used. MCODE was used to identify and rank all possible molecular complexes of particular networks and ModuLand was used to identify crucial nodes forming the overlapping modules in those networks. MCODE detects densely connected regions in large protein interaction networks, which may be characterized as molecular complexes [22]. The MCODE method stands on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The ModuLand method provides an algorithm for determining extensively overlapping network modules [23, 24]. Additionally, it identifies several hierarchical layers of modules through representation of modules of the lower layer by meta-nodes of the higher hierarchical layer. This method predicts the function of the whole module and determines key nodes bridging two or multiple modules through assigning module cores.

During MCODE and ModuLand analysis default parameter values were utilized. The default MCODE set up was fixed like, Find Clusters: in Whole Network; Network Scoring (Advanced Option)- a) Include Loops: Turn off, b) Degree Cutoff: 2; Cluster Finding- a) Haircut: Turn on, b) Fluff: Turn off, c) Node Score Cutoff: 0.2, d) K-Core: 2, e) Max. Depth: 100. During ModuLand analysis, selected unweighted network option was taken with default value 1. ModuLand was run to identify and visualize overlapping modules and merged (for modules) with threshold value 1.0 to create correlation matrix of original modularization and module correlation histogram. Measures option of ModuLand was used to calculate the graph related parameters of the overlapping modules.

Results and discussion

Molecular complex detection

Molecular complex detection (MCODE) method has been used to evaluate yeast protein interaction compilation using known molecular complex data from mass spectrometry of the proteome [19, 37]. This leads to the observation that highly interconnected, or dense regions of the network may represent molecular complexes [38]. The numbers of possible modules that can be said as molecular complexes, differ between normal and cancer conditions in each of the five tissues (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10). The ranked molecular complex numbers of normal and cancer protein interaction networks are 15 and 19 for bone, 22 and 28 for breast, 22 and 27 for colon, 21 and 30 for kidney and 19 and 28 for liver respectively (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10). In all cases, possible molecular complex numbers increase in cancer condition. The statistical significance test also supports the difference (at p ≤ 0.05) and depicts that the molecular complex numbers of cancer PINs are significantly increased than the normal PINs (at p = 0.02) (Additional file 3).

Kidney cancer shows highest increment during cancer in comparison to normal state for predicted molecular complex numbers (Figures 7 and 8). Not only the molecular complex numbers, all other parameters e.g. scores, nodes and edges of the molecular complex networks differ between normal and cancer conditions for each tissue (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 and Additional file 4).

As in case of cancer networks, the related edge and node numbers increase from the normal conditions for all five tissues, the overall clustering is also enhanced in cancer networks. The normal and cancer networks were mainly constructed based on the expression and interaction data of protein molecules participating in major cancer signal transduction pathways which has been described in our previous paper [21]. The event of increased edges and nodes in cancer tissues compared with normal tissues can be explained as the enhancement of molecular interactions at proteomic level in cancer states in comparison to normal states. It is mentionable that the graphical representations of such differences are based on already validated experimental data regarding gene expression and protein interaction. The biological meaning of the observed differences seems to be very obvious indicating that cancer tissue involves more proteins to interact with each other during cancer signaling.

A current report supports that disease genes tend to have higher degree and connectivity in comparison to non-disease genes in terms of expression and interaction of proteins [39]. Some studies also indicate that proteins encoded by cancer genes can interact strongly with other proteins and show higher connectivity than normal condition [40]. There is also evidence of overrepresentation of 10% of protein interaction clusters within the cancer interactome when compared to the normal protein interaction networks [7].

Overlapping module and crucial node identification from the networks

In case of bone, overlapping module is present in normal condition but absent in cancer (Figure 11). Overlapping modules between normal and cancer states differ for all other tissues (Figures 12, 13, 14, 15, 16, 17, 18 and 19). In breast, kidney and liver edge and node numbers decrease in cancer and most of the molecules forming the overlapping networks are changed (Figures 12, 13, 16, 17, 18 and 19). In colon, edge and node numbers remain constant but most of the molecules forming the overlapping modules are altered (Figures 14, 15). The highest fluctuation of overlapping module from the point of node and edge number and molecules forming the overlapping networks occurs in case of kidney (Figures 16, 17). The nodes of the overlapping module can be said as the crucial nodes with module centrality (which is the central node of the related modules formed by ModuLand) of the respective network [41]. The important network properties of the overlapping modules have been compared in Tables 1, 2, 3, 4, 5, 6, 7, 8 and 9.