- Open Access
Comparative analysis of differential network modularity in tissue specific normal and cancer protein interaction networks
Journal of Clinical Bioinformaticsvolume 3, Article number: 19 (2013)
Large scale understanding of complex and dynamic alterations in cellular and subcellular levels during cancer in contrast to normal condition has facilitated the emergence of sophisticated systemic approaches like network biology in recent times. As most biological networks show modular properties, the analysis of differential modularity between normal and cancer protein interaction networks can be a good way to understand cancer more significantly. Two aspects of biological network modularity e.g. detection of molecular complexes (potential modules or clusters) and identification of crucial nodes forming the overlapping modules have been considered in this regard.
In the current study, the computational analysis of previously published protein interaction networks (PINs) has been conducted to identify the molecular complexes and crucial nodes of the networks. Protein molecules involved in ten major cancer signal transduction pathways were used to construct the networks based on expression data of five tissues e.g. bone, breast, colon, kidney and liver in both normal and cancer conditions. MCODE (molecular complex detection) and ModuLand methods have been used to identify the molecular complexes and crucial nodes of the networks respectively.
In case of all tissues, cancer PINs show higher level of clustering (formation of molecular complexes) than the normal ones. In contrast, lower level modular overlapping is found in cancer PINs than the normal ones. Thus a proposition can be made regarding the formation of some giant nodes in the cancer networks with very high degree and resulting in reduced overlapping among the network modules though the predicted molecular complex numbers are higher in cancer conditions.
The study predicts some major molecular complexes that might act as the important regulators in cancer progression. The crucial nodes identified in this study can be potential drug targets to combat cancer.
Reductionist philosophy has directed biological research for decades [1, 2]. A significant amount of information has been generated so far in the field of biological sciences as enrichment of human knowledgebase to understand life . Despite enormous success of reductionism to decode the structural and functional attributes at cellular and molecular levels of life-organization, it is progressively becoming clearer that biological functions can rarely be credited to discrete perception of individual molecules. Alternatively, most biological phenomena emerge due to extremely interactive complexity derived from functional integrity of cell’s numerous constituents . Various recent approaches have been initiated and accomplished to study biological systems in more integrative and comprehensive way. Network model can play an important role to understand the complex network system based on multiple sets of interactions and to make plain and clear analysis of the origin of observed network characteristics [3–7]. Network biology has thus come out at present time as a revolutionary approach for the empirical study to understand complex biological systems [3, 8–12].
In cancer condition, genomic instability results in alterations of downstream signal transduction pathways and protein-protein interactions. Current understanding of the dynamic changes at genomic and proteomic levels indicates that cancer can be considered as a stochastic phenomenon rather than being the result of some specific linear alterations . Insightful understanding of comparative regulatory patterns in normal and cancerous cells requires in detailed study of molecular interactions  and network biology has prospective usefulness in this regard . The concepts of network biology can be utilized to decipher the differential interaction patterns between normal and cancer conditions through construction of biomolecular networks and subsequent in depth analysis of the networks.
Studying modularity of biomolecular networks can be an efficient way to understand their inherent properties and identify the crucial molecular sets and components of the networks (which is a basic challenge of the study of these networks) . In most of the cases biomolecular networks show modular organization that means the network can be divided into modules according to the density of connections among the nodes of a network. More specifically, modules are the subsets of a network that have comparatively high connectedness among the nodes (through the edges) forming the modules. The modules have lots of connections within themselves but sparse connections among them [17, 18]. From a general point of view, depiction of the modules is useful in understanding the structural and functional features of networks, which has stimulated many empirical researches as well as practical applications e.g. protein complex and drug target identification [19, 20].
The main objective of this paper was to study the differential modularity patterns of normal and cancer protein interaction networks (PINs). The PINs were constructed for five tissues e.g. bone, breast, colon, kidney and liver in both normal and cancer conditions . The network construction was based on expression data of protein molecules participating in ten major cancer signal transduction pathways. MCODE (molecular complex detection)  method was used to identify and analyze potential molecular complexes (modules or clusters) of the networks. Another method ModuLand [23, 24] was used for identification and subsequent analysis of crucial nodes forming overlapping modules of the networks.
The primary data required were retrieved from differential expression database GeneHub-GEPIS (an online bioinformatics tool for inferring gene expression patterns in a large panel of normal and cancer tissues; http://research-public.gene.com/Research/genentech/genehub-gepis/index.html)  and protein-protein interaction prediction tools e.g. PIPs (Human Protein-Protein Interaction Prediction; http://www.compbio.dundee.ac.uk/www-pips/) [26, 27] and STRING (a database of known and predicted protein interactions; http://string.embl.de/) [28–33]. Cytoscape software package [34–36] was used to construct protein interaction networks (PINs) (Additional files 1 and 2) . For modularity analysis two Cytoscape plugins namely MCODE and ModuLand were used. MCODE was used to identify and rank all possible molecular complexes of particular networks and ModuLand was used to identify crucial nodes forming the overlapping modules in those networks. MCODE detects densely connected regions in large protein interaction networks, which may be characterized as molecular complexes . The MCODE method stands on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The ModuLand method provides an algorithm for determining extensively overlapping network modules [23, 24]. Additionally, it identifies several hierarchical layers of modules through representation of modules of the lower layer by meta-nodes of the higher hierarchical layer. This method predicts the function of the whole module and determines key nodes bridging two or multiple modules through assigning module cores.
During MCODE and ModuLand analysis default parameter values were utilized. The default MCODE set up was fixed like, Find Clusters: in Whole Network; Network Scoring (Advanced Option)- a) Include Loops: Turn off, b) Degree Cutoff: 2; Cluster Finding- a) Haircut: Turn on, b) Fluff: Turn off, c) Node Score Cutoff: 0.2, d) K-Core: 2, e) Max. Depth: 100. During ModuLand analysis, selected unweighted network option was taken with default value 1. ModuLand was run to identify and visualize overlapping modules and merged (for modules) with threshold value 1.0 to create correlation matrix of original modularization and module correlation histogram. Measures option of ModuLand was used to calculate the graph related parameters of the overlapping modules.
Results and discussion
Molecular complex detection
Molecular complex detection (MCODE) method has been used to evaluate yeast protein interaction compilation using known molecular complex data from mass spectrometry of the proteome [19, 37]. This leads to the observation that highly interconnected, or dense regions of the network may represent molecular complexes . The numbers of possible modules that can be said as molecular complexes, differ between normal and cancer conditions in each of the five tissues (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10). The ranked molecular complex numbers of normal and cancer protein interaction networks are 15 and 19 for bone, 22 and 28 for breast, 22 and 27 for colon, 21 and 30 for kidney and 19 and 28 for liver respectively (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10). In all cases, possible molecular complex numbers increase in cancer condition. The statistical significance test also supports the difference (at p ≤ 0.05) and depicts that the molecular complex numbers of cancer PINs are significantly increased than the normal PINs (at p = 0.02) (Additional file 3).
Kidney cancer shows highest increment during cancer in comparison to normal state for predicted molecular complex numbers (Figures 7 and 8). Not only the molecular complex numbers, all other parameters e.g. scores, nodes and edges of the molecular complex networks differ between normal and cancer conditions for each tissue (Figures 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 and Additional file 4).
As in case of cancer networks, the related edge and node numbers increase from the normal conditions for all five tissues, the overall clustering is also enhanced in cancer networks. The normal and cancer networks were mainly constructed based on the expression and interaction data of protein molecules participating in major cancer signal transduction pathways which has been described in our previous paper . The event of increased edges and nodes in cancer tissues compared with normal tissues can be explained as the enhancement of molecular interactions at proteomic level in cancer states in comparison to normal states. It is mentionable that the graphical representations of such differences are based on already validated experimental data regarding gene expression and protein interaction. The biological meaning of the observed differences seems to be very obvious indicating that cancer tissue involves more proteins to interact with each other during cancer signaling.
A current report supports that disease genes tend to have higher degree and connectivity in comparison to non-disease genes in terms of expression and interaction of proteins . Some studies also indicate that proteins encoded by cancer genes can interact strongly with other proteins and show higher connectivity than normal condition . There is also evidence of overrepresentation of 10% of protein interaction clusters within the cancer interactome when compared to the normal protein interaction networks .
Overlapping module and crucial node identification from the networks
In case of bone, overlapping module is present in normal condition but absent in cancer (Figure 11). Overlapping modules between normal and cancer states differ for all other tissues (Figures 12, 13, 14, 15, 16, 17, 18 and 19). In breast, kidney and liver edge and node numbers decrease in cancer and most of the molecules forming the overlapping networks are changed (Figures 12, 13, 16, 17, 18 and 19). In colon, edge and node numbers remain constant but most of the molecules forming the overlapping modules are altered (Figures 14, 15). The highest fluctuation of overlapping module from the point of node and edge number and molecules forming the overlapping networks occurs in case of kidney (Figures 16, 17). The nodes of the overlapping module can be said as the crucial nodes with module centrality (which is the central node of the related modules formed by ModuLand) of the respective network . The important network properties of the overlapping modules have been compared in Tables 1, 2, 3, 4, 5, 6, 7, 8 and 9.
Correlation matrix and correlation histogram in both normal and cancer conditions for each tissue represent the nature of correlation among the nodes of the overlapping modules (Tables 10, 11, 12, 13, 14, 15, 16, 17 and 18 and Figures 20, 21, 22, 23, 24, 25, 26, 27 and 28). Correlation matrix represents all the possible interactions of the overlapping modules. Correlation histogram represents only the valid interactions at certain threshold (here 1.0). From the correlation matrix and histogram, it is found that the interactions among the nodes of overlapping modules differ between normal and cancer cases (Tables 10, 11, 12, 13, 14, 15, 16, 17 and 18 and Figures 20, 21, 22, 23, 24, 25, 26, 27 and 28). The statistical significance test also supports the difference (at p ≤ 0.1) and depicts that valid interactions (at threshold 1.0) of overlapping modules in cancer PINs are significantly increased than the normal PINs (at p = 0.08) (Additional file 3).
In case of bone, there is no correlation matrix and correlation histogram for cancer as there is no overlapping module (Table 10; Figure 20). Correlation matrix and correlation histogram show reduced number of interactions during cancer in case of breast, kidney and liver (Tables 11, 12, 15, 16, 17, and 18; Figures 21, 22, 25, 26, 27 and 28). In case of colon, the interaction number remains the same (Tables 13, 14; Figures 23, 24). The correlation frequency in the histograms fluctuates between two conditions as the molecules representing the nodes of overlapping modules differ (Figures 20, 21, 22, 23, 24, 25, 26, 27 and 28).
The crucial nodes identified from the overlapping modules are found to show important biological signification in recently reconstructed high-quality Staphylococcus aureus metabolic network model [41–43]. Identification of functional subgraphs from cancer protein interaction networks representing the important modules and their components has been a key issue in some papers [44, 45].
The parameter values used for MCODE and ModuLand analysis remained the same for both normal and cancer state study and were applied according to the suggested range by plugin developers. So it can be assumed that the parameter values have not any significant effect on the conclusions. It can be also said that some minor effects of parameter values may have some influence but these will not affect our understanding of qualitative comparison between normal and cancer PINs.
The MCODE study shows that during cancer condition in each tissue, network clustering is increased. The ModuLand study denotes that the crucial nodes with module centrality are decreased in cancer (except breast cancer) representing the reduced level of module overlapping of cancer networks. The possible reason can be explained by degree distribution of the networks (Figures 29, 30, 31, 32, 33, 34, 35, 36, 37 and 38). Degree distribution of the networks can account for a possible explanation for counter behaving such clustering and overlapping. In all cancer PINs, few selective nodes with much higher degree are found contrary to the normal PINs. From this observation, a plausible argument can be proposed that some giant nodes are formed in cancer PINs covering a huge degree number and result in many randomly dispersed nodes. Such instance reduces the number of nodes with module centrality and subsequently overlapping modules with reduced number of nodes and edges are formed.
The study gives us a clear picture of the differential modular nature between normal and cancer protein interaction networks. Normal and cancer protein interaction networks (PINs) show observable differences in case of both molecular complex and crucial node identification. The cancer PINs show higher predicted clustering but lower overlapping of network modules in contrast to the normal ones. The changes in predicted molecular complexes between normal and cancer PINs can be a handy tool to decipher the conversion of normal cells to cancer cells. The major molecular complexes (higher ranked) resulted from this study can be merged with experimental evidences to identify the core regulators responsible for cancer enigma. The identified crucial nodes can be recommended as potential drug targets against cancer and can be further assessed with experimental studies. This study can be further intensified through the inclusion of whole proteomic networks for normal and cancer cells derived from high throughput proteomic methods and their subsequent analysis by comprehensive computational tools. The networks considered here are unweighted and static which makes it less reliable to understand the real dynamic physical nature of living tissues. So it requires further expedition to comprehend the dynamics as well as to overcome the present limitations of network level understanding of biological processes. Moreover, the protein interaction study has to be merged with corresponding gene regulatory networks to draw more authentic conclusion regarding predicted modularity.
Kitano H: Computational systems biology. Nature. 2002, 420: 206-210.
Oltvai ZN, Barabasi AL: Life’s complexity pyramid. Science. 2002, 298: 763-764.
Barabasi AL, Oltvai ZN: Network biology: understanding the cell’s functional organization. Nat Rev Genet. 2004, 5: 101-113.
Kininmonth S, van Oppen M, Castine S, et al: The small genetic world of Seriatopora hystrix. Netw Biol. 2012, 2 (1): 1-15.
Zhang WJ: How to construct the statistic network? An association network of herbaceous plants constructed from field sampling. Netw Biol. 2012, 2 (2): 57-68.
Zhang WJ: Modeling community succession and assembly: A novel method for network evolution. Netw Biol. 2012, 2 (2): 69-78.
Ibrahim SS, Eldeeb MAR, Rady MAH, et al: The role of protein interaction domains in the human cancer network. Netw Biol. 2011, 1 (1): 59-71.
Newman MEJ: Networks: An Introduction. 2010, UK: Oxford University Press
Dormann CF: How to be a specialist? Quantifying specialisation in pollination networks. Netw Biol. 2011, 1 (1): 1-20.
Martinez-Antonio A: Escherichia coli transcriptional regulatory network. Netw Biol. 2011, 1 (1): 21-33.
Tacutu R, Budovsky A, Yanai H, et al: Immunoregulatory network and cancer-associated genes: Molecular links and relevance to aging. Netw Biol. 2011, 1 (2): 112-120.
Zhang WJ: Constructing ecological interaction networks by correlation analysis: hints from community sampling. Netw Biol. 2011, 1 (2): 81-98.
Mamun MA, Rahman MS, Islam MF, Honi U, Sobhani ME: Molecular biology and the riddle of cancer: The 'Tom and Jerry’ show. Oncol Rev. 2011, 5 (4): 215(8)-
Mirzarezaee M, Araabi BN, Sadeghi M: Comparison of hubs in effective normal and tumor protein interaction networks. Basic Clin Neurosci. 2010, 2 (10): 44-50.
Zhou TT: Network systems biology for targeted cancer therapies. Chin J Cancer. 2012, 31 (3): 134-141.
Junker BH, Koschutzki D, Schreiber F: Exploration of biological network centralities with CentiBiN. BMC Bioinforma. 2006, 7: 219-
Newman MEJ: Modularity and community structure in networks. Proceedings of the National Academy of Sciences. 2006, USA, 103: 8577-8582.
Fortunato S: Community detection in graphs. Phys Rep. 2010, 486: 75-174.
Gavin AC, Aloy P, Grandi P, et al: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440: 631-636.
Krogan NJ, Cagney G, Yu HY, et al: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440: 637-643.
Rahman KMT, Islam MF, Banik RS, Honi U, Diba FS, Sumi SS, Kabir SMT, Akhter MS: Changes in protein interaction networks between normal and cancer conditions: Total chaos or ordered disorder?. Netw Biol. 2013, 3 (1): 15-28.
Bader GD, Hogue CWV: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinforma. 2003, 4 (2): 1-27.
Kovacs IA, Palotai R, Szalay-Beko M, Csermely P: Community landscapes: a novel, integrative approach for the determination of overlapping network modules. PLoS ONE. 2010, 7: e12528-
Szalay-Beko M, Palotai R, Szappanos B, Kovacs IA, Papp B, Csermely P: ModuLand plug-in for Cytoscape: Determination of hierarchical layers of overlapping network modules and community centrality. Bioinform. 2012, 28 (16): 2202-2204.
Zhang Y, Luoh SM, Hon LS, Baertsch R, Wood WI, Zhang Z: GeneHub-GEPIS: digital expression profiling for normal and cancer tissues based on an integrated gene database. Nucleic Acids Res. 2007, 35: W152-W158.
McDowall MD, Scott MS, Barton GJ: PIPs: Human protein-protein interactions prediction database. Nucleic Acids Res. 2009, 37: D651-D656.
Scott MS, Barton GJ: Probabilistic prediction and ranking of human protein-protein interactions. BMC Bioinforma. 2007, 8: 239-260.
Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C: The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 2010, 39: D561-D568.
Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, Von Mering C: STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 2009, 37: D412-D416.
Von Mering C, Jensen LJ, Kuhn M, Chaffron S, Doerks T, Krüger B, Snel B, Bork P: STRING 7--recent developments in the integration and prediction of protein interactions. Nucleic Acids Res. 2007, 35: D358-D362.
Von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005, 33: D433-D437.
von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B: STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 2003, 31 (1): 258-261.
Snel B, Lehmann G, Bork P, Huynen MA: STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res. 2000, 28 (18): 3442-3444.
Smoot M, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinform. 2011, 27 (3): 431-432.
Cline MS, Smoot M, Cerami E, Kuchinsky A, et al: Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007, 2: 2366-2382.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504.
Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415: 141-147.
Tong AH, Drees B, Nardelli G, Bader GD, Brannetti B, Castagnoli L: A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science. 2002, 295: 321-324.
Jonsson PF, Bates PA: Global topological features of cancer proteins in the human interactome. Bioinform. 2006, 22 (18): 2291-2297.
Sun J, Zhao Z: A comparative study of cancer proteins in the human protein-protein interaction network. BMC Genomics. 2010, 11 (Suppl 3): S5-
Ding DW: Identification of crucial nodes in biological networks. Netw Biol. 2012, 2 (3): 118-120.
Ding DW, Liu T, Lu KZ: Centralization of complex networks: Application to metabolic networks. Comput Appl Chem. 2008, 25: 1508-1510.
Ding DW, Li LN: Why giant strong component is so important for metabolic networks?. Rivista di Biologia / Biol Forum. 2009, 102: 12-16.
Jonsson PF, Cavanna T, Zicha D, Bates PA: Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinforma. 2006, 7: 2-
Hu K, Chen F: Identification of significant pathways in gastric cancer based on protein-protein interaction networks and cluster analysis. Genet Mol Biol. 2012, 35 (3): 701-708.
The authors like to acknowledge Mahbub-E-Sobhani and Md. Shaifur Rahman for their endless inspiration, support and guidance throughout the work. The authors also like to acknowledge Mehdi Rahman, Hannan Hossain, Riasat Azim, Abdullah Mahmud-Al-Rafat, Apurba Majumder, Saimoon Rahman Imran and Shamim Reza Ronju for their help and assistance.
The authors declare that they have no competing interests.
MFI and KMTR have contributed to idea development, networks construction, computational analysis and interpretation. MMH, RSB, SR, SSS, FMNH, MTST and AU have contributed to data mining, maintenance and processing. All the authors have contributed equally to the writing of the paper. All authors provided critical feedback on the manuscript and read and approved the final manuscript.
Md Fahmid Islam, Md Moinul Hoque, Rajat Suvra Banik, Sanjoy Roy, Sharmin Sultana Sumi, F M Nazmul Hassan, Md Tauhid Siddiki Tomal, Ahmad Ullah and K M Taufiqur Rahman contributed equally to this work.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.