comoR: a software for disease comorbidity risk assessment

Moni, Mohammad Ali; Liò, Pietro

doi:10.1186/2043-9113-4-8

Methodology
Open access
Published: 23 May 2014

comoR: a software for disease comorbidity risk assessment

Mohammad Ali Moni^1,2 &
Pietro Liò¹

Journal of Clinical Bioinformatics volume 4, Article number: 8 (2014) Cite this article

7531 Accesses
46 Citations
1 Altmetric
Metrics details

Abstract

Background

The diagnosis of comorbidities, which refers to the coexistence of different acute and chronic diseases, is difficult due to the modern extreme specialisation of physicians. We envisage that a software dedicated to comorbidity diagnosis could result in an effective aid to the health practice.

Results

We have developed an R software comoR to compute novel estimators of the disease comorbidity associations. Starting from an initial diagnosis, genetic and clinical data of a patient the software identifies the risk of disease comorbidity. Then it provides a pipeline with different causal inference packages (e.g. pcalg, qtlnet etc) to predict the causal relationship of diseases. It also provides a pipeline with network regression and survival analysis tools (e.g. Net-Cox, rbsurv etc) to predict more accurate survival probability of patients. The input of this software is the initial diagnosis for a patient and the output provides evidences of disease comorbidity mapping.

Conclusions

The functions of the comoR offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high–throughput and clinical data analysis pipelines.

Introduction

The term “comorbidity” refers to the coexistence or presence of multiple diseases or disorders in relation to a primary disease or disorder in a patient [1]. Multimorbidity can be also defined as coexistence of two or more diseases, but no index disease is considered [2]. A comorbidity relationship between two diseases exists whenever they appear simultaneously in a patient more than chance alone. It represents the co–occurrence of diseases or presence of different medical conditions one after another in the same patient [3, 4]. Some diseases or infections can coexist in one person by coincidence, and there is no pathological association among them. However, in most of the cases, multiple diseases (acute or chronic events) occur together in a patient because of the associations among diseases. These associations can be due to direct or indirect causal relationships and the shared risk factors among diseases [5, 6]. For an instance, people with HIV-1 appear to have a markedly higher rate of end-stage renal disease (ESRD) than the healthy people [7]. It is because some of the risk factors associated with HIV-1 acquisition are the same as those that lead to kidney disease. Patients with chronic kidney disease increase risk of cardiovascular mortality [8]. Thus HIV-1 infections is associated with cardiovascular mortality.

One of the most challenging problems in biomedical research is to understand the complex correlation mechanisms of human diseases. Recent research has increasingly demonstrated that many seemingly dissimilar diseases have common molecular mechanisms. Exploring relations between genes and diseases at the molecular level could greatly facilitate our understanding of pathogenesis, and eventually lead to better diagnosis and treatment. Diseases are more likely to be comorbid if they share associated genes [3]. However, some diseases have direct positive association among them while other diseases may have indirect positive association among them through the biological pathways. The analysis of pathway-disease associations, in addition to gene-disease associations, could be used to clarify the molecular mechanism of a disease. Ashley, Butte, Wheeler, Chen, Klein, Dewey, Dudley, Ormond, Pavlovic, Morgan, Pushkarev, Neff, Hudgins, Gong, Hodges, Berlin, Thorn, Sangkuhl, Hebert, Woon, Sagreiya, Whaley, Knowles, Chou, Thakuria, Rosenbaum, Zaranek, Church, Greely and Quake et al. analysed personal genome, gene-environment interactions and conditionally dependent risks for the clinical assessment [9]. Population-based disease association is also useful in conjunction with molecular and genetic data to discover the molecular origins of disease and disease comorbidity [4]. Patient medical records contain important clarification regarding the co-occurrences of diseases affecting the same patient. To estimate the correlation starting from disease co-occurrence, we need to quantify the strength of the comorbidity risk. Disease Ontology (DO) is also helpful to promote the investigation of diseases and disease risk factors [10].

Comorbidity is an important factor for better risk stratification of patients and treatment planning. The more precise predictions can be made by taking comorbidity into account, the more accurate patient management could be possible. Comorbidity has a significant predictive value on overall survival [11]. Older persons’ survival is highly dependent on it. Comorbidities influence patients treatments and confound survival analysis [12]. For an instance, comorbidity has a major effect on survival in gynaecological cancer, particularly for cancer of the cervix [13]. Many researchers have developed survival analysis software for predicting outcomes of the disease [14–23]. However, all of them are based on the single disease. But survival of patient depends on the disease comorbidity, environment, patient age and treatment plan. Kan et al. performed survival analysis of elderly dialysis patients considering comorbidity risk [24]. They observed that the life expectancy decreases with increasing the number of comorbid diseases. So it is important to consider the comorbidity for more accurate survival prediction.

We have developed an R software comoR to compute statistically significant associations among diseases and to predict disease comorbidity risk by using diverse set of data. The input of this software is the initial diagnosis for a patient. To perform the computation of the comorbidity risk, this software uses clinical, gene expression, pathways and ontology data. It provides different comorbidity assessment; integration of genetic information with the comoR output data could be used to infer causal relationships among diseases and to predict more accurate survival probability of patients. The goal of this software is to assist a medical practitioner in decision making in potential treatment.

Implementation

The comoR provides a number of processing options to find comorbidity of a disease. R bioconductor annotation data packages “org.Hs.eg.db” and “DO.db” are used for the annotation and mapping between gene symbol, Entrez id, OMIM (Online Mendelian Inheritance in Man) id and DO (Disease Ontology) term [25]. comoR is also dependent on “DOSE” bioconductor package for the mapping of DO and DOLite [26]. A set of differential expressed gene symbols/Entrez ids/OMIM id/3 or 5 digit ICD-9-CM code of the disease can be used as input of comoR functions. Flow diagram of the comoR software is shown in Figure 1.

Comorbidity based on clinical information

Patient medical records contain important clarification regarding the co-occurrences of diseases affecting the same patient. Two diseases are connected if they are co-expressed in a significant number of patients in a population [4]. To estimate the correlation starting from disease co-occurrence, we need to quantify the strength of the comorbidity risk. We used two comorbidity measures to quantify the strength of comorbidity associations between two diseases: (i) the Relative Risk (fraction between the number of patients diagnosed with both diseases and random expectation based on disease prevalence) as the quantified measures of comorbidity tendency of two disease pairs; and (ii) ϕ-correlation (Pearsons correlation for binary variables) to measure the robustness of the comorbidity association. We used the relative risk R R_{i
j} and ϕ-correlation ϕ_{i
j} of observing a pair of diseases i and j affecting the same patient. The R R_{i
j} allows us to quantify the co-occurrence of disease pairs compared with the random expectation. When two diseases co-occur more frequently than expected by chance, we will get R R_{i
j}>1 and ϕ_{i
j}>0. The two comorbidity measures are not completely independent of each other. We included edges between disease pairs for which the co-occurrence is significantly greater than the random expectation based on population prevalence of the diseases. Clinical information is from the http://www.icd9data.com in the ICD-9-CM format and collected from [4]. The function comorbidityPatients of the comoR package is able to take input an OMIM id/3 or 5 digit ICD-9-CM code of a disease or a list of gene symbols/Entrez ids and provides comorbidity pattern of diseases based on the relative risk and ϕ-correlation between two diseases. comorbidityPatients requires two parameters id list and id type (see details in the Additional file 1). An example and its output (Figure 2) is as follows:

Gene–disease association

comoR makes use of OMIM [27] to explore the genetic association between diseases. Two diseases are connected if they share at least one gene that is statistically significant dysregulated [28]. comoR computes disease-disease association by adopting semantic similarity measures and hypergeometric test. OMIM diseases ids are mapped with ICD-9-CM codes based on the literature [3]. Neighbourhood based benchmark method is used to identify the comorbidity pattern among diseases [28]. We build the associated network as a bipartite graph; each common neighbour node is selected based on the Jaccard coefficient method [28]. comorOMIM function of comor takes as input any of these three options: a list of gene symbols, a list of Entrez gene ids or an OMIM id. This function provides disease comorbidity associations and network based on the disease-gene associations. comorOMIM requires two parameters id list and id type (see details in the Additional file 1). An example and its output (Figure 3) is as follows:

Pathway–disease association

The analysis of pathway-disease associations is important to investigate the molecular mechanism of a disease. We have used Kegg pathway and disease database (http://www.genome.jp/kegg/) and developed a function comorbidityPath to predict the comorbidity risk based on disease pathway association [29]. This software identifies the disease–disease associations using the associations among molecular pathways and their associated diseases. Hypergeometric test is used for extracting associations among pathways and diseases; graph topological structure is used to measure the similarity between diseases [30]. comorbidityPath function takes as input any of the following options: a list of gene symbols, a list of Entrez gene ids or an OMIM id. This function provides disease comorbidity associations and network based on the pathway-disease associations. comorbidityPath requires two parameters id list and id type (see details in the Additional file 1). An example and its output (Figure 4) is as follows:

Ontology and causal inference to evaluate comorbidity

DO provides an open source ontology for the integration of biomedical data that is associated with human diseases [10]. Terms in DO include disease names and disease-related concepts, which are organised in a directed acyclic graph (DAG) [31]. Disease Ontology Lite (DOLite) gives more interpretable results for gene-disease association tests [32]. DO and DOLite enable us to analyse disease association by adopting semantic similarity measures to expand our understanding of the relationships between different diseases. The semantic comparisons of DO provides quantitative ways to compute similarities between diseases [30]. So we have developed a function comorbidityDO for the computation of DO and DOLite based disease comorbidity in an ontology sense. It is a DO-based enrichment analysis function to measure association among diseases and to explore their functional associations from gene sets. Hypergeometric geometric test is used to compute whether the number of selected genes associated with the DO term is larger than expected. Gene set enrichment analysis are used for predicting the significance of gene–disease and disease–disease associations. comorbidityDO function operates by using either of the following input: DO id, a list of gene symbols or Entrez gene ids of the patient sample. This function provides disease comorbidity associations and network based on the DO and DOLite. comorbidityDO requires two parameters id list and id type (see details in the Additional file 1). An example and its output (Figure 5) is as follows:

Comorbidity associations among diseases, i.e. the output of comoR, could be a useful input for causal inference software, precisely pcalg to predict the causal inference relationships among the comorbidity diseases. In the comoR, we have included a function comorbidityCausality to predict the causality inference among the diseases using the PC, RFCI, and FCI algorithms of the pcalg [33]. The directed edges of the network show the direction of the cause-effect relationships among diseases. Finally a network disease analysis leads to a patient comorbidity map which is a powerful visualisation of the patient condition. Nodes of the comorbidity map represent diseases and edge between the nodes represents comorbidity risk. Noteworthy, if related molecular information is available, exercise and diet could be also incorporated and be used in the comorbidity map. comorbidityCausality requires two parameters: comorbidity associations of comoR output and preprocessed gene expression data (see details in the Additional file 1). An example and its output (Figure 6) is as follows:

Methods

We used two comorbidity measures to quantify the strength of comorbidity associations between two diseases - Relative Risk (R R_{i
j}) as the quantified measures of comorbidity tendency of two disease pairs and ϕ-correlation (ϕ_{i
j}) to measure the robustness of the comorbidity association, which are calculated by using following two equations:

R R_{ij} = \frac{C_{ij} / N}{(P_{i} P_{j} - C_{ij}) / N^{2}} = \frac{C_{ij} N}{P_{i} P_{j} - C_{ij}}

(1)

ϕ_{ij} = \frac{C_{ij} N - P_{i} P_{j}}{\sqrt{(P_{i} P_{j} - C_{ij}) (N - P_{i}) (N - P_{j})}}

(2)

where N is the total number of patients in the population, P_i and P_j are incidences/prevalences of diseases i and j respectively. C_{i
j} is the number of patients that have been diagnosed with both diseases i and j, and P_iP_j is the random expectation based on disease prevalence. The significance of the relative risk R R_{i
j} is calculated by using the Katz et al. method to estimate confidence intervals [34]. The 99% confidence interval for the R R_{i
j} between two diseases i and j is calculated by: Lower bounds of the confidence interval (L B)=R R_{i
j}∗e x p(−2.56∗σ_{i
j}) and Upper bounds of the confidence interval (U B)=R R_{i
j}∗e x p(2.56∗σ_{i
j}), where σ_{i
j} is given by: $σ_{ij} = \frac{1}{C_{ij}} + \frac{1}{P_{i} P_{j}} - \frac{1}{N} - \frac{1}{N^{2}}$ . Disease pairs within the 99% confidence interval are only considered if the LB value is larger than 1 when R R_{i
j} is larger than 1, or if the UB value is smaller than 1 when R R_{i
j} is smaller than 1. For ϕ_{i
j}>0 comorbidity is larger than expected by chance and for ϕ_{i
j}<0 comorbidity is smaller than expected by chance. We can determine the significance of ϕ≠0 by performing a t-test. This consists of calculating t according to the formula: $t = \frac{ϕ \sqrt{n - 2}}{\sqrt{1 - ϕ^{2}}}$ , where n is the number of observations used to calculate ϕ.

Diseases are connected when the diseases share at least one significant dysregulated gene or signaling pathway. Let a particular set of human diseases D and a set of human genes G, gene-disease associations attempt to find whether gene g∈G is associated with disease d∈D. If G_i and G_j, the sets of significant up and down dysregulated genes associated with diseases i and j respectively, then the number of shared dysregulated genes $(n_{ij}^{g})$ associated with both diseases i and j is as follows:

n_{ij}^{g} = N (G_{i} \cap G_{j})

(3)

The co-occurrence refers to the number of shared genes or pathways between two diseases. Each common neighbour is calculated based on the Jaccard Index method to measure the strength of co-occurrence, where association score for a node pair is as:

As s_{i, j} = \frac{N (G_{i} \cap G_{j})}{N (G_{i} \cup G_{j})}

(4)

Hypergeometric test is implemented for enrichment analysis [31]. It is used to assess whether the number of selected genes or pathways associated with disease is larger than expected. To determine whether any disease annotate a specified list of genes at frequency greater than that would be expected by chance, comoR calculates a p-value using the hypergeometric distribution. Significance of the enrichment analysis is assessed by the hypergeometric test and the p−v a l u e is adjusted by false discovery rate (FDR). The hypergeometric p-value is calculated using the following formula:

p - value = 1 - \sum_{i = 0}^{k - 1} \frac{(\binom{M}{i}) (\binom{N - M}{n - i})}{(\binom{N}{n})}

(5)

where N is the total number of reference genes, M is the number of genes that are associated to the disease of interest, n is the size of the list of genes of interest and k is the number of genes within that list which are associated to the disease.

Graph-based methods using the topology of DO graph structure is used to compute semantic similarity. We have adapted the method for measuring the functional similarity of protein-coding genes based on GO terms [30]. Semantic values of DO term or diseases were calculated based on the DAG of corresponding diseases. Semantic similarity for any pair of DO term or diseases between DA and DB is calculated based on disease semantic value. Formally, a DO term or a disease A can be represented as a graph D A G_A=(A,T_A,E_A), where T_A is the set of all diseases or DO terms in D A G_A, including term A itself and all of its ancestor terms in the DO graph, and E_A is the set of corresponding edges that connect the DO terms in D A G_A. To encode the semantic of a DO term in a measurable format to enable a quantitative comparison, Wang firstly defined the semantic value of term A as the aggregate contribution of all terms in D A G_A to the semantics of term A, terms closer to term A in D A G_A contribute more to its semantics [30]. Thus, we defined the contribution of a disease or DO term t in D A G_A to the semantics of DO term A as the D value of disease or term t related to disease or term A, D_A(t), which can be calculated as:

\{\begin{array}{l} D_{A} (A) = 1 \\ D_{A} (t) = max {w_{e} * D_{A} (t^{'}) | t^{'} \in children of (t)} if t \neq A \end{array}

(6)

where w_e is the semantic contribution factor for edge e (e∈E_A) linking term or disease t with its child term or disease $t^{^{'}}$ . It is assigned between 0 and 1 according to the types of associations. Term A contributes to its own is defined as one. Then the semantic value of DO term or disease A, D V(A) is calculated as:

DV (A) = \sum_{t \in T_{A}} D_{A} (t)

(7)

Thus given two DO terms or diseases A and B, the semantic similarity between these two terms or disease is defined as:

S_{sim} (A, B) = \sum_{t \in T_{A} \cap T_{B}} \frac{D_{A} (t) + D_{B} (t)}{DV (A) + DV (B)}

(8)

where D_A(t) is the semantic value of disease t related to DO term or disease A and D_B(t) is the semantic value of DO term or disease t associated to DO term or disease B.

Comparison with similar software

An R package “comorbidities” that has functions to categorize comorbidites into the Deyo-Charlson index, the original Elixhauser index of 30 comorbidities, and the AHRQ comorbidity index of 29 diagnoses [35, 36]. This package provides total comorbidity count or the total Charlson score. But comoR provides relative risk, ϕ-correlation, associated genes, pathway and p-value between the comorbidity diseases. It could provide comorbidity associations among all diseases. So comoR is more useful than “comorbidities”.

Most of the researchers have done the survival analysis and developed tools considering a single infection or disease. Cho et al. developed robust likelihood-based survival modeling for microarray data [18] and Zhang et al. developed Net-Cox model by integrating network information into the Cox’s proportional hazard model for the survival prediction [37]. However, these approaches for analysing the death and recurrence outcomes are based on the single disease (e.g. ovarian cancer). But the survival of a patient depends on the disease comorbidity, treatment plan and environmental effect [38]. To observe the association among diseases through the biomarker genes, we have compared the significance of genes for each disease using network-based Cox regression approach. We have calculated network (genes co-expression and functional linkage networks) based penalised regression coefficient (β) values of 5 genes in five diseases conditions(breast cancer, colon cancer, ovarian cancer, liver cancer and osteosarcoma) by using Net-Cox. For this comparative study we have considered five NCBI GEO data sets, accession numbers are GSE3494, GSE17536, GSE26712, GSE10141 and GSE21257 [39–43]. The comparative coefficient (β) values of five significant genes (BRCA1, BRCA2, PTEN, TGFB2 and TP53) in 5 diseases conditions are reported in the Table 1. It is observed that diseases may coexist in the same patient. Our software is able to predict occurrence of other diseases in relation to primary disease. So the comorbidity output of our software could be helpful for more accurate survival analysis. So, comoR could be integrate as a pipeline with the survival analysis softwares.

Table 1 Comparative values of genes co-expression and functional linkage network based penalised Cox regression coefficient ( β ) of five significant genes (BRCA1, BRCA2, PTEN, TGFB2 and TP53) in five diseases conditions (breast cancer, colon cancer, ovarian cancer, liver cancer and osteosarcoma)

Full size table

Discussion

Exploring associations among diseases at the molecular and clinical levels could greatly facilitate our understanding of pathogenesis, and eventually lead to better diagnosis and treatment. If two diseases have associated comorbidity, the occurrence of one of them in a patient may increase the likelihood of developing the other diseases. Development of methods integrating genetic and clinical data will assist clinical decision making and represent a large step towards individualised medicine. Hidalgo et al. analysed comorbidity associations using the medical records [4]. To our knowledge, there is no available R software package for the prediction of disease comorbidities. An R package “comorbodoties” is able to categorises ICD-9-CM codes based on published 30 comorbidity indices using Deyo adaptation of Charlson index and the Elixhauser index [35, 36]. We have developed comoR, an R package that implements different statistical approach for the prediction of disease comorbidity using divers set of data.

Advances in high-throughput molecular assay technologies in the fields of genomics, proteomics and other omics is increasing the diagnostic and therapeutic strategies, and systems-driven strategies for personalised treatment. In particular, the availability of these data sets for many different diseases presents a ripe opportunity to use data-driven approaches to advance our current knowledge of disease relationships in a systematic way. Patient’s genetic/genomic data is becoming important for clinical decision making, including disease risk assessment, disease diagnosis and subtyping, drug therapy and dose selection [44]. In the future, clinicians will have to consider genetic/genomic implications to patient care throughout their clinical workflow, including electronic prescribing of medications. The identified disease patterns can then be further investigated with regards to their diagnostic utility or help in the prediction of novel therapeutic targets. Therefore, comoR could be helpful for the personalised medicine system. This software will provide us to detect many diseases at the earliest detectable phase, weeks, months, and maybe years before symptoms appear. Thus it could be applicable in the personalised medicine and in clinical bioinformatics.

Conclusion

Doctors need to be kept updated on novel information on likely comorbidities of diseases. The comoR software provides a robust approach to study disease comorbidities, which can be easily integrated into pipelines for high-throughput and clinical data analysis and to predict causal inference of a disease. This software will help to gain a better understanding of the complex pathogenesis of disease risk phenotypes and the heterogeneity of disease comorbidity. Thus it could be applicable in the personalised medicine and in clinical bioinformatics.

Availability and requirements

The software package comoR has been written in the platform independent R programming language. It requires R version 3.0.1 or newer to run. The software is freely available at http://www.cl.cam.ac.uk/~mam211/comoR/ and will appear in Comprehensive R Archive Network (CRAN) at (http://cran.r-project.org/).

References

Capobianco E, Liò P: Comorbidity: a multidimensional approach. Trends Mol Med. 2013, 19 (9): 515-521. 10.1016/j.molmed.2013.07.004.
Article PubMed Google Scholar
Radner H, Yoshida K, Smolen JS, Solomon DH: Multimorbidity and rheumatic conditions —enhancing the concept of comorbidity. Nat Rev Rheumatol. 2014,
Google Scholar
Park J, Lee DS, Christakis NA, Barabási AL: The impact of cellular networks on disease comorbidity. Mol Syst Biol. 2009, 5: 1-
Article Google Scholar
Hidalgo CA, Blumm N, Barabási AL, Christakis NA: A dynamic network approach for the study of human phenotypes. PLoS Comput Biol. 2009, 5 (4): e1000353-10.1371/journal.pcbi.1000353.
Article PubMed Central PubMed Google Scholar
Tong B, Stevenson C: Comorbidity of cardiovascular disease, diabetes and chronic kidney disease in Australia. 2007, Australian Institute of Health & Welfare, Canberra
Google Scholar
Liò P, Paoletti N, Moni MA, Atwell K, Merelli E, Viceconti M: Modelling osteomyelitis. BMC bioinformatics. 2012, 13 (Suppl 14): S12-10.1186/1471-2105-13-S14-S12.
Article PubMed Central PubMed Google Scholar
Kumar MSA, Sierka DR, Damask AM, Fyfe B, Mcalack RF, Heifets M, Moritz MJ, Alvarez D, Kumar A: Safety and success of kidney transplantation and concomitant immunosuppression in HIV-positive patients. Kidney Int. 2005, 67 (4): 1622-1629. 10.1111/j.1523-1755.2005.00245.x.
Article PubMed Google Scholar
de Jager DJ, Vervloet MG, Dekker FW: Noncardiovascular mortality in CKD: an epidemiological perspective. Nat Rev Nephrol. 2014, 10 (4): 208-214. 10.1038/nrneph.2014.8.
Article CAS PubMed Google Scholar
Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE, Dudley JT, Ormond KE, Pavlovic A, Morgan AA: Clinical assessment incorporating a personal genome. The Lancet. 2010, 375 (9725): 1525-1535. 10.1016/S0140-6736(10)60452-7.
Article CAS Google Scholar
Schriml LM, Arze C, Nadendla S, Chang YWW, Mazaitis M, Felix V, Feng G, Kibbe WA: Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Res. 2012, 40 (D1): D940-D946. 10.1093/nar/gkr972.
Article PubMed Central CAS PubMed Google Scholar
Lagro J, Melis RJ, Rikkert MGO: Importance of comorbidity in competing risks analysis in patients with localized renal cell carcinoma. J Clin Oncol. 2010, 28 (18): e298-e298. 10.1200/JCO.2009.27.3987.
Article PubMed Google Scholar
Hall SF, Rochon PA, Streiner DL, Paszat LF, Groome PA, Rohland SL: Measuring comorbidity in patients with head and neck cancer. The Laryngoscope. 2002, 112 (11): 1988-1996. 10.1097/00005537-200211000-00015.
Article PubMed Google Scholar
Ferrandina G, Lucidi A, Paglia A, Corrado G, Macchia G, Tagliaferri L, Fanfani F, Morganti AG, Valentini V, Scambia G: Role of comorbidities in locally advanced cervical cancer patients administered preoperative chemoradiation: impact on outcome and treatment-related complications. Eur J Surg Oncol (EJSO). 2012, 38 (3): 238-244. 10.1016/j.ejso.2011.12.001.
Article CAS Google Scholar
Lin Y, Wang S, Chappell RJ: Lasso tree for cancer staging with survival data. Biostatistics. 2013, 14 (2): 327-339. 10.1093/biostatistics/kxs044.
Article PubMed Central PubMed Google Scholar
Annest A, Bumgarner RE, Raftery AE, Yeung KY: The iterative bayesian model averaging algorithm for survival analysis: an improved method for gene selection and survival analysis on microarray data. 2010,
Google Scholar
Oberthuer A, Kaderali L, Kahlert Y, Hero B, Westermann F, Berthold F, Brors B, Eils R, Fischer M: Subclassification and individual survival time prediction from gene expression data of neuroblastoma patients by using CASPAR. Clin Cancer Res. 2008, 14 (20): 6590-6601. 10.1158/1078-0432.CCR-07-4377.
Article CAS PubMed Google Scholar
Haibe-Kains B, Schröder M, Olsen C, Sotiriou C, Bontempi G, Quackenbush J, de Montréal RC: Survcomp: a package for performance assessment and comparison for survival analysis. 2013, 27 (22): 3206-3208.
Google Scholar
Cho H, Yu A, Kim S, Kang J, Hong SM: Robust likelihood-based survival modeling for microarray data. J Stat Softw. 2009, 29 (i01): (American Statistical Association),
Google Scholar
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS: Random survival forests. Ann Appl Stat. 2008, 841-860. (JSTOR),
Google Scholar
Therneau T: Package survival. R Project. 2013,
Google Scholar
Yasrebi H: SurvJamda: an R package to predict patients’ survival and risk assessment using joint analysis of microarray gene expression data. Bioinformatics. 2011, 27 (8): 1168-1169. 10.1093/bioinformatics/btr103.
Article CAS PubMed Google Scholar
Lopez-de Ullibarri I, Jácome MA: survPresmooth: an R package for presmoothed estimation in survival analysis. J Stat Softw. 2013, 54 (11): 1-26.
Article Google Scholar
Colchero F, Jones O, Rebke M, Colchero MF: Package BaSTA. Methods Ecol Evol. 2013, 3 (3): 466-470.
Article Google Scholar
Kan WC, Wang JJ, Wang SY, Sun YM, Hung CY, Chu CC, Lu CL, Weng SF, Chio CC, Chien CC: The new Comorbidity Index for predicting survival in elderly dialysis patients: a long-term population-based study. PloS one. 2013, 8 (8): e68748-10.1371/journal.pone.0068748.
Article PubMed Central CAS PubMed Google Scholar
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5 (10): R80-10.1186/gb-2004-5-10-r80.
Article PubMed Central PubMed Google Scholar
Yu G, Wang LG: Disease ontology semantic and enrichment analysis. 2012,
Google Scholar
McKusick VA: Mendelian inheritance in man and its online version, OMIM. Am J Human Genet. 2007, 80 (4): 588-10.1086/514346.
Article CAS Google Scholar
Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. Proc Nat Acad Sci. 2007, 104 (21): 8685-8690. 10.1073/pnas.0701361104.
Article PubMed Central CAS PubMed Google Scholar
Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M: KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res. 2010, 38 (suppl 1): D355-D360.
Article PubMed Central CAS PubMed Google Scholar
Wang JZ, Du Z, Payattakool R, Philip SY, Chen CF: A new method to measure the semantic similarity of GO terms. Bioinformatics. 2007, 23 (10): 1274-1281. 10.1093/bioinformatics/btm087.
Article CAS PubMed Google Scholar
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Nat Acad Sci USA. 2005, 102 (43): 15545-15550. 10.1073/pnas.0506580102.
Article PubMed Central CAS PubMed Google Scholar
Du P, Feng G, Flatow J, Song J, Holko M, Kibbe WA, Lin SM: From disease ontology to disease-ontology lite: statistical methods to adapt a general-purpose ontology for the test of gene-ontology associations. Bioinformatics. 2009, 25 (12): i63-i68. 10.1093/bioinformatics/btp193.
Article PubMed Central CAS PubMed Google Scholar
Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P: Causal inference using graphical models with the R package pcalg. J Stat Softw. 2012, 47 (11): 1-26.
Article Google Scholar
Katz D, Baptista J, Azen S, Pike M: Obtaining confidence intervals for the risk ratio in cohort studies. Biometrics. 1978, 469-474. (JSTOR),
Google Scholar
Deyo RA, Cherkin DC, Ciol MA: Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992, 45 (6): 613-619. 10.1016/0895-4356(92)90133-8.
Article CAS PubMed Google Scholar
Elixhauser A, Steiner C, Harris DR, Coffey RM: Comorbidity measures for use with administrative data. Med Care. 1998, 36: 8-27. 10.1097/00005650-199801000-00004.
Article CAS PubMed Google Scholar
Zhang W, Ota T, Shridhar V, Chien J, Wu B, Kuang R: Network-based Survival Analysis Reveals Subnetwork Signatures for Predicting Outcomes of Ovarian Cancer Treatment. PLoS Comput Biol. 2013, 9 (3): e1002975-10.1371/journal.pcbi.1002975.
Article PubMed Central CAS PubMed Google Scholar
Bowker SL, Majumdar SR, Veugelers P, Johnson JA: Increased cancer-related mortality for patients with type 2 diabetes who use sulfonylureas or insulin. Diabetes Care. 2006, 29 (2): 254-258. 10.2337/diacare.29.02.06.dc05-1558.
Article PubMed Google Scholar
Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET: An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Nat Acad Sci USA. 2005, 102 (38): 13550-13555. 10.1073/pnas.0506230102.
Article PubMed Central CAS PubMed Google Scholar
Smith JJ, Deane NG, Wu F, Merchant NB, Zhang B, Jiang A, Lu P, Johnson JC, Schmidt C, Bailey CE: Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology. 2010, 138 (3): 958-968. 10.1053/j.gastro.2009.11.005.
Article PubMed Central CAS PubMed Google Scholar
Bonome T, Levine DA, Shih J, Randonovich M, Pise-Masison CA, Bogomolniy F, Ozbun L, Brady J, Barrett JC, Boyd J: A gene signature predicting for survival in suboptimally debulked patients with ovarian cancer. Cancer Res. 2008, 68 (13): 5478-5486. 10.1158/0008-5472.CAN-07-6595.
Article CAS PubMed Google Scholar
Villanueva A, Hoshida Y, Battiston C, Tovar V, Sia D, Alsinet C, Cornella H, Liberzon A, Kobayashi M, Kumada H: Combining clinical, pathology, and gene expression data to predict recurrence of hepatocellular carcinoma. Gastroenterology. 2011, 140 (5): 1501-1512. 10.1053/j.gastro.2011.02.006.
Article PubMed Central CAS PubMed Google Scholar
Buddingh EP, Kuijjer ML, Duim RA, Bürger H, Agelopoulos K, Myklebost O, Serra M, Mertens F, Hogendoorn PC, Lankester AC: Tumor-infiltrating macrophages are associated with metastasis suppression in high-grade osteosarcoma: a rationale for treatment with macrophage activating agents. Clinical Cancer Research. 2011, 17 (8): 2110-2119. 10.1158/1078-0432.CCR-10-2047.
Article CAS PubMed Google Scholar
Ullman-Cullere MH, Mathew JP: Emerging landscape of genomics in the electronic health record for personalized medicine. Hum Mutat. 2011, 32 (5): 512-516. 10.1002/humu.21456.
Article PubMed Google Scholar

Download references

Acknowledgements

This work is supported by the EU Mission T2D project.

Author information

Authors and Affiliations

Computer Laboratory, University of Cambridge, William Gates Building, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK
Mohammad Ali Moni & Pietro Liò
Department of Computer Science & Engineering, Pabna University of Science & Technology, Pabna, Bangladesh
Mohammad Ali Moni

Authors

Mohammad Ali Moni
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Liò
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Ali Moni.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The software was developed by MAM under the supervision of PL. MAM and PL wrote the manuscript. All authors contributed to and approved the manuscript.

Electronic supplementary material

Additional file 1: comoRdocumentation.(PDF 93 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Moni, M.A., Liò, P. comoR: a software for disease comorbidity risk assessment. J Clin Bioinform 4, 8 (2014). https://doi.org/10.1186/2043-9113-4-8

Download citation

Received: 14 February 2014
Accepted: 17 April 2014
Published: 23 May 2014
DOI: https://doi.org/10.1186/2043-9113-4-8

comoR: a software for disease comorbidity risk assessment

Abstract

Background

Results

Conclusions

Introduction

Implementation

Comorbidity based on clinical information

Gene–disease association

Pathway–disease association

Ontology and causal inference to evaluate comorbidity

Methods

Comparison with similar software

Discussion

Conclusion

Availability and requirements

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Electronic supplementary material

Additional file 1: comoRdocumentation.(PDF 93 KB)

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Clinical Bioinformatics