consensusNetR is an R Package for combining networks into a consensus network based on the work of Laura Cantini and Andrei Zinovyev. In addition to identifying consensus based on correlation of community meta-genes (loadings or membership scores), we also implement methods based on overlap.

Installation

Install the version from BMS BioGit with:

remotes::install_github(
  repo = "Systems-Methods/consensusNetR"
)

or:

remotes::install_git(
  "https://github.com/Systems-Methods/consensusNetR"
)

Example Workflow

This example will create a consensus network from three public datasets: GSE39582, TCGA COAD, and TCGA READ

This example workflow begins with icWGCNA results, however alternative methods work as well (KNN, PCA, ICA, WGCNA). See Example Appendix for downloading data and icWGCNA run details and code.

Consensus construction

# Create list of community_membership object
memb_list <- list(
  GSE39582 = GSE39582_icwgcna$community_membership,
  READ = read_icwgcna$community_membership,
  COAD = coad_icwgcna$community_membership
)

# Construct Meta Reciprocal Best Hits based on overlaps (318 communities found)
rbh <- construct_rbh_overlap_based(memb_list, top_n = 25)
nrow(rbh)
## 318

# RBH Heatmap Creation
plot_rbh(rbh = rbh, network_membership_list = memb_list)

# Detect Communities in Adjacency/Reciprocal Best Hits Matrix
consensus_comms <- detect_consensus_communities(rbh)
# showing the first 10 communities 
#note community 1 is a miscellaneous and will be removed
table(consensus_comms$Cluster)[1:10]
##    1   2   3   4   5   6   7   8   9  10 
##  200   3   3   3   3   3   3   3   3   3 

# Compute the average metagene across studies for each community
consensus_memb <- calc_consensus_memberships(consensus_comms, memb_list)

Downstream Analysis

consensus_genes <- get_gene_community_membership(consensus_comms, memb_list, 2)
head(consensus_genes)
##    gene_id cluster n_studies
##  1    A1BG      43         2
##  2     A2M       1         2
##  3    AACS      19         2
##  4   AAGAB      25         2
##  5   AASDH      40         2
##  6    AASS      42         2

# Need to use icWGCNA for individual eigengenes
GSE39582_eigen <- icWGCNA::compute_eigengene_matrix(
  ex = GSE39582_df,
  membership_matrix = consensus_memb
)
read_eigen <- icWGCNA::compute_eigengene_matrix(
  ex = read_df,
  membership_matrix = consensus_memb
)
coad_eigen <- icWGCNA::compute_eigengene_matrix(
  ex = coad_df,
  membership_matrix = consensus_memb
)

eigen_list <- list(GSE39582_eigen, read_eigen, coad_eigen)
plot_consensus_eig_dist(eigen_list)

Example Appendix

Downloading data

For GSE39582 we need to convert from Affymetrix Human Genome U133 Plus 2.0 Array to gene symbols, by using the icWGCNA::gene_mapping() function. This matches with the two TCGA datasets already in gene symbols.

library(icWGCNA)

# GSE39582
GSE39582 <- GEOquery::getGEO("GSE39582")

# TCGA READ
UCSCXenaTools::getTCGAdata(
  project = "READ",
  mRNASeq = TRUE,
  mRNASeqType = "normalized",
  clinical = TRUE,
  download = TRUE,
  destdir = "/MY_PATH/data/"
)

# TCGA COAD
UCSCXenaTools::getTCGAdata(
  project = "COAD",
  mRNASeq = TRUE,
  mRNASeqType = "normalized",
  clinical = TRUE,
  download = TRUE,
  destdir = "/MY_PATH/data/"
)

Preprocessing steps

All datasets must have consistent annotation (i.e. Gene symbols, Entrez, Ensembl, …). In this example we will convert GSE39582 to gene symbols using the icWGCNA::gene_mapping() function.

# creating annotation file for gene mapping to gene symbols
GSE39582_annotation <- GSE39582@featureData@data |>
  dplyr::select(ID, gene_symbol = `Gene Symbol`) |>
  dplyr::mutate(
    gene_symbol = purrr::map(
      gene_symbol, ~ stringr::str_split(.x, " /// ")[[1]]
    )
  ) %>%
  tidyr::unnest(gene_symbol)

GSE39582_hugo <- icWGCNA::gene_mapping(
  GSE39582@assayData$exprs,
  GSE39582_annotation,
  compress_fun = "highest_mean",
  compress_trans = "log_exp"
)

All data should be normalized, however in this example we downloaded the normalized data so no transformations needed here.

icWGCNA runs

For icWGCNA runs using defaults, except reducing max iterations to 5 for demonstration purposes. These runs benefit greatly by using multiple computer cores.

# GSE39582
GSE39582_icwgcna <- icWGCNA::icwgcna(GSE39582_hugo, maxIt = 5)

# TCGA READ
read_icwgcna <- icWGCNA::icwgcna(read_df, maxIt = 5)

# TCGA COAD
coad_icwgcna <- icWGCNA::icwgcna(coad_df, maxIt = 5)

Code of Conduct

Please note that the icWGCNA project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.