Section 2 Materials
R and packages
At the time of writing, the most recent version of R is 3.6.2 (Dark and Stormy Night). The instructions of how to install R in different platforms (Linux, OS X and Windows) can be found at https://www.r-project.org where precompiled binaries are provided for download. For Linux users who do not have a sudo privilege, R could be installed from the source code at the home directory (that is, $HOME):
wget https://cran.wu.ac.at/src/base/R-3/R-3.6.2.tar.gz
tar xvfz R-3.6.2.tar.gz
cd R-3.6.2
./configure --prefix=$HOME/R-3.6.2
make
make check
make install
$HOME/R-3.6.2/bin/R # start RWe highly recommend using a dedicated package BiocManager to install and update any packages that have been deposited into Bioconductor and CRAN, two repositories that are exclusive to each other so that a package cannot be deposited into both. BiocManager should be installed first in a conventional way (i.e. using the function install.packages), and then can be used to install other packages in a single step. Once an additional package remotes also installed, BiocManager can be also used to install packages hosted at GitHub, usually as a development repository prior to submission into Bioconductor or CRAN.
# first, install the package BiocManager
install.packages("BiocManager")
# then install packages from Bioconductor and CRAN
BiocManager::install(c("biobroom","dnet","ggrepel","gridExtra","limma","patchwork","remotes","tidyverse","osfr"), dependencies=T)
# can also install packages from GitHub
BiocManager::install("hfang-bristol/XGR")Genetic interactions
We extracted human genetic interactions from BioGRID (version 3.5.179) involving 3102 genes (mapped to NCBI GeneID; the same hereinafter) and their 7856 interactions. This dataset was preprocessed into an igraph object (using the igraph package), saved as an RData-formatted file ig.BioGRID_genetic.RData, deposited at https://osf.io/gskpn).
ig.BioGRID_genetic
## IGRAPH 0a7230a UN-- 3102 7856 -- 
## + attr: name (v/c), geneid (v/n), symbol (v/c), description (v/c),
## | nPMID (e/n)
## + edges from 0a7230a (vertex names):
##  [1] A1BG  --REV3L   A2M   --KRAS    AAGAB --TP53    AAMP  --KRAS   
##  [5] AANAT --P2RY6   AANAT --SPHK1   AANAT --SSTR5   AANAT --TOP1   
##  [9] AARS2 --LEO1    AARS2 --MRPS16  AARS2 --MRPS5   AARS2 --PSMB6  
## [13] AATF  --DONSON  AATF  --GFI1B   AATF  --MCM3AP  ABCB5 --CSK    
## [17] ABCB5 --KRAS    ABCB7 --AURKA   ABCB7 --HSCB    ABCB7 --LONP1  
## [21] ABCB7 --MBTPS2  ABCB7 --MED23   ABCB7 --NUBP1   ABCB7 --PITRM1 
## [25] ABCB7 --TAF2    ABCE1 --KRAS    ABCG2 --CSK     ABCG5 --FLT3   
## + ... omitted several edgesGene expression
We obtained human tissue RNA-seq datasets (gene-centric expression level quantified as transcripts per million [TPM]) in the GTEx study (version 8). This study recruited ~1000 postmortem donors from which 49 tissues (each tissue with at least 70 donors/samples) were profiled using bulk RNA-seq. To aid in selecting tissue-specific expressed genes and their expression distribution within a tissue, we precalculated descriptive summary for each gene per tissue: ymin (the minimum TPM amongst the same tissue samples), lower (25% quantile), middle (i.e. median), upper (75% quantile) and ymax (the maximum TPM). This per-tissue gene summary data was represented as a tibble object (using the tibble package) and saved as an RData file GTEx_V8_TPM_boxplot.RData. Doing so this dataset, though much reduced in size, is still informative for further extraction of genes expressed in a tissue (filtering by ymin >= 1) and for boxplot visualisation of expression distribution.
GTEx_V8_TPM_boxplot
## # A tibble: 1,709,316 x 9
##    ENSG       Symbol  SMTSD           SMTS      ymin lower middle   upper   ymax
##    <chr>      <chr>   <chr>           <chr>    <dbl> <dbl>  <dbl>   <dbl>  <dbl>
##  1 ENSG00000… DDX11L1 Adipose - Subc… Adipose…     0     0      0 0       0.166 
##  2 ENSG00000… DDX11L1 Muscle - Skele… Muscle       0     0      0 0.0150  0.116 
##  3 ENSG00000… DDX11L1 Artery - Tibial Blood V…     0     0      0 0       0.130 
##  4 ENSG00000… DDX11L1 Artery - Coron… Blood V…     0     0      0 0       0.0710
##  5 ENSG00000… DDX11L1 Heart - Atrial… Heart        0     0      0 0.0143  0.138 
##  6 ENSG00000… DDX11L1 Adipose - Visc… Adipose…     0     0      0 0       0.127 
##  7 ENSG00000… DDX11L1 Uterus          Uterus       0     0      0 0.0244  0.148 
##  8 ENSG00000… DDX11L1 Vagina          Vagina       0     0      0 0.0202  0.118 
##  9 ENSG00000… DDX11L1 Breast - Mamma… Breast       0     0      0 0.00493 0.0744
## 10 ENSG00000… DDX11L1 Skin - Not Sun… Skin         0     0      0 0       0.114 
## # … with 1,709,306 more rows