Section 1 Overview

As part of Epistasis: Methods and Protocols, Methods in Molecular Biology - Springer, we describe three showcases analysing human genetic interactions and/or gene expression data in human tissues (Materials), compatible with the concept of tidy data science (FIGURE 1.1 and FIGURE 1.2). All done exclusively using the R one-liner, defined as a sequential pipeline of elementary functions chained together achieving a complex task. We will guide the users through step-by-step instructions on (Case 1) how to identify, visualise and interpret network modules of genetic interactions; (Case 2) how to identify and interpret tissue-specific genetic interactions; and (Case 3) how to carry out genetic interaction-based tissue clustering and differential interaction analysis. All three showcases are producible on its own, achieved in relatively short runtime (~20 min for Case 1, ~15 min for Case 2, and ~25 min for Case 3). We encourage the users to run through these showcases before analysing their own datasets.

Collection of the main packages for tidy data science, with features and key functions briefed.

FIGURE 1.1: Collection of the main packages for tidy data science, with features and key functions briefed.

The one-liner. Top: illustration of data analytics one-liner, conceptually depicted as data flow through a cascade of functions, with defining characteristic as SICE (=sequential, intuitive, combinatory and elementary). Bottom: the template of the one-liner that is chained together through %>%, the pipe operator, with examples beneath achieving common tasks.

FIGURE 1.2: The one-liner. Top: illustration of data analytics one-liner, conceptually depicted as data flow through a cascade of functions, with defining characteristic as SICE (=sequential, intuitive, combinatory and elementary). Bottom: the template of the one-liner that is chained together through %>%, the pipe operator, with examples beneath achieving common tasks.