Genetic Interaction Network Interpretation (GINI): A Tidy Data Science Perspective
Section 1 Overview
As part of Epistasis: Methods and Protocols
, Methods in Molecular Biology - Springer, we describe three showcases analysing human genetic interactions and/or gene expression data in human tissues (Materials), compatible with the concept of tidy data science (FIGURE 1.1 and FIGURE 1.2). All done exclusively using the R one-liner, defined as a sequential pipeline of elementary functions chained together achieving a complex task. We will guide the users through step-by-step instructions on (Case 1) how to identify, visualise and interpret network modules of genetic interactions; (Case 2) how to identify and interpret tissue-specific genetic interactions; and (Case 3) how to carry out genetic interaction-based tissue clustering and differential interaction analysis. All three showcases are producible on its own, achieved in relatively short runtime (~20 min for Case 1, ~15 min for Case 2, and ~25 min for Case 3). We encourage the users to run through these showcases before analysing their own datasets.

FIGURE 1.1: Collection of the main packages for tidy data science, with features and key functions briefed.

FIGURE 1.2: The one-liner. Top: illustration of data analytics one-liner, conceptually depicted as data flow through a cascade of functions, with defining characteristic as SICE (=sequential, intuitive, combinatory and elementary). Bottom: the template of the one-liner that is chained together through %>%, the pipe operator, with examples beneath achieving common tasks.