A pipeline for differential expression analysis of RNA-seq data and the effect of filter cutoff on performance




Robert, Bonnie-Jean

Journal Title

Journal ISSN

Volume Title



RNA sequencing is a powerful new approach to analyzing differential expression of transcripts between treatments. Many statistical methods are now available to test for differential expression, each one reports results differently. This thesis presents a workflow of five popular methods and discusses the results. A pipeline was built in the R language to analyze four of these packages using a real RNA-seq dataset. At present, researchers must prepare RNA-seq data prior to analysis to achieve reliable results. Filtering is a necessary preparatory step in which transcripts exhibiting low levels of genetic expression are removed from further analysis. Yet, little research is available to guide researchers on how best to choose this threshold. This thesis introduces a study designed to determine if the choice of filter threshold has a significant effect on individual package performance. Increasing the filtering threshold was shown to decrease the sensitivity and increase the specificity of the four statistical methods studied.



Bioinformatics, Statistics, RNA-Sequencing, Filter, Differential Expression