GO trimming : systematically reducing redundancy in large Gene Ontology datasets
Date
2011-07-28
Authors
Jantzen, Stuart G.
Sutherland, Ben J.G.
Minkley, David R.
Koop, Benjamin F.
Journal Title
Journal ISSN
Volume Title
Publisher
BioMed Central
Abstract
Background: The increased accessibility of gene expression tools has enabled a wide variety of experiments
utilizing transcriptomic analyses. As these tools increase in prevalence, the need for improved standardization in
processing and presentation of data increases, as does the need to guard against interpretation bias. Gene
Ontology (GO) analysis is a powerful method of interpreting and summarizing biological functions. However, while
there are many tools available to investigate GO enrichment, there remains a need for methods that directly
remove redundant terms from enriched GO lists that often provide little, if any, additional information.
Findings: Here we present a simple yet novel method called GO Trimming that utilizes an algorithm designed to
reduce redundancy in lists of enriched GO categories. Depending on the needs of the user, this method can be
performed with variable stringency. In the example presented here, an initial list of 90 terms was reduced to 54,
eliminating 36 largely redundant terms. We also compare this method to existing methods and find that GO
Trimming, while simple, performs well to eliminate redundant terms in a large dataset throughout the depth of
the GO hierarchy.
Conclusions: The GO Trimming method provides an alternative to other procedures, some of which involve
removing large numbers of terms prior to enrichment analysis. This method should free up the researcher from
analyzing overly large, redundant lists, and instead enable the concise presentation of manageable, informative GO
lists. The implementation of this tool is freely available at: http://lucy.ceh.uvic.ca/go_trimming/cbr_go_trimming.py
Description
BioMed Central
Keywords
Centre for Biomedical Research
Citation
Jantzen et al.: GO Trimming: Systematically reducing redundancy in large Gene Ontology datasets. BMC Research Notes 2011 4:267.