Algorithms for prediction of RNA secondary structure: coronavirus pseudoknots via Shapify & CParty

dc.contributor.authorTrinity, Luke
dc.contributor.supervisorJabbari, Hosna
dc.contributor.supervisorStege, Ulrike
dc.date.accessioned2024-01-30T17:52:31Z
dc.date.available2024-01-30T17:52:31Z
dc.date.copyright2024en_US
dc.date.issued2024-01-30
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelDoctor of Philosophy Ph.D.en_US
dc.description.abstractRNA molecules play a vital role in cellular processes, and many possess functional structures. Due to the complex nature of experimental methods to detect RNA structure, computational tools to predict RNA structure formation are invaluable for building comprehensive knowledge. We seek to predict RNA structure algorithmically, with a focus on the following concepts from the literature: (1) Minimum Free Energy (MFE) methods, (2) the hierarchical folding hypothesis, and (3) partition function ensemble approaches. The MFE framework is an RNA folding hypothesis stating that each RNA molecule folds into the structure with the minimum free energy. In conjunction with MFE, we employ the biologically motivated hierarchical folding hypothesis, stating that an RNA molecule will first fold once (initial fold), before a subsequent folding may occur that lowers the structure's free energy. The accuracy of MFE and hierarchical folding methods can be improved by effective incorporation of known RNA structure information such as experimental reactivity data. We introduce Shapify, an algorithm incorporating experimental data within hierarchical RNA folding prediction. Shapify receives SHAPE data as input to guide RNA structure prediction, allowing the unification of multiple experimental results to determine structure-function patterns. The time complexity of Shapify is O(N^3) time, where N is the RNA sequence length, enabling faster prediction compared with other methods that also handle a complex RNA structure class. We then consider the partition function model, based on the MFE approach, where we compute the sum of free energies for each possible RNA structure in the ensemble at equilibrium. The likelihood of any particular RNA structure occurring can then be determined based on the energy of the structure itself relative to the total energy in the system. Currently, partition function methods are restricted to predicting a limited set of RNA structures, because existing algorithms that allow complex RNA structures are too slow, at best O(N^5) time complexity. We introduce CParty, an O(N^3) time complexity partition function algorithm that includes complex RNA structures in the ensemble. The development of CParty's recursive decomposition schemes was non-trivial to integrate within the algorithmic implementation. By providing an input structure to algorithm CParty, we compute a `conditional' partition function, enabling probabilistic calculation that advances understanding of RNA structure formation patterns. In this dissertation, we (1) incorporate partial RNA structure information into hierarchical secondary structure prediction via Shapify to understand important secondary structure motifs affecting viral function, (2) design and implement CParty, a conditional partition function algorithm to handle complex RNA structures, and (3) apply these and other related algorithms to provide RNA structural information for COVID-19 therapeutic targets. Here, we pinpoint key secondary structure folding motifs in our quest to predict functional RNA structures. Our hierarchical folding algorithms push the frontier of prediction accuracy for functional RNA secondary structures, contributing to coronavirus treatments.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/15913
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectRNA structure predictionen_US
dc.subjectFree energyen_US
dc.subjectPartition functionen_US
dc.subjectPseudoknotsen_US
dc.subjectRNA structureen_US
dc.subjectSARS-CoV-2en_US
dc.subjectCoronavirusesen_US
dc.subjectViral structureen_US
dc.subjectSARS coronavirusen_US
dc.titleAlgorithms for prediction of RNA secondary structure: coronavirus pseudoknots via Shapify & CPartyen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Trinity_Luke_PhD_2024.pdf
Size:
13.87 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: