KnotAli: informed energy minimization through the use of evolutionary information

dc.contributor.authorGray, Mateo
dc.contributor.supervisorJabbari, Hosna
dc.contributor.supervisorChester, Sean
dc.date.accessioned2021-08-31T18:31:01Z
dc.date.copyright2021en_US
dc.date.issued2021-08-31
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractMotivation: Improving the prediction of structures, especially those containing pseudoknots (structures with crossing base pairs) is an ongoing challenge. Current alignment-based prediction algorithms only find the consensus structure, and their alignments can come from structure-based alignment algorithms, which is more reliable, but come with an increased cost compared to sequence-based alignment algorithms. This step can be removed; however, non-alignment based algorithms neglect structural information that can be found within similar sequences. Results: We present a new method for prediction of RNA pseudoknotted secondary structures that combines the strengths of MFE prediction and alignment-based methods. KnotAli takes an RNA sequence alignment and uses covariation and thermodynamic energy minimization to predict secondary structures for each individual sequence in the alignment. We compared KnotAli's performance to that of three other alignment-based algorithms, on a large data set of 10 families with pseudoknotted and pseudoknot-free reference structures. We produced sequence alignments for each family using two well-known sequence aligners (MUSCLE and MAFFT). We found KnotAli to be superior in 6 of the 10 families for MUSCLE and 7 of the 10 for MAFFT. We find KnotAli's predictions to be less dependent on alignment quality. In particular, KnotAli is shown to have more accurate predictions compared to other leading methods as alignment quality deteriorates. Availability: The algorithm can be found online on Github at https://github.com/mateog4712/KnotAlien_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/13342
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectRNA secondary structureen_US
dc.subjectMFEen_US
dc.subjectPseudoknoten_US
dc.subjectThermodynamic energy minimizationen_US
dc.subjectSequence alignmenten_US
dc.subjectCovariationen_US
dc.titleKnotAli: informed energy minimization through the use of evolutionary informationen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gray_Mateo_MASc-2021.pdf
Size:
693.22 KB
Format:
Adobe Portable Document Format
Description:
Manuscript
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: