Iimi: A novel automated workflow for plant virus diagnostics from high-throughput sequencing data




Ning, Haochen

Journal Title

Journal ISSN

Volume Title



Several workflows have been developed for the diagnostic testing of plant viruses using high-throughput sequencing methods. Most of these workflows require considerable expertise and input from the analyst to perform and interpret the data when deciding on a plant’s disease status. The most common detection methods use workflows based on de novo assembly and/or read mapping. Existing virus detection software mainly uses simple deterministic rules for decision-making, requiring a certain level of understanding of virology when interpreting the results. This can result in inconsistencies in data interpretation between analysts which can have serious ramifications. To combat these challenges, we developed an automated workflow using machine-learning methods, decreasing human interaction while increasing recall, precision, and consistency. Our workflow involves sequence data mapping, feature extraction, and machine learning model training. Using real data, we compared the performance of our method with other popular approaches and show our approach increases recall and precision while decreasing the detection time for most types of sequencing data.



Machine Learning, Virus diagnostics