Application of data augmentation techniques in metabolomics correlation network




Emadi, Mina

Journal Title

Journal ISSN

Volume Title



Motivation: Metabolomics stands as a beacon in modern biological research, enabling a deep understanding of the intricate play of small molecules within organisms. The role of these small molecules prove to be pivotal in disease detection, progression monitoring, and tailoring therapeutic strategies. Correlation networks, which depict the intricate interdependencies between metabolites, form the backbone of these metabolomic studies. However, the quest for precision in these networks is often hampered by the lack of expansive, high-quality datasets- a recurring challenge in clinical metabolomics. While machine learning has transformed numerous disciplines by extracting patterns from vast datasets, its application to typically smaller clinical metabolomics datasets remains suboptimal. This gap between the potential of machine learning and the constraints of available data forms the crux of our study. results: Through this research, we pioneered the implementation of two innovative data augmentation techniques: pairwise mean augmentation and noise introduction. These techniques effectively augmented the scale and variability of our datasets, enhancing the reliability of the resulting correlation networks. Furthermore, we introduced the ``Strongly Correlated Network'' , a novel network construction algorithm. Simplifying network complexities while retaining critical interconnections, our method, when juxtaposed with traditional correlation networks, manifested superior reliability and robustness. Importantly, we underscored the transformative potential of data augmentation techniques in fortifying correlation networks, especially when navigating the shoals of limited sample sizes.



Machine Learning, Bioinformatics