Functional principal component analysis based machine learning algorithms for spectral analysis

Date

2021-09-07

Authors

Bie, Yifeng

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The ability to probe molecular electronic and vibrational structures gives rise to optical absorption spectroscopy, which is a credible tool used in molecular quantification and classification with high sensitivity, low limit of detection (LoD), and immunity to electromagnetic noises. Spectra are sensitive to slight analyte variations, so they are often used to identify a sample’s components. This thesis proposes several methods for quick classification and quantification of analysts based on their absorbance spectra. functional Principal Component Analysis (fPCA) is employed for feature extraction and dimension reduction. For 1,000-pixel spectra data, fPCA can capture the majority variance with as few output scores as the number of expected analytes. This reduces the amount of calculation required for the following machine learning algorithms. Further, the output scores are fed into XGBoost and logistic regression for classification, and fed into XGBoost and linear regression for quantification. Our models were tested on both synthesized datasets and experimentally acquired dataset. Our models demonstrated similar performance compared to deep learning but with much faster processing speeds. For the synthesized 30 dB dataset, our model XGBoost with fPCA could reach a micro-averaged f1 score of 0.9551 ± 0.0008, while FNN-OT [1] could obtain 0.940±0.001. fPCA helped the algorithms extract the feature of each analyte; furthermore, the output scores nearly had a linear relationship with their concentrations. It was much easier for the algorithm to find the mapping function between the inputs and the outputs with fPCA, which shortened the training and testing time.

Description

Keywords

fPCA, Machine Learning, Spectral Analysis, Absorbance Spectrum

Citation