Multi-label classification with optimal thresholding for multi-composition spectroscopic analysis

dc.contributor.authorGan, Luyun
dc.contributor.supervisorLu, Tao
dc.date.accessioned2019-08-30T21:08:45Z
dc.date.available2019-08-30T21:08:45Z
dc.date.copyright2019en_US
dc.date.issued2019-08-30
dc.degree.departmentDepartment of Electrical and Computer Engineeringen_US
dc.degree.levelMaster of Applied Science M.A.Sc.en_US
dc.description.abstractSpectroscopic analysis has several applications in physics, chemistry, bioinformatics, geophysics, astronomy, etc. It has been widely used for detecting mineral samples, gas emission, and food volatiles. Machine learning algorithms for spectroscopic analysis focus on either regression or single-label classification problems. Using multi-label classification to identify multiple chemical components from the spectrum, has not been explored. In this thesis, we implement Feed-forward Neural Network with Optimal Thresholding (FNN-OT) identifying gas species among a multi gas mixture in a cluttered environment. Spectrum signals are initially processed by a feed-forward neural network (FNN) model, which produces individual prediction scores for each gas. These scores will be the input of a following optimal thresholding (OT) system. Predictions of each gas component in one testing sample will be made by comparing its output score from FNN against a threshold from the OT system. If its output score is larger than the threshold, the prediction is 1 and 0 otherwise, representing the existence/non-existence of that gas component in the spectrum. Using infrared absorption spectroscopy and tested on synthesized spectral datasets, our approach outperforms FNN itself and conventional binary relevance - Partial Least Squares with Binary Relevance (PLS-BR). All three models are trained and tested on 18 synthesized datasets with 6 levels of \signal-to-noise ratio and 3 types of gas correlation. They are evaluated and compared with micro, macro and sample averaged precision, recall and F1 score. For mutually independent and randomly correlated gas data, FNN-OT yields better performance than FNN itself or the conventional PLS-BR, by significantly by increasing recall without sacrificing much precision. For positively correlated gas data, FNN-OT performs better in capturing information of positive label correlation from noisy datasets than the other two models.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/11095
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectMulti-label Classificationen_US
dc.subjectInfrared Spectroscopyen_US
dc.subjectFeed-forward Neural Networken_US
dc.subjectMachine Learningen_US
dc.titleMulti-label classification with optimal thresholding for multi-composition spectroscopic analysisen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gan_Luyun_MASc_2019.pdf
Size:
4.82 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: