Clustering by Gaussian Mixture Model and Light Gradient Boosting Machine

dc.contributor.authorYang, Feihan
dc.contributor.supervisorDong, Xiao-Dai
dc.date.accessioned2024-05-25T16:14:01Z
dc.date.available2024-05-25T16:14:01Z
dc.date.issued2024
dc.degree.departmentDepartment of Electrical and Computer Engineering
dc.degree.levelMaster of Engineering MEng
dc.description.abstractThis project studies clustering by Gaussian mixture model (GMM) and Bayesian Gaussian mixture model (BGMM) combined with light gradient boosting machine (LightGBM) respectively. One common unsupervised learning method for clustering, K-means, serves as the baseline for comparison. LightGBM is an ensemble supervised learning method that combines a number of weak learners to form a strong learner. In this project, LightGBM is combined with BGMM and GMM to improve the clustering performance. A Kaggle competition dataset is used to test these different learning algorithms. Performance evaluation is based on rand index that assesses the similarity between the ground truth clusters and predicted clusters. Moreover, intracluster distances and intercluster distances that indicate the aggregation of the clusters and the separation between different clusters respectively are calculated to generate other performance metrics. In particular, an intercluster distance named multi-cluster average centroid linkage distance is proposed to simplify the distance computation with high precision. The evaluation results reveal that LightGBM with BGMM consistently outperforms the other methods making it a preferred classification approach for the dataset.
dc.description.scholarlevelGraduate
dc.identifier.urihttps://hdl.handle.net/1828/16559
dc.language.isoen
dc.subjectunsupervised learning
dc.subjectLightGBM
dc.titleClustering by Gaussian Mixture Model and Light Gradient Boosting Machine
dc.typeproject

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Feihan_MEng_2024
Size:
669.76 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: