StretchVADER – A Rule-based Technique to Improve Sentiment Intensity Detection using Stretched Words and Fine-Grained Sentiment Analysis

dc.contributor.authorJokhio, Muhammad Naveed
dc.contributor.supervisorGulliver, Thomas Aaron
dc.date.accessioned2024-01-22T18:58:09Z
dc.date.available2024-01-22T18:58:09Z
dc.date.copyright2024en_US
dc.date.issued2024-01-22
dc.degree.departmentDepartment of Electrical and Computer Engineering
dc.degree.levelMaster of Applied Science M.A.Sc.en_US
dc.description.abstractWatching a horror movie and someone shouts “HEEEELLLPPPPPPPPP” or someone replies to your joke with a huge “HAHAHAHAHAHAHAHAHAHAHA” is known as word stretching. Word stretching is not only an integral part of spoken language but is also found in many texts. Though it is very rare in formal writing, it is frequently used on social media. Word stretching emphasizes the meaning of the underlying word, changes the context and impacts the sentiment intensity of the sentence. In this work, a rule-based fine-grained approach to sentiment analysis named StretchVADER is introduced that extends the capabilities of the rule-based approach called VADER. StretchVADER detects sentiment intensity using textual features such as stretched words and smileys by calculating a StretchVADER Score (SVS). This score is also used to label the dataset. It has been observed that many tweets contain stretched words and smileys, e.g. 28.5% in a randomly extracted dataset from Twitter. A dataset is also generated and annotated using SVS which contains detailed features related to stretched words and smileys. Finally, Machine Learning (ML) models are evaluated using two different data encoding techniques, e.g. TF-IDF and Word2Vec. The results obtained show that the XGBoost algorithm with 1500 gradient-boosted trees and TF-IDF data encoding achieved a higher accuracy, precision, recall and F1-score than the other ML models, i.e. 91.24%, 91.11%, 91.24% and 91.08%, respectively.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/15836
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.titleStretchVADER – A Rule-based Technique to Improve Sentiment Intensity Detection using Stretched Words and Fine-Grained Sentiment Analysisen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jokhio_MuhammadNaveed_MASc_2024.pdf
Size:
2.43 MB
Format:
Adobe Portable Document Format
Description:
MASc Thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: