UVicSpace

Detecting opinion spam and fake news using n-gram analysis and semantic similarity

Show simple item record

dc.contributor.author Ahmed, Hadeer
dc.date.accessioned 2017-11-14T15:34:59Z
dc.date.available 2017-11-14T15:34:59Z
dc.date.copyright 2017 en_US
dc.date.issued 2017-11-14
dc.identifier.uri https://dspace.library.uvic.ca//handle/1828/8796
dc.description.abstract In recent years, deceptive contents such as fake news and fake reviews, also known as opinion spams, have increasingly become a dangerous prospect, for online users. Fake reviews affect consumers and stores a like. Furthermore, the problem of fake news has gained attention in 2016, especially in the aftermath of the last US presidential election. Fake reviews and fake news are a closely related phenomenon as both consist of writing and spreading false information or beliefs. The opinion spam problem was formulated for the first time a few years ago, but it has quickly become a growing research area due to the abundance of user-generated content. It is now easy for anyone to either write fake reviews or write fake news on the web. The biggest challenge is the lack of an efficient way to tell the difference between a real review or a fake one; even humans are often unable to tell the difference. In this thesis, we have developed an n-gram model to detect automatically fake contents with a focus on fake reviews and fake news. We studied and compared two different features extraction techniques and six machine learning classification techniques. Furthermore, we investigated the impact of keystroke features on the accuracy of the n-gram model. We also applied semantic similarity metrics to detect near-duplicated content. Experimental evaluation of the proposed using existing public datasets and a newly introduced fake news dataset introduced indicate improved performances compared to state of the art. en_US
dc.language English eng
dc.language.iso en en_US
dc.rights Available to the World Wide Web en_US
dc.subject Classification en_US
dc.subject Fake content en_US
dc.subject Fake news en_US
dc.subject n-gram en_US
dc.subject Machine learning en_US
dc.subject Semantic Similarity en_US
dc.subject Keystrokes pattern en_US
dc.title Detecting opinion spam and fake news using n-gram analysis and semantic similarity en_US
dc.type Thesis en_US
dc.contributor.supervisor Traoré, Issa
dc.degree.department Department of Electrical and Computer Engineering en_US
dc.degree.level Master of Applied Science M.A.Sc. en_US
dc.description.scholarlevel Graduate en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UVicSpace


Browse

My Account

Statistics

Help