Intra-topic clustering for social media

Date

2020-08-28

Authors

Gondhi, Uttej Reddy

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

With the social media platforms leading the internet in terms of user base and the average time spent, significant amount of data is being generated by these platforms every day. This makes social media platforms a go-to place to understand the reviews, trends, and opinions of the people. Any regular search for a popular topic would result in an abundance of information and thus it is impossible to go through these large amounts of data manually to understand the trends. This thesis discusses techniques for the intra-topic clustering of such social media data and discusses how social media noise increases the redundancy of the search results. Our goal is to filter the amount of redundant information an end-user must review from a regular social media search. The research proposes clustering models based on two string similarity measures Jaccard word token and T-Information distance. Evaluation parameters are introduced and the models are evaluated on clustering a set of current and historical topics to determine which techniques are the most effective.

Description

Keywords

Socialmedia, clustering, intra-topic clustering, Tweet clustering

Citation