Evaluation of intra-set clustering techniques for redundant social media content

Date

2018-12-19

Authors

Jubinville, Jason

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis evaluates various techniques for intra-set clustering of social media data from an industry perspective. The research goal was to establish methods for reducing the amount of redundant information an end user must review from a standard social media search. The research evaluated both clustering algorithms and string similarity measures for their effectiveness in clustering a selection of real-world topic and location-based social media searches. In addition, the algorithms and similarity measures were tested in scenarios based on industry constraints such as rate limits. The results were evaluated using several practical measures to determine which techniques were effective.

Description

Keywords

social media, twitter, clustering, T Information, Jaccard, Hamming, T Codes

Citation