Fast and scalable triangle counting in graph streams: the hybrid approach
Date
2020-12-14
Authors
Singh, Paramvir
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Triangle counting is a major graph problem with several applications in social network analysis, anomaly detection, etc. A considerable amount of work has contributed to approximately computing the global triangle counts using several computational models. One of the most popular streaming models considered is Edge Streaming in which the edges arrive in the form of a graph stream. We categorize the existing literature into two categories: Fixed Memory (FM) approach, and Fixed Probability (FP) approach. As the size of the graphs grows, several challenges arise such as memory space limitations, and prohibitively long running time. Therefore, both FM and FP categories exhibit some limitations. FP algorithms fail to scale for massive graphs. We identified a limitation of FM category $i.e.$ FM algorithms have higher computational time than their FP variants.
In this work, we present a new category called the Hybrid approach that overcomes the limitations of both FM and FP approaches. We present two new algorithms that belong to the hybrid category: Neighbourhood Hybrid Multisampling (NHMS) and Triest/ThinkD Hybrid Sampling (THS) for estimating the number of global triangles in graphs. These algorithms are highly scalable and have better running time than FM and FP variants. We experimentally show that both NHMS and THS outperform state-of-the-art algorithms in space-efficient environments.
Description
Keywords
Graph mining, Triangle counting, approximation algorithms, Edge Streaming