Triangle Enumeration in Massive Graphs using Map Reduce

Date

2018-05-16

Authors

Bhojwani, Pooja

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this era of big data, graph, which adds the advantage of structural representation of data has gained extreme importance. Analyzing the graphical structure of the data provides deep, meaningful insights about it and is widely used for a vast number of applications. Enumerating triangles is one of the crucial pillars of complex graph analysis and lays the basis for two most fundamental measures of the stability of a network, clustering coe cient, and transitivity ratio. Besides, triangle listing also has applications in wide range of domains, such as spam detection, nding communities, fake account detection in social networks, and many more. Several internal memory algorithms have been proposed to tackle this problem. However, these algorithms are not scalable for the massive graphs generated from big data. One way to solve this is by utilizing the power of parallel computation and thereby distributing the work to various machines. Google's map-reduce model implements parallel computation and also manages data partition. In this project, our goal is to list triangles in massive directed and undirected graphs using map-reduce. For triangle enumeration in undirected graphs, we implement existing map-reduce algorithmic solution. We also propose an extension to the algorithm for directed cycle and trust triangles detection. Finally, we perform an extensive evaluation of the proposed map-reduce solution for both directed and undirected graphs on real-world datasets. Experimental results show that these algorithms are able to enumerate the triangles in very large within a very short span of time.

Description

Keywords

Map Reduce, Spark, Graph Theory, Triangle Enumeration, Big Data

Citation