Triangle Enumeration in Massive Graphs using Map Reduce
Date
2018-05-16
Authors
Bhojwani, Pooja
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this era of big data, graph, which adds the advantage of structural representation
of data has gained extreme importance. Analyzing the graphical structure of the
data provides deep, meaningful insights about it and is widely used for a vast number
of applications. Enumerating triangles is one of the crucial pillars of complex graph
analysis and lays the basis for two most fundamental measures of the stability of a
network, clustering coe cient, and transitivity ratio. Besides, triangle listing also has
applications in wide range of domains, such as spam detection, nding communities,
fake account detection in social networks, and many more. Several internal memory
algorithms have been proposed to tackle this problem. However, these algorithms are
not scalable for the massive graphs generated from big data. One way to solve this is
by utilizing the power of parallel computation and thereby distributing the work to
various machines. Google's map-reduce model implements parallel computation and
also manages data partition.
In this project, our goal is to list triangles in massive directed and undirected
graphs using map-reduce. For triangle enumeration in undirected graphs, we implement
existing map-reduce algorithmic solution. We also propose an extension to
the algorithm for directed cycle and trust triangles detection. Finally, we perform
an extensive evaluation of the proposed map-reduce solution for both directed and
undirected graphs on real-world datasets. Experimental results show that these algorithms
are able to enumerate the triangles in very large within a very short span
of time.
Description
Keywords
Map Reduce, Spark, Graph Theory, Triangle Enumeration, Big Data