Scalable analytics of massive graphs




Popova, Diana

Journal Title

Journal ISSN

Volume Title



Graphs are commonly selected as a model of scientific information: graphs can successfully represent imprecise, uncertain, noisy data; and graph theory has a well-developed mathematical apparatus forming a solid and sound foundation for graph research. Design and experimental confirmation of new, scalable, and practical analytics for massive graphs have been actively researched for decades. Our work concentrates on developing new accurate and efficient algorithms that calculate the most influential nodes and communities in an arbitrary graph. Our algorithms for graph decomposition into families of most influential communities compute influential communities faster and using smaller memory footprint than existing algorithms for the problem. Our algorithms solving the problem of influence maximization in large graphs use much smaller memory than the existing state-of-the-art algorithms while providing solutions with equal accuracy. Our main contribution is designing data structures and algorithms that drastically cut the memory footprint and scale up the computation of influential communities and nodes to massive modern graphs. The algorithms and their implementations can efficiently handle networks of billions of edges using a single consumer-grade machine. These claims are supported by extensive experiments on large real-world graphs of different types.



massive graphs, graph analytics, influence maximization, randomized algorithms, influential communities