Mining Small Subgraphs in Massive Graphs




Santoso, Yudi

Journal Title

Journal ISSN

Volume Title



Graph or network analysis is a much needed method of analysis as it can reveal some insights that will not be obvious through other methods. In a graph, entities are represented by nodes or vertices, and relations or connections among the entities are represented by edges. By analysing a graph we can get invaluable information on how a system works and on how one part of the system is related to the others. Many graph analytical problems require that we find and locate all subgraphs of specific patterns within a given graph. This task is not trivial when we are dealing with massive graphs, of millions or even billions of nodes and edges. In particular, it becomes harder when we want to get it done within a limited time, and with a limited amount of computational resources. In this dissertation, we focus on building efficient algorithms and methods to enumerate graphlets, or small connected induced subgraphs, for both undirected and directed graphs. With our solutions we are able to enumerate up to 5-node graphlets in some massive graphs by using only a single commodity machine, producing trillions of graphlets.



Graph analysis, Subgraph enumeration