Complex graph algorithms using relational database

Ahmed, Aly

Complex graph algorithms using relational database

dc.contributor.author	Ahmed, Aly
dc.contributor.supervisor	Thomo, Alex
dc.date.accessioned	2021-08-24T19:18:15Z
dc.date.available	2021-08-24T19:18:15Z
dc.date.copyright	2021	en_US
dc.date.issued	2021-08-24
dc.degree.department	Department of Computer Science
dc.degree.level	Doctor of Philosophy Ph.D.	en_US
dc.description.abstract	Data processing for Big Data plays a vital role for decision-makers in organizations and government, enhances the user experience, and provides quality results in prediction analysis. However, many modern data processing solutions make a significant investment in hardware and maintenance costs, such as Hadoop and Spark, often neglecting the well established and widely used relational database management systems (RDBMS's). In this dissertation, we study three fundamental graph problems in RDBMS. The first problem we tackle is computing shortest paths (SP) from a source to a target in large network graphs. We explore SQL based solutions and leverage the intelligent scheduling that a RDBMS performs when executing set-at-a-time expansions of graph vertices, which is in contrast to vertex-at-a-time expansions in classical SP algorithms. Our algorithms perform orders of magnitude faster than baselines and outperform counterparts in native graph databases. Second, we studied the PageRank problem which is vital in Google Search and social network analysis to determine how to sort search results and identify important nodes in a graph. PageRank is an iterative algorithm which imposes challenges when implementing it over large graphs. We study computing PageRank using RDBMS for very large graphs using a consumer-grade machine and compare the results to a dedicated graph database. We show that our RDBMS solution is able to process graphs of more than a billion edges in few minutes, whereas native graph databases fail to handle graphs of much smaller sizes. Last, we present a carefully engineered RDBMS solution to the problem of triangle enumeration for very large graphs. We show that RDBMS's are suitable tools for enumerating billions of triangles in billion-scale networks on a consumer grade machine. Also, we compare our RDBMS solution's performance to a native graph database and show that our RDBMS solution outperforms by orders of magnitude.	en_US
dc.description.scholarlevel	Graduate	en_US
dc.identifier.bibliographicCitation	Aly Ahmed, Keanelek Enns, and Alex Thomo. Triangle enumeration forbillion-scale graphs in rdbms. InAINA (2), pages 160–173, 2021	en_US
dc.identifier.bibliographicCitation	Aly Ahmed and Alex Thomo. Computing source-to-target shortest paths forcomplex networks in rdbms.Journal of Computer and System Sciences,89:114–129, 2017	en_US
dc.identifier.bibliographicCitation	Aly Ahmed and Alex Thomo. Pagerank for billion-scale networks in rdbms.InInternational Conference on Intelligent Networking and CollaborativeSystems, pages 89–100. Springer International Publishing, 2020.	en_US
dc.identifier.uri	http://hdl.handle.net/1828/13306
dc.language	English	eng
dc.language.iso	en	en_US
dc.rights	Available to the World Wide Web	en_US
dc.subject	Shortest Path	en_US
dc.subject	pagerank	en_US
dc.subject	RDBMS	en_US
dc.subject	Matrix partitioning	en_US
dc.subject	Big Data	en_US
dc.subject	Triangle Enumeration	en_US
dc.subject	Graph Database	en_US
dc.subject	PTE	en_US
dc.subject	Compact Forward	en_US
dc.subject	Table partitioning	en_US
dc.subject	Billion Scale Graph	en_US
dc.title	Complex graph algorithms using relational database	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ahmed_Aly_PhD_2021.pdf
Size:: 1.74 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (ETD)