Computing Graph Metrics and Graph Properties with SQL Queries

Date

2019-08

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Within big data analytics, graph problems are as important as machine learning. There exist many algorithms to analyze large graphs, but they are limited by main memory. On the other hand, a lot of data stored on DBMSs needs to be analyzed as graphs. Moreover, DBMSs can work in parallel, and they do not have RAM limitations. In this paper, we propose several algorithms that produce metrics and show properties of the graph as well as help us to understand the graph structure specifically diameter and betweenness centrality. This work is a big step beyond transitive closure and recursive queries. We propose optimized SQL queries that work on a graph stored in relational form as triples which can compute diameter and betweenness centrality in a more flexible and efficient manner. We study how to optimize SQL queries combining demanding joins and aggregations that remove main memory limitation and also work in parallel. Finally, we provide an experimental evaluation to understand accuracy and performance. We compare our algorithms with popular platforms including Python and Spark. We experimentally show our that SQL algorithms are accurate and efficient.

Description

Keywords

Graph analysis, SQL queries, Betweenness centrality, Diameter, DBMS

Citation