ORCID ID

0000-0002-6199-346X

Date of Award

8-2022

Degree Type

Dissertation

Degree Name

Ph.D.

Degree Program

Engineering and Applied Science - Computer Science

Department

Computer Science

Major Professor

Shaikh Arifuzzaman

Second Advisor

Khaled Ibrahim

Third Advisor

Md Tamjidul Hoque

Fourth Advisor

Dimitrios Charalampidis

Fifth Advisor

Mahdi Abdelguerfi

Abstract

Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern era. We work on a well-known graph problem, community detection (CD). We design parallel
algorithms for Louvain method for static networks and show around 12-fold speedup. The implementations use both shared-memory and distributed memory parallel algorithms. We also show the change of communities in dynamic networks in different time phases computing several graph metrics based on their temporal definition. We detect temporal communities in dynamic
networks representing social/brain/communication/citation networks in a more concrete way. We present both shared-memory and distributed-memory parallel algorithms for CD in dynamic graphs using permanence, a vertex-based metric. The parallel CD algorithm implemented using Message Passing Interface (MPI) for temporal graphs is the first MPI-based algorithm to the best of our knowledge. Our algorithm achieves 30× speedup for the largest network with billions of edges. We present a scalable method for CD based on Graph Convolutional Network (GCN) via semi-supervised node classification using PyTorch with CUDA on GPU environment (4× performance gain). Our model achieves up to 86.9% accuracy and 0.85 F1 Score on different real-world datasets from diverse domains. We provide a scalable solution to the Sparse Deep Neural Network (DNN) Challenge by designing data parallel Sparse DNN using TensorFlow on GPU (4.7× speedup). We include the applications of webspam detection from webgraphs (billions of edges), sentiment analysis on social network, Twitter (1.2 million tweets) to reveal insights about COVID-19 vaccination awareness among the public and timeseries forecasting of the vaccinated population in the USA to portray the importance of graph mining in our daily activities.

Rights

The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.

Share

COinS