Date of Award


Degree Type


Degree Name


Degree Program

Computer Science


Computer Science

Major Professor

Md Tamjidul Hoque


Noncoding RNAs (ncRNAs) play a significant role in several fundamental biological processes by binding to RNA-binding proteins (RBPs); hence, it is necessary to study ncRNA-protein interaction (RPI). Several classic and deep-learning machine learning models have been pro-posed to predict RPI. These models first need to collect features of RNA and protein, such as physicochemical properties, secondary and tertiary structure, et cetera, before feeding them into the model. More recently, after the advancement of high throughput sequenc-ing and the improvement in Natural Language Processing (NLP), transformer models like BERT-RBP and Evolutionary Scaling Model (ESM) can be trained to automatically extract feature representations, containing both low and high-level information, from RNA and pro-tein sequences directly. This method could make manual feature collection optional. Hence, in this study, we compare the performance of such language-based features against manually created features to predict the interaction probability between a protein and an RNA.


The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.