Date of Award

5-2026

Degree Type

Dissertation

Degree Name

Ph.D.

Degree Program

Engineering and Applied Science - Computer Science

Department

Computer Science

Major Professor

Md Tamjidul Hoque

Second Advisor

Christopher Summa

Third Advisor

Shreya Banerjee

Fourth Advisor

Dimitrios Charalampidis

Abstract

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) lack a stable three-dimensional structure. Despite this, they play important roles in many biological functions, including transcriptional regulation, signal transduction, cellular scaffolding, and development. Their structural plasticity enables functional versatility, but it also makes experimental characterization difficult, resource-intensive, and highly context-dependent. Existing structure prediction tools, including recent breakthroughs such as AlphaFold, are primarily optimized for folded proteins and often show limited accuracy for disordered regions. These limitations highlight the need for robust computational approaches to predict disordered regions and better understand their structural and functional properties.

This thesis presents five contributions toward that goal. First, it introduces a protein language model-based framework for predicting intrinsically disordered regions directly from sequence, achieving state-of-the-art performance and ranking first among all methods in the CAID2 benchmark. Second, it incorporates structural information into a transformer-based deep neural network to further improve disorder prediction accuracy, achieving the top ranking in the most recent CAID3 challenge. Third, it presents a genetic algorithm-optimized classifier that identifies DNA- and RNA-binding residues within disordered proteins by combining sequence-derived and disorder-based features. Fourth, it proposes a sequence-based method for capturing residue-level backbone flexibility through prediction of torsion-angle fluctuations. Finally, it develops a conformational ensemble generator that produces structurally diverse and experimentally validated ensembles of disordered proteins using statistical energy functions. These methods advance the computational characterization of protein disorder across multiple scales and provide a foundation for further studies of its biological roles and therapeutic relevance.

Rights

The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.

Available for download on Wednesday, April 09, 2031

Share

COinS