Date of Award

8-2022

Degree Type

Thesis

Degree Name

M.S.

Degree Program

Computer Science

Department

Computer Science

Major Professor

Dr. Md Tamjidul Hoque

Second Advisor

Dr. Christopher M Summa

Third Advisor

Dr. Atriya Sen

Abstract

Protein-protein interactions in a cell are essential to the characterization and performance of various fundamental biological processes. Due to the tedious, resource-expensive, and time-consuming experimental processes, computational techniques to solve protein pair interaction difficulties have emerged as an active research area in bioinformatics. This research seeks to develop an innovative machine learning-based technique that predicts the interaction of a protein pair based on carefully selected input features and exploits information-rich evolutionary information. We developed a protein-protein interaction predictor, PPILS, that leverages the evolutionary knowledge from the protein language model. We examined several distinct neural network architectures: CNN+LSTM, Transformer, Encoder-Decoder, and FNN and found that the encoder-decoder architecture with light attention performs the best. The method is straightforward; there are only four learnable weight matrices. The model will receive protein representations from the language model, perform one convolution on them to get attention coefficients, and then normalize them along the length dimension using the SoftMax function to generate attention. A second convolution is applied to input features to create values. Then, take the element-wise product of attention and values to construct a representation of the protein. After calculating the sum over the length dimension, a fixed-size protein representation is obtained. This is then concatenated with the maximum length dimension of the data and fed to the decoder. The decoder is our classification engine to predict protein interactions. We found that the PPILS outperformed other cutting-edge techniques for PPI prediction. We believe the proposed method could serve as an essential tool in protein-protein interaction prediction, further accelerating the protein drug discovery process.

Rights

The University of New Orleans and its agents retain the non-exclusive license to archive and make accessible this dissertation or thesis in whole or in part in all forms of media, now or hereafter known. The author retains all other ownership rights to the copyright of the thesis or dissertation.

Recommended Citation

Howladar, Nayan, "Protein-Protein Interaction Prediction from Language of Biological Coding" (2022). University of New Orleans Theses and Dissertations. 3013.
https://scholarworks.uno.edu/td/3013

ThesisandDissertationApprovalForm_Data (71).pdf (49 kB)

Download

Included in

Amino Acids, Peptides, and Proteins Commons, Artificial Intelligence and Robotics Commons

COinS

ScholarWorks@UNO

University of New Orleans Theses and Dissertations

Protein-Protein Interaction Prediction from Language of Biological Coding

Date of Award

Degree Type

Degree Name

Degree Program

Department

Major Professor

Second Advisor

Third Advisor

Abstract

Rights

Recommended Citation

Included in

Search

Browse

Author Corner

Links

ScholarWorks@UNO

University of New Orleans Theses and Dissertations

Protein-Protein Interaction Prediction from Language of Biological Coding

Author

Date of Award

Degree Type

Degree Name

Degree Program

Department

Major Professor

Second Advisor

Third Advisor

Abstract

Rights

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links