Event Title

An Effective Machine Learning Method to Predict Residues of DNA- and RNA-Binding Protein

Presenter Information

Aasish Rijal

College(s)

College of Sciences

Submission Type

Oral Presentation

Description

DNA- and RNA-binding proteins have diverse roles in various biological processes. Their functions include controlling transcription and translation, DNA repair, splicing, apoptosis, and mediating stress responses. DNA- and RNA-binding proteins are important for biological research and understanding many diseases’ pathogenesis, yet most of them still need to be discovered. This study aims to develop a machine learning method to accurately predict DNA and RNA-binding residues. To develop the model, various properties of the protein sequences, such as amino acid type, physicochemical properties, PSSM values of amino acids, structural properties, torsion angles, and disorder regions, have been studied. We follow the pipeline of developing an optimum machine learning method which includes feature engineering, feature selection, parameter optimization, experiment with different machine learning (ML) methods, and ensemble ML methods. To evaluate the proposed method, we have used two independent test datasets. The experimental results show that the proposed method outperformed the state-of-the-art methods.

Comments

Honorable Mention, Undergraduate Presentation

This document is currently not available here.

Share

COinS
 

An Effective Machine Learning Method to Predict Residues of DNA- and RNA-Binding Protein

DNA- and RNA-binding proteins have diverse roles in various biological processes. Their functions include controlling transcription and translation, DNA repair, splicing, apoptosis, and mediating stress responses. DNA- and RNA-binding proteins are important for biological research and understanding many diseases’ pathogenesis, yet most of them still need to be discovered. This study aims to develop a machine learning method to accurately predict DNA and RNA-binding residues. To develop the model, various properties of the protein sequences, such as amino acid type, physicochemical properties, PSSM values of amino acids, structural properties, torsion angles, and disorder regions, have been studied. We follow the pipeline of developing an optimum machine learning method which includes feature engineering, feature selection, parameter optimization, experiment with different machine learning (ML) methods, and ensemble ML methods. To evaluate the proposed method, we have used two independent test datasets. The experimental results show that the proposed method outperformed the state-of-the-art methods.