Date of Award
A central problem in the bioinformatics is to find the binding sites for regulatory motifs. This is a challenging problem that leads us to a platform to apply a variety of data mining methods. In the efforts described here, a combined motif discovery method that uses mutual information and Gibbs sampling was developed. A new scoring schema was introduced with mutual information and joint information content involved. Simulated tempering was embedded into classic Gibbs sampling to avoid local optima. This method was applied to the 18 pieces DNA sequences containing CRP binding sites validated by Stormo and the results were compared with Bioprospector. Based on the results, the new scoring schema can get over the defect that the basic model PWM only contains single positioin information. Simulated tempering proved to be an adaptive adjustment of the search strategy and showed a much increased resistance to local optima.
Lu, Daming, "A Combined Motif Discovery Method" (2009). University of New Orleans Theses and Dissertations. 990.