Date

September 2010

Document Type

Dissertation

Degree Name

Ph.D.

Department

Dept. of Medical Informatics and Clinical Epidemiology

Institution

Oregon Health & Science University

Abstract

Identification of biologically relevant high-occupancy transcription factor binding sites (TFBS) in silico has historically been a difficult problem with a high error rate. Methods which utilize information in addition to the sequence of binding sites (e.g. chromatin information) have been shown to improve performance over strictly sequence-based methods; however, a number of questions about such methods remain unanswered: whether such models are suitable for multiple transcription factors, whether a general model or generalizable approach to the problem is possible, and what the effect of such prediction on biological inference is. In this work, we construct and evaluate a number of classifiers of position weight matrix-predicted TFBS (“occupancy classifiers”) based on four distinct transcription factors and demonstrate that such classifiers identify biochemically confirmed high-occupancy sites at a high rate. I contrast and compare the algorithms and predictors used by these classifi

Identifier

doi:10.6083/M4NC5Z6P

School

School of Medicine

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.