Dept. Bioinformatics and Epidemiology
Oregon Health & Science University
Mental health notes contain implicit and explicit concepts difficult to extract from the narrative note. This âfree textâ contains vital information necessary for comprehensive terminology creation used to enhance Natural Language Processing (NLP) tasks and is an ongoing challenge for use with advanced NLP tools. This project will build a gold standard corpus of narrative, clinical text manually annotated for key clinical data describing several concepts documented within the Veterans Affairs (VA) for Post-traumatic Stress Disorder (PTSD); most notably symptomatology and treatment modalities. We describe and discuss the annotation technique with regard to process and content, which includes defining schema, creating guidelines (determining the level of concepts to be annotated), and using a double annotation method. More than 900 clinical documents from the VA electronic medical record (EMR) for patients known to have a diagnosis of PTSD are used. ProtÃ©gÃ© 3.4.8-Knowtator was used for annotation and all clinical documents are analyzed and remain in the VAâs secure Informatics and Computing Infrastructure (VINCI). Annotator agreement is reported and each document is adjudicated by an expert clinician. We report general results on the final dataset created. We also report inter-annotator agreements (IAA), document batch differences, and provide descriptive statistics that query the data by type, frequency, and relationship to illustrate the main features of the resulting data. The annotated corpus of clinical notes will be used to facilitate NLP tasks in named entity recognition, automated categorization, and temporal analysis. The validated annotations and unique terms will also be used as a supporting resource of terminology and relationships for incorporation into an ontology.
School of Medicine
Zirkle, Maryan, "Developing a manually annotated corpus of VA electronic medical record notes for post-traumatic stress disorder natural language processing tasks" (2013). Scholar Archive. 987.