Clinical notes are often divided into sections, or segments, such as "history of present illness" or "past medical history." These sections often have subsections as well, such as the "cardiovascular exam" section of the "physical exam." One can gain greater understanding of clinical notes by recognition of the section in which a concept lives. For instance, both a "past medical history" and the "family medical history" sections can contain a list of diseases, but the context decribes very different import to the patient about whom the note was written. Section tagging is an important early step in natural language processing applications applied to clinical notes.
To improve recognition of section headers, we have developed SecTag. SecTag recognizes note section headers using NLP, Bayesian, spelling correction, and scoring techniques. The algorithm can auto-train through multiple iterations on a single corpus.
To improve recognition of section headers, we have developed:
|SecTag||Application to recognition clinical note section headers. It is Perl-based module that applies normalization, spelling correction, and Naive-Bayesian scoring to label and predict sections. It outputs HL7 Clinical Document Architecture (CDA) XML-documents.|
|SecTag section header terminology||This terminology is freely available in SQL or CSV format below.|
Since some codes are borrowed from Logical Observation Identifiers Names and Codes (LOINC®), users must have either a valid LOINC or UMLS license:
- LOINC license - visit http://loinc.org/downloads
- UMLS license - http://www.nlm.nih.gov/databases/umls.html
- Denny JC, Spickard A 3rd, Johnson KB, Peterson NB, Peterson JF, Miller RA. Evaluation of a method to identify and categorize section headers in clinical documents.J Am Med Inform Assoc. 2009 Nov-Dec;16(6):806-15.
- Denny JC, Miller RA, Johnson KB, Spickard A 3rd. Development and evaluation of a clinical note section header terminology. AMIA Annu Symp Proc. 2008 Nov 6:156-60.
|Description of SecTag Terminology.rtf||11.12 KB|
|SecTag_Terminology.sql.txt (Save without the .txt extension)||424 KB|
|SecTag concepts.csv||835.62 KB|