A method for encoding clinical datasets with SNOMED CT




Lee, Dennis H.
Lau, Francis Y.
Quan, Hue

Journal Title

Journal ISSN

Volume Title


BioMed Central


Background: Over the past decade there has been a growing body of literature on how the Systematised Nomenclature of Medicine Clinical Terms (SNOMED CT) can be implemented and used in different clinical settings. Yet, for those charged with incorporating SNOMED CT into their organisation’s clinical applications and vocabulary systems, there are few detailed encoding instructions and examples available to show how this can be done and the issues involved. This paper describes a heuristic method that can be used to encode clinical terms in SNOMED CT and an illustration of how it was applied to encode an existing palliative care dataset. Methods: The encoding process involves: identifying input data items; cleaning the data items; encoding the cleaned data items; and exporting the encoded terms as output term sets. Four outputs are produced: the SNOMED CT reference set; interface terminology set; SNOMED CT extension set and unencodeable term set. Results: The original palliative care database contained 211 data elements, 145 coded values and 37,248 free text values. We were able to encode ~84% of the terms, another ~8% require further encoding and verification while terms that had a frequency of fewer than five were not encoded (~7%). Conclusions: From the pilot, it would seem our SNOMED CT encoding method has the potential to become a general purpose terminology encoding approach that can be used in different clinical systems.


BioMed Central



Lee et al.: A method for encoding clinical datasets with SNOMED CT. BMC Medical Informatics and Decision Making 2010 10:53.