Skip to main content
Log in

Crowdsourcing techniques to create a fuzzy subset of SNOMED CT for semantic tagging of medical documents

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Ontologies and other schemes are useful for allowing semantic tagging of documents for many applications on the semantic web. Representing uncertainty on the semantic web is becoming increasingly common, using ontologies and other techniques. Ontology and declarative tools allow documents using concepts contained in these ontologies to be reasoned about using computer systems. Very large ontologies and vocabularies have been created; however, users may find it difficult to select the correct concept or term when there are large numbers of items that on face value appear to represent the same idea. Creating subsets of ontologies is a popular approach to solve this problem but this may not fit well with the need to deal with complex domains. However, crowdsourcing techniques, which harness the power of large groups, may be more effective than document analysis or expert opinion. In crowdsourcing, large numbers of people collaborate by performing relatively simple tasks usually using applications distributed via the World Wide Web. This approach is being tested in the medical domain using a very large clinical vocabulary, SNOMED CT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

SNOMED:

Systematized nomenclature of medicine

CT:

Clinical terms SNOMED CT

References

  • Acampora G, Gaeta M, Loia V, Vasilakos AV (2010) Interoperable and adaptive fuzzy services for ambient intelligence applications. ACM Trans Auton Adapt Syst 5(2):1–26. doi:10.1145/1740600.1740604

    Article  Google Scholar 

  • Anderson DP, Cobb J, Korpela E, Lebofsky M, Werthimer D (2002) SETI@home: an experiment in public-resource computing. Commun ACM 45 (11):56–61. doi:http://doi.acm.org/10.1145/581571.581573

    Google Scholar 

  • Aniket K, Ed HC, Bongwon S (2008) Crowdsourcing user studies with mechanical turk. In: Paper presented at the Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, Florence, Italy

  • Calegari S, Loregian M (2006) Using dynamic fuzzy ontologies to understand creative environments. In: Flexible query answering systems, pp 404–415

  • Michael F. Chiang JCH, Alexander C. Yu, Daniel S. Casper, James J (2006) Cimino, and Justin Starren reliability of SNOMED-CT coding by three physicians using two terminology browsers. AMIA Annu Symp Proc 2006:131–135

  • Dolin RH, Alschuler L, Beebe C, Biron PV, Boyer SL, Essin D, Kimber E, Lincoln T, Mattison JE (2001) The HL7 clinical document architecture. J Am Med Inform Assoc 8(6):552–569

    Article  Google Scholar 

  • Huberman BA (2008) Crowdsourcing and attention. Computer 41(11):103–105

    Article  Google Scholar 

  • Huff SM, Rocha RA, McDonald CJ, De Moor GJE, Fiers T, Bidgood WD Jr, Forrey AW, Francis WG, Tracy WR, Leavelle D, Stalling F, Griffin B, Maloney P, Leland D, Charles L, Hutchins K, Baenziger J (1998) Development of the logical observation identifier names and codes (LOINC) vocabulary. J Am Med Inform Assoc 5(3):276–292

    Article  Google Scholar 

  • Kalra D, Beale T, Heard S (2005) The openEHR foundation. Studies Health Technol Inform 115:153

    Google Scholar 

  • O’Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM (2005) Measuring diagnoses: ICD code accuracy. (International Classification of Diseases). Health Serv Res 40(5):1620

    Article  Google Scholar 

  • O’Neil MP C, Read J (1995) Read codes version 3: a user led terminology. Methods Inf Med 34(1/2):187–192

    Google Scholar 

  • Oreilly T (2007) What is Web 2.0: design patterns and business models for the next generation of software. Commun Strateg No 1, p 17, First Quarter

  • Parry D (2004) Fuzzification of a standard ontology to encourage reuse. In: The 2004 IEEE International Conference on Information Reuse and Integration (IEEE IRI-2004), Las Vegas USA, pp 582–587

  • Patrick J, Wang Y, Budd P, Rector A, Brandt S, Rogers J, Herkes R, Ryan A, Vazirnezhad B (2008) Developing SNOMED CT subsets from clinical notes for intensive care service. Health Care and Informatics Review Online (HCIRO). Health Care and Informatics Review Online (HCIRO)

  • Peleg M, Tu S (2006) Decision support, knowledge representation and management in medicine. In: Haux R, Kulikowski C (eds) IMIA Yearbook 2006: assessing information—technologies for health. IMIA, Heidelberg, pp 72–80

    Google Scholar 

  • Peter GG (2005) HIT and MIS: implications of health information technology and medical information systems. Commun ACM 48(10):68–74. doi:10.1145/1089107.1089141

    Article  MathSciNet  Google Scholar 

  • Rivadeneira AW, Gruen DM, Muller MJR (2007) Getting our head in the clouds: toward evaluation studies of tagclouds. Paper presented at the Proceedings of the SIGCHI conference on Human factors in computing systems. San Jose, California

    Google Scholar 

  • Sanchez E, Yamanoi T (2006) Fuzzy ontologies for the semantic web. In: Flexible query answering systems, pp 691–699

  • Shadbolt N, Hall W, Berners-Lee T (2006) The semantic Web revisited. IEEE Intell Syst Appl 21(3):96–101

    Article  Google Scholar 

  • Singh P, Lin T, Mueller ET, Lim G, Perkins T, Zhu WL (2002) Open mind common sense: knowledge acquisition from the general public. In: On the move to meaningful internet systems 2002: CoopIS, DOA, and ODBASE : Confederated International Conferences CoopIS, DOA, and ODBASE 2002. Proceedings, pp 1223–1237

  • Siorpaes K, Simperl E (2009) Human intelligence in the process of semantic content creation. World Wide Web 13(1):33–59

    Article  Google Scholar 

  • Spackman K (2005) Rates of change in a large clinical terminology: three years experience with SNOMED clinical terms. AMIA Annu Symp Proc 714–718

  • Stoilos G, Simou N, Stamou G, Kollias S (2006) Uncertainty and the semantic web. IEEE Intell Syst 21(5):84–87

    Article  Google Scholar 

  • Tho QT, Hui SC, Fong ACM, Tru Hoang C (2006) Automatic fuzzy ontology generation for semantic Web. IEEE Trans Knowl Data Eng 18(6):842–856

    Article  Google Scholar 

  • Thomas ML, Mark OG, Christian T, Benedikt F, Daniel K, Michael K, Henning S, Berthold BW (2003) Content-based image retrieval in medical applications for picture archiving and communication systems. In: Huang HK, Osman MR (eds) SPIE, pp 109–117

  • U.S. National Library of Medicine (2001) Medical Subject Headings. U.S. National Library of Medicine,. http://www.nlm.nih.gov/mesh/. Accessed 11 January 2002

  • von Ahn L, Maurer B, McMillen C, Abraham D, Blum M (2008) reCAPTCHA: human-based character recognition via web security measures. Science 321(5895):1465–1468. doi:10.1126/science.1160379

    Article  MathSciNet  MATH  Google Scholar 

  • Widyantoro DH (2001) Using fuzzy ontology for query refinement in a personalized abstract search engine. In: Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pp 610–615

  • Wollersheim D, Rahayu W (2002) Methodology for creating a sample subset of dynamic taxonomy to use in navigating medical text databases. In: Database engineering and applications symposium, 2002. In: Proceedings. International 2002, pp 276–284

  • World Health Organization (2001) ICD-10: The International Statistical Classification of Diseases and Related Health Problems, tenth revision. WHO. http://www.who.int/whosis/icd10/. Accessed 11 January 2002

  • Yanbe Y, Jatowt A, Nakamura S, Tanaka K (2007) Can social bookmarking enhance search in the web? In: Proceedings of the 2007 conference on Digital libraries, pp 107–116

  • Zadeh L (1965) Fuzzy sets. J Inf Control 8:338–353

    Article  MathSciNet  MATH  Google Scholar 

  • Zimmermann H (1980) OSI reference model—the ISO model of architecture for open systems interconnection. IEEE Transactions on Commun [legacy, pre 1988] 28(4):425–432

    Article  Google Scholar 

Download references

Acknowledgments

The Authors would like to offer their thanks to the women’s health ultrasound department at Auckland District Health Board, especially Kathy Dryden, Chief Sonographer. SNOMED CT use in New Zealand relies on the work of NZHIS and in particular Ted Cizadlo and his team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David T. Parry.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parry, D.T., Tsai, TC. Crowdsourcing techniques to create a fuzzy subset of SNOMED CT for semantic tagging of medical documents. Soft Comput 16, 1119–1127 (2012). https://doi.org/10.1007/s00500-011-0787-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-011-0787-z

Keywords

Navigation