Abstract
In different applications (i.e., information retrieval, filtering or analysis), it is useful to detect similar terms and to provide the possibility to use them jointly. Clustering of terms is one of the methods which can be exploited for this. In our study, we propose to test three methods dedicated to the clustering of terms (hierarchical ascendant classification, Radius and maximum), to combine them with the semantic distance algorithms and to compare them through the results they provide when applied to terms from the pharmacovigilance area. The comparison indicates that the non disjoint clustering (Radius and maximum) outperform the disjoint clusters by 10 to up to 20 points in all the experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barzilay, R., Elhadad, N.: Sentence alignment for monolingual comparable corpora. In: EMNLP, pp. 25–32 (2003)
Paşca, M.: Mining paraphrases from self-anchored web sentence fragments. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 193–204. Springer, Heidelberg (2005)
Max, A., Bouamor, H., Vilnat, A.: Generalizing sub-sentential paraphrase acquisition across original signal type of text pairs. In: EMNLP, pp. 721–31 (2012)
Jacquemin, C.: A symbolic and surgical acquisition of terms through variation. In: Wermter, S., Riloff, E., Scheler, G. (eds.) IJCAI-WS 1995. LNCS, vol. 1040, pp. 425–438. Springer, Heidelberg (1996)
Daille, B., Habert, B., Jacquemin, C., Royauté, J.: Empirical observation of term variations and principles for their description. Terminology 3(2), 197–257 (1996)
Hahn, U., Honeck, M., Piotrowsky, M., Schulz, S.: Subword segmentation - leveling out morphological variations for medical document retrieval. In: Annual Symposium of the American Medical Informatics Association (AMIA), Washington (2001)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19(1), 17–30 (1989)
Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of Associations for Computational Linguistics, pp. 133–138 (1994)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: WordNet: An Electronic Lexical Database, pp. 305–332 (1998)
Zhong, J., Zhu, H., Li, J., Yu, Y.: Conceptual graph matching for semantic search. In: Priss, U., Corbett, D.R., Angelova, G. (eds.) ICCS 2002. LNCS (LNAI), vol. 2393, pp. 92–106. Springer, Heidelberg (2002)
Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004), pp. 1089–1090 (2004)
Nguyen, H., Al-Mubaid, H.: New ontology-based semantic similarity measure for the biomedical domain. IEEE Eng. Med. Biol. Proc., 623–628 (2006)
Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)
Bodenreider, O., Pakhomov, S.: Exploring adjectival modification in biomedical discourse across two genres. In: Workshop Natural Language Processing in Biomedical Applications of ACL, pp. 105–112 (2003)
Grabar, N., Zweigenbaum, P.: Lexically-based terminology structuring. Terminology 10, 23–54 (2004)
D’aquin, M., Euzenat, J., Le Duc, C., Lewen, H.: Sharing and reusing aligned ontologies with cupboard. In: K-CAP 2009, pp. 179–180 (2009)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Statistical Data Analysis based on the L1 Norm, pp. 405–416 (1987)
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York (1981)
Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. Fuzzy System, 595–607 (2001)
Lelu, A.: Modles neuronaux pour lanalyse de donnes documentaires et textuelles. Phd thesis, Universite de Paris VI, Paris, France (1993)
Dupuch, M., Bousquet, C., Grabar, N.: Automatic creation and refinement of the clusters of pharmacovigilance terms. In: ACM IHI, pp. 181–190 (2012)
Cleuziou, G., Martin, L., Vrain, C.: PoBOC: An overlapping clustering algorithm. application to rule-based classification and textual data. In: ECAI, pp. 440–444 (2004)
Cleuziou, G.: OKM: Une extension des k-moyennes pour la recherche de classes recouvrantes. In: EGC, pp. 691–702 (2007)
Johnson, S.: Hierarchical clustering schemes. Psychometrika 32, 241–254 (1967)
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: ACM SIGMOD, pp. 103–114 (1996)
Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: ACM SIGMOD, pp. 73–84 (1998)
Alecu, I., Bousquet, C., Jaulent, M.: A case report: Using snomed ct for grouping adverse drug reactions terms. BMC Med. Inform. Decis. Mak. 8(1), 4 (2008)
Brown, E.G., Wood, L., Wood, S.: The medical dictionary for regulatory activities (MedDRA). Drug Saf. 20(2), 109–117 (1999)
Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.Y.: SNOMED clinical terms: Overview of the development process and project status. In: AMIA, pp. 662–666 (2001)
NLM: UMLS Knowledge Sources Manual. National Library of Medicine, Bethesda, Maryland (2008), http://www.nlm.nih.gov/research/umls/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dupuch, M., Engström, C., Silvestrov, S., Hamon, T., Grabar, N. (2013). Comparison of Clustering Approaches through Their Application to Pharmacovigilance Terms. In: Peek, N., Marín Morales, R., Peleg, M. (eds) Artificial Intelligence in Medicine. AIME 2013. Lecture Notes in Computer Science(), vol 7885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38326-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-38326-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38325-0
Online ISBN: 978-3-642-38326-7
eBook Packages: Computer ScienceComputer Science (R0)