Abstract
The paper presents the possibility of direct comparison of medical text content by using unstructured representation of document information in frequency matrix of terms. Dimensionality reduction is performed using Latent Semantic Indexing method. Two common metrics are used: Cosine distance and Jaccard metric. Cosine measure shows a lower sensitivity for finding similar documents. The analysis was performed using SAS Text Analytics elements on set of 400 cases of description of abdominal radiological diagnostic images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Albright, R.: Taming Text with the SVD. SAS Institute White Paper (2004)
Boniński P.: Metody indeksowania obrazów medycznych na potrzeby radiologii cyfrowej. PhD thesis. Warsow University of Technology (2007)
Kawa, J., Juszczyk, J., Pyciński, B., Badura, P., Piętka, E.: Radiological atlas for patient specific model generation. Adv. Intell. Syst. Comput. 84, 69–84 (2014)
Krallinger, M., Vasquez, M., Leitner, F., Salgado, D., Chatr-Aryamontri, A., Winter, A. et al.: The protein-protein interaction tasks of BioCreative III: classification ranking of articles and linking bio-ontology concepts to full text. BMC Bioinformatics 12 (Suppl. 8), S3 (2011)
Meyer, C.: Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia (2000)
Rebholz-Schuhmann, D., Jepes, A., Li, C., Kafkas, S., Lewin, I., et al.: Assesment of NER solutions against the first and second CALBC Silver Standard Corpus. J. Biomed. Semantics 2(Suppl. 5), S11 (2011)
Skarbek W.: Indeksowanie multimediów. Lecture materials (2005)
Vandenberghe, L.: Applied Numerical Computing (lecture). http://www.seas.ucla.edu/~vandenbe/103/reader.pdf (2011)
Zhu, F., Patumcharoenpol, P., Zhang, Ch., Yang, Y., et al.: Biomedical text mining and its applications in cancer research. J. Biomed. Inf. 46, 200–211 (2013)
Acknowledgments
The study was supported by National Science Center, Poland, Grant No UMO-2012/05/B/ST7/02136.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Spinczyk, D., Dzieciątko, M. (2016). Similarity Search for the Content of Medical Records. In: Piętka, E., Badura, P., Kawa, J., Wieclawek, W. (eds) Information Technologies in Medicine. ITiB 2016. Advances in Intelligent Systems and Computing, vol 471. Springer, Cham. https://doi.org/10.1007/978-3-319-39796-2_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-39796-2_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39795-5
Online ISBN: 978-3-319-39796-2
eBook Packages: EngineeringEngineering (R0)