Abstract
This study compared the accuracy of three Singular Value Decomposition (SVD) based models developed for classifying injury narratives. Two SVD-Bayesian models and one SVD-Regression model were developed to classify bodies of free text. Injury narratives and corresponding E-codes assigned by human experts from the 1997 and 1998 US National Health Interview Survey (NHIS) were used on all three models. Using the E-code categories assigned by experts as the basis for comparison all methods were compared. Further experiments showed that the performance of the equidistant Bayes model and regression model improved as more SVD vectors were used for the input. The regression model was compared to a fuzzy Bayes model. It was concluded that all three models are capable of learning from human experts to accurately categorize cause-of-injury codes from injury narratives, with the regression-based model being the strongest, while all were dominated by multiple-word fuzzy Bayes model.
Chapter PDF
References
Lehto, M., Sorock, G.: Machine learning of motor vehicle accident categories from narrative data. Methods Info Med. 35(4-5), 309–316 (1996)
Wu, H., Gunopulos, D.: IEEE International Conference on Data Mining (ICDM 2002), pp. 713–716. IEEE Computer Society Press, Los Alamitos (2002)
Hofmann, T.: Probabilistic latent semantic indexing. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, pp. 50–57 (1999)
Yang, Y.: Noise reduction in a statistical approach to text categorization. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 256–263. ACM, New York (1995)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Review 37(4), 573–595 (1995)
Wellman, H.M., Lehto, M.R., Sorock, G.S.: Computerized coding of injury narrative data from the National Health Interview Survey. Accid. Anal. Prev. 36, 165–171 (1995)
Berry, M.W., Fierro, R.D.: Low rank orthogonal decomposition for information retrieval applications. Numerical Linear Algebra with Applications 3(4), 301–327 (1996)
Tange, H.J., Hasman, A., de Vries Robbé, P.F., Schouten, H.C.: Medical narratives in electronic medical records. International Journal of Medical Informatics 46, 7–29 (1997)
Mikkelsen, G., Aasly, J.: Manual semantic tagging to improve access to information in narrative electronic medical records. International Journal of Medical Informatics 65(1), 17–29 (2002)
van Mulligen, E.M., Stam, H., van Ginneken, A.M.: Clinical Data Entry. In: Chute, C.G. (ed.) AMIA Annual Symposium, p. 81, Hanley & Belfus, Philadelphia (1998)
Harrast, J.J., Koris, M.J., Chen, S.F., Poss, R., Sledge, C.B.: Design and implementation of an automated operative note, M D Computing, 12(6), 559–565 (1995)
Stocky, T., Faaborg, A., Lieberman, H.: A commonsense approach to predictive text entry. In: Proceedings of Conference on Human Factors in Computing Systems, 24–29 (2004)
Sona, D., Avesani, P., Moskovitch, R.: Automated multi-classification of clinical guidelines in concept hierarchies, Artificial Intelligence in Medicine, Aberdeen, Scotland, UK (2005)
Wu, S.-j., Lehto, M., Yih, Y., Flanagan, M., Zillich, A., Doebbeling, B.: A Logistic Regression Model for Assessing Clinicians’ Perceived Usefulness of Computerized Clinical Reminders. in: The 36th International Conference on Computers and Industrial Engineering, Taipei, Taiwan (June 2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Noorinaeini, A., Lehto, M.R., Wu, Sj. (2007). Hybrid Singular Value Decomposition: A Model of Human Text Classification. In: Smith, M.J., Salvendy, G. (eds) Human Interface and the Management of Information. Methods, Techniques and Tools in Information Design. Human Interface 2007. Lecture Notes in Computer Science, vol 4557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73345-4_59
Download citation
DOI: https://doi.org/10.1007/978-3-540-73345-4_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73344-7
Online ISBN: 978-3-540-73345-4
eBook Packages: Computer ScienceComputer Science (R0)