Abstract
This paper describes an application of information retrieval techniques to automated industry and occupation code classification for Korean Census records. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code book from the Census Bureau. The system was experimented with 46,762 industry records and occupation 36,286 records using 10-fold cross-validation evaluation method. As experimental results, the system showed 87.08% and 66.08% production rates when classifying industry records into level 2 and level 5 codes respectively. In semi-automated mode, it showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Apeel, M.V., Hellerman, E.: Census Bureau Experiments with Automated Industry and Occupation Coding. In: Proceedings of the American Statistical Association, pp. 32–40 (1983)
Baeza-Yates, Ribeiro-Neto: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Chen, B., Creecy, R.H., Appel, M.: On Error Control of Automated Industry and Occupation Coding. Journal of Official Statistics 9(4), 729–745 (1993)
Creecy, R.H., Masand, B.M., Smith, S.J., Walts, D.L.: Trading MIPS and Memory for Knowledge Engineering. Communications of the ACM 35(8), 48–64 (1992)
Gilman, D.W., Appel, M.V.: Automated Coding Research At the Census Bureau. U.S. Census Bureau, http://www.census.gov/srd/papers/pdf/rr94-4.pdf
Korean Standard Industrial Classification. National Statistical Office (January 2000)
Korean Standard Classification of Occupations. National Statistical Office (January 2000)
Lee, D.G.: A High Speed Index Term Extracting System Considering the Morphological Configuration of Noun. M.S. Thesis, Dept. of Computer Science and Engineering, Korea Univ., Korea (2000)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Rowe, E., Wong, C.: An Introduction to the ACRT Coding System. Bureau of the Census Statistical Research Report Series No. RR94/02 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lim, H.S., Lee, S.H. (2005). An Application of Information Retrieval Technique to Automated Code Classification. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2005. Lecture Notes in Computer Science(), vol 3681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11552413_14
Download citation
DOI: https://doi.org/10.1007/11552413_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28894-7
Online ISBN: 978-3-540-31983-2
eBook Packages: Computer ScienceComputer Science (R0)