Abstract
E-catalogs are semi-structured documents that consist of multiple attributes and values. Although the conventional text classification techniques are applicable to the e-catalog classification as well, they cannot use the attribute information effectively to improve the classification accuracy. In this paper, we propose an e-catalog classification algorithm by extending Naïve Bayesian Classifier to use the attribute information. Specifically, we focus on exploiting two e-catalog specific characteristics: the attribute-wise keyword distribution and the category dependent attributes. Experiments on real data validate the proposed method.
This work was supported by the Postal Technology R&D program of MKE/IITA. [2006-X-001-02, Development of Real-time Postal Logistics System].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M., Schulten, E., Fensel, D.: GoldenBullet: Automated Classification of Product Data in E-commerce. Business Information System (2002)
Kim, Y., Lee, T., Chun, J., Lee, S.: Modified Naïve Bayes Classifier for E-Catalog Classification. In: Lee, J., Shim, J., Lee, S.-g., Bussler, C.J., Shim, S. (eds.) DEECS 2006. LNCS, vol. 4055, pp. 246–257. Springer, Heidelberg (2006)
Kim, D., Lee, S.: Catalog Management in e-Commerce Systems. In: Computer Science and Technology (CST 2003) (2003)
Hepp, M., Leukel, J., Schmitz, V.: A Quantitative Analysis of Product Categorization Standards: eCl@ss, UNSPSC, eOTD, and RNTD. Knowledge and Information Systems (KAIS) 13(1), 77–114 (2007)
Yi, J., Sundaresan, N.: A Classifier for Semi-Structured Documents. In: 6th ACM SIGKDD, pp. 340–344 (2000)
Denoyer, L., Gallinari, P.: Bayesian Network Model for Semi-structured Document Classification. Information Processing & Management 40(5), 807–827 (2004)
Bratko, A., Filipic, B.: Exploiting Structural Information for Semi-structured Document Categorization. Information Processing & Management 42(3), 679–694 (2006)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Singhal, A.: Modern Information Retrieval: A Brief Overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24(4), 35–43 (2001)
KOCIS (Korea Ontology-based e-Catalog Information System) (Accessed on, April 2008), http://www.g2b.go.kr:8100/index.jsp
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, Yg., Lee, T., Lee, Sg., Park, JH. (2008). Exploiting Attribute-Wise Distribution of Keywords and Category Dependent Attributes for E-Catalog Classification. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2008. Lecture Notes in Computer Science, vol 5226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87442-3_121
Download citation
DOI: https://doi.org/10.1007/978-3-540-87442-3_121
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87440-9
Online ISBN: 978-3-540-87442-3
eBook Packages: Computer ScienceComputer Science (R0)