Abstract
This paper presents an Advanced P-Tree based K-Nearest Neighbor (AP-KNN) algorithm for text categorization to capture useful information from customer open-end answers via an intelligent survey system. The “intelligence” of this survey system is built in its text categorization module which can classify customers’ feedbacks on certain characteristics of the products in concern. A software prototype system is developed based on the AP-KNN algorithm. The prototype system allows online questionnaire design, online customer feedback collection, digitisation of linguistic feedback and customer preference reasoning and motivation analysis. The system could significantly shorten the survey and analysis time and is thus expected to reduce design cycle time for new product development. To test the AP-KNN, a case study was carried out in a survey for developing portable audio products. The study shows that AP-KNN performed much better than the original P-Tree based KNN in terms of speed and accuracy.
Similar content being viewed by others
References
Aggarwal, C. C., Gates, S. C., & Yu, P. S. (2004). On using partial supervision for text categorization. IEEE Transactions on Knowledge and Data Engineering, 16(2). doi:10.1109/TKDE.2004.1269601.
Crow, K. (2002). Voice of the customer, product development forum. DRM Associates.
Fung, G. P. C., Yu, J. X., Lu, H., & Yu, P. S. (2006). Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering, 18(1). doi:10.1109/TKDE.2006.16.
Gao J.B., Shi D. and Liu X.M. (2007). Significant vector learning to construct sparse kernel regression models. Neural Networks 20(7): 791–798. doi:10.1016/j.neunet.2007.03.001.
Han, E., Karypis, G., & Kumar, V. (1999). Text categorization using weight adjusted K-Nearest neighbor classification. Department of Computer Science and Engineering, Army HPC Research Center, University of Minnesota.
Imad, R., & William, P. (2004). An optimized approach for KNN text categorization using P-Trees. In SAC (pp. 613–617).
Jacobs, P. S. (1992). Joining statistics with NLP for text categorization. In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, pp. 178–185.
Kim, S., Seo, H., & Rim, H. (2003). Poisson naive bayes for text classification with feature weighting. In Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages (IRAL2003).
Li B., Lu Q. and Yu S. (2004). An adaptive K-Nearest neighbor text categorization strategy. ACM Transactions on Asian Language Information Processing 3(4): 215–226. doi:10.1145/1039621.1039623.
Li, X., Zhou, J. H., Ren, J. X., Yang, Q. Z., & Lu, W. F. (2005). Customer demand discovery for new product design. In CD proceedings of The 15th International Conference on Engineering Design (ICED 05), ID-333.65.
Miguel, E. R., & Srinivasan, P. (2002). Hierarchical text categorization using neural networks. Kluwer Academic Publishers. Information Retrieval, 5, 87–118.
Nguyen M.N., Shi D. and Quek C. (2006). FCMAC-BYY: Fuzzy CMAC using Bayesian Ying-Yang learning. IEEE Transactions on Systems, Man and Cybernetics Part B 36(5): 1180–1190
Sebastiani F. (2002). Machine learning in automated text categorization. ACM Computing Surveys 34(1): 1–47. doi:10.1145/505282.505283.
Shi D., Yeung D.S. and Gao J. (2005). Sensitivity analysis applied to the construction of radial basis function network. Neural Networks 18(7): 951–957. doi:10.1016/j.neunet.2005.02.006.
Taeho, J. (2000). NeuroTextCategorizer: A new model of neural network for text categorization. In The Proceedings of ICONIP (pp. 280–285).
Takahiko, K. (2002). Topic difference factor extraction between two document sets and its application to text categorization. In Annual ACM Conference on Research and Development in Information Retrieval Session (pp. 137–144). Text Categorization.
Thorsten, J. (1998). Text categorization with support vector machines: Learning with many relevant features. In European Conference on Machine Learning (ECML).
Yang, Y., & Chute, C. G. (1992). A linear least squares fit mapping method for information retrieval from natural language text. In Proceedings of the 14th Interrelation Conference on Computational Linguistics (COLING 92) (pp. 447–453). New York: McGraw-Hill.
Yang, Y., & Chute, C. G. (1994). An example-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems, 12(3). doi:10.1145/183422.183424.
Yang, Y., & Liu, X. (1999). A re-examination of text categorization methods. In Proceedings of 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 42–49). New York: ACM Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, X., Shi, D., Charastrakul, V. et al. Advanced P-Tree based K-Nearest neighbors for customer preference reasoning analysis. J Intell Manuf 20, 569–579 (2009). https://doi.org/10.1007/s10845-008-0146-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-008-0146-9