Abstract
Most classification problems assume that there are sufficient training sets to induce the prediction knowledge. Few studies are focused on the label prediction according to the small knowledge. Hence, a classification algorithm in which the prediction knowledge is induced by only few training instances at the initial stage and is incrementally expanded by following verified instances is presented. We have shown how to integrate kNN and LARM methods to design a multi-strategy classification algorithm. In the experiments on edoc collection, we show that the proposed method improves 4% in accuracy of low-confidence results of kNN prediction and 8% in accuracy of results of the dominant class bias of LARM prediction. We also show experimentally that the proposed method obtains enhanced classification accuracy and achieves acceptable performance efficiency.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Friedman, J.H., Kohavi, R., Yun, Y.: Lazy Decision Tree. In: 13th National Conference on Artificial Intelligence, pp. 717–724. AAAI Press and MIT, Boston (1996)
Zheng, Z., Webb, G.: Lazy learning of bayesian rules. Machine Learning 1, 53–84 (2000)
Veloso, A., Meira Jr., W., Zaki, M.J.: Lazy Associative Classification. In: 6th International Conference on Data Mining, pp. 645–654. IEEE Press, HongKong (2006)
Veloso, A., Meira Jr., W.: Lazy Associative Classification for Content-based Spam Detection. In: 4th Latin American Web Congress, pp. 154–161. IEEE Press, Cholula (2006)
Michell, T.M.: The need for biases in learning generalizations. Technique Report CBM-TR-117, Computer Science Department, Rutgers University, New Jersey (1980)
Michalski, R.S., Tecuci, G.: Machine learning: A multistrategy approach. Morgan Kaufmann, SanMateo (1994)
Domingos, P.: Unifying instance-based and rule-based induction. Machine Learning 24, 141–168 (1996)
Li, J., Ramamohanarao, K., Dong, G.: Combining the strength of pattern frequency and distance for classification. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD 2001. LNCS, vol. 2035, pp. 455–466. Springer, Heidelberg (2001)
Golding, A.R., Rosenbloom, P.S.: Improving accuracy by combining rule-based and case-based reasoning. Artificial Intelligence 87, 215–254 (1996)
Wojan, A.: Combination of Metric-Based and Rule-Based Classification. In: Slezak, D., Wang, G., Szczuka, M., Duentsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS, vol. 3641, pp. 501–511. Springer, Heidelberg (2005)
Michell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Kantardzic, M.: Data Mining: Concepts, Models, Methods, and Algorithms. Wiley Interscience, USA (2003)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: ACM International Conference on Knowledge Discovery and Data Mining, pp. 80–86. ACM Press, New York (1998)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large database. In: ACM-SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press, Washington (1993)
Chinese Knowledge and Information Processing (CKIP) of Academia Sinica of Taiwan, A Chinese word segmentation system, http://ckipsvr.iis.sinica.edu.tw
Salton, G., McGill, M.J.: An introduction to modern information retrieval. McGraw-Hill, New York (1983)
Reuters-21578 Text Categorization Test Collection, http://www.daviddlewis.com/resources/testcollections/reuters21578
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, J., Lee, S. (2009). A Multi-Strategy Approach to KNN and LARM on Small and Incrementally Induced Prediction Knowledge. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)