Abstract
This paper proposes a mining novel approach which consists of two new data mining algorithms for the classification over quantitative data, based on two new pattern called MOUCLAS (MOUntain function based CLASsification) Patterns and JumpingMOUCLAS Patterns. The motivation of the study is to develop two classifiers for quantitative attributes by the concepts of the association rule and the clustering. An illustration of using petroleum well logging data for oil/gas formation identification is presented in the paper. MPsandJMPs are ideally suitable to derive the implicit relationship between measured values (well logging data) and properties to be predicted (oil/gas formation or not). As a hybrid of classification and clustering and association rules mining, our approach have several advantages which are (1) it has a solid mathematical foundation and compact mathematical description of classifiers, (2) it does not require discretization, (3) it is robust when handling noisy or incomplete data in high dimensional data space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: An overview. In: Advances in knowledge discovery and data mining, pp. 1–34. AAAI/MIT Press (1996)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2000)
Lent, B., Swami, A., Widom, J.: Clustering association rules. In: ICDE 1997, pp. 220–231 (1997)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD 1998, pp. 80–86 (1998)
Meretakis, D., Wuthrich, B.: Extending naive Bayes classifiers using long itemsets. In: Proc. of the Fifth ACM SIGKDD, pp. 165–174. ACM Press, New York (1999)
Li, J., Dong, G., Ramamohanarao, K.: Making Use of the Most Expressive Jumping Emerging Patterns for Classification. Knowledge and Information Systems 3(2), 131–145 (2001)
Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Mateo (1993)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Skikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: SIGMOD 1996, pp. 1–12 (1996)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th Int’l Conf. on Artificial Intelligence, pp. 1022–1029. Morgan Kaufmann, San Francisco (1993)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. of the Twelfth Int’l Conf. on Machine Learning, pp. 94–202. Morgan Kaufmann, San Francisco (1995)
Ahmed, K.M., El-Makky, N.M., Taha, Y.: A note on Beyond Market Baskets: Generalizing Association Rules to Correlations. In: The Proceedings of SIGKDD Explorations, vol. 1(2), pp. 46–48 (2000)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th VLDB, pp. 487–499 (1994)
Liu, B., Hsu, W., Ma, Y.: Mining Association Rules with Multiple Minimum Supports. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 1999), San Diego, CA, USA, August 15-18 (1999)
Dong, G., Li, J.: Feature selection methods for classification. Intelligent Data Analysis: An International Journal 1 (1997)
Liu, H., Motoda, H. (eds.): Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Boston (1998)
Sarawagi, W., Stonebraker, M.: On automatic feature selection. Int’l J. of Pattern Recognition and Artificial Intelligence 2, 197–220 (1988)
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence, 273–324 (1997)
Yager, R., Filev, D.: Generation of Fuzzy Rules by Mountain Clustering. Journal of Intelligent & Fuzzy Systems 2(3), 209–219 (1994)
Chiu, S.L.: Fuzzy model identification based on cluster estimation. Journal of Intelligent and Fuzzy System 2(3) (1994)
Hinneburg, A., Keim, D.: An efficient approach to clustering in large Multimedia dataset with noise. In: KDD 1998, pp. 58–65 (1998)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD 1998 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hao, Y., Quirchmayr, G., Stumptner, M. (2006). Mining MOUCLAS Patterns and Jumping MOUCLAS Patterns to Construct Classifiers. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_10
Download citation
DOI: https://doi.org/10.1007/11677437_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)