Abstract
The use of association rule mining carries the attendant challenge of focusing on appropriate data subsets so as to reduce the volume of association rules produced. The intent is to heuristically identify “interesting” rules more efficiently, from less data. This challenge is similar to that of identifying “high-value” attributes within the more general framework of machine learning, where early identification of key attributes can profoundly influence the learning outcome. In developing heuristics for improving the focus of association rule mining, there is also the question of where in the overall process such heuristics are applied. For example, many such focusing methods have been applied after the generation of a large number of rules, providing a kind of ranking or filtering. An alternative is to constrain the input data earlier in the data mining process, in an attempt to deploy heuristics in advance, and hope that early resource savings provide similar or even better mining results. In this paper we consider possible improvements to the problem of achieving focus in web mining, by investigating both the articulation and deployment of rule constraints to help attain analysis convergence and reduce computational resource requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Examiner: Optimized level-wise frequent pattern mining with monotone constraints. In: IEEE ICDM, Melbourne, Florida (November 2004)
Bonchi, F., Goethals, B.: Fp-bonsai: the art of growing and pruning small fp-trees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 155–160. Springer, Heidelberg (2004)
Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, Springer, Heidelberg (2004)
Bucila, C., Gehrke, J., Kifer, D., White, W.: Dualminer: A dual-pruning algorithm for itemsets with constraints. In: Eight ACM SIGKDD Internationa Conf. on Knowledge Discovery and Data Mining, Edmonton, Alberta, pp. 42–51 (August 2002)
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: ICDE, pp. 443–452 (2001)
Chi, E.H., Pitkow, J., Mackinlay, J., Pirolli, P., Gossweiler, R., Card, S.K.: Visualizing the evolution of web ecologies. In: CHI 1998. Proceedings of the Conference on Human Factors in Computing Systems (1998)
El-Hajj, M., Zaïane, O.R.: Non recursive generation of frequent k-itemsets from frequent pattern tree representations. In: Kambayashi, Y., Mohania, M.K., Wöß, W. (eds.) DaWak 2003. LNCS, vol. 2737, Springer, Heidelberg (2003)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) ACM SIGMOD Intl. Conference on Management of Data, vol. 05, pp. 1–12. ACM Press, New York (2000)
Lakshmanan, L.V., Ng, R., Han, J., Pang, A.: Optimization of constrained frequent set queries with 2-variable constraints. In: ACM SIGMOD Conference on Management of Data, pp. 157–168 (1999)
Pei, J., Han, J., Lakshmanan, L.V.: Mining frequent itemsets with convertible constraints. In: IEEE ICDE Conference, pp. 433–442 (2001)
Ting, R.M.H., Bailey, J., Ramamohanarao, K.: Paradualminer: An efficient parallel implementation of the dualminer algorithm. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 96–105. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
El-Hajj, M., Chen, J., Zaïane, O.R., Goebel, R. (2007). Constraint-Based Mining of Web Page Associations. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_33
Download citation
DOI: https://doi.org/10.1007/978-3-540-76928-6_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76926-2
Online ISBN: 978-3-540-76928-6
eBook Packages: Computer ScienceComputer Science (R0)