Abstract
This paper presents a novel measure of complexity of a case base. The concept of Fractal Dimensions, which is a generalization of the idea of dimensions, is used to estimate complexity. In terms of a classification problem, the idea of Fractal Dimension is used to estimate the ruggedness of the space spanned by instances along the decision boundary. Experiments over collections of varying complexity show that the measure exhibits strong negative correlation with classification accuracies over several classifiers. We also present empirical findings from experiments over non-textual datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cummins, L., Bridge, D.: On Dataset Complexity for Case Base Maintenance. In: Ram, A., Wiratunga, N. (eds.) ICCBR 2011. LNCS, vol. 6880, pp. 47–61. Springer, Heidelberg (2011)
Massie, S., Craw, S., Wiratunga, N.: Complexity-guided case discovery for case based reasoning. In: AAAI, pp. 216–221 (2005)
Deepak, P., Chakraborti, S., Khemani, D.: Query Suggestions for Textual Problem Solution Repositories. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 569–581. Springer, Heidelberg (2013)
Vinay, V., Cox, I.J., Milic-Frayling, N., Wood, K.: Measuring the complexity of a collection of documents. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 107–118. Springer, Heidelberg (2006)
Marimont, R.B., Shapiro, M.B.: Nearest Neighbour Searches and the Curse of Dimensionality. IMA Journal of Applied Mathematics 24(1), 59–70 (1979)
Lamontagne, L.: Textual cbr authoring using case cohesion. In: Proceedings of 3rd Textual Case-Based Reasoning Workshop at the 8th European Conf. on CBR (2006)
Massie, S., Wiratunga, N., Craw, S., Donati, A., Vicari, E.: From Anomaly Reports to Cases. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS (LNAI), vol. 4626, pp. 359–373. Springer, Heidelberg (2007)
Chakraborti, S., Cerviño Beresi, U., Wiratunga, N., Massie, S., Lothian, R., Khemani, D.: Visualizing and evaluating complexity of textual case bases. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 104–119. Springer, Heidelberg (2008)
McGreggor, K., Kunda, M., Goel, A.: Fractals and ravens. Artificial Intelligence 215, 1–23 (2014)
Mandelbrot, B.: How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension. Science 156(3775), 636–638 (1967)
Lang, K.: Newsweeder: Learning to filter netnews. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 331–339 (1995)
Sakkis, G., Androutsopoulos, I., Spyropoulos, C.D.: A memory-based approach to anti-spam filtering for mailing lists. Information Retrieval 6, 49–73 (2003)
Delany, S.J., Bridge, D.: Feature-Based and Feature-Free Textual CBR: A Comparison in Spam Filtering. In: Bell, D.A., Milligan, P., Sage, P.P. (eds.) Procs. of the 17th Irish Conference on Artificial Intelligence and Cognitive Science, pp. 244–253. Queen’s University Belfast (2006)
Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S.: KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. Multiple-Valued Logic and Soft Computing 17(2-3), 255–287 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dileep, K.V.S., Chakraborti, S. (2014). Estimating Case Base Complexity Using Fractal Dimension. In: Lamontagne, L., Plaza, E. (eds) Case-Based Reasoning Research and Development. ICCBR 2014. Lecture Notes in Computer Science(), vol 8765. Springer, Cham. https://doi.org/10.1007/978-3-319-11209-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-11209-1_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11208-4
Online ISBN: 978-3-319-11209-1
eBook Packages: Computer ScienceComputer Science (R0)