Abstract
Case-Based Reasoning (CBR) is a lazy learning method and, being such, when a new query is made to a CBR system, the swiftness of its retrieval phase proves to be very important for the overall system performance. The availability of ubiquitous data today is an opportunity for CBR systems as it implies more cases to reason with. Nevertheless, this availability also introduces a challenge for the CBR retrieval since distance calculations become computationally expensive. A good example of a domain where the case base is subject to substantial growth over time is the health records of patients where a query is typically an incremental update to prior cases. To deal with the retrieval performance challenge in such domains where cases are sequentially related, we introduce a novel method which significantly reduces the number of cases assessed in the search of exact nearest neighbors (NNs). In particular, when distance measures are metrics, they satisfy the triangle inequality and our method leverages this property to use it as a cutoff in NN search. Specifically, the retrieval is conducted in a lazy manner where only the cases that are true NN candidates for a query are evaluated. We demonstrate how a considerable number of unnecessary distance calculations is avoided in synthetically built domains which exhibit different problem feature characteristics and different cluster diversity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Following Fig. 1, you can think of P and \(P^\prime \) as the problem parts of the cases \(C_{4}^{0}\) and \(C_{4}^{1}\) respectively, and \(P^{\prime \prime }\) as \(P_{4}^{2}\), and C as any case \(C_{x}^{y}\) of any Sequence x where x\(\,\ne \,\)4.
- 2.
As we will discuss in a following section, this calculus can be improved by introducing some proposals already existing in the CBR literature.
- 3.
Dataset generation code can be found at http://www.iiia.csic.es/~oguz/lazy/.
- 4.
A document with supplementary material with figures summarizing all datasets can be found at http://www.iiia.csic.es/~oguz/lazy/.
References
Begum, S., Ahmed, M.U., Funk, P., Xiong, N., Folke, M.: Case-based reasoning systems in the health sciences: a survey of recent trends and developments. IEEE Trans. Syst., Man, Cybern., Part C (Appl. Rev.) 41(4), 421–434 (2011)
Deza, M.M., Deza, E.: Encyclopedia of distances. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00234-2
Francis, A.G., Ram, A.: The utility problem in case-based reasoning. In: Case-Based Reasoning: Papers from the 1993 Workshop, pp. 160–161 (1993)
Houeland, T.G., Aamodt, A.: The utility problem for lazy learners - towards a non-eager approach. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS (LNAI), vol. 6176, pp. 141–155. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14274-1_12
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Schaaf, J.W.: Fish and Shrink. A next step towards efficient case retrieval in large scaled case bases. In: Smith, I., Faltings, B. (eds.) EWCBR 1996. LNCS, vol. 1168, pp. 362–376. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0020623
Smyth, B., Cunningham, P.: The utility problem analysed. In: Smith, I., Faltings, B. (eds.) EWCBR 1996. LNCS, vol. 1168, pp. 392–399. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0020625
Smyth, B., Keane, M.T.: Remembering to forget. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 377–382. Citeseer (1995)
Smyth, B., McKenna, E.: Competence guided incremental footprint-based retrieval. Knowl. Based Syst. 14(3–4), 155–161 (2001)
Smyth, B., McKenna, E.: Competence models and the maintenance problem. Comput. Intell. 17(2), 235–249 (2001)
Van Dongen, S., Enright, A.J.: Metric distances derived from cosine similarity and Pearson and Spearman correlations. arXiv preprint arXiv:1208.3145 (2012)
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. arXiv preprint arXiv:1408.2927 (2014)
Acknowledgements
This work has been partially funded by project Innobrain, COMRDI-151-0017 (RIS3CAT comunitats), and Feder funds. Mehmet Oğuz Mülâyim is a PhD Student of the doctoral program in Computer Science at the Universitat Autònoma de Barcelona.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Mülâyim, M.O., Arcos, J.L. (2018). Perks of Being Lazy: Boosting Retrieval Performance. In: Cox, M., Funk, P., Begum, S. (eds) Case-Based Reasoning Research and Development. ICCBR 2018. Lecture Notes in Computer Science(), vol 11156. Springer, Cham. https://doi.org/10.1007/978-3-030-01081-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-01081-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01080-5
Online ISBN: 978-3-030-01081-2
eBook Packages: Computer ScienceComputer Science (R0)