skip to main content
10.1145/3543507.3587431acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections

CAM: A Large Language Model-based Creative Analogy Mining Framework

Published:30 April 2023Publication History


Analogies inspire creative solutions to problems, and facilitate the creative expression of ideas and the explanation of complex concepts. They have widespread applications in scientific innovation, creative writing, and education. The ability to discover creative analogies that are not explicitly mentioned but can be inferred from the web is highly desirable to power all such applications dynamically and augment human creativity. Recently, Large Pre-trained Language Models (PLMs), trained on massive Web data, have shown great promise in generating mostly known analogies that are explicitly mentioned on the Web. However, it is unclear how they could be leveraged for mining creative analogies not explicitly mentioned on the Web. We address this challenge and propose Creative Analogy Mining (CAM), a novel framework for mining creative analogies, which consists of the following three main steps: 1) Generate analogies using PLMs with effectively designed prompts, 2) Evaluate their quality using scoring functions, and 3) Refine the low-quality analogies by another round of prompt-based generation. We propose both unsupervised and supervised instantiations of the framework so that it can be used even without any annotated data. Based on human evaluation using Amazon Mechanical Turk, we find that our unsupervised framework can mine 13.7% highly-creative and 56.37% somewhat-creative analogies. Moreover, our supervised scores are generally better than the unsupervised ones and correlate moderately with human evaluators, indicating that they would be even more effective at mining creative analogies. These findings also shed light on the creativity of PLMs 1.

Skip Supplemental Material Section

Supplemental Material


  1. Charu C. Aggarwal and ChengXiang Zhai (Eds.). 2012. Mining Text Data. Springer.Google ScholarGoogle Scholar
  2. Mostafa A Alksher, Azreen Azman, Razali Yaakob, Rabiah Abdul Kadir, Abdulmajid Mohamed, and Eissa M Alshari. 2016. A review of methods for mining idea from text. In 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). IEEE, 88–93.Google ScholarGoogle Scholar
  3. Workneh Yilma Ayele. 2020. Adapting CRISP-DM for idea mining: a data mining process for generating ideas using a textual dataset. International Journal of Advanced Computer Sciences and Applications 11, 6 (2020), 20–32.Google ScholarGoogle Scholar
  4. Workneh Y Ayele and Gustaf Juell-Skielse. 2021. A Systematic Literature Review about Idea Mining: The Use of Machine-Driven Analytics to Generate Ideas. In Future of Information and Communication Conference. Springer, 744–762.Google ScholarGoogle Scholar
  5. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.Google ScholarGoogle Scholar
  6. Brendan Bena and Jugal Kalita. 2020. Introducing aspects of creativity in automatic poetry generation. arXiv preprint arXiv:2002.02511 (2020).Google ScholarGoogle Scholar
  7. Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. 2022. Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. arxiv:2210.04186 [cs.CL]Google ScholarGoogle Scholar
  8. MA Boden. 1994. What is creativity¿,[w:] MA Boden (red.), Dimensions of creativity.Google ScholarGoogle Scholar
  9. Margaret A Boden. 2009. Computer models of creativity. AI Magazine 30, 3 (2009), 23–23.Google ScholarGoogle Scholar
  10. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google ScholarGoogle Scholar
  11. Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, and Aniket Kittur. 2018. Solvent: A mixed initiative system for finding analogies between research papers. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1–21.Google ScholarGoogle Scholar
  12. Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021).Google ScholarGoogle Scholar
  13. Simon Colton, Geraint A Wiggins, 2012. Computational creativity: The final frontier¿. In Ecai, Vol. 12. Montpelier, 21–26.Google ScholarGoogle Scholar
  14. Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning. arXiv preprint arXiv:2205.12548 (2022).Google ScholarGoogle Scholar
  15. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  16. Giulia Di Fede, Davide Rocchesso, Steven P Dow, and Salvatore Andolina. 2022. The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas. In Creativity and Cognition. 623–627.Google ScholarGoogle Scholar
  17. Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2020. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics 8 (2020), 439–453.Google ScholarGoogle Scholar
  18. Kenneth D Forbus, Ronald W Ferguson, Andrew Lovett, and Dedre Gentner. 2017. Extending SME to handle large-scale cognitive modeling. Cognitive Science 41, 5 (2017), 1152–1201.Google ScholarGoogle Scholar
  19. Giorgio Franceschelli and Mirco Musolesi. 2021. Creativity and machine learning: A survey. arXiv preprint arXiv:2104.02726 (2021).Google ScholarGoogle Scholar
  20. Dedre Gentner. 2002. Analogy in scientific discovery: The case of Johannes Kepler. Model-based reasoning: Science, technology, values (2002), 21–39.Google ScholarGoogle Scholar
  21. Karni Gilon, Joel Chan, Felicia Y Ng, Hila Liifshitz-Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy mining for specific design needs. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.Google ScholarGoogle Scholar
  22. Ben Goodrich, Vinay Rao, Peter J Liu, and Mohammad Saleh. 2019. Assessing the factual accuracy of generated text. In proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 166–175.Google ScholarGoogle Scholar
  23. Maureen E Gray and Keith J Holyoak. 2021. Teaching by analogy: From theory to practice. Mind, Brain, and Education 15, 3 (2021), 250–263.Google ScholarGoogle Scholar
  24. Douglas R Hofstadter and Melanie Mitchell. 1994. The Copycat project: A model of mental fluidity and analogy-making. (1994).Google ScholarGoogle Scholar
  25. Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating innovation through analogy mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 235–243.Google ScholarGoogle Scholar
  26. Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422–446.Google ScholarGoogle Scholar
  27. Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of hallucination in natural language generation. Comput. Surveys (2022).Google ScholarGoogle Scholar
  28. Faustina Johnson and Santosh Kumar Gupta. 2012. Web content mining techniques: a survey. International journal of computer applications 47, 11 (2012).Google ScholarGoogle Scholar
  29. John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.Google ScholarGoogle Scholar
  30. Mahmut Kaya and Hasan Şakir Bilge. 2019. Deep metric learning: A survey. Symmetry 11, 9 (2019), 1066.Google ScholarGoogle Scholar
  31. Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81–93.Google ScholarGoogle Scholar
  32. Harry Khamis. 2008. Measures of association: How to choose¿Journal of Diagnostic Medical Sonography 24, 3 (2008), 155–162.Google ScholarGoogle Scholar
  33. Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2022. Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works. arXiv preprint arXiv:2210.08477 (2022).Google ScholarGoogle Scholar
  34. Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. ACM Sigkdd Explorations Newsletter 2, 1 (2000), 1–15.Google ScholarGoogle Scholar
  35. Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. (2011).Google ScholarGoogle Scholar
  36. P. Kruse, A. Schieber, A. Hilbert, and E. Schoop. 2013. Idea mining–text mining supported knowledge management for innovation purposes. In AMCIS (2013).Google ScholarGoogle Scholar
  37. Carolyn Lamb, Daniel G Brown, and Charles LA Clarke. 2018. Evaluating computational creativity: An interdisciplinary tutorial. ACM Computing Surveys (CSUR) 51, 2 (2018), 1–34.Google ScholarGoogle Scholar
  38. Won Sang Lee and So Young Sohn. 2019. Discovering emerging business ideas based on crowdfunded software projects. Decision Support Systems 116 (2019), 102–113.Google ScholarGoogle Scholar
  39. Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, and Weizhu Chen. 2022. On the Advance of Making Language Models Better Reasoners. arXiv preprint arXiv:2206.02336 (2022).Google ScholarGoogle Scholar
  40. Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, and Xiang Ren. 2021. Riddlesense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. arXiv preprint arXiv:2101.00376 (2021).Google ScholarGoogle Scholar
  41. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.Google ScholarGoogle Scholar
  42. Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).Google ScholarGoogle Scholar
  43. Melanie Mitchell. 2021. Abstraction and analogy-making in artificial intelligence. arXiv preprint arXiv:2102.10717 (2021).Google ScholarGoogle Scholar
  44. Richard G Morris, Scott H Burton, Paul M Bodily, and Dan Ventura. 2012. Soup Over Bean of Pure Joy: Culinary Ruminations of an Artificial Chef.. In ICCC. Citeseer, 119–125.Google ScholarGoogle Scholar
  45. Christina Niklaus, Matthias Cetto, André Freitas, and Siegfried Handschuh. 2018. A survey on open information extraction. arXiv preprint arXiv:1806.05599 (2018).Google ScholarGoogle Scholar
  46. Takaya Ogawa and Yuya Kajikawa. 2017. Generating novel research ideas using computational intelligence: A case study involving fuel cells and ammonia synthesis. Technological Forecasting and Social Change 120 (2017), 41–47.Google ScholarGoogle Scholar
  47. Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022).Google ScholarGoogle Scholar
  48. Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. (2022).Google ScholarGoogle Scholar
  49. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. ScholarGoogle Scholar
  50. Graeme Ritchie. 2001. Assessing creativity. In Proc. of AISB’01 Symposium. Citeseer.Google ScholarGoogle Scholar
  51. René Rohrbeck. 2014. Trend scanning, scouting and foresight techniques. In Management of the fuzzy front end of innovation. Springer, 59–73.Google ScholarGoogle Scholar
  52. Ananya B Sai, Akash Kumar Mohankumar, and Mitesh M Khapra. 2022. A survey of evaluation metrics used for NLG systems. ACM Computing Surveys (CSUR) 55, 2 (2022), 1–39.Google ScholarGoogle Scholar
  53. Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).Google ScholarGoogle Scholar
  54. Hanieh Shakeri, Carman Neustaedter, and Steve DiPaola. 2021. Saga: Collaborative storytelling with gpt-3. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing. 163–166.Google ScholarGoogle Scholar
  55. Dirk Thorleuchter, Dirk Van den Poel, and Anita Prinzie. 2010. Mining ideas from textual information. Expert Systems with Applications 37, 10 (2010), 7182–7188.Google ScholarGoogle Scholar
  56. Hannu Toivonen and Oskar Gross. 2015. Data mining and machine learning in computational creativity. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5, 6 (2015), 265–275.Google ScholarGoogle Scholar
  57. Asahi Ushio, Luis Espinosa-Anke, Steven Schockaert, and Jose Camacho-Collados. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies¿arXiv preprint arXiv:2105.04949 (2021).Google ScholarGoogle Scholar
  58. Chris van der Lee, Albert Gatt, Emiel van Miltenburg, and Emiel Krahmer. 2021. Human evaluation of automatically generated text: Current trends and best practice guidelines. Computer Speech & Language 67 (2021), 101151.Google ScholarGoogle Scholar
  59. Tony Veale. 2013. Once More, With Feeling! Using Creative Affective Metaphors to Express Information Needs.. In ICCC. 16–23.Google ScholarGoogle Scholar
  60. Tony Veale and Yanfen Hao. 2007. Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language. In AAAI, Vol. 2007. 1471–1476.Google ScholarGoogle Scholar
  61. Dan Ventura. 2016. Mere generation: Essential barometer or dated concept. In Proceedings of the Seventh International Conference on Computational Creativity. Sony CSL, Paris, 17–24.Google ScholarGoogle Scholar
  62. Graham Wallas. 1926. The art of thought. Vol. 10. Harcourt, Brace.Google ScholarGoogle Scholar
  63. Hei-Chia Wang, Tzu-Ting Hsu, and Yunita Sari. 2019. Personal research idea recommendation using research trends and a hierarchical topic model. Scientometrics 121, 3 (2019), 1385–1406.Google ScholarGoogle Scholar
  64. Kai Wang. 2019. Towards a taxonomy of idea generation techniques. Foundations of Management 11, 1 (2019), 65–80.Google ScholarGoogle Scholar
  65. Ruishuang Wang, Zhao Li, Jian Cao, Tong Chen, and Lei Wang. 2019. Convolutional recurrent neural networks for text classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–6.Google ScholarGoogle Scholar
  66. Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny Zhou. 2022. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).Google ScholarGoogle Scholar
  67. Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier Movellan, and Paul Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Advances in neural information processing systems 22 (2009).Google ScholarGoogle Scholar
  68. Thomas Winters and Pieter Delobelle. 2021. Survival of the wittiest: Evolving satire with language models. In Proceedings of the Twelfth International Conference on Computational Creativity. Association for Computational Creativity (ACC), 82–86.Google ScholarGoogle Scholar
  69. Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335–4385.Google ScholarGoogle Scholar
  70. Kaiyu Yang, Jia Deng, and Danqi Chen. 2022. Generating Natural Language Proofs with Verifier-Guided Search. arXiv preprint arXiv:2205.12443 (2022).Google ScholarGoogle Scholar
  71. Jieli Zhou, Yuntao Zhou, and Yi Xu. 2018. Analogy Search Engine: Finding Analogies in Cross-Domain Research Papers. arXiv preprint arXiv:1812.06974 (2018).Google ScholarGoogle Scholar

Index Terms

  1. CAM: A Large Language Model-based Creative Analogy Mining Framework



    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '23: Proceedings of the ACM Web Conference 2023
      April 2023
      4293 pages

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 April 2023


      Request permissions about this article.

      Request Permissions

      Check for updates


      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

      Upcoming Conference

      WWW '24
      The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore , Singapore
    • Article Metrics

      • Downloads (Last 12 months)374
      • Downloads (Last 6 weeks)47

      Other Metrics

    PDF Format

    View or Download as a PDF file.



    View online with eReader.


    HTML Format

    View this article in HTML Format .

    View HTML Format