skip to main content
10.1145/3543507.3587431acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

CAM: A Large Language Model-based Creative Analogy Mining Framework

Published: 30 April 2023 Publication History

Abstract

Analogies inspire creative solutions to problems, and facilitate the creative expression of ideas and the explanation of complex concepts. They have widespread applications in scientific innovation, creative writing, and education. The ability to discover creative analogies that are not explicitly mentioned but can be inferred from the web is highly desirable to power all such applications dynamically and augment human creativity. Recently, Large Pre-trained Language Models (PLMs), trained on massive Web data, have shown great promise in generating mostly known analogies that are explicitly mentioned on the Web. However, it is unclear how they could be leveraged for mining creative analogies not explicitly mentioned on the Web. We address this challenge and propose Creative Analogy Mining (CAM), a novel framework for mining creative analogies, which consists of the following three main steps: 1) Generate analogies using PLMs with effectively designed prompts, 2) Evaluate their quality using scoring functions, and 3) Refine the low-quality analogies by another round of prompt-based generation. We propose both unsupervised and supervised instantiations of the framework so that it can be used even without any annotated data. Based on human evaluation using Amazon Mechanical Turk, we find that our unsupervised framework can mine 13.7% highly-creative and 56.37% somewhat-creative analogies. Moreover, our supervised scores are generally better than the unsupervised ones and correlate moderately with human evaluators, indicating that they would be even more effective at mining creative analogies. These findings also shed light on the creativity of PLMs 1.

Supplemental Material

PDF File
Appendix

References

[1]
Charu C. Aggarwal and ChengXiang Zhai (Eds.). 2012. Mining Text Data. Springer.
[2]
Mostafa A Alksher, Azreen Azman, Razali Yaakob, Rabiah Abdul Kadir, Abdulmajid Mohamed, and Eissa M Alshari. 2016. A review of methods for mining idea from text. In 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). IEEE, 88–93.
[3]
Workneh Yilma Ayele. 2020. Adapting CRISP-DM for idea mining: a data mining process for generating ideas using a textual dataset. International Journal of Advanced Computer Sciences and Applications 11, 6 (2020), 20–32.
[4]
Workneh Y Ayele and Gustaf Juell-Skielse. 2021. A Systematic Literature Review about Idea Mining: The Use of Machine-Driven Analytics to Generate Ideas. In Future of Information and Communication Conference. Springer, 744–762.
[5]
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.
[6]
Brendan Bena and Jugal Kalita. 2020. Introducing aspects of creativity in automatic poetry generation. arXiv preprint arXiv:2002.02511 (2020).
[7]
Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. 2022. Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. arxiv:2210.04186 [cs.CL]
[8]
MA Boden. 1994. What is creativity¿,[w:] MA Boden (red.), Dimensions of creativity.
[9]
Margaret A Boden. 2009. Computer models of creativity. AI Magazine 30, 3 (2009), 23–23.
[10]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[11]
Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, and Aniket Kittur. 2018. Solvent: A mixed initiative system for finding analogies between research papers. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1–21.
[12]
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021).
[13]
Simon Colton, Geraint A Wiggins, 2012. Computational creativity: The final frontier¿. In Ecai, Vol. 12. Montpelier, 21–26.
[14]
Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning. arXiv preprint arXiv:2205.12548 (2022).
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[16]
Giulia Di Fede, Davide Rocchesso, Steven P Dow, and Salvatore Andolina. 2022. The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas. In Creativity and Cognition. 623–627.
[17]
Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2020. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics 8 (2020), 439–453.
[18]
Kenneth D Forbus, Ronald W Ferguson, Andrew Lovett, and Dedre Gentner. 2017. Extending SME to handle large-scale cognitive modeling. Cognitive Science 41, 5 (2017), 1152–1201.
[19]
Giorgio Franceschelli and Mirco Musolesi. 2021. Creativity and machine learning: A survey. arXiv preprint arXiv:2104.02726 (2021).
[20]
Dedre Gentner. 2002. Analogy in scientific discovery: The case of Johannes Kepler. Model-based reasoning: Science, technology, values (2002), 21–39.
[21]
Karni Gilon, Joel Chan, Felicia Y Ng, Hila Liifshitz-Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy mining for specific design needs. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.
[22]
Ben Goodrich, Vinay Rao, Peter J Liu, and Mohammad Saleh. 2019. Assessing the factual accuracy of generated text. In proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 166–175.
[23]
Maureen E Gray and Keith J Holyoak. 2021. Teaching by analogy: From theory to practice. Mind, Brain, and Education 15, 3 (2021), 250–263.
[24]
Douglas R Hofstadter and Melanie Mitchell. 1994. The Copycat project: A model of mental fluidity and analogy-making. (1994).
[25]
Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating innovation through analogy mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 235–243.
[26]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422–446.
[27]
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of hallucination in natural language generation. Comput. Surveys (2022).
[28]
Faustina Johnson and Santosh Kumar Gupta. 2012. Web content mining techniques: a survey. International journal of computer applications 47, 11 (2012).
[29]
John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.
[30]
Mahmut Kaya and Hasan Şakir Bilge. 2019. Deep metric learning: A survey. Symmetry 11, 9 (2019), 1066.
[31]
Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81–93.
[32]
Harry Khamis. 2008. Measures of association: How to choose¿Journal of Diagnostic Medical Sonography 24, 3 (2008), 155–162.
[33]
Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2022. Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works. arXiv preprint arXiv:2210.08477 (2022).
[34]
Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. ACM Sigkdd Explorations Newsletter 2, 1 (2000), 1–15.
[35]
Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. (2011).
[36]
P. Kruse, A. Schieber, A. Hilbert, and E. Schoop. 2013. Idea mining–text mining supported knowledge management for innovation purposes. In AMCIS (2013).
[37]
Carolyn Lamb, Daniel G Brown, and Charles LA Clarke. 2018. Evaluating computational creativity: An interdisciplinary tutorial. ACM Computing Surveys (CSUR) 51, 2 (2018), 1–34.
[38]
Won Sang Lee and So Young Sohn. 2019. Discovering emerging business ideas based on crowdfunded software projects. Decision Support Systems 116 (2019), 102–113.
[39]
Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, and Weizhu Chen. 2022. On the Advance of Making Language Models Better Reasoners. arXiv preprint arXiv:2206.02336 (2022).
[40]
Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, and Xiang Ren. 2021. Riddlesense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. arXiv preprint arXiv:2101.00376 (2021).
[41]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.
[42]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).
[43]
Melanie Mitchell. 2021. Abstraction and analogy-making in artificial intelligence. arXiv preprint arXiv:2102.10717 (2021).
[44]
Richard G Morris, Scott H Burton, Paul M Bodily, and Dan Ventura. 2012. Soup Over Bean of Pure Joy: Culinary Ruminations of an Artificial Chef. In ICCC. Citeseer, 119–125.
[45]
Christina Niklaus, Matthias Cetto, André Freitas, and Siegfried Handschuh. 2018. A survey on open information extraction. arXiv preprint arXiv:1806.05599 (2018).
[46]
Takaya Ogawa and Yuya Kajikawa. 2017. Generating novel research ideas using computational intelligence: A case study involving fuel cells and ammonia synthesis. Technological Forecasting and Social Change 120 (2017), 41–47.
[47]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022).
[48]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. (2022).
[49]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084
[50]
Graeme Ritchie. 2001. Assessing creativity. In Proc. of AISB’01 Symposium. Citeseer.
[51]
René Rohrbeck. 2014. Trend scanning, scouting and foresight techniques. In Management of the fuzzy front end of innovation. Springer, 59–73.
[52]
Ananya B Sai, Akash Kumar Mohankumar, and Mitesh M Khapra. 2022. A survey of evaluation metrics used for NLG systems. ACM Computing Surveys (CSUR) 55, 2 (2022), 1–39.
[53]
Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).
[54]
Hanieh Shakeri, Carman Neustaedter, and Steve DiPaola. 2021. Saga: Collaborative storytelling with gpt-3. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing. 163–166.
[55]
Dirk Thorleuchter, Dirk Van den Poel, and Anita Prinzie. 2010. Mining ideas from textual information. Expert Systems with Applications 37, 10 (2010), 7182–7188.
[56]
Hannu Toivonen and Oskar Gross. 2015. Data mining and machine learning in computational creativity. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5, 6 (2015), 265–275.
[57]
Asahi Ushio, Luis Espinosa-Anke, Steven Schockaert, and Jose Camacho-Collados. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies¿arXiv preprint arXiv:2105.04949 (2021).
[58]
Chris van der Lee, Albert Gatt, Emiel van Miltenburg, and Emiel Krahmer. 2021. Human evaluation of automatically generated text: Current trends and best practice guidelines. Computer Speech & Language 67 (2021), 101151.
[59]
Tony Veale. 2013. Once More, With Feeling! Using Creative Affective Metaphors to Express Information Needs. In ICCC. 16–23.
[60]
Tony Veale and Yanfen Hao. 2007. Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language. In AAAI, Vol. 2007. 1471–1476.
[61]
Dan Ventura. 2016. Mere generation: Essential barometer or dated concept. In Proceedings of the Seventh International Conference on Computational Creativity. Sony CSL, Paris, 17–24.
[62]
Graham Wallas. 1926. The art of thought. Vol. 10. Harcourt, Brace.
[63]
Hei-Chia Wang, Tzu-Ting Hsu, and Yunita Sari. 2019. Personal research idea recommendation using research trends and a hierarchical topic model. Scientometrics 121, 3 (2019), 1385–1406.
[64]
Kai Wang. 2019. Towards a taxonomy of idea generation techniques. Foundations of Management 11, 1 (2019), 65–80.
[65]
Ruishuang Wang, Zhao Li, Jian Cao, Tong Chen, and Lei Wang. 2019. Convolutional recurrent neural networks for text classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–6.
[66]
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny Zhou. 2022. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).
[67]
Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier Movellan, and Paul Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Advances in neural information processing systems 22 (2009).
[68]
Thomas Winters and Pieter Delobelle. 2021. Survival of the wittiest: Evolving satire with language models. In Proceedings of the Twelfth International Conference on Computational Creativity. Association for Computational Creativity (ACC), 82–86.
[69]
Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335–4385.
[70]
Kaiyu Yang, Jia Deng, and Danqi Chen. 2022. Generating Natural Language Proofs with Verifier-Guided Search. arXiv preprint arXiv:2205.12443 (2022).
[71]
Jieli Zhou, Yuntao Zhou, and Yi Xu. 2018. Analogy Search Engine: Finding Analogies in Cross-Domain Research Papers. arXiv preprint arXiv:1812.06974 (2018).

Cited By

View all
  • (2024)Supermind Ideator: How scaffolding Human-AI collaboration can increase creativityCollective Intelligence10.1177/263391372413051173:4Online publication date: 3-Dec-2024
  • (2024)Large Language Model Agents Enabled Generative Design of Fluidic Computation InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686351(1-3)Online publication date: 13-Oct-2024
  • (2024)Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01258(13246-13257)Online publication date: 16-Jun-2024
  • Show More Cited By

Index Terms

  1. CAM: A Large Language Model-based Creative Analogy Mining Framework

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '23: Proceedings of the ACM Web Conference 2023
    April 2023
    4293 pages
    ISBN:9781450394161
    DOI:10.1145/3543507
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 April 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. analogy mining
    2. creativity
    3. large language model

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Data Availability

    Conference

    WWW '23
    Sponsor:
    WWW '23: The ACM Web Conference 2023
    April 30 - May 4, 2023
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)182
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Supermind Ideator: How scaffolding Human-AI collaboration can increase creativityCollective Intelligence10.1177/263391372413051173:4Online publication date: 3-Dec-2024
    • (2024)Large Language Model Agents Enabled Generative Design of Fluidic Computation InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686351(1-3)Online publication date: 13-Oct-2024
    • (2024)Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01258(13246-13257)Online publication date: 16-Jun-2024
    • (2024)The influence of subjective knowledge, technophobia and perceived enjoyment on design students’ intention to use artificial intelligence design toolsInternational Journal of Technology and Design Education10.1007/s10798-024-09897-335:1(333-358)Online publication date: 23-May-2024
    • (2024)A Map of Exploring Human Interaction Patterns with LLM: Insights into Collaboration and CreativityArtificial Intelligence in HCI10.1007/978-3-031-60615-1_5(60-85)Online publication date: 29-Jun-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media