research-article

CAM: A Large Language Model-based Creative Analogy Mining Framework

Authors:

Chengxiang ZhaiAuthors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 3903 - 3914

https://doi.org/10.1145/3543507.3587431

Published: 30 April 2023 Publication History

Abstract

Analogies inspire creative solutions to problems, and facilitate the creative expression of ideas and the explanation of complex concepts. They have widespread applications in scientific innovation, creative writing, and education. The ability to discover creative analogies that are not explicitly mentioned but can be inferred from the web is highly desirable to power all such applications dynamically and augment human creativity. Recently, Large Pre-trained Language Models (PLMs), trained on massive Web data, have shown great promise in generating mostly known analogies that are explicitly mentioned on the Web. However, it is unclear how they could be leveraged for mining creative analogies not explicitly mentioned on the Web. We address this challenge and propose Creative Analogy Mining (CAM), a novel framework for mining creative analogies, which consists of the following three main steps: 1) Generate analogies using PLMs with effectively designed prompts, 2) Evaluate their quality using scoring functions, and 3) Refine the low-quality analogies by another round of prompt-based generation. We propose both unsupervised and supervised instantiations of the framework so that it can be used even without any annotated data. Based on human evaluation using Amazon Mechanical Turk, we find that our unsupervised framework can mine 13.7% highly-creative and 56.37% somewhat-creative analogies. Moreover, our supervised scores are generally better than the unsupervised ones and correlate moderately with human evaluators, indicating that they would be even more effective at mining creative analogies. These findings also shed light on the creativity of PLMs 1.

Supplemental Material

PDF File

Appendix

Download
923.23 KB

References

[1]

Charu C. Aggarwal and ChengXiang Zhai (Eds.). 2012. Mining Text Data. Springer.

[2]

Mostafa A Alksher, Azreen Azman, Razali Yaakob, Rabiah Abdul Kadir, Abdulmajid Mohamed, and Eissa M Alshari. 2016. A review of methods for mining idea from text. In 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). IEEE, 88–93.

[3]

Workneh Yilma Ayele. 2020. Adapting CRISP-DM for idea mining: a data mining process for generating ideas using a textual dataset. International Journal of Advanced Computer Sciences and Applications 11, 6 (2020), 20–32.

[4]

Workneh Y Ayele and Gustaf Juell-Skielse. 2021. A Systematic Literature Review about Idea Mining: The Use of Machine-Driven Analytics to Generate Ideas. In Future of Information and Communication Conference. Springer, 744–762.

[5]

Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.

[6]

Brendan Bena and Jugal Kalita. 2020. Introducing aspects of creativity in automatic poetry generation. arXiv preprint arXiv:2002.02511 (2020).

[7]

Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. 2022. Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. arxiv:2210.04186 [cs.CL]

[8]

MA Boden. 1994. What is creativity¿,[w:] MA Boden (red.), Dimensions of creativity.

[9]

Margaret A Boden. 2009. Computer models of creativity. AI Magazine 30, 3 (2009), 23–23.

[10]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.

[11]

Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, and Aniket Kittur. 2018. Solvent: A mixed initiative system for finding analogies between research papers. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1–21.

[12]

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021).

[13]

Simon Colton, Geraint A Wiggins, 2012. Computational creativity: The final frontier¿. In Ecai, Vol. 12. Montpelier, 21–26.

[14]

Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning. arXiv preprint arXiv:2205.12548 (2022).

[15]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[16]

Giulia Di Fede, Davide Rocchesso, Steven P Dow, and Salvatore Andolina. 2022. The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas. In Creativity and Cognition. 623–627.

[17]

Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2020. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics 8 (2020), 439–453.

[18]

Kenneth D Forbus, Ronald W Ferguson, Andrew Lovett, and Dedre Gentner. 2017. Extending SME to handle large-scale cognitive modeling. Cognitive Science 41, 5 (2017), 1152–1201.

[19]

Giorgio Franceschelli and Mirco Musolesi. 2021. Creativity and machine learning: A survey. arXiv preprint arXiv:2104.02726 (2021).

[20]

Dedre Gentner. 2002. Analogy in scientific discovery: The case of Johannes Kepler. Model-based reasoning: Science, technology, values (2002), 21–39.

[21]

Karni Gilon, Joel Chan, Felicia Y Ng, Hila Liifshitz-Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy mining for specific design needs. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.

[22]

Ben Goodrich, Vinay Rao, Peter J Liu, and Mohammad Saleh. 2019. Assessing the factual accuracy of generated text. In proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 166–175.

[23]

Maureen E Gray and Keith J Holyoak. 2021. Teaching by analogy: From theory to practice. Mind, Brain, and Education 15, 3 (2021), 250–263.

[24]

Douglas R Hofstadter and Melanie Mitchell. 1994. The Copycat project: A model of mental fluidity and analogy-making. (1994).

[25]

Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating innovation through analogy mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 235–243.

[26]

Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422–446.

[27]

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of hallucination in natural language generation. Comput. Surveys (2022).

[28]

Faustina Johnson and Santosh Kumar Gupta. 2012. Web content mining techniques: a survey. International journal of computer applications 47, 11 (2012).

[29]

John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.

[30]

Mahmut Kaya and Hasan Şakir Bilge. 2019. Deep metric learning: A survey. Symmetry 11, 9 (2019), 1066.

[31]

Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81–93.

[32]

Harry Khamis. 2008. Measures of association: How to choose¿Journal of Diagnostic Medical Sonography 24, 3 (2008), 155–162.

[33]

Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2022. Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works. arXiv preprint arXiv:2210.08477 (2022).

[34]

Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. ACM Sigkdd Explorations Newsletter 2, 1 (2000), 1–15.

[35]

Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. (2011).

[36]

P. Kruse, A. Schieber, A. Hilbert, and E. Schoop. 2013. Idea mining–text mining supported knowledge management for innovation purposes. In AMCIS (2013).

[37]

Carolyn Lamb, Daniel G Brown, and Charles LA Clarke. 2018. Evaluating computational creativity: An interdisciplinary tutorial. ACM Computing Surveys (CSUR) 51, 2 (2018), 1–34.

[38]

Won Sang Lee and So Young Sohn. 2019. Discovering emerging business ideas based on crowdfunded software projects. Decision Support Systems 116 (2019), 102–113.

[39]

Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, and Weizhu Chen. 2022. On the Advance of Making Language Models Better Reasoners. arXiv preprint arXiv:2206.02336 (2022).

[40]

Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, and Xiang Ren. 2021. Riddlesense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. arXiv preprint arXiv:2101.00376 (2021).

[41]

Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.

[42]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).

[43]

Melanie Mitchell. 2021. Abstraction and analogy-making in artificial intelligence. arXiv preprint arXiv:2102.10717 (2021).

[44]

Richard G Morris, Scott H Burton, Paul M Bodily, and Dan Ventura. 2012. Soup Over Bean of Pure Joy: Culinary Ruminations of an Artificial Chef. In ICCC. Citeseer, 119–125.

[45]

Christina Niklaus, Matthias Cetto, André Freitas, and Siegfried Handschuh. 2018. A survey on open information extraction. arXiv preprint arXiv:1806.05599 (2018).

[46]

Takaya Ogawa and Yuya Kajikawa. 2017. Generating novel research ideas using computational intelligence: A case study involving fuel cells and ammonia synthesis. Technological Forecasting and Social Change 120 (2017), 41–47.

[47]

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022).

[48]

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. (2022).

[49]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084

[50]

Graeme Ritchie. 2001. Assessing creativity. In Proc. of AISB’01 Symposium. Citeseer.

[51]

René Rohrbeck. 2014. Trend scanning, scouting and foresight techniques. In Management of the fuzzy front end of innovation. Springer, 59–73.

[52]

Ananya B Sai, Akash Kumar Mohankumar, and Mitesh M Khapra. 2022. A survey of evaluation metrics used for NLG systems. ACM Computing Surveys (CSUR) 55, 2 (2022), 1–39.

[53]

Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).

[54]

Hanieh Shakeri, Carman Neustaedter, and Steve DiPaola. 2021. Saga: Collaborative storytelling with gpt-3. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing. 163–166.

[55]

Dirk Thorleuchter, Dirk Van den Poel, and Anita Prinzie. 2010. Mining ideas from textual information. Expert Systems with Applications 37, 10 (2010), 7182–7188.

[56]

Hannu Toivonen and Oskar Gross. 2015. Data mining and machine learning in computational creativity. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5, 6 (2015), 265–275.

[57]

Asahi Ushio, Luis Espinosa-Anke, Steven Schockaert, and Jose Camacho-Collados. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies¿arXiv preprint arXiv:2105.04949 (2021).

[58]

Chris van der Lee, Albert Gatt, Emiel van Miltenburg, and Emiel Krahmer. 2021. Human evaluation of automatically generated text: Current trends and best practice guidelines. Computer Speech & Language 67 (2021), 101151.

[59]

Tony Veale. 2013. Once More, With Feeling! Using Creative Affective Metaphors to Express Information Needs. In ICCC. 16–23.

[60]

Tony Veale and Yanfen Hao. 2007. Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language. In AAAI, Vol. 2007. 1471–1476.

[61]

Dan Ventura. 2016. Mere generation: Essential barometer or dated concept. In Proceedings of the Seventh International Conference on Computational Creativity. Sony CSL, Paris, 17–24.

[62]

Graham Wallas. 1926. The art of thought. Vol. 10. Harcourt, Brace.

[63]

Hei-Chia Wang, Tzu-Ting Hsu, and Yunita Sari. 2019. Personal research idea recommendation using research trends and a hierarchical topic model. Scientometrics 121, 3 (2019), 1385–1406.

[64]

Kai Wang. 2019. Towards a taxonomy of idea generation techniques. Foundations of Management 11, 1 (2019), 65–80.

[65]

Ruishuang Wang, Zhao Li, Jian Cao, Tong Chen, and Lei Wang. 2019. Convolutional recurrent neural networks for text classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–6.

[66]

Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny Zhou. 2022. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).

[67]

Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier Movellan, and Paul Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Advances in neural information processing systems 22 (2009).

[68]

Thomas Winters and Pieter Delobelle. 2021. Survival of the wittiest: Evolving satire with language models. In Proceedings of the Twelfth International Conference on Computational Creativity. Association for Computational Creativity (ACC), 82–86.

[69]

Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335–4385.

[70]

Kaiyu Yang, Jia Deng, and Danqi Chen. 2022. Generating Natural Language Proofs with Verifier-Guided Search. arXiv preprint arXiv:2205.12443 (2022).

[71]

Jieli Zhou, Yuntao Zhou, and Yi Xu. 2018. Analogy Search Engine: Finding Analogies in Cross-Domain Research Papers. arXiv preprint arXiv:1812.06974 (2018).

Cited By

Heyman JRick SGiacomelli GWen HLaubacher RTaubenslag NRagupathy PCurhan JMalone TKnicker MJeddi Y(2024)Supermind Ideator: How scaffolding Human-AI collaboration can increase creativityCollective Intelligence10.1177/263391372413051173:4Online publication date: 3-Dec-2024
https://doi.org/10.1177/26339137241305117
Lu QFang JYao ZYang YLyu SMi HYao L(2024)Large Language Model Agents Enabled Generative Design of Fluidic Computation InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686351(1-3)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3672539.3686351
Zhong SHuang ZGao SWen WLin LZitnik MZhou P(2024)Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01258(13246-13257)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01258
Show More Cited By

Index Terms

CAM: A Large Language Model-based Creative Analogy Mining Framework
1. Information systems
  1. World Wide Web
    1. Web mining

Recommendations

Creative Computing for Bespoke Ideation
COMPSAC '15: Proceedings of the 2015 IEEE 39th Annual Computer Software and Applications Conference - Volume 01

Today, idea generation is an extremely important activity for both academic researchers and industrial groups. A considerable number of applications and research studies have been made in the past years in order to increase the effectiveness of idea ...
Linguistic Readymades and Creative Reuse

Creativity often arises from a process of appropriation, in which something is wrenched from its normative context of use and given new meaning in a new setting. In this vein, Marcel Duchamp popularized the notion of an artistic ready-made when his ...
Patterns for creative thinking: idea generation
EuroPLoP '15: Proceedings of the 20th European Conference on Pattern Languages of Programs

Creativity is an important skill in many domains. It is required to innovate, develop new ideas, get deeper insights, address challenges and resolve conflicts. In this context we understand creativity as the process of creating and developing new and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '23: Proceedings of the ACM Web Conference 2023

April 2023

4293 pages

ISBN:9781450394161

DOI:10.1145/3543507

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Data Availability

Appendix https://dl.acm.org/doi/10.1145/3543507.3587431#supplement.pdf

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
496
Total Downloads

Downloads (Last 12 months)182
Downloads (Last 6 weeks)12

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Heyman JRick SGiacomelli GWen HLaubacher RTaubenslag NRagupathy PCurhan JMalone TKnicker MJeddi Y(2024)Supermind Ideator: How scaffolding Human-AI collaboration can increase creativityCollective Intelligence10.1177/263391372413051173:4Online publication date: 3-Dec-2024
https://doi.org/10.1177/26339137241305117
Lu QFang JYao ZYang YLyu SMi HYao L(2024)Large Language Model Agents Enabled Generative Design of Fluidic Computation InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686351(1-3)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3672539.3686351
Zhong SHuang ZGao SWen WLin LZitnik MZhou P(2024)Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01258(13246-13257)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01258
Wang YZhao YTian XYang JLuo S(2024)The influence of subjective knowledge, technophobia and perceived enjoyment on design students’ intention to use artificial intelligence design toolsInternational Journal of Technology and Design Education10.1007/s10798-024-09897-335:1(333-358)Online publication date: 23-May-2024
https://doi.org/10.1007/s10798-024-09897-3
Li JLi JSu Y(2024)A Map of Exploring Human Interaction Patterns with LLM: Insights into Collaboration and CreativityArtificial Intelligence in HCI10.1007/978-3-031-60615-1_5(60-85)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1007/978-3-031-60615-1_5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten