ABSTRACT
Analogies inspire creative solutions to problems, and facilitate the creative expression of ideas and the explanation of complex concepts. They have widespread applications in scientific innovation, creative writing, and education. The ability to discover creative analogies that are not explicitly mentioned but can be inferred from the web is highly desirable to power all such applications dynamically and augment human creativity. Recently, Large Pre-trained Language Models (PLMs), trained on massive Web data, have shown great promise in generating mostly known analogies that are explicitly mentioned on the Web. However, it is unclear how they could be leveraged for mining creative analogies not explicitly mentioned on the Web. We address this challenge and propose Creative Analogy Mining (CAM), a novel framework for mining creative analogies, which consists of the following three main steps: 1) Generate analogies using PLMs with effectively designed prompts, 2) Evaluate their quality using scoring functions, and 3) Refine the low-quality analogies by another round of prompt-based generation. We propose both unsupervised and supervised instantiations of the framework so that it can be used even without any annotated data. Based on human evaluation using Amazon Mechanical Turk, we find that our unsupervised framework can mine 13.7% highly-creative and 56.37% somewhat-creative analogies. Moreover, our supervised scores are generally better than the unsupervised ones and correlate moderately with human evaluators, indicating that they would be even more effective at mining creative analogies. These findings also shed light on the creativity of PLMs 1.
Supplemental Material
Available for Download
Appendix
- Charu C. Aggarwal and ChengXiang Zhai (Eds.). 2012. Mining Text Data. Springer.Google Scholar
- Mostafa A Alksher, Azreen Azman, Razali Yaakob, Rabiah Abdul Kadir, Abdulmajid Mohamed, and Eissa M Alshari. 2016. A review of methods for mining idea from text. In 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). IEEE, 88–93.Google Scholar
- Workneh Yilma Ayele. 2020. Adapting CRISP-DM for idea mining: a data mining process for generating ideas using a textual dataset. International Journal of Advanced Computer Sciences and Applications 11, 6 (2020), 20–32.Google Scholar
- Workneh Y Ayele and Gustaf Juell-Skielse. 2021. A Systematic Literature Review about Idea Mining: The Use of Machine-Driven Analytics to Generate Ideas. In Future of Information and Communication Conference. Springer, 744–762.Google Scholar
- Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.Google Scholar
- Brendan Bena and Jugal Kalita. 2020. Introducing aspects of creativity in automatic poetry generation. arXiv preprint arXiv:2002.02511 (2020).Google Scholar
- Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. 2022. Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. arxiv:2210.04186 [cs.CL]Google Scholar
- MA Boden. 1994. What is creativity¿,[w:] MA Boden (red.), Dimensions of creativity.Google Scholar
- Margaret A Boden. 2009. Computer models of creativity. AI Magazine 30, 3 (2009), 23–23.Google Scholar
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google Scholar
- Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, and Aniket Kittur. 2018. Solvent: A mixed initiative system for finding analogies between research papers. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 1–21.Google Scholar
- Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021).Google Scholar
- Simon Colton, Geraint A Wiggins, 2012. Computational creativity: The final frontier¿. In Ecai, Vol. 12. Montpelier, 21–26.Google Scholar
- Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning. arXiv preprint arXiv:2205.12548 (2022).Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
- Giulia Di Fede, Davide Rocchesso, Steven P Dow, and Salvatore Andolina. 2022. The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas. In Creativity and Cognition. 623–627.Google Scholar
- Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2020. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics 8 (2020), 439–453.Google Scholar
- Kenneth D Forbus, Ronald W Ferguson, Andrew Lovett, and Dedre Gentner. 2017. Extending SME to handle large-scale cognitive modeling. Cognitive Science 41, 5 (2017), 1152–1201.Google Scholar
- Giorgio Franceschelli and Mirco Musolesi. 2021. Creativity and machine learning: A survey. arXiv preprint arXiv:2104.02726 (2021).Google Scholar
- Dedre Gentner. 2002. Analogy in scientific discovery: The case of Johannes Kepler. Model-based reasoning: Science, technology, values (2002), 21–39.Google Scholar
- Karni Gilon, Joel Chan, Felicia Y Ng, Hila Liifshitz-Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy mining for specific design needs. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–11.Google Scholar
- Ben Goodrich, Vinay Rao, Peter J Liu, and Mohammad Saleh. 2019. Assessing the factual accuracy of generated text. In proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 166–175.Google Scholar
- Maureen E Gray and Keith J Holyoak. 2021. Teaching by analogy: From theory to practice. Mind, Brain, and Education 15, 3 (2021), 250–263.Google Scholar
- Douglas R Hofstadter and Melanie Mitchell. 1994. The Copycat project: A model of mental fluidity and analogy-making. (1994).Google Scholar
- Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating innovation through analogy mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 235–243.Google Scholar
- Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422–446.Google Scholar
- Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of hallucination in natural language generation. Comput. Surveys (2022).Google Scholar
- Faustina Johnson and Santosh Kumar Gupta. 2012. Web content mining techniques: a survey. International journal of computer applications 47, 11 (2012).Google Scholar
- John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.Google Scholar
- Mahmut Kaya and Hasan Şakir Bilge. 2019. Deep metric learning: A survey. Symmetry 11, 9 (2019), 1066.Google Scholar
- Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81–93.Google Scholar
- Harry Khamis. 2008. Measures of association: How to choose¿Journal of Diagnostic Medical Sonography 24, 3 (2008), 155–162.Google Scholar
- Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2022. Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works. arXiv preprint arXiv:2210.08477 (2022).Google Scholar
- Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. ACM Sigkdd Explorations Newsletter 2, 1 (2000), 1–15.Google Scholar
- Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. (2011).Google Scholar
- P. Kruse, A. Schieber, A. Hilbert, and E. Schoop. 2013. Idea mining–text mining supported knowledge management for innovation purposes. In AMCIS (2013).Google Scholar
- Carolyn Lamb, Daniel G Brown, and Charles LA Clarke. 2018. Evaluating computational creativity: An interdisciplinary tutorial. ACM Computing Surveys (CSUR) 51, 2 (2018), 1–34.Google Scholar
- Won Sang Lee and So Young Sohn. 2019. Discovering emerging business ideas based on crowdfunded software projects. Decision Support Systems 116 (2019), 102–113.Google Scholar
- Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, and Weizhu Chen. 2022. On the Advance of Making Language Models Better Reasoners. arXiv preprint arXiv:2206.02336 (2022).Google Scholar
- Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, and Xiang Ren. 2021. Riddlesense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. arXiv preprint arXiv:2101.00376 (2021).Google Scholar
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.Google Scholar
- Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).Google Scholar
- Melanie Mitchell. 2021. Abstraction and analogy-making in artificial intelligence. arXiv preprint arXiv:2102.10717 (2021).Google Scholar
- Richard G Morris, Scott H Burton, Paul M Bodily, and Dan Ventura. 2012. Soup Over Bean of Pure Joy: Culinary Ruminations of an Artificial Chef.. In ICCC. Citeseer, 119–125.Google Scholar
- Christina Niklaus, Matthias Cetto, André Freitas, and Siegfried Handschuh. 2018. A survey on open information extraction. arXiv preprint arXiv:1806.05599 (2018).Google Scholar
- Takaya Ogawa and Yuya Kajikawa. 2017. Generating novel research ideas using computational intelligence: A case study involving fuel cells and ammonia synthesis. Technological Forecasting and Social Change 120 (2017), 41–47.Google Scholar
- Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022).Google Scholar
- Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. (2022).Google Scholar
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084Google Scholar
- Graeme Ritchie. 2001. Assessing creativity. In Proc. of AISB’01 Symposium. Citeseer.Google Scholar
- René Rohrbeck. 2014. Trend scanning, scouting and foresight techniques. In Management of the fuzzy front end of innovation. Springer, 59–73.Google Scholar
- Ananya B Sai, Akash Kumar Mohankumar, and Mitesh M Khapra. 2022. A survey of evaluation metrics used for NLG systems. ACM Computing Surveys (CSUR) 55, 2 (2022), 1–39.Google Scholar
- Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).Google Scholar
- Hanieh Shakeri, Carman Neustaedter, and Steve DiPaola. 2021. Saga: Collaborative storytelling with gpt-3. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing. 163–166.Google Scholar
- Dirk Thorleuchter, Dirk Van den Poel, and Anita Prinzie. 2010. Mining ideas from textual information. Expert Systems with Applications 37, 10 (2010), 7182–7188.Google Scholar
- Hannu Toivonen and Oskar Gross. 2015. Data mining and machine learning in computational creativity. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5, 6 (2015), 265–275.Google Scholar
- Asahi Ushio, Luis Espinosa-Anke, Steven Schockaert, and Jose Camacho-Collados. 2021. BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies¿arXiv preprint arXiv:2105.04949 (2021).Google Scholar
- Chris van der Lee, Albert Gatt, Emiel van Miltenburg, and Emiel Krahmer. 2021. Human evaluation of automatically generated text: Current trends and best practice guidelines. Computer Speech & Language 67 (2021), 101151.Google Scholar
- Tony Veale. 2013. Once More, With Feeling! Using Creative Affective Metaphors to Express Information Needs.. In ICCC. 16–23.Google Scholar
- Tony Veale and Yanfen Hao. 2007. Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language. In AAAI, Vol. 2007. 1471–1476.Google Scholar
- Dan Ventura. 2016. Mere generation: Essential barometer or dated concept. In Proceedings of the Seventh International Conference on Computational Creativity. Sony CSL, Paris, 17–24.Google Scholar
- Graham Wallas. 1926. The art of thought. Vol. 10. Harcourt, Brace.Google Scholar
- Hei-Chia Wang, Tzu-Ting Hsu, and Yunita Sari. 2019. Personal research idea recommendation using research trends and a hierarchical topic model. Scientometrics 121, 3 (2019), 1385–1406.Google Scholar
- Kai Wang. 2019. Towards a taxonomy of idea generation techniques. Foundations of Management 11, 1 (2019), 65–80.Google Scholar
- Ruishuang Wang, Zhao Li, Jian Cao, Tong Chen, and Lei Wang. 2019. Convolutional recurrent neural networks for text classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–6.Google Scholar
- Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny Zhou. 2022. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).Google Scholar
- Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier Movellan, and Paul Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Advances in neural information processing systems 22 (2009).Google Scholar
- Thomas Winters and Pieter Delobelle. 2021. Survival of the wittiest: Evolving satire with language models. In Proceedings of the Twelfth International Conference on Computational Creativity. Association for Computational Creativity (ACC), 82–86.Google Scholar
- Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335–4385.Google Scholar
- Kaiyu Yang, Jia Deng, and Danqi Chen. 2022. Generating Natural Language Proofs with Verifier-Guided Search. arXiv preprint arXiv:2205.12443 (2022).Google Scholar
- Jieli Zhou, Yuntao Zhou, and Yi Xu. 2018. Analogy Search Engine: Finding Analogies in Cross-Domain Research Papers. arXiv preprint arXiv:1812.06974 (2018).Google Scholar
Index Terms
- CAM: A Large Language Model-based Creative Analogy Mining Framework
Recommendations
Creative Computing for Bespoke Ideation
COMPSAC '15: Proceedings of the 2015 IEEE 39th Annual Computer Software and Applications Conference - Volume 01Today, idea generation is an extremely important activity for both academic researchers and industrial groups. A considerable number of applications and research studies have been made in the past years in order to increase the effectiveness of idea ...
Linguistic Readymades and Creative Reuse
Creativity often arises from a process of appropriation, in which something is wrenched from its normative context of use and given new meaning in a new setting. In this vein, Marcel Duchamp popularized the notion of an artistic ready-made when his ...
Patterns for creative thinking: idea generation
EuroPLoP '15: Proceedings of the 20th European Conference on Pattern Languages of ProgramsCreativity is an important skill in many domains. It is required to innovate, develop new ideas, get deeper insights, address challenges and resolve conflicts. In this context we understand creativity as the process of creating and developing new and ...
Comments