skip to main content
10.1145/3664647.3681339acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery

Published: 28 October 2024 Publication History

Abstract

Social media has become ubiquitous for connecting with others, staying updated with news, expressing opinions, and finding entertainment. However, understanding the intention behind social media posts remains challenging due to the implicit and commonsense nature of these intentions, the need for cross-modality understanding of both text and images, and the presence of noisy information such as hashtags, misspelled words, and complicated abbreviations. To address these challenges, we present MIKO, a Multimodal Intention Knowledge DistillatiOn framework that collaboratively leverages a Large Language Model (LLM) and a Multimodal Large Language Model (MLLM) to uncover users' intentions. Specifically, our approach uses an MLLM to interpret the image, an LLM to extract key information from the text, and another LLM to generate intentions. By applying MIKO to publicly available social media datasets, we construct an intention knowledge base featuring 1,372K intentions rooted in 137,287 posts. Moreover, We conduct a two-stage annotation to verify the quality of the generated knowledge and benchmark the performance of widely used LLMs for intention generation, and further apply MIKO to a sarcasm detection dataset and distill a student model to demonstrate the downstream benefits of applying intention knowledge.

Supplemental Material

MP4 File - MIKO:Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery
Social media is a double-edged sword. When used correctly, social media can have a positive impact on individuals, groups, and organizations. Otherwise it will endanger the safety of society. We will think Why do we care about intention on social media? Because user intention reflects the most direct social needs of users for publishing blog posts. Based on the above reasons, we proposed a framework which encompasses three core components: multi-information reasoning, intention distillation, and multi-view intention effectiveness evaluation. We leverage the LLava and ChatGPT models, employing a novel hierarchical prompt guidance approach to extract image description key information and intentions from user posts. Following this, we annotate the derived intentions based on rationality and credibility, create a benchmark, and assess the performance of various and the performance with the help of intentions on sarcasm detection task.

References

[1]
Fernando Gonzalez Adauto, Zhijing Jin, Bernhard Schölkopf, Tom Hope, Mrinmaya Sachan, and Rada Mihalcea. 2023. Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good. In Findings of EMNLP. 415--438.
[2]
Hossam Ali-Hassan, Dorit Nevo, and Michael R. Wade. 2015. Linking dimensions of social media use to job performance: The role of social capital. J. Strateg. Inf. Syst., Vol. 24, 2 (2015), 65--89.
[3]
Jacob Andreas. 2022. Language Models as Agent Models. In Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7--11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 5769--5779.
[4]
Mario Ezra Aragón, Adrián Pastor López-Monroy, Luis Gonzalez, David E. Losada, and Manuel Montes. 2023. DisorBERT: A Double Domain Adaptation Model for Detecting Signs of Mental Disorders in Social Media. In ACL, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). 15305--15318.
[5]
Janet W Astington. 1993. The child's discovery of the mind. Vol. 31. Harvard University Press.
[6]
Michael Bratman. 1987. Intention, plans, and practical reason. (1987).
[7]
Yitao Cai, Huiyu Cai, and Xiaojun Wan. 2019. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model. In ACL, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). 2506--2515.
[8]
Yitao Cai, Huiyu Cai, and Xiaojun Wan. 2019. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 2506--2515.
[9]
Yitao Cai, Huiyu Cai, and Xiaojun Wan. 2019. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model. In ACL. 2506--2515.
[10]
Chunkit Chan, Cheng Jiayang, Weiqi Wang, Yuxin Jiang, Tianqing Fang, Xin Liu, and Yangqiu Song. 2024. Exploring the Potential of ChatGPT on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations. (2024), 684--721.
[11]
Chunkit Chan, Cheng Jiayang, Yauwai Yim, Zheye Deng, Wei Fan, Haoran Li, Xin Liu, Hongming Zhang, Weiqi Wang, and Yangqiu Song. 2024. NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding. CoRR, Vol. abs/2404.13627 (2024).
[12]
Feng Chen and Yujian Feng. 2023. Chain-of-Thought Prompt Distillation for Multimodal Named Entity and Multimodal Relation Extraction. CoRR, Vol. abs/2306.14122 (2023).
[13]
Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, and David Jurgens. 2023. Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark. In EMNLP, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). 11370--11403.
[14]
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei. 2022. Scaling Instruction-Finetuned Language Models. https://doi.org/10.48550/ARXIV.2210.11416
[15]
Bao Dai, Ahsan Ali, and Hongwei Wang. 2020. Exploring information avoidance intention of social media users: a cognition-affect-conation perspective. Internet Res., Vol. 30, 5 (2020), 1455--1478.
[16]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT, Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). 4171--4186.
[17]
Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing Yin, Junxian He, and Yangqiu Song. 2024. IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce. CoRR, Vol. abs/2406.10173 (2024). showeprint[arXiv]2406.10173
[18]
Chiara Patricia Drolsbach and Nicolas Pröllochs. 2023. Believability and Harmfulness Shape the Virality of Misleading Social Media Posts. In WWW. 4172--4177.
[19]
Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. 2022. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In ACL. 320--335.
[20]
Ramazan Esmeli, Mohamed Bader-El-Den, and Alaa Mohasseb. 2019. Context and Short Term User Intention Aware Hybrid Session Based Recommendation System. In INISTA. 1--6.
[21]
Chen Gao, Jinyu Chen, Si Liu, Luting Wang, Qiong Zhang, and Qi Wu. 2021. Room-and-object aware knowledge reasoning for remote embodied referring expression. In CVPR. 3064--3073.
[22]
Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, and Shuicheng Yan. 2020. Adversarialnas: Adversarial neural architecture search for gans. In CVPR. 5680--5689.
[23]
Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei Zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, and Yu Qiao. 2023. LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model. CoRR, Vol. abs/2304.15010 (2023).
[24]
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LS? and other neural network architectures. Neural Networks, Vol. 18, 5--6 (2005), 602--610.
[25]
Britta Grimme, Janina Pohl, Hendrik Winkelmann, Lucas Stampe, and Christian Grimme. 2023. Lost in Transformation: Rediscovering LLM-Generated Campaigns in Social Media. In MISDOOM (Lecture Notes in Computer Science, Vol. 14397), Davide Ceolin, Tommaso Caselli, and Marina Tulin (Eds.). 72--87.
[26]
Yu Gu, Sheng Zhang, Naoto Usuyama, Yonas Woldesenbet, Cliff Wong, Praneeth Sanapathi, Mu Wei, Naveen Valluri, Erika Strandberg, Tristan Naumann, and Hoifung Poon. 2023. Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events. CoRR, Vol. abs/2307.06439 (2023).
[27]
Xunhua Guo, Lingli Wang, Mingyue Zhang, and Guoqing Chen. 2023. First Things First? Order Effects in Online Product Recommender Systems. ACM Trans. Comput. Hum. Interact., Vol. 30, 1 (2023), 1--35.
[28]
Janosch Haber, Bertie Vidgen, Matthew Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, and Paul Röttger. 2023. Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore. In ACL, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). 12705--12721.
[29]
Choochart Haruechaiyasak, Alisa Kongthon, Pornpimon Palingoon, and Kanokorn Trakultaweekoon. 2013. S-Sense: A Sentiment Analysis Framework for Social Media Sensing. In IJCNLP. 6--13.
[30]
Mutian He, Tianqing Fang, Weiqi Wang, and Yangqiu Song. 2024. Acquiring and modeling abstract commonsense knowledge via conceptualization. Artif. Intell., Vol. 333 (2024), 104149.
[31]
Thorsten Hennig-Thurau. 2004. Customer orientation of service employees: Its impact on customer satisfaction, commitment, and retention. International journal of service industry management, Vol. 15, 5 (2004), 460--478.
[32]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. CoRR, Vol. abs/1503.02531 (2015).
[33]
Yuting Hu, Liang Zheng, Yi Yang, and Yongfeng Huang. 2017. Twitter100k: A real-world dataset for weakly supervised cross-media retrieval. IEEE Transactions on Multimedia, Vol. 20, 4 (2017), 927--938.
[34]
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. 2023. Mistral 7B. CoRR, Vol. abs/2310.06825 (2023).
[35]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP. 1746--1751.
[36]
Yi-Cheng Ku and Yi-Ming Tai. 2013. What Happens When Recommendation System Meets Reputation System? The Impact of Recommendation Information on Purchase Intention. In HICSS. 1376--1383.
[37]
Yan Leng and Yuan Yuan. 2023. Do LLM Agents Exhibit Social Behavior? CoRR, Vol. abs/2312.15198 (2023).
[38]
Bin Liang, Chenwei Lou, Xiang Li, Lin Gui, Min Yang, and Ruifeng Xu. 2021. Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs. In MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021, Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo César, Florian Metze, and Balakrishnan Prabhakaran (Eds.). ACM, 4707--4715.
[39]
Bin Liang, Chenwei Lou, Xiang Li, Lin Gui, Min Yang, and Ruifeng Xu. 2021. Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs. In MM. 4707--4715.
[40]
Bin Liang, Chenwei Lou, Xiang Li, Min Yang, Lin Gui, Yulan He, Wenjie Pei, and Ruifeng Xu. 2022. Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network. In ACL. 1767--1777.
[41]
Xinyu Lin, Wenjie Wang, Yongqi Li, Shuo Yang, Fuli Feng, Yinwei Wei, and Tat-Seng Chua. 2024. Data-efficient Fine-tuning for LLM-based Recommendation. arXiv preprint arXiv:2401.17197 (2024).
[42]
Chen Liu, Shibo He, Qihang Zhou, Shizhong Li, and Wenchao Meng. 2024. Large Language Model Guided Knowledge Distillation for Time Series Anomaly Detection. arXiv preprint arXiv:2401.15123 (2024).
[43]
Haotian Liu, Chunyuan Li, Yuheng Li, and Yong Jae Lee. 2023. Improved Baselines with Visual Instruction Tuning. CoRR, Vol. abs/2310.03744 (2023).
[44]
Hui Liu, Wenya Wang, and Haoliang Li. 2022. Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement. In EMNLP, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). 4995--5006.
[45]
Junling Liu, Chao Liu, Renjie Lv, Kang Zhou, and Yan Zhang. 2023. Is ChatGPT a Good Recommender? A Preliminary Study. CoRR, Vol. abs/2304.10149 (2023).
[46]
Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, and Heng Ji. 2018. Visual Attention Model for Name Tagging in Multimodal Social Media. In ACL, Iryna Gurevych and Yusuke Miyao (Eds.). 1990--1999.
[47]
OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. OpenAI (2022). https://openai.com/blog/chatgpt
[48]
OpenAI. 2023. GPT-4 Technical Report. CoRR, Vol. abs/2303.08774 (2023).
[49]
Hongliang Pan, Zheng Lin, Peng Fu, Yatao Qi, and Weiping Wang. 2020. Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection. In Findings of EMNLP, Trevor Cohn, Yulan He, and Yang Liu (Eds.), Vol. EMNLP 2020. 1383--1392.
[50]
Guilherme Penedo, Quentin Malartic, Daniel Hesslow, Ruxandra Cojocaru, Alessandro Cappelli, Hamza Alobeidli, Baptiste Pannier, Ebtesam Almazrouei, and Julien Launay. 2023. The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116 (2023).
[51]
Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, and Jianfeng Gao. 2023. Instruction Tuning with GPT-4. CoRR, Vol. abs/2304.03277 (2023).
[52]
Josef Perner. 1991. Understanding the representational mind.The MIT Press.
[53]
Lluís Garcia Pueyo, Vinodh Kumar Sunkara, Prathyusha Senthil Kumar, Mohit Diwan, Qian Ge, Behrang Javaherian, and Vasilis Verroios. 2023. Detecting and Limiting Negative User Experiences in Social Media Platforms. In WWW. 4086--4094.
[54]
Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, and Yejin Choi. 2019. ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. In AAAI. 3027--3035.
[55]
Wenqi Sun, Ruobing Xie, Junjie Zhang, Wayne Xin Zhao, Leyu Lin, and Ji-Rong Wen. 2024. Distillation is All You Need for Practically Using Different Pre-trained Recommendation Models. CoRR, Vol. abs/2401.00797 (2024).
[56]
Yun Tang, Antonio A Bruto da Costa, Jason Zhang, Irvine Patrick, Siddartha Khastgir, and Paul Jennings. 2023. Domain Knowledge Distillation from Large Language Model: An Empirical Study in the Autonomous Driving Domain. arXiv preprint arXiv:2307.11769 (2023).
[57]
Maxim Tkachenko, Mikhail Malyuk, Andrey Holmanyuk, and Nikolai Liubimov. 2020--2022. Label Studio: Data labeling software. https://github.com/heartexlabs/label-studio Open source software available from https://github.com/heartexlabs/label-studio.
[58]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurélien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. CoRR, Vol. abs/2302.13971 (2023).
[59]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurélien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. CoRR, Vol. abs/2307.09288 (2023).
[60]
Jiaan Wang, Yunlong Liang, Fandong Meng, Haoxiang Shi, Zhixu Li, Jinan Xu, Jianfeng Qu, and Jie Zhou. 2023. Is ChatGPT a Good NLG Evaluator? A Preliminary Study. CoRR, Vol. abs/2303.04048 (2023).
[61]
Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, Yangqiu Song, and Antoine Bosselut. 2023. CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6--10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 13520--13545.
[62]
Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Jiayang Cheng, Chunkit Chan, and Yangqiu Song. 2024. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. (2024).
[63]
Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, and Yangqiu Song. 2024. On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions. CoRR, Vol. abs/2406.10885 (2024). showeprint[arXiv]2406.10885
[64]
Weiqi Wang, Tianqing Fang, Baixuan Xu, Chun Yi Louis Bo, Yangqiu Song, and Lei Chen. 2023. CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9--14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 13111--13140.
[65]
Weiqi Wang and Yangqiu Song. 2024. MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. CoRR, Vol. abs/2406.02106 (2024).
[66]
Hajime Watanabe, Mondher Bouazizi, and Tomoaki Ohtsuki. 2018. Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection. IEEE Access, Vol. 6 (2018), 13825--13835.
[67]
Tao Xiong, Peiran Zhang, Hongbo Zhu, and Yihui Yang. 2019. Sarcasm Detection with Self-matching Networks and Low-rank Bilinear Pooling. In WWW, Ling Liu, Ryen W. White, Amin Mantrach, Fabrizio Silvestri, Julian J. McAuley, Ricardo Baeza-Yates, and Leila Zia (Eds.). 2115--2124.
[68]
Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao Jing, Tianqing Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing Yin, Long Chen, and Yangqiu Song. 2024. MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding. CoRR, Vol. abs/2406.10701 (2024).
[69]
Nan Xu, Zhixiong Zeng, and Wenji Mao. 2020. Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association. In ACL. 3777--3786.
[70]
Kailai Yang, Shaoxiong Ji, Tianlin Zhang, Qianqian Xie, and Sophia Ananiadou. 2023. On the Evaluations of ChatGPT and Emotion-enhanced Prompting for Mental Health Analysis. CoRR, Vol. abs/2304.03347 (2023).
[71]
Changlong Yu, Xin Liu, Jefferson Maia, Yang Li, Tianyu Cao, Yifan Gao, Yangqiu Song, Rahul Goutam, Haiyang Zhang, Bing Yin, and Zheng Li. 2024. COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon. In Companion of the 2024 International Conference on Management of Data, SIGMOD/PODS 2024, Santiago AA, Chile, June 9--15, 2024, Pablo Barceló, Nayat Sánchez Pi, Alexandra Meliou, and S. Sudarshan (Eds.). ACM, 148--160.
[72]
Changlong Yu, Weiqi Wang, Xin Liu, Jiaxin Bai, Yangqiu Song, Zheng Li, Yifan Gao, Tianyu Cao, and Bing Yin. 2023. FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9--14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 1173--1191.
[73]
Edward N Zalta, Uri Nodelman, Colin Allen, and John Perry. 1995. Stanford encyclopedia of philosophy.
[74]
Qi Zhang, Jinlan Fu, Xiaoyu Liu, and Xuanjing Huang. 2018. Adaptive Co-attention Network for Named Entity Recognition in Tweets. In AAAI, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). 5674--5681.
[75]
Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, Jianxin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, and Lichao Sun. 2023. A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT. CoRR, Vol. abs/2302.09419 (2023).
[76]
Qingyuan Zhou, Zheng Xu, and Neil Y. Yen. 2019. User sentiment analysis based on social network information and its application in consumer reconstruction intention. Comput. Hum. Behav., Vol. 100 (2019), 177--183.

Cited By

View all
  • (2024)Structured Intention Generation with Multimodal Graph Transformers: The MMIntent-LLM Framework2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10826116(8512-8519)Online publication date: 15-Dec-2024

Index Terms

  1. Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. intention knowledge distillation
    2. large language model
    3. large vision language model
    4. social media

    Qualifiers

    • Research-article

    Funding Sources

    • NSFC
    • RIF
    • GRF
    • UGC Research Matching Grants

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)260
    • Downloads (Last 6 weeks)156
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Structured Intention Generation with Multimodal Graph Transformers: The MMIntent-LLM Framework2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10826116(8512-8519)Online publication date: 15-Dec-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media