skip to main content
10.1145/3639856.3639872acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaimlsystemsConference Proceedingsconference-collections
research-article

Improved Sequence Predictions using Knowledge Graph Embedding for Large Language Models

Published: 17 May 2024 Publication History

Abstract

Large Language Models (LLM) have gained huge popularity recently due to their problem-solving capability in multiple domains. Technically LLMs can be considered a critical mixture of huge amounts of training data, smart and exhaustive prompt engineering, and word prediction models along with Reinforcement and Supervised learning mechanisms. Word prediction models are at the core of any Large Language Model. The latest word prediction techniques are sequential and transformer models. Transformers have overcome most of the drawbacks of sequential models with similar embedding knowledge. The literature survey shows little to no improvement in the embedding techniques. In this paper, we examined the existing word prediction models by replacing embedding models with an auto-engineered Knowledge Graph Embedding. This auto-engineered data representation shows drastic improvements in prediction quality. This mechanism also accelerates the prediction by providing more context information to the models with respect to the general embedding mechanism. Standard evaluation strategies are used to compare the model behavior.

References

[1]
Jia Chen, Tao Chen, Mengqi Shen, Yunhai Shi, Dongjing Wang, and Xin Zhang. 2022. Gated three-tower transformer for text-driven stock market prediction. Multimedia Tools and Applications 81, 21 (2022), 30093–30119.
[2]
Yuanfei Dai, Shiping Wang, Neal N Xiong, and Wenzhong Guo. 2020. A survey on knowledge graph embedding: Approaches, applications and benchmarks. Electronics 9, 5 (2020), 750.
[3]
Mohammadreza Ghodsi, Xiaofeng Liu, James Apfel, Rodrigo Cabrera, and Eugene Weinstein. 2020. RNN-transducer with stateless prediction network. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7049–7053.
[4]
GM Harshvardhan, Mahendra Kumar Gourisaria, Manjusha Pandey, and Siddharth Swarup Rautaray. 2020. A comprehensive survey and analysis of generative models in machine learning. Computer Science Review 38 (2020), 100285.
[5]
Orlando Iparraguirre-Villanueva, Victor Guevara-Ponce, Daniel Ruiz-Alvarado, Saul Beltozar-Clemente, Fernando Sierra-Liñan, Joselyn Zapata-Paulini, and Michael Cabanillas-Carbonell. 2023. Text prediction recurrent neural networks using long short-term memory-dropout. Indones. J. Electr. Eng. Comput. Sci 29 (2023), 1758–1768.
[6]
Abhyuday N Jagannatha and Hong Yu. 2016. Structured prediction models for RNN based sequence labeling in clinical text. In Proceedings of the conference on empirical methods in natural language processing. conference on empirical methods in natural language processing, Vol. 2016. NIH Public Access, 856.
[7]
Myeongjun Jang, Seungwan Seo, and Pilsung Kang. 2019. Recurrent neural network-based semantic variational autoencoder for sequence-to-sequence learning. Information Sciences 490 (2019), 59–73.
[8]
Siwei Lai, Kang Liu, Shizhu He, and Jun Zhao. 2016. How to Generate a Good Word Embedding. IEEE Intelligent Systems 31, 6 (2016), 5–14. https://doi.org/10.1109/MIS.2016.45
[9]
Danyang Liu and Gongshen Liu. 2019. A transformer-based variational autoencoder for sentence generation. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–7.
[10]
Navin Kumar Manaswi and Navin Kumar Manaswi. 2018. Rnn and lstm. Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech Recognition With TensorFlow and Keras (2018), 115–126.
[11]
Pijush Kanti Dutta Pramanik, Nilanjan Sinhababu, Kyung-Sup Kwak, and Prasenjit Choudhury. 2021. Deep learning based resource availability prediction for local mobile crowd computing. IEEE Access 9 (2021), 116647–116671.
[12]
Pijush Kanti Dutta Pramanik, Nilanjan Sinhababu, Anand Nayyar, and Prasenjit Choudhury. 2021. Predicting device availability in mobile crowd computing using ConvLSTM. In 2021 7th International Conference on Optimization and Applications (ICOA). IEEE, 1–5.
[13]
Andrew Pulver and Siwei Lyu. 2017. LSTM with working memory. In 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 845–851.
[14]
Ruslan Salakhutdinov. 2015. Learning deep generative models. Annual Review of Statistics and Its Application 2 (2015), 361–385.
[15]
Bidhan Sarkar, Nilanjan Sinhababu, Manob Roy, Pijush Kanti Dutta Pramanik, and Prasenjit Choudhury. 2020. Mining multilingual and multiscript Twitter data: unleashing the language and script barrier. International Journal of Business Intelligence and Data Mining 16, 1 (2020), 107–127.
[16]
Joseph Sirrianni, Emre Sezgin, Daniel Claman, and Simon L Linwood. 2022. Medical text prediction and suggestion using generative pretrained transformer models with dental medical notes. Methods of Information in Medicine 61, 05/06 (2022), 195–200.
[17]
Qian Tao, Yuchen Zhou, Jie Huang, Jiaying Li, and Senyu Ma. 2019. A GAN-based Transfer Learning Approach for Sentiment Analysis. In Proceedings of the 2019 International Conference on Artificial Intelligence and Computer Science. 364–368.
[18]
Jin Wang, Bo Peng, and Xuejie Zhang. 2018. Using a stacked residual LSTM model for sentiment intensity prediction. Neurocomputing 322 (2018), 93–101.
[19]
Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, and Dacheng Tao. 2023. Dptext-detr: Towards better scene text detection with dynamic points in transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 3241–3249.
[20]
Kamil Zeberga, Muhammad Attique, Babar Shah, Farman Ali, Yalew Zelalem Jembre, Tae-Sun Chung, 2022. A novel text mining approach for mental health prediction using Bi-LSTM and BERT model. Computational Intelligence and Neuroscience 2022 (2022).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AIMLSystems '23: Proceedings of the Third International Conference on AI-ML Systems
October 2023
381 pages
ISBN:9798400716492
DOI:10.1145/3639856
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention mechanisms
  2. generative models
  3. neural networks
  4. text prediction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

AIMLSystems 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 40
    Total Downloads
  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media