poster

Multi-language Reverse Dictionary Model Based on Improved mBERT

Authors:
Qi Han

Department of Computer Science and Technology, Ocean University of China, China

Department of Computer Science and Technology, Ocean University of China, China

0009-0008-6972-6842
View Profile

,
Yingjian Liu

Department of Computer Science and Technology, Ocean University of China, China

Department of Computer Science and Technology, Ocean University of China, China

0000-0002-3401-2566
View Profile

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023July 2023Pages 114–115https://doi.org/10.1145/3603165.3607426

Published:25 September 2023Publication History

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023

Pages 114–115

ABSTRACT

A reverse dictionary generates a ranked list of vocabulary words that correspond to the definition of a given input description. Although reverse dictionary has widely practical values, little research has been done, particularly on multilingual reverse dictionary. To address this gap and enhance the accuracy of reverse dictionary across different languages, this paper proposes a multilingual reverse dictionary model based on mBERT. It optimizes the original model with features such as part-of-speech of words. The effectiveness of this improved model has been validated on both English and Chinese datasets. Experimental results illustrate that our model outperforms the baseline models in most metrics.

References

L. Zhang, F. Qi, Z. Liu, Y. Wang, Q. Liu, and M. Sun, ‘Multi-channel reverse dictionary model’, in Proceedings of the AAAI conference on artificial intelligence, 2020, pp. 312–319.Google ScholarCross Ref
J. D. M.-W. C. Kenton and L. K. Toutanova, ‘BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding’, in Proceedings of NAACL-HLT, 2019, pp. 4171–4186.Google Scholar
A. Conneau , ‘Unsupervised Cross-lingual Representation Learning at Scale’, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 8440–8451.Google Scholar
C. Raffel , ‘Exploring the limits of transfer learning with a unified text-to-text transformer’, The Journal of Machine Learning Research, vol. 21, no. 1, pp. 5485–5551, 2020.Google ScholarDigital Library
H. Yan, X. Li, X. Qiu, and B. Deng, ‘BERT for Monolingual and Cross-Lingual Reverse Dictionary’, in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 4329–4338.Google ScholarCross Ref

Recommendations

Multi class-based n-gram language model for new words using web data
ROCOM'11/MUSP'11: Proceedings of the 11th WSEAS international conference on robotics, control and manufacturing technology, and 11th WSEAS international conference on Multimedia systems & signal processing

Out-of-vocabulary (OOV) words cause a serious problem for automatic speech recognition (ASR) system. Not only it will be miss-recognized as an in-vocabulary word with similar phonetics, but the error will also affect nearby words to make errors. ...
Read More
Comparison of performance of enhanced morpheme-based language model with different word-based language models for improving the performance of Tamil speech recognition system

This paper describes a new technique of language modeling for a highly inflectional Dravidian language, Tamil. It aims to alleviate the main problems encountered in processing of Tamil language, like enormous vocabulary growth caused by the large number ...
Read More
Pattern dictionary development based on non-compositional language model for japanese compound and complex sentences
ICCPOL'06: Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead

A large-scale sentence pattern dictionary (SP-dictionary) for Japanese compound and complex sentences has been developed. The dictionary has been compiled based on the non-compositional language model. Sentences with 2 or 3 predicates are extracted from ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023
July 2023
173 pages
ISBN:9798400702334
DOI:10.1145/3603165

Copyright © 2023 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2023
Check for updates
Author Tags
Language model
Natural language processing
Reverse dictionary
Qualifiers
- poster
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 38
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Multi-language Reverse Dictionary Model Based on Improved mBERT

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023

ABSTRACT

References

Cited By

Recommendations

Multi class-based n-gram language model for new words using web data

Comparison of performance of enhanced morpheme-based language model with different word-based language models for improving the performance of Tamil speech recognition system

Pattern dictionary development based on non-compositional language model for japanese compound and complex sentences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Multi-language Reverse Dictionary Model Based on Improved mBERT

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023

ABSTRACT

References

Cited By

Recommendations

Multi class-based n-gram language model for new words using web data

Comparison of performance of enhanced morpheme-based language model with different word-based language models for improving the performance of Tamil speech recognition system

Pattern dictionary development based on non-compositional language model for japanese compound and complex sentences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media