Abstract
Machine Reading Comprehension (MRC) is the process where computers or, machines are taught to understand a paragraph or more technically called a context. Like humans, machines also need to be evaluated for their understanding on question answering. MRC is one of the formidable sub-domains in the Natural Language Processing (NLP) domain, which has seen considerable progress over the years. In recent years, many novel datasets have tried to challenge the Machine Reading Comprehension (MRC) models with inference based question answering. With the advancement in NLP, many models have surpassed human-level performance on these datasets, albeit ignoring the obvious disparity between genuine human-level performance and state-of-the-art performance. This highlights the need for attention on the collective improvement of existing datasets, metrics, and models towards “real” prehension. Addressing the lack of sanity in the domain, this paper performs a comparative study on various transformer based models and tries to highlight the success factors of each model. Subsequently, we discuss an MRC model that performs comparatively better, if not the best, on question answering and give directions for future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, pp. 1–18 (2020). http://arxiv.org/abs/2003.10555
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019 – 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, issue number Mlm, pp. 4171–4186 (2019)
Jiang, Z., Yu, W., Zhou, D., Chen, Y., Feng, J., Yan, S.: ConvBERT: improving BERT with span-based dynamic convolution. In: Advances in Neural Information Processing Systems 2020-Decem (NeurIPS), pp. 1–17 (2020)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, pp. 1–17 (2019). http://arxiv.org/abs/1909.11942
Lewis, M., et al.: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, pp. 7871–7880 (2020). https://doi.org/10.18653/v1/2020.acl-main.703
Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach, p. 1 (2019). http://arxiv.org/abs/1907.11692
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, pp. 2–6 (2019). http://arxiv.org/abs/1910.01108
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 1–18 (2019)
Zhang, Z., Yang, J., Zhao, H.: Retrospective Reader for Machine Reading Comprehension (Lm) (2020). http://arxiv.org/abs/2001.09694
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sankar, A., Dhanalakshmi, R. (2022). Comparative Study of Transformer Models. In: Hua, W., Wang, H., Li, L. (eds) Databases Theory and Applications. ADC 2022. Lecture Notes in Computer Science, vol 13459. Springer, Cham. https://doi.org/10.1007/978-3-031-15512-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-15512-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15511-6
Online ISBN: 978-3-031-15512-3
eBook Packages: Computer ScienceComputer Science (R0)