Skip to main content

A Template-Based Approach for Generating Vietnamese References from Flat MR Dataset in Restaurant Domain

  • Conference paper
  • First Online:
Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications (FDSE 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1306))

Included in the following conference series:

  • 1180 Accesses

Abstract

In recent years, researchers in natural language generation (NLG) focus on corpus-based systems on specific or across domains. The training data should consist of meaning representations (MRs) paired with Natural Language (NL) references. In the first content of the article, we introduce a Vietnamese Flat MR dataset which is the first Vietnamese dataset for training end-to-end, data-driven NLG systems in restaurant domain. We establish a method of generating references on this dataset. The core of the method are two important stages: (i) sentence planning which determine semantic template of the output text; (ii) surface realization which selecting appropriate Vietnamese phrases to replace the corresponding predicates (slot-value) of the Flat MR in the semantic template. The evaluation results show that the dataset and proposed generating method have contributed well to the development of the NLG research direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Androutsopoulos, I., Lampouras, G., Galanis, D.: Generating natural language descriptions from OWL ontologies: the natural OWL system. J. Artif. Intell. Res. 48, 671–715 (2013)

    Article  Google Scholar 

  2. Ballesteros, M., Bohnet, B., Mille, S., Wanner, L.: Data-driven sentence generation with non-isomorphic trees. In: Proceedings of NAACL-HTL 2015, pp. 387–397 (2015)

    Google Scholar 

  3. Bangalore, S., Stent, A.: Natural Language Generation in Interactive Systems. Cambridge University Press, Cambridge (2014)

    Google Scholar 

  4. Bernardi, R., et al.: Automatic description generation from images: a survey of models, datasets, and evaluation measures. J. Artif. Intell. Res. 55, 409–442 (2016)

    Article  Google Scholar 

  5. Chen, X., et al.: Microsoft COCO Captions: Data Collection and Evaluation Server (2015)

    Google Scholar 

  6. Chen, D.L., Mooney, R.J.: Learning to sportscast: a test of grounded language acquisition. In: Proceedings of the 25th International Conference on Machine learning (ICML), Helsinki, Finland, pp. 128–135 (2008)

    Google Scholar 

  7. Chomsky, N.: Syntactic Structures, 2nd edn. Mouton de Gruyter (2002)

    Google Scholar 

  8. Colin, E., Gardent, C., Mrabet, Y., Narayan, S., Beltrachini, P.L.: The webNLG challenge: generating text from DBPedia data. In: Proceedings of INLG 2016, pp. 163–167 (2016)

    Google Scholar 

  9. Dethlefs, N., Hastie, H., Rieser, V., Lemon, O.: Optimising incremental dialogue decisions using information density for interactive systems. In: Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 82–93 (2012)

    Google Scholar 

  10. Dethlefs, N.: Context-sensitive natural language generation: from knowledge-driven to data-driven techniques. Lang. Linguist. Compass 8(3), 99–115 (2014)

    Article  Google Scholar 

  11. Doddington, G.: Automatic evaluation of machine translation quality using n-gram cooccurrence statistics. In: Proceedings of the 2nd International Conference on Human Language Technology Research, San Diego, CA, USA, pp. 138–145 (2002)

    Google Scholar 

  12. Dong, L., Huang, S., Wei, F., Lapata, M., Zhou, M., Xu, K.: Learning to generate product reviews from attributes. In: Proceedings of EACL 2017, pp. 623–632 (2017)

    Google Scholar 

  13. Dusek, O., Jurcicek, F.: Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 45–51 (2016a)

    Google Scholar 

  14. Dusek, O., Jurcicek, F.: A context-aware natural language generator for dialogue systems. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Los Angeles, CA, USA, pp. 185–190 (2016b)

    Google Scholar 

  15. Dusek, O., Jurcicek, F.: Training a natural language generator from unaligned data. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 451–461 (2015)

    Google Scholar 

  16. Dusek, O., Novikova, J., Rieser, V.: Findings of the E2E NLG challenge (2018)

    Google Scholar 

  17. Dusek, O., Novikova, J., Rieser, V.: Evaluating the state-of-the-art of end-to-end natural language generation: the E2E NLG challenge (2019)

    Google Scholar 

  18. Gatt, A., Krahmer, E.: Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Intell. Res. 61, 65–170 (2018)

    Article  MathSciNet  Google Scholar 

  19. Gardent, C., Beltrachini, P.L.: A statistical, grammar-based approach to microplanning. Comput. Linguist. 43(1), 1–30 (2017)

    Article  MathSciNet  Google Scholar 

  20. Gkatzia, D., Mahamood, S.: A snapshot of NLG evaluation practices 2005–2014. In: Proceedings of the Fifteenth European Workshop on Natural Language Generation (ENLG), pp. 57–60. Association for Computational Linguistics, Brighton, UK (2015)

    Google Scholar 

  21. Halliday, M., Matthiessen, C.: An Introduction to Functional Grammar, 3rd edn. Hodder Arnold, London (2004)

    Google Scholar 

  22. Herzig, J., Shmueli-Scheuer, M., Sandbank, T., Konopnicki, D.: Neural response generation for customer service based on personality traits. In: Proceedings of INLG 2017, pp. 252–256 (2017)

    Google Scholar 

  23. Lampouras, G., Vlachos, A.: Imitation learning for language generation from unaligned data. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp. 1101–1112 (2016)

    Google Scholar 

  24. Lavie, A., Agarwal, A.: METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, pp. 228–231 (2007)

    Google Scholar 

  25. Lebret, R., Grangier, D., Auli, M.: Generating Text from Structured Data with Application to the Biography Domain. CoRR, 1603.07771 (2016)

    Google Scholar 

  26. Lepp, L., Munezero, M., Granroth-Wilding, M., Toivonen, H.: Data-driven news generation for automated journalism. In: Proceedings of INLG 2017, pp. 188–197 (2017)

    Google Scholar 

  27. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004, Barcelona, Spain, pp. 74–81 (2004)

    Google Scholar 

  28. Mairesse, F., et al.: Phrase-based statistical language generation using graphical models and active learning. In: Proceedings of the Forty-Eighth Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 1552–1561 (2010)

    Google Scholar 

  29. Mei, H., Bansal, M., Walter, M.R.: What to talk about and how? Selective generation using LSTMs with coarse-to-fine alignment. In: Proceedings of NAACL-HLT, San Diego, CA, USA (2016)

    Google Scholar 

  30. Novikova, J., Lemon, O., Rieser, V.: Crowd-sourcing NLG data: pictures elicit better data. In: Proceedings of the 9th International Natural Language Generation Conference, Edinburgh, UK, pp. 265–273 (2016)

    Google Scholar 

  31. Novikova, J., Dusek, O., Rieser, V.: The E2E dataset: new challenges for end-to-end generation. In: Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Saarbrücken, Germany, pp. 201–206 (2017a)

    Google Scholar 

  32. Novikova, J., Dusek, O., Rieser, V.: Data-driven Natural Language Generation: Paving the Road to Success. arXiv preprint arXiv:1706.09433 (2017b)

  33. Novikova, J., Dusek, O., Curry, A.C., Rieser, V.: Why we need new evaluation metrics for NLG. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 2241–2252 (2017c)

    Google Scholar 

  34. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, pp. 311–318 (2002)

    Google Scholar 

  35. Reiter, E., Dale, R.: Building Natural Language Generation System. Cambridge University Press, Cambridge (1997)

    Google Scholar 

  36. Rieser, V., Lemon, O., Keizer, S.: Natural language generation as incremental planning under uncertainty: adaptive information presentation for statistical dialogue systems. IEEE/ACM Trans. Audio Speech Lang. Process. 22(5), 979–993 (2014). https://doi.org/10.1109/TASL.2014.2315271

    Article  Google Scholar 

  37. Sharma, S., He, J., Suleman, K., Schulz, H., Bachman, F.: Natural language generation in dialogue using lexicalized and delexicalized data. CoRR, abs/1606.03632 (2016)

    Google Scholar 

  38. Tran, T.: Phương pháp xác định những câu hỏi tương đương nghĩa cho hệ thống tìm kiếm thư viện bằng truy vấn tiếng Việt [The method of identifying questions having the equivalent meaning for the library finding system by Vietnamese queries]. M.S. Thesis. University of Information Technology, VNU-HCM, Vietnam (2011)

    Google Scholar 

  39. Vedantam, R., Zitnick, C.L., Parikh, D.: CIDEr: consensus-based image description evaluation. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 4566–4575 (2015)

    Google Scholar 

  40. Wen, T.-H., Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 1711–1721 (2015)

    Google Scholar 

  41. Wen, T.-H., et al.: Multi-domain neural network language generation for spoken dialogue systems. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, pp. 120–129 (2016)

    Google Scholar 

  42. Zaidan, O.F., Callison-Burch, C.: Crowdsourcing translation: professional quality from non-professionals. In: Proceedings of the ACL, Portland, Oregon, USA, pp. 1220–1229 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dang Tuan Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, D.T., Tran, T. (2020). A Template-Based Approach for Generating Vietnamese References from Flat MR Dataset in Restaurant Domain. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2020. Communications in Computer and Information Science, vol 1306. Springer, Singapore. https://doi.org/10.1007/978-981-33-4370-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-33-4370-2_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-33-4369-6

  • Online ISBN: 978-981-33-4370-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics