Skip to main content

Automatic Construction of Tamil UNL Dictionary

  • Conference paper
  • First Online:
  • 1755 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9468))

Abstract

In this paper, we propose an automatic tool for creating dictionary entries of Tamil words for the Universal Networking Language (UNL). Dictionary plays a crucial role in many NLP applications especially in machine translation (MT) systems. However, creating dictionary entries manually is a time consuming process. Moreover the UNL dictionary consists of additional features such as semantic constraints and attributes. To address this complex task, we propose a domain specific approach where the dictionary entries are created automatically using other word-based resources such as WordNet, bilingual dictionaries, and the UNL ontology. For the source of domain specific words, we use domain specific documents from the web. The resources used for extracting meaningful words from the documents are: Morphological analyzer, to extract the grammatical information of a given word, WordNet, to identify the semantics of the given word and UNL KB (Knowledge Base) to obtain the semantic constraints of a given word. Semantic constraints help to know the tense mood and aspect of the given word. Sometimes these semantic constraints may not be determined correctly by the automatic process. In such cases, a semantic similarity based filtering method based on UNL ontology is used to remove the incorrect dictionary entries. Thus, this automatic dictionary tool handles words semantically and also improves the correctness of the dictionary.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ribeiro, C., Santos, R., Chaves, R.P., Marrafa, P.: “Semi-Automatic UNL dictionary generation using WordNet.PT. In: Universidade de Lisboa, CLUL CLG – Computation of Lexical and Grammatical Knowledge Research Group

    Google Scholar 

  2. Mangairkarasi, S., Gunasundari, S.: Semantic based text summarization using universal networking language. Int. J. Appl. Inf. Syst. 3(8), 18–23 (2012)

    Google Scholar 

  3. Gamallo Otero, P., Pichel Campose, J.R.: Automatic generation of bilingual dictionaries using intermediary languages and comparable corpora. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 473–483. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  4. Verma, N., Bhattacharyya, P.: Automatic generation of multilingual lexicon by using WordNet. In: The Proceedings of Convergences 2003, International Conference on the Convergence of Knowledge, Culture, Language and Information Technologies (2003)

    Google Scholar 

  5. Verma, N., Bhattacharyya, P.: Automatic lexicon generation through WordNet. In: Global WordNet Conference (2004)

    Google Scholar 

  6. Ali, M.N.Y., Ripon, S., Allayear, S.M.: “UNL based Bangla natural text conversion – predicate preserving parser approach. Int. J. Comput. Sci. Issues 9, 259–265 (2012)

    Google Scholar 

  7. Mridha, M.F., Nur, K.M., Banik, M., Huda, M.N.: Structure of dictionary entries of Bangla morphemes for universal networking language (UNL). Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 746–754 (2011)

    Google Scholar 

  8. Mridha, M.F., Nur, K.M., Banik, M., Huda, M.N.: Generation of attributes for Bangla words for universal networking language (UNL). Int. J. Adv. Comput. Sci. Appl. 2, 1–7 (2011)

    Google Scholar 

  9. Balaji, J., Geetha, T.V., Parthasarathi, R., Karky, M.: Article: morpho-semantic features for rule-based Tamil enconversion. Int. J. Comput. Appl. 26(6), 11–18 (2011)

    Google Scholar 

  10. Dhanabalan, T., Geetha, T.V.: UNL deconverter for Tamil. In: The International Conference on the Convergence of Knowledge, Culture, Language and Information Technologies (2003)

    Google Scholar 

  11. UNDL. 2011. Universal networking digital language. http://www.undl.org/. Accessed 28 September 2011

  12. Umamaheswari, E., Ranganathan, K., Geetha T.V., Parthasarathi, R., Karky, M.: Enhancement of morphological analyzer with compound, numeral and colloquial word handler. Tamil Computing Lab (TaCoLa), College of Engineering Guindy, Anna University, Chennai

    Google Scholar 

  13. Elanchezhiyan, K., Karthikeyan, S, Geetha, T.V., Parthasarathi, R., Karky, M.: Agaraadhi: a novel online dictionary framework. In: 10th International Tamil Internet Conference of International Forum for Information Technology in Tamil

    Google Scholar 

  14. Rajendran,S.: Tamil WordNet, Department of Linguistics Tamil University, Thanjavur

    Google Scholar 

  15. UNL Ontology 2011. http://www.undl.org/unlsys/uw/UNLOntology.html

  16. Alansary, S., Nagi, M., Adly, N.: A library information system (LIS) based on UNL knowledge infrastructure. In: Proceedings of the Universal Networking Language Workshop in conjunction with 7th International Conference on “Computer Science and Information Technology (2009)

    Google Scholar 

  17. Pushpak Bhattacharyya IndoWordNet, Lexical Resources Engineering Conference 2010 (LREC 2010), May 2010

    Google Scholar 

  18. Vossen, P.: EuroWordNet: a Multilingual Database with Lexical Semantic Networks. Spriger, Berlin (1998)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ganesh J , Ranjani Parthasarathi or Geetha T. V .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

J, G., Parthasarathi, R., V, G.T. (2015). Automatic Construction of Tamil UNL Dictionary. In: Prasath, R., Vuppala, A., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2015. Lecture Notes in Computer Science(), vol 9468. Springer, Cham. https://doi.org/10.1007/978-3-319-26832-3_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26832-3_58

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26831-6

  • Online ISBN: 978-3-319-26832-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics