Skip to main content
Log in

LIT: transcription, annotation, search and visualization tools for the Lexicon of the Italian Television

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

LIT (Lexicon of the Italian Television) is a project conceived by the Accademia della Crusca, the leading research institution on the Italian language, in collaboration with CLIEO (Center for theoretical and historical Linguistics: Italian, European and Oriental languages), with the aim of studying frequencies of the Italian vocabulary used in television. Approximately 170 hours of random television recordings acquired from the national broadcaster RAI (Italian Radio Television) during the year 2006 have been used to create the corpus of transcriptions. The principal outcome of the project is the design and implementation of an interactive system which combines a web-based video transcription and annotation tool, a full featured search engine, and a web application for data visualization with text-video syncing. Furthermore, the project is currently under deployment as a module of the larger national research funding FIRB 2009 VIVIT (Fondo di Investimento per la Ricerca di Base, Vivi l'Italiano), which will integrate its achievements and results within a semantic web infrastructure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. A complete list is available at http://annotation.exmaralda.org/index.php/Tools

  2. Agile software development refers to a group of software development methodologies based on iterative development, where requirements and solutions evolve through collaboration between self-organizing cross-functional teams. http://agilemanifesto.org/

  3. Amaral R, Meinedo H, Caseiro D, Trancoso I, Neto JP (2006) Automatic vs. Manual Topic Segmentation and Indexation in Broadcast News. In: Proc. of the IV Jornadas en Tecnologia del Habla

  4. Anvil: http://www.anvil-software.de/ for more details

  5. at the moment of this writing, definitions of TEI standard DTD are those of rev. 5, released Nov. 2007: http://www.tei-c.org/Guidelines/P5/

  6. Bender EM, Langendoen DT (2010) `Computational linguistics in support of Linguistic Theory. Linguistic Issues in Language Technology 3(2):1–31

    Google Scholar 

  7. Bertini M, Cucchiara R, Del Bimbo A, Grana C, Serra G, Torniai C, Vezzani R (2009) Dynamic pictorially enriched ontologies for video digital libraries. IEEE Multimedia (MMUL)

  8. bits of memory for the ATLAS project definition: http://xml.coverpages.org/atlasAnnotation.html and official website of the Multimodal Information Group, the research board which is currently hosting parts of the project: http://www.nist.gov/itl/iad/mig/

  9. Dybkjaer L, Berman S, Kipp M, Olsen MW, Pirelli V, Reithinger N, Soria C (2001) Survey of existing tools, standards and user needs for annotation of natural interaction and multimodal data. Technical report, January

  10. Garrett J (2002) Elements of user experience: user-centered design for the web. New Riders Press, USA

    Google Scholar 

  11. Hauptmann AG, Jin R, Ng TD (2003) Video retrieval using speech and image information. In Storage and retrieval for multimedia databases 2003. EI’03 Electronic Imaging, pp 148–159

  12. Helena Moniz, Fernando Batista, Hugo Meinedo, Alberto Abad, Isabel Trancoso, Ana Isabel Mata da Silva, Nuno J (2010) Mamede, Prosodically-based automatic segmentation and punctuation, In Speech Prosody 2010, ISCA, Chicago, USA, May

  13. ILSP: http://www.ilsp.gr for more details

  14. in terms of diffusion, Flash player is supported by over 95% of market share: http://www.statowl.com/custom_ria_market_penetration.php

  15. it is fairly difficult to find a complete and updated reference of annotation tools and applications. The following, dated 2001, even if a little bit old, is almost complete and can be useful as a starting point: http://www.ldc.upenn.edu/annotation/

  16. Kristoffersen S (2008) Learnability and robustness of user interfaces: towards a formal analysis of usability design principles. In: Proceedings of the ICSOFT 2008: 3rd International Conference on Software and Data Technologies. vol. SE/GSDCA/MUSE: Institute for Systems and Technologies of Information, Control and Communication, pp 261–268

  17. Kuniavsky M (2003) Observing the user experience—A practitioner’s guide to user research. Morgan Kaufmann Publishers, Elsevier Science, USA

    Google Scholar 

  18. NITE: http://www.dfki.de/nite/ for more details

  19. Praat: http://www.fon.hum.uva.nl/praat/ for more details

  20. RDF standard official website: http://www.w3.org/RDF/ and SPARQL recommendation for RDF queries: http://www.w3.org/TR/rdf-sparql-query/

  21. SOAP protocol standard definition at W3C: http://www.w3.org/TR/soap/ and definition of REST web services: http://en.wikipedia.org/wiki/Representational_State_Transfer

  22. the MPEG-7 standard definition and overview: http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm

  23. The VIVIT (Vivi l’italiano) project is funded by the FIRB, the Investment Fund for Basic Research of the Italian Ministry of University and Research and has the objective of creating an integrated digital archive of teaching materials, texts and iconographic documents and media for knowledge dissemination of Italian language and cultural history

  24. Transcriber: http://trans.sourceforge.net/en/presentation.php for more details

Download references

Acknowledgments

The LIT project was initially funded by CLIEO, the “Center for theoretical and historical Linguistics: Italian, European and Oriental languages”, in collaboration with the Accademia della Crusca, the leading research institution on the Italian language, and we owe a debt of gratitude to Prof. Nicoletta Maraschio, who allowed to kick off this research. Most of the computer engineering work done at the Media Integration and Communication Center had a continuous feedback from researchers of the Accademia and would not have been possible with the precious support of Marco Biffi and Vera Gheno. Luckily enough, the initial financial support of CLIEO has been extended for a 3 years project funded by the Italian Ministry of Education, University and Research, which will allow integrating semantic web functionalities.

Besmir Bregasi, Ervin Hoxha and Tiberio Uricchio are three M.Eng. students who merely knew how things could get complicated when projects are delivered by the Media Integration and Communication Center, but they had the chance to learn a lot while working on their B.Eng. dissertation assignment. Freshness of young minds is always a plus when efforts can be focused and welded with senior experience.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas M. Alisi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alisi, T.M., Del Bimbo, A., Ferracani, A. et al. LIT: transcription, annotation, search and visualization tools for the Lexicon of the Italian Television. Multimed Tools Appl 60, 327–346 (2012). https://doi.org/10.1007/s11042-010-0610-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0610-3

Keywords

Navigation