Abstract
LIT (Lexicon of the Italian Television) is a project conceived by the Accademia della Crusca, the leading research institution on the Italian language, in collaboration with CLIEO (Center for theoretical and historical Linguistics: Italian, European and Oriental languages), with the aim of studying frequencies of the Italian vocabulary used in television. Approximately 170 hours of random television recordings acquired from the national broadcaster RAI (Italian Radio Television) during the year 2006 have been used to create the corpus of transcriptions. The principal outcome of the project is the design and implementation of an interactive system which combines a web-based video transcription and annotation tool, a full featured search engine, and a web application for data visualization with text-video syncing. Furthermore, the project is currently under deployment as a module of the larger national research funding FIRB 2009 VIVIT (Fondo di Investimento per la Ricerca di Base, Vivi l'Italiano), which will integrate its achievements and results within a semantic web infrastructure.









Similar content being viewed by others
References
A complete list is available at http://annotation.exmaralda.org/index.php/Tools
Agile software development refers to a group of software development methodologies based on iterative development, where requirements and solutions evolve through collaboration between self-organizing cross-functional teams. http://agilemanifesto.org/
Amaral R, Meinedo H, Caseiro D, Trancoso I, Neto JP (2006) Automatic vs. Manual Topic Segmentation and Indexation in Broadcast News. In: Proc. of the IV Jornadas en Tecnologia del Habla
Anvil: http://www.anvil-software.de/ for more details
at the moment of this writing, definitions of TEI standard DTD are those of rev. 5, released Nov. 2007: http://www.tei-c.org/Guidelines/P5/
Bender EM, Langendoen DT (2010) `Computational linguistics in support of Linguistic Theory. Linguistic Issues in Language Technology 3(2):1–31
Bertini M, Cucchiara R, Del Bimbo A, Grana C, Serra G, Torniai C, Vezzani R (2009) Dynamic pictorially enriched ontologies for video digital libraries. IEEE Multimedia (MMUL)
bits of memory for the ATLAS project definition: http://xml.coverpages.org/atlasAnnotation.html and official website of the Multimodal Information Group, the research board which is currently hosting parts of the project: http://www.nist.gov/itl/iad/mig/
Dybkjaer L, Berman S, Kipp M, Olsen MW, Pirelli V, Reithinger N, Soria C (2001) Survey of existing tools, standards and user needs for annotation of natural interaction and multimodal data. Technical report, January
Garrett J (2002) Elements of user experience: user-centered design for the web. New Riders Press, USA
Hauptmann AG, Jin R, Ng TD (2003) Video retrieval using speech and image information. In Storage and retrieval for multimedia databases 2003. EI’03 Electronic Imaging, pp 148–159
Helena Moniz, Fernando Batista, Hugo Meinedo, Alberto Abad, Isabel Trancoso, Ana Isabel Mata da Silva, Nuno J (2010) Mamede, Prosodically-based automatic segmentation and punctuation, In Speech Prosody 2010, ISCA, Chicago, USA, May
ILSP: http://www.ilsp.gr for more details
in terms of diffusion, Flash player is supported by over 95% of market share: http://www.statowl.com/custom_ria_market_penetration.php
it is fairly difficult to find a complete and updated reference of annotation tools and applications. The following, dated 2001, even if a little bit old, is almost complete and can be useful as a starting point: http://www.ldc.upenn.edu/annotation/
Kristoffersen S (2008) Learnability and robustness of user interfaces: towards a formal analysis of usability design principles. In: Proceedings of the ICSOFT 2008: 3rd International Conference on Software and Data Technologies. vol. SE/GSDCA/MUSE: Institute for Systems and Technologies of Information, Control and Communication, pp 261–268
Kuniavsky M (2003) Observing the user experience—A practitioner’s guide to user research. Morgan Kaufmann Publishers, Elsevier Science, USA
NITE: http://www.dfki.de/nite/ for more details
Praat: http://www.fon.hum.uva.nl/praat/ for more details
RDF standard official website: http://www.w3.org/RDF/ and SPARQL recommendation for RDF queries: http://www.w3.org/TR/rdf-sparql-query/
SOAP protocol standard definition at W3C: http://www.w3.org/TR/soap/ and definition of REST web services: http://en.wikipedia.org/wiki/Representational_State_Transfer
the MPEG-7 standard definition and overview: http://mpeg.chiariglione.org/standards/mpeg-7/mpeg-7.htm
The VIVIT (Vivi l’italiano) project is funded by the FIRB, the Investment Fund for Basic Research of the Italian Ministry of University and Research and has the objective of creating an integrated digital archive of teaching materials, texts and iconographic documents and media for knowledge dissemination of Italian language and cultural history
Transcriber: http://trans.sourceforge.net/en/presentation.php for more details
Acknowledgments
The LIT project was initially funded by CLIEO, the “Center for theoretical and historical Linguistics: Italian, European and Oriental languages”, in collaboration with the Accademia della Crusca, the leading research institution on the Italian language, and we owe a debt of gratitude to Prof. Nicoletta Maraschio, who allowed to kick off this research. Most of the computer engineering work done at the Media Integration and Communication Center had a continuous feedback from researchers of the Accademia and would not have been possible with the precious support of Marco Biffi and Vera Gheno. Luckily enough, the initial financial support of CLIEO has been extended for a 3 years project funded by the Italian Ministry of Education, University and Research, which will allow integrating semantic web functionalities.
Besmir Bregasi, Ervin Hoxha and Tiberio Uricchio are three M.Eng. students who merely knew how things could get complicated when projects are delivered by the Media Integration and Communication Center, but they had the chance to learn a lot while working on their B.Eng. dissertation assignment. Freshness of young minds is always a plus when efforts can be focused and welded with senior experience.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alisi, T.M., Del Bimbo, A., Ferracani, A. et al. LIT: transcription, annotation, search and visualization tools for the Lexicon of the Italian Television. Multimed Tools Appl 60, 327–346 (2012). https://doi.org/10.1007/s11042-010-0610-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0610-3