Thesaurus Construction and Use: a Practical Manual, 4th ed.

Marianne Lykke Nielsen (Royal School of Library and Information Science, Denmark)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 June 2002

353

Keywords

Citation

Lykke Nielsen, M. (2002), "Thesaurus Construction and Use: a Practical Manual, 4th ed.", Journal of Documentation, Vol. 58 No. 3, pp. 350-353. https://doi.org/10.1108/jd.2002.58.3.350.15

Publisher

:

Emerald Group Publishing Limited

Copyright © 2002, MCB UP Limited


This book is the fourth edition of a classic which has served as a thorough guideline for thesaurus compilers for the last 30 years. Since the first edition appeared in 1972, the IR scene has changed dramatically from information retrieval based on bibliographic, intellectual indexing to full‐text retrieval, and the authors bring up in the introduction the question of whether a new edition is needed and timely. However, both research and practice show that thesauri still have a role in information retrieval. In fact, one could say that the thesaurus has been reborn during the last ten years in the context of corporate information systems available by intranets and systems providing access to huge amounts of networked information by the Internet (see e.g. Soergel, 1999; Gilchrist and Kibby, 2000; Hunter, 2001; Clarke and Yancey, 2001). The purpose and structure of the information systems using a thesaurus have changed radically, and the variation of systems is enormous, ranging from traditional manually indexed systems registering documents to content management systems registering data and information objects using computer‐based indexing and advanced searching techniques. Thesauri also vary and serve many functions, for example aiding in the general understanding of a subject area, acting as terminological standards, supporting automatic indexing, supporting query expansion, etc. However, the primary role of thesauri has remained stable as a tool to map and structure semantic relationships used to describe and provide access to information objects, and although thesauri serve different purposes, they overlap greatly following very similar principles and methods for their construction. Thus, there is no doubt that there is still a place for manuals on thesaurus construction in the information retrieval scene. The question is whether this book has adapted with the times and provides timely guidelines on the topic.

Basically, this book describes the concept of thesauri and how to design and compile them. Section A describes the nature, purpose and use of thesauri, dividing the role in two: thesauri for indexing and thesauri for searching, thereby omitting the function of assisting researchers, writers, readers in exploring the conceptual framework and context of a certain knowledge domain. The recent development of “corporate taxonomies” is introduced in the book, but the new role for the thesaurus as a teaching aid or knowledge base in its own right, supporting language learning and understanding, is not discussed thoroughly. Neither is the broader role of providing semantic knowledge to metadata in general mentioned (Hunter, 2001).

Section B describes why a thorough understanding of an information system is needed when designing a thesaurus in order to meet the requirements of the system and its users. Useful recommendations of what aspects to consider are given, but the manual does not suggest any methodology for gathering the necessary information. It is said in the introduction that there are “problems to be faced in keeping a book on this topic as short as possible”, and with all respect to this a suggestion could be to include references to works about the necessary system analysis (e.g. Pejtersen, 1980; Soergel, 1985; Lykke Nielsen, 2001).

Section C, which refers to international and national standards for thesaurus construction, is followed by sections dealing with the very important and very fully covered topics of choice and form of terms, specificity and compound terms, explaining and remodelling the recommendations of the standards for vocabulary control and structuring. Students find these sections very comprehensive, but they realise later when practising thesaurus construction that they are very useful in real life.

These sections are followed by G “Auxiliary retrieval devices”, H “Thesaurus displays” and I “Multilingual thesauri”, before section J on “Construction techniques”. Using the book in practice, this structure does not work very well. The sections about construction methodology: planning, techniques, control and structure, belong together and form a progression or flow for thesaurus construction. The process is started by an analysis of the information environment, followed by determination of purpose and design, then collection of terms and concepts, control of form and structuring. Thus, the sections about control and structure are strongly connected to section J, explaining the best way to fulfil the task of steps B, C and D in Figure 35 (pp. 154‐5), and a restructuring of the book might be useful.

The recommendations about construction techniques are clear, constructive and very useful for the thesaurus compiler. However, it might be helpful to have a description of how to use and integrate thesaurus management software in the construction process. These tools provide advantages, but might also put some constraints on both methodology and design. Section H on thesaurus displays provides a rich picture of possibilities for layout and display. Only two pages are devoted to screen displays, and even though conventional displays apply to electronic versions, it would be useful to have more advice on special opportunities and constraints related to screen display. Many of today’s thesauri only exist in an electronic version. The topics of multilingual thesauri (section I) and thesaurus integration (section L) are very important in relation to networked information retrieval, facilitating retrieval from many different sources and across many different languages and environments. The book provides an overview of projects and research within the field, but does not discuss in detail how to handle problems concerning morphological differences and fluid, fuzzy meanings of terms caused by social‐cultural and disciplinary differences.

The expansion between the third and fourth editions is only six pages, of which the bibliography contributes three, and the index another one. Only a few changes differentiate the two editions: the presentation of the corporate taxonomy and its connection to networking, full‐text systems and the use of automatic, computer‐based techniques. During its lifetime the book has continuously been updated and new related topics have been introduced. Thus it is a very valuable work providing basic knowledge about thesaurus construction with reference to new, appropriate literature. This fourth edition is not a must for the reader familiar with the third edition, but for new readers it is recommended as a comprehensive reference work to consult when using or developing thesauri. However, both old and new readers, experts and novices would profit from a fifth, rewritten edition discussing in detail the new era of thesauri and putting it into the context of new technologies.

References

Clarke, D. and Yancey, T. (2001), “Twenty‐first century tools for vocabulary management and indexing”, paper presented at the 2001 Annual Meeting of the American Society for Information Science and Technology, Washington, DC, available at: www.synaptica.com/asis_2001.asp (accessed 19 December 2001).

Gilchrist, A. and Kibby, P. (2000), Taxonomies for Business: Access and Connectivity in a Wired World, TFPL, London.

Hunter, J. (2001), “MetaNet – a metadata term thesaurus to enable semantic interoperability between metadata domains”, Journal of Digital Information, special issue on Networked Knowledge Organization Systems, Vol. 1 No. 8, available at: http://jodi.ecs.soton.ac.uk/Articles/v01/i108/Hunter (accessed 23 January 2002).

Lykke Nielsen, M. (2001), “A framework for work task based thesaurus design”, Journal of Documentation, Vol. 57 No. 6, pp. 774‐97.

Pejtersen, A.M. (1980), “Design of a classification scheme for fiction based on an analysis of actual user‐librarian communication, and use of the scheme of control of librarians” search strategies”, in Harbo, O. and Kajberg, L. (Eds), Theory and Application of Information Research, Proceedings of the 2nd International Research Forum on Information Science, Mansell, London, pp. 146‐9.

Soergel, D. (1985), Organizing Information, Academic Press, San Diego, CA.

Soergel, D. (1999), “The rise of ontologies or the reinvention of classification”, Journal of the American Society for Information Science, Vol. 50 No. 12, pp. 1119‐20.

Related articles