ABSTRACT
In this paper, a Bayesian inference network model for automatic indexing with index terms (descriptors) from a prescribed vocabulary is presented. It requires an indexing dictionary with rules mapping terms of the respective subject field onto descriptors and inverted lists for terms occuring in a set of documents of the subject field and descriptors manually assigned to these documents. The indexing dictionary can be derived automatically from a set of manually indexed documents. An application of the network model is described, followed by an indexing example and some experimental results about the indexing performance of the network model.
- Fangmeyer, H.; Lustig, G. (1969). The EURATOM Automatic Indexing Project. In" International Federation for information Processing (ed.): IFIP Congress 88, Edinburgh, pages 1310-1314. North Holland Publishing Company, Amsterdam.Google Scholar
- Fuhr, N.; Knorz, G. (1984). Retrieval Test Evaluation of a Rule Based Automatic Indexing (AIR/PHYS). In: Van Rijsbergen, C. (ed.): 1~esearch and Development in Information Retrieval, pages 391-408. Cambridge University Press, Cambridge. Google ScholarDigital Library
- Fuhr, N.; Hartmann, S.; Lustig, G.; Schwantnet, M.; Tzeras, K.; gnorz, G. (1991). AIR/X- a Rule-Based Multistage Indexing System for Large Subject Fields. In: Proceedings of the RIAO'91, Barcelona, Spain, April 2-5, 1991, pages 606-623.Google Scholar
- Fuhr, N. (1989a). Models for Retrieval with probabilistic Indexing. Information Processing and Management 25(1), pages 55-72. Google ScholarDigital Library
- Fuhr, N. (1989b). Optimum Polynomial Retrieval Functions Based on the Probability Ranking Principle. A CM Transactions on Information Systems 7(3), pages 183-204. Google ScholarDigital Library
- Hartmann, S. (1993). Weiterentwicklung der automatischen Indexierung. Dissertation. TH Darmstadt, Fachbereich Informatik (In Preparation).Google Scholar
- Jaene, H.; Seelbach, D. (1975). Maschinelle Eztrak. tion yon zusammengesetzten A usdriicken aus englischen Fachtezten. Report ZMD-A-29, Beuth, Berlin, Frankfurt.Google Scholar
- Kienitz-Vollmer, B.; Reichard, J. (1986). Bestimmung yon Mehrwortgruppen mithilfe des Begrenzerverfahrens. In: Lustig, G. (ed.): Automatische indexierung zwischen Forschung und Anwendung, pages 18-30. Olms, Hildesheim.Google Scholar
- Knorz, G. (1983). Automatisches Indexieren als Erkennen abstrakter Objekte. Niemeyer, Tiibingen.Google Scholar
- Kuhlen, R. (1977). Experimentelle Morphologie in der Informationswissenschafl. Verlag Dokumentation, Miinchen.Google Scholar
- Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman, San Mateo, Cal. Google ScholarDigital Library
- van Rijsbergen, C. J. (1977). A Theoretical Basis for the Use of Co-Occurrence Data in Information Retrieval. Journal of Documentation 33, pages 106- 119.Google Scholar
- Savoy J.; Desbois D. (1991). Bayesian Inference Networks in Hypertext. In: Proceedings of the RIA O'91, Barcelona, Spain, April P-5, 1991, pages 662-683.Google Scholar
- Turtle H.; Croft B. (1990). Inference Network for Document Retrieval. In: Vidick, J.-L. (ed.): Proceedings of the 13th Conference on Research 8~ Development in Information Retrieval, pages 1-24. ACM, New York. Google Scholar
- Turtle H.; Croft B. (1991). Efficient Probabilistic Inference for Text Retrieval. In: Proceedings of the 1~IA0'91, Barcelona, Spain, April P.5, 1991, pages 644-661.Google Scholar
Index Terms
- Automatic indexing based on Bayesian inference networks
Recommendations
The hyperdyadic index and generalized indexing and query with PIQUE
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database ManagementMany scientists rely on indexing and query to identify trends and anomalies within extreme-scale scientific data. Compressed bitmap indexing (e.g., FastBit) is the go-to indexing method for many scientific datasets and query workloads. Recently, the ...
Stronger Lempel-Ziv Based Compressed Text Indexing
Given a text T[1..u] over an alphabet of size σ, the full-text search problem consists in finding the occ occurrences of a given pattern P[1..m] in T. In indexed text searching we build an index on T to improve the search time, yet increasing the space ...
Comments