Skip to main content
Log in

Exploiting Background Information in Knowledge Discovery from Text

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

This paper describes the FACT system for knowledge discovery fromtext. It discovers associations—patterns ofco-occurrence—amongst keywords labeling the items in a collection oftextual documents. In addition, when background knowledge is available aboutthe keywords labeling the documents FACT is able to use this information inits discovery process. FACT takes a query-centered view of knowledgediscovery, in which a discovery request is viewed as a query over theimplicit set of possible results supported by a collection of documents, andwhere background knowledge is used to specify constraints on the desiredresults of this query process. Execution of a knowledge-discovery query isstructured so that these background-knowledge constraints can be exploitedin the search for possible results. Finally, rather than requiring a user tospecify an explicit query expression in the knowledge-discovery querylanguage, FACT presents the user with a simple-to-use graphical interface tothe query language, with the language providing a well-defined semantics forthe discovery actions performed by a user through the interface.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agrawal, A., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Conference on Management of Data(pp. 207–216).

  • Agrawal, A. and Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the VLDB Conference, Santiago, Chile.

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, I. (1995). Fast Discovery of Association Rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining(pp. 307–328), AAAI Press.

  • Apte, C., Damerau, F., and Weiss, S. (1994). Towards language independent automated learning of text categorization models. In Proceedings of ACM-SIGIR Conference on Information Retrieval.

  • Dagan, I., Feldman, R., and Hirsh, H. (1996). Keyword-based browsing and analysis of large document sets. In Proceedings of SDAIR96, Las Vegas, Nevada.

  • Feldman R. and Dagan I. (1995). KDT—Knowledge discovery in texts. In Proceedings of the First International Conference on Knowledge Discovery (KDD-95).

  • Feldman, R., Dagan, I., and Kloesgen, W. (1996). Efficient algorithms for mining and manipulating associations in texts. In Proceedings of EMCSR96, Vienna, Austria.

  • Imielinski, T. (1995). Invited talk. The First International Conference on Knowledge Discovery (KDD-95).

  • Iwayama, M. and Tokunaga, T. (1994). A probabilistic model for text categorization based on a single random variable with multiple values. In Proceedings of the 4th Conference on Applied Natural Language Processing.

  • Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkamo, A. (1994). Finding interesting rules from large sets of discovered association rules. In Proceedings of the 3rd International Conference on Information and Knowledge Management.

  • Mannila, H., Toivonen, H., and Verkamo, A. (1994). Efficient Algorithms for Discovering Association Rules. In KDD-94: AAAI Workshop on Knowledge Discovery in Databases, 181–192.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feldman, R., Hirsh, H. Exploiting Background Information in Knowledge Discovery from Text. Journal of Intelligent Information Systems 9, 83–97 (1997). https://doi.org/10.1023/A:1008693204338

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008693204338

Navigation