Skip to main content
Log in

Prioritizing Information for the Discovery of Phenomena

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

We consider the problem of prioritizing a collection of discrete pieces of information, or transactions. The goal is to rank the transactions in such a way that the user can best pursue a subset of the transactions in hopes of discovering those which were generated by an interesting source. The problem is shown to differ from traditional classification in several fundamental ways. Ranking algorithms are divided into classes, depending on the amount of information they may utilize. We demonstrate that while ranking by the least constrained algorithm class is consistent with classification, such is not the case for a more constrained class of algorithms. We demonstrate also that while optimal ranking by the former class is “easy”, optimal ranking by the latter class is NP-hard. Finally, we present detectors which solve optimally restricted versions of the ranking problem, including symmetric anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Angluin, D. and Smith, C. (1983). Inductive Inference: Theory and Methods, ACM Computing Surveys, 15(3), 237–269.

    Google Scholar 

  • Baim, P.W. (1988).A Method for Attribute Selection in Inductive Learning Systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6), 888–896.

    Google Scholar 

  • Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis Dimension, J. ACM, 36(4), 929–965.

    Google Scholar 

  • Doak, J. (1992). An Evaluation of Feature Selection Methods and their Application to Computer Security, University of California at Davis, Technical Report CSE-92-18.

  • Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis, New York, NY: John Wiley and Sons.

    Google Scholar 

  • Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition (2nd edition), San Diego, CA: Academic Press, Inc.

    Google Scholar 

  • Garey, M. and Johnson, D. (1979). Computers and Intractability, San Fransisco, CA: W.H. Freeman.

    Google Scholar 

  • Gelman, A., Carlin, J., Stern H., and Rubin, D. (1995). Bayesian Data Analysis, London, England: Chapman & Hall.

    Google Scholar 

  • Gonzalez, R.C. and Thomason, M.G. (1978). Syntactic Pattern Recognition: An Introduction, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Haussler, D. (1989). Learning Conjunctive Concepts in Structured Domains, Machine Learning, 4(1).

  • Helman, P. and Bhangoo, J. (1997). A Statistically Based System for Prioritizing Information Exploration Under Uncertainty, IEEE Trans. on Systems, Man, and Cybernetics, 27(4), 449–466.

    Google Scholar 

  • Helman, P. and Liepins, G. (1993). Statistical Foundations of Audit Trail Analysis for the Detection of Computer Misuse, IEEE Trans. on Software Engineering, 19(9), 886–901.

    Google Scholar 

  • Littlestone, N. (1988). Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm, Machine Learning, 2(2), 285–318.

    Google Scholar 

  • Sobel, M.J. (1990). Complete Ranking Procedures With Appropriate Loss Functions, Communications in Statistics—Theory and Methods, 19(12), 4525–4544.

    Google Scholar 

  • Tanner, M.A. (1991). Tools for Statistical Inference—Observed Data and Data Augmentation Methods, New York, NY: Springer-Verlag.

    Google Scholar 

  • Tou, J.T. and Gonzalez, R.C. (1984). Pattern Recognition Principles, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Valiant, L. (1985). A Theory of the Learnable, Comm. ACM, 27(11), 1134–1142.

    Google Scholar 

  • Watanabe, S. (1985). Pattern Recognition: Human and Mechanical, New York, NY: John Wiley and Sons.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Helman, P., Gore, R. Prioritizing Information for the Discovery of Phenomena. Journal of Intelligent Information Systems 11, 99–138 (1998). https://doi.org/10.1023/A:1008628802726

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008628802726

Navigation