Prioritizing Information for the Discovery of Phenomena

Helman, Paul; Gore, Rebecca

doi:10.1023/A:1008628802726

Prioritizing Information for the Discovery of Phenomena

Published: September 1998

Volume 11, pages 99–138, (1998)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Paul Helman¹ &
Rebecca Gore²

31 Accesses
2 Citations
Explore all metrics

Abstract

We consider the problem of prioritizing a collection of discrete pieces of information, or transactions. The goal is to rank the transactions in such a way that the user can best pursue a subset of the transactions in hopes of discovering those which were generated by an interesting source. The problem is shown to differ from traditional classification in several fundamental ways. Ranking algorithms are divided into classes, depending on the amount of information they may utilize. We demonstrate that while ranking by the least constrained algorithm class is consistent with classification, such is not the case for a more constrained class of algorithms. We demonstrate also that while optimal ranking by the former class is “easy”, optimal ranking by the latter class is NP-hard. Finally, we present detectors which solve optimally restricted versions of the ranking problem, including symmetric anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Angluin, D. and Smith, C. (1983). Inductive Inference: Theory and Methods, ACM Computing Surveys, 15(3), 237–269.
Google Scholar
Baim, P.W. (1988).A Method for Attribute Selection in Inductive Learning Systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6), 888–896.
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis Dimension, J. ACM, 36(4), 929–965.
Google Scholar
Doak, J. (1992). An Evaluation of Feature Selection Methods and their Application to Computer Security, University of California at Davis, Technical Report CSE-92-18.
Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis, New York, NY: John Wiley and Sons.
Google Scholar
Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition (2nd edition), San Diego, CA: Academic Press, Inc.
Google Scholar
Garey, M. and Johnson, D. (1979). Computers and Intractability, San Fransisco, CA: W.H. Freeman.
Google Scholar
Gelman, A., Carlin, J., Stern H., and Rubin, D. (1995). Bayesian Data Analysis, London, England: Chapman & Hall.
Google Scholar
Gonzalez, R.C. and Thomason, M.G. (1978). Syntactic Pattern Recognition: An Introduction, Reading, MA: Addison-Wesley.
Google Scholar
Haussler, D. (1989). Learning Conjunctive Concepts in Structured Domains, Machine Learning, 4(1).
Helman, P. and Bhangoo, J. (1997). A Statistically Based System for Prioritizing Information Exploration Under Uncertainty, IEEE Trans. on Systems, Man, and Cybernetics, 27(4), 449–466.
Google Scholar
Helman, P. and Liepins, G. (1993). Statistical Foundations of Audit Trail Analysis for the Detection of Computer Misuse, IEEE Trans. on Software Engineering, 19(9), 886–901.
Google Scholar
Littlestone, N. (1988). Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm, Machine Learning, 2(2), 285–318.
Google Scholar
Sobel, M.J. (1990). Complete Ranking Procedures With Appropriate Loss Functions, Communications in Statistics—Theory and Methods, 19(12), 4525–4544.
Google Scholar
Tanner, M.A. (1991). Tools for Statistical Inference—Observed Data and Data Augmentation Methods, New York, NY: Springer-Verlag.
Google Scholar
Tou, J.T. and Gonzalez, R.C. (1984). Pattern Recognition Principles, Reading, MA: Addison-Wesley.
Google Scholar
Valiant, L. (1985). A Theory of the Learnable, Comm. ACM, 27(11), 1134–1142.
Google Scholar
Watanabe, S. (1985). Pattern Recognition: Human and Mechanical, New York, NY: John Wiley and Sons.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of New Mexico, Albuquerque, New Mexico, 87131
Paul Helman
Channing Laboratory, Brigham and Women's Hospital, Boston, Massachusetts, 02115-5804
Rebecca Gore

Authors

Paul Helman
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Gore
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Helman, P., Gore, R. Prioritizing Information for the Discovery of Phenomena. Journal of Intelligent Information Systems 11, 99–138 (1998). https://doi.org/10.1023/A:1008628802726

Download citation

Issue Date: September 1998
DOI: https://doi.org/10.1023/A:1008628802726

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prioritizing Information for the Discovery of Phenomena

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A Comprehensive Survey of Anomaly Detection Algorithms

Introduction to Bioinformatics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Prioritizing Information for the Discovery of Phenomena

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A Comprehensive Survey of Anomaly Detection Algorithms

Introduction to Bioinformatics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation