Loading [a11y]/accessibility-menu.js
Supervised learning in the wild: Text classification for critical technologies | IEEE Conference Publication | IEEE Xplore

Supervised learning in the wild: Text classification for critical technologies


Abstract:

We explore the problem of locating documents pertaining to critical technologies (e.g., restricted, proprietary, or sensitive technical information) from among a massive ...Show More

Abstract:

We explore the problem of locating documents pertaining to critical technologies (e.g., restricted, proprietary, or sensitive technical information) from among a massive and highly heterogeneous collection of largely unimportant files. We present a system that employs the use of supervised machine learning (i.e., pattern recognition) to detect such critical documents. To address difficult or ambiguous instances, we supplement the text classifier with an automated keyword search. That is, we extract, in an automated fashion, discriminative terms (i.e., keywords) from the training set and match them against documents during the classification process. We demonstrate the effectiveness of this hybrid approach through a series of validation tests and case studies.
Date of Conference: 29 October 2012 - 01 November 2012
Date Added to IEEE Xplore: 28 January 2013
ISBN Information:

ISSN Information:

Conference Location: Orlando, FL, USA

Contact IEEE to Subscribe

References

References is not available for this document.