ABSTRACT
Information filtering includes monitoring text streams to detect patterns that are more complex than those handled by search engines. Text stream monitoring and pattern detection have far reaching applications such as tracking information flow among terrorist outfits, web parental control, and business intelligence. Pattern characterization requirements of applications entail an expressive language for specifying patterns than what is currently provided by Information Retrieval Query Languages (IRQLs) and current information filtering systems. Pattern specification alone does not suffice, as detecting these complex patterns is equally important in order to use these systems for real-world applications.InfoFilter, a content-based information filtering system, presented in this paper, allows users to specify complex patterns and detects these patterns in incoming text streams from various sources such as news feed, emails, web pages and caption text from streaming videos. Complex patterns such as combinations of sequential, structural patterns, wild cards, word frequencies, proximity, Boolean operators and synonyms are formulated using the expressive pattern specification language, PSL, proposed in this paper. Once specified, these complex patterns are detected using a data flow paradigm over Pattern Detection Graphs (PDGs).
- R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. New York: ACM Press / Addison-Wesley, 1999.]] Google ScholarDigital Library
- G. Salton and M. McGill, Introduction to Modern Information Retrieval. New York: McGraw-Hill, Inc., 1983.]] Google ScholarDigital Library
- M. W. Berry, Survey of Text Mining: Clustering, Classification, and Retrieval. New York: Springer-Verlag, 2004.]] Google ScholarDigital Library
- C. Fellbaum, "WordNet: An Electronic Lexical Database," MIT press, 1998.]]Google Scholar
- T. Yan and H. Garcia-Molina, "The SIFT Information Dissemination System," in ACM TODS, vol. 24, no. 4, pp. 529 - 565, December 1999.]] Google ScholarDigital Library
- C. Stevens, "Knowledge-Based Assistance For Accessing Large, Poorly Structured Information Spaces," Ph.D. dissertation, Dept. of CS. University of Colorado, Boulder, 1993.]] Google ScholarDigital Library
- U. Manber, "Glimpse: A Tool To Search Through Entire File System," in Proc. of USENIX Winter 1994 Technical Conference.]] Google ScholarDigital Library
- M. Araújo, G. Navarro, and N. Ziviani, "Large Text Searching Allowing Errors," in Proc. of South American Workshop on String Processing, 1997, pp. 2--20.]]Google Scholar
- K. Aas, "A Survey on Personalized Information Filtering Systems For The World Wide Web," Report No. 922, Norwegian Computing Center, December, 1997.]]Google Scholar
- W. B. C. James, P. Callan and S. M. Harding, "The INQUERY Retrieval System," in Proc. of DEXA, 1992.]]Google Scholar
- "Structured Query Retrieval in Lemur." {Online}. Available: http://www-2.cs.cmu.edu/~emur/2.2/StructuredQuery.html]]Google Scholar
- R. Adaikkalavan and S. Chakravarthy, "SnoopIB: Interval-Based Event Specification and Detection for Active Databases," in Proc. of Advances in Databases and Information Systems (ADBIS), LNCS 2798, 2003, pp. 190--204.]]Google Scholar
- L. Elkhalifa, "InfoFilter: Complex Pattern Specification and Detection Over Text Streams," M. S. Thesis, Dept. of CSE, The University of Texas at Arlington, 2004. {Online}. Available: http://itlab.uta.edu/ITLABWEB/Students/sharma/theses/Laali.pdf]]Google Scholar
- "JWNL (Java WordNet Library)." {Online}. Available: http://sourceforge.net/projects/jwordnet]]Google Scholar
- S. Wu and U. Manber, "Fast Text Searching Allowing Errors," in Communications of the ACM, vol. 35, no. 10, pp. 83--91, 1992.]] Google ScholarDigital Library
- M. Nelson, "Fast String Searching With Suffix Trees," in Dr. Dobb's Journal, August 1996.]]Google Scholar
- "Sun Microsystems, JavaMail API Specification v 1.3.1." 2003.]]Google Scholar
Recommendations
Frugal bribery in voting
Bribery in elections is an important problem in computational social choice theory. We introduce and study two important special cases of the classical $Bribery problem, namely, Frugal-bribery and Frugal-$bribery where the briber is frugal in nature. By ...
An Experiment in Approval Voting
The first major experimental comparison of approval voting with regular plurality voting occurred in the 1985 annual election of The Institute of Management Sciences TIMS. In approval voting a person votes for approves of as many candidates as desired, ...
A New Coercion-Resistant and Receipt-Free Electronic Voting System with Verifiability and Secrecy
ICEE '12: Proceedings of the 2012 3rd International Conference on E-Business and E-Government - Volume 02The coexistence of verifiability and voting receipts in an electronic voting system is a contradictory issue. Because the electronic vote is in a virtual form, voters rely on voting receipts to verify the integrity of the votes. However, the voting ...
Comments