ABSTRACT
Deploying a classifier to large-scale systems such as the web requires careful feature design and performance evaluation. Evaluation is particularly challenging because these large collections frequently change. In this paper we adapt stratified sampling techniques to evaluate the precision of classifiers deployed in large-scale systems. We investigate different types of stratification strategies, and then we derive a new online sampling algorithm that incrementally approximates the theoretical optimal disproportionate sampling strategy. In experiments, the proposed algorithm significantly outperforms both simple random sampling as well as other types of stratified sampling, with an average reduction of about 20% in labeling effort to reach the same confidence and interval-bounds on precision
- J. Allan, B. Carterette, J. A. Aslam, V. Pavlu, B. Dachev, and E. Kanoulas. Million query track 2007 overview. In E. M. Voorhees and L. P. Buckland, editors, The Sixteenth Text REtrieval Conference Proceedings (TREC 2007). National Institute of Standards and Technology, December 2008. NIST Special Publication SP 500-274.Google Scholar
- P. N. Bennett. Using asymmetric distributions to improve text classifier probability estimates. In SIGIR `03, 2003. Google ScholarDigital Library
- J. Carletta. Assessing agreement in classification tasks: the kappa statistic. Computational Linguistics, 22(2):249--254, 1996. Google ScholarDigital Library
- S. Chaudhuri, G. Das, and V. Narasayya. Optimized stratified sampling for approximate query processing. ACM TODS, 32(2), 2007. Google ScholarDigital Library
- G. Cormack and T. Lynam. Online supervised spam filter evaluation. ACM TOIS, 25(3), 2007. Google ScholarDigital Library
- P. Dixon, A. Ellison, and N. Gotelli. Improving the precision of estimates of the frequency of rare events. Ecology, 86(5), 2005.Google Scholar
- S. Fernandes, C. Kamienski, J. Kelner, D. Mariz, and D. Sadok. A stratified traffic sampling methodology for seeing the big picture. Computer Networks, 52:2677--2689, 2008. Google ScholarDigital Library
- X. He, L. Duan, Y. Zhou, and B. Dom. Threshold selection for web-page classification with highly skewed class distribution. In WWW `09, 2009. Google ScholarDigital Library
- L. Kish. Survey Sampling. John Wiley & Sons, Inc., 1965.Google Scholar
- D. D. Lewis. Evaluating and optimizing autonomous text classification systems. In SIGIR `95, 1995. Google ScholarDigital Library
- Netscape Communication Corporation. Open directory project. http://www.dmoz.org.Google Scholar
- J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. J. Smola, P. Bartlett, B. Scholkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers. MIT Press, 1999.Google Scholar
- S. K. Thompson. Sampling. Wiley-Interscience, 2002.Google Scholar
- E. Yilmaz, E. Kanoulas, and J. A. Aslam. A simple and efficient sampling method for estimating AP and NDCG. In SIGIR `08, 2008. Google ScholarDigital Library
- B. Zadrozny. Learning and evaluating classifiers under sample selection bias. In ICML `04, 2004. Google ScholarDigital Library
- B. Zadrozny and C. Elkan. Reducing multiclass to binary by coupling probability estimates. In KDD '02, 2002.Google Scholar
- T. Zseby. Stratification strategies for sampling-based non-intrusive measurements of one-way delay. In Passive and Active Measurement Workshop (PAM 2003), 2003.Google Scholar
Index Terms
- Online stratified sampling: evaluating classifiers at web-scale
Recommendations
Fast balanced sampling for highly stratified population
Balanced sampling is a very efficient sampling design when the variable of interest is correlated to the auxiliary variables on which the sample is balanced. A procedure to select balanced samples in a stratified population has previously been proposed. ...
The Concept of Stratified Sampling of Execution Traces
ICPC '11: Proceedings of the 2011 IEEE 19th International Conference on Program ComprehensionExecution traces can be overwhelmingly large. To reduce their size, sampling techniques, especially the ones based on random sampling, have been extensively used. Random sampling, however, may result in samples that are not representative of the ...
Stratified sampling of execution traces: Execution phases serving as strata
The understanding of the behavioral aspects of a software system is an important enabler for many reverse engineering activities. The behavior of software is typically represented in the form of execution traces. Traces, however, can be overwhelmingly ...
Comments