ABSTRACT
Intelligent desktop environments allow the desktop user to define a set of projects or activities that characterize the user's desktop work. These environments then attempt to identify the current activity of the user in order to provide various kinds of assistance. These systems take a hybrid approach in which they allow the user to declare their current activity but they also employ learned classifiers to predict the current activity to cover those cases where the user forgets to declare the current activity. The classifiers must be trained on the very noisy data obtained from the user's activity declarations. Instead of asking the user to review and relabel the data manually, we employ an active EM algorithm that combines the EM algorithm and active learning. EM can be viewed as retraining on its own predictions. To make it more robust, we only retrain on those predictions that are made with high confidence. For active learning, we make a small number of queries to the user based on the most uncertain instances. Experimental results on real users show this active EM algorithm can significantly improve the prediction precision, and that it performs better than either EM or active learning alone.
- D. Angluin. Queries and concept learning. Machine Learning, 2(4):319--342, April 1988. Google ScholarCross Ref
- D. Angluin and P. Laird. Learning from noisy examples. Machine Learning, 2(4):343--370, 1988. Google ScholarDigital Library
- A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, pages 506--519, 2003. Google ScholarDigital Library
- A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc.of COLT-98, pages 92--100, 1998. Google ScholarDigital Library
- C. E. Brodley and M. Friedl. Identifying mislabeled training data. JAIR, 11:131--167, 1999.Google ScholarCross Ref
- D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Machine Learning, 15(2):201--221, 1994. Google ScholarCross Ref
- D. Cohn, Z. Ghahramani, and M. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129--145, 1996. Google ScholarDigital Library
- A. Dempster, N. Laird, and D. Rubin. Maximum-likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39:1--38, 1977.Google ScholarCross Ref
- M. Dredze, T. Lau, and N. Kushmerick. Automatically classifying emails into activities. In Proc. of IUI-06, pages 70 -- 77, 2006. Google ScholarDigital Library
- Y. Freund, H. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133--168, 1997. Google ScholarDigital Library
- E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In UAI-98, pages 256--265, 1998. Google ScholarDigital Library
- E. Horvitz, A. Jacobs, and D. Hovel. Attention-sensitive alerting. In UAI-99, pages 305--313, 1999. Google ScholarDigital Library
- E. Horvitz, A. Jacobs, and D. Hovel. Learning and reasoning about interruption. In Proc. of ICMI-03, pages 20--27, 2003. Google ScholarDigital Library
- T. Joachims. Transductive inference for text classification using support vector machines. In Proc. of ICML-99, pages 200--209, 1999. Google ScholarDigital Library
- V. Kaptelinin. UMEA: translating interaction histories into project contexts. In SIGCHI, pages 353--360, 2003. Google ScholarDigital Library
- M. Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM, pages 983--1006, 1998. Google ScholarDigital Library
- N. Kushmerick and T. Lau. Automated email activity management: an unsupervised learning approach. In Proc. of IUI-05, pages 67--74, 2005. Google ScholarDigital Library
- D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proc. of SIGIR-94, pages 3--12, 1994. Google ScholarDigital Library
- A. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In Proc.of ICML-98, pages 350--358, 1998. Google ScholarDigital Library
- T. M. Mitchell, S. H. Wang, Y. Huang, and A. Cheyer. Extracting knowledge about users' activities from raw workstation contents. In Proc.of AAAI-06, 2006. Google ScholarDigital Library
- I. Muslea, S. Minton, and C. Knoblock. Active + semi-supervised learning = robust multi-view learning. In Proc. of ICML-02, pages 435--442, 2002. Google ScholarDigital Library
- K. Nigam and R. Ghani. Analyzing the effectiveness and applicability of co-training. In Proc. of CIKM-00, pages 86--93, 2000. Google ScholarDigital Library
- K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2/3):103--134, 2000. Google ScholarDigital Library
- M. Philipose, K. Fishkin, M. Perkowitz, D. Patterson, and D. Hahnel. The probabilistic activity toolkit: Towards enabling activity-aware computer interfaces. Technical Report IRS-TR-03-013, Intel Research Lab, Seattle, WA, 2003.Google Scholar
- D. Pierce and C. Cardie. Limitations of co-training for natural language learning from large datasets. In Proc. of EMNLP, pages 1--9, 2001.Google Scholar
- M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarCross Ref
- M. Saar-Tsechansky and F. Provost. Active sampling for class probability estimation and ranking. Machine Learning, 54(2):153--178, 1994. Google ScholarDigital Library
- J. Shen, L. Li, T. Dietterich, and J. Herlocker. A hybrid learning system for recognizing user tasks from desktopactivities and email messages. In Proc. of IUI-06, pages 86--92, 2006. Google ScholarDigital Library
- S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Proc. of ICML-00, pages 999--1006, 2000. Google ScholarDigital Library
Index Terms
- Active EM to reduce noise in activity recognition
Recommendations
Smartwatch based activity recognition using active learning
CHASE '17: Proceedings of the Second IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering TechnologiesHuman activity monitoring has become widely popular in recent years, and has been utilized in a vast number of fields and applications. Most of the activity recognition algorithms proposed have emphasized the use of inertial sensors in smartphone ...
A Disk Failure Prediction Method Based on Active Semi-supervised Learning
Disk failure has always been a major problem for data centers, leading to data loss. Current disk failure prediction approaches are mostly offline and assume that the disk labels required for training learning models are available and accurate. However, ...
Designing and evaluating active learning methods for activity recognition
UbiComp '14 Adjunct: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct PublicationActivity recognition in smart home environments is a crucial step towards fully autonomous assistance and health monitoring. Due to the high variance in house configurations and sensor placements, it is important to collect and label sample sensor data ...
Comments