skip to main content
10.1145/1216295.1216323acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
Article

Active EM to reduce noise in activity recognition

Published:28 January 2007Publication History

ABSTRACT

Intelligent desktop environments allow the desktop user to define a set of projects or activities that characterize the user's desktop work. These environments then attempt to identify the current activity of the user in order to provide various kinds of assistance. These systems take a hybrid approach in which they allow the user to declare their current activity but they also employ learned classifiers to predict the current activity to cover those cases where the user forgets to declare the current activity. The classifiers must be trained on the very noisy data obtained from the user's activity declarations. Instead of asking the user to review and relabel the data manually, we employ an active EM algorithm that combines the EM algorithm and active learning. EM can be viewed as retraining on its own predictions. To make it more robust, we only retrain on those predictions that are made with high confidence. For active learning, we make a small number of queries to the user based on the most uncertain instances. Experimental results on real users show this active EM algorithm can significantly improve the prediction precision, and that it performs better than either EM or active learning alone.

References

  1. D. Angluin. Queries and concept learning. Machine Learning, 2(4):319--342, April 1988. Google ScholarGoogle ScholarCross RefCross Ref
  2. D. Angluin and P. Laird. Learning from noisy examples. Machine Learning, 2(4):343--370, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, pages 506--519, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc.of COLT-98, pages 92--100, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. E. Brodley and M. Friedl. Identifying mislabeled training data. JAIR, 11:131--167, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  6. D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Machine Learning, 15(2):201--221, 1994. Google ScholarGoogle ScholarCross RefCross Ref
  7. D. Cohn, Z. Ghahramani, and M. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129--145, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Dempster, N. Laird, and D. Rubin. Maximum-likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39:1--38, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  9. M. Dredze, T. Lau, and N. Kushmerick. Automatically classifying emails into activities. In Proc. of IUI-06, pages 70 -- 77, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Freund, H. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133--168, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In UAI-98, pages 256--265, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Horvitz, A. Jacobs, and D. Hovel. Attention-sensitive alerting. In UAI-99, pages 305--313, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Horvitz, A. Jacobs, and D. Hovel. Learning and reasoning about interruption. In Proc. of ICMI-03, pages 20--27, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. Transductive inference for text classification using support vector machines. In Proc. of ICML-99, pages 200--209, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Kaptelinin. UMEA: translating interaction histories into project contexts. In SIGCHI, pages 353--360, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM, pages 983--1006, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Kushmerick and T. Lau. Automated email activity management: an unsupervised learning approach. In Proc. of IUI-05, pages 67--74, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proc. of SIGIR-94, pages 3--12, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In Proc.of ICML-98, pages 350--358, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. M. Mitchell, S. H. Wang, Y. Huang, and A. Cheyer. Extracting knowledge about users' activities from raw workstation contents. In Proc.of AAAI-06, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. I. Muslea, S. Minton, and C. Knoblock. Active + semi-supervised learning = robust multi-view learning. In Proc. of ICML-02, pages 435--442, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Nigam and R. Ghani. Analyzing the effectiveness and applicability of co-training. In Proc. of CIKM-00, pages 86--93, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2/3):103--134, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Philipose, K. Fishkin, M. Perkowitz, D. Patterson, and D. Hahnel. The probabilistic activity toolkit: Towards enabling activity-aware computer interfaces. Technical Report IRS-TR-03-013, Intel Research Lab, Seattle, WA, 2003.Google ScholarGoogle Scholar
  25. D. Pierce and C. Cardie. Limitations of co-training for natural language learning from large datasets. In Proc. of EMNLP, pages 1--9, 2001.Google ScholarGoogle Scholar
  26. M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  27. M. Saar-Tsechansky and F. Provost. Active sampling for class probability estimation and ranking. Machine Learning, 54(2):153--178, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Shen, L. Li, T. Dietterich, and J. Herlocker. A hybrid learning system for recognizing user tasks from desktopactivities and email messages. In Proc. of IUI-06, pages 86--92, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Proc. of ICML-00, pages 999--1006, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Active EM to reduce noise in activity recognition

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IUI '07: Proceedings of the 12th international conference on Intelligent user interfaces
        January 2007
        388 pages
        ISBN:1595934812
        DOI:10.1145/1216295

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 January 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate746of2,811submissions,27%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader