Query by Committee in a Heterogeneous Environment

Shao, Hao; Tong, Bin; Suzuki, Einoshin

doi:10.1007/978-3-642-35527-1_16

Hao Shao^22,24,
Bin Tong²² &
Einoshin Suzuki²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7713))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3468 Accesses
5 Citations

Abstract

In real applications of inductive learning, labeled instances are often deficient. The countermeasure is either to ask experts to label informative instances in active learning, or to borrow useful information from abundant labeled instances in the source domain in transfer learning. Due to the high cost of querying experts, it is promising to integrate the two methodologies into a more robust and reliable classification framework to compensate the disadvantages of both methods. Recently, a few research studies have been investigated to integrate the two methods together, which is called transfer active learning. However, when there exist unrelated domains which have different distributions or label assignments, an inevitable problem named negative transfer will happen which leads to degenerated performance. Also, how to avoid selecting unconcerned samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to measure the similarities between different domains, so that the negative effects can be alleviated. To avoid querying irrelevant instances, we also present an adaptive strategy that is able to eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both synthetic and real data sets show that our algorithm is able to query less instances and converges faster than the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Harpale, A., Yang, Y.: Active Learning for Multi-Task Adaptive Filtering. In: Proceeding of the 27th International Conference of Machine Learning (ICML 2010), pp. 431–438 (2010)
Google Scholar
McCallum, A., Nigam, K.: Employing EM in Pool-based Active Learning for Text Classification. In: Proceeding of the 15th International Conference of Machine Learning (ICML 1998), pp. 350–358 (1998)
Google Scholar
Settles, B.: Active Learning Literature Survey. Technical Report 1648, University of Wisconsin–Madison (2010)
Google Scholar
Cohn, D., Ladner, R., Waibel, A.: Improving Generalization with Active Learning. In: Machine Learning, pp. 201–221 (1994)
Google Scholar
Lewis, D.D., Gale, W.A.: A Sequential Algorithm for Training Text Classifiers. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1994), pp. 3–12 (1994)
Google Scholar
Pereira, F., Tishby, N., Lee, L.: Distributional Clustering of English Words. In: Proceeding of the 31st Annual Meeting of the Association for Computational Linguistics (ACL 1993), pp. 183–190 (1993)
Google Scholar
Li, H., Shi, Y., Chen, M.Y., Hauptmann, A.G., Xiong, Z.: Hybrid Active Learning for Cross-domain Video Concept Detection. In: Proceedings of the International Conference on Multimedia (MM 2010 ), pp. 1003–1006 (2010)
Google Scholar
Shao, H., Suzuki, E.: Feature-based Inductive Transfer Learning through Minimum Encoding. In: Proceeding of the SIAM International Conference on Data Mining (SDM 2011), pp. 259–270 (2011)
Google Scholar
Seung, H.S., Opper, M., Sompolinsky, H.: Query by Committee. In: Computational Learning Theory, pp. 287–294 (1992)
Google Scholar
Dagan, I., Engelson, S.P.: Committee-Based Sampling For Training Probabilistic Classifiers. In: Proceeding of the 23rd International Conference on Machine Learning (ICML 2006), pp. 150–157 (1995)
Google Scholar
Muslea, I., Minton, S., Knoblock, C.A.: Active Semi-Supervised Learning = Robust Multi-View Learning. In: Proceeding of the 19th International Conference on Machine Learning (ICML 2002), pp. 435–442 (2002)
Google Scholar
Lin, J.: Divergence Measures based on the Shannon Entropy. IEEE Transactions on Information theory 37, 145–151 (1991)
Article MATH Google Scholar
Church, K.W., Gale, W.A.: A Comparison of the Enhanced Good-Turing and Deleted Estimation Methods for Estimating Probabilities of English Bigrams. Computer Speech and Language 5, 19–54 (1991)
Article Google Scholar
Li, L., Jin, X., Pan, S., Sun, J.: Multi-Domain Active Learning for Text Classification. In: Proceeding of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2012 (2012)
Google Scholar
Balcan, M.F., Beygelzimer, A., Langford, J.: Agnostic Active Learning. In: Proceeding of the 23rd International Conference on Machine Learning (ICML 2006), pp. 65–72 (2006)
Google Scholar
Rosenstein, M.T., Marx, Z., Kaelbling, L.P., Dietterich, T.G.: To Transfer or not to Transfer. In: NIPS 2005 Workshop, Inductive Transfer: 10 Years Later (2005)
Google Scholar
Caruana, R.: Multitask Learning. Machine Learning 28, 41–75 (1997)
Article Google Scholar
Reichart, R., Tomanek, K., Hahn, U.: Multi-task Active Learning for Linguistic Annotations. In: Annual Meeting of the Association for Computational Linguistics (ACL 2008), pp. 861–869 (2008)
Google Scholar
Raj, S., Ghosh, J., Crawford, M.M.: An Active Learning Approach to Knowledge Transfer for Hyperspectral Data Analysis. In: Proceeding of the IEEE International Conference on Geoscience and Remote Sensing Symposium (IGARSS 2006), pp. 541–544 (2006)
Google Scholar
Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Boosting for Transfer Learning. In: Proceeding of the 24th International Conference of Machine Learning (ICML 2007), pp. 193–200 (2007)
Google Scholar
Shi, X., Fan, W., Ren, J.: Actively Transfer Domain Knowledge. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 342–357. Springer, Heidelberg (2008)
Chapter Google Scholar
Zhang, Y.: Multi-Task Active Learning with Output Constraints. In: Proceeding of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010 (2010)
Google Scholar
Zhu, Z., Zhu, X., Ye, Y., Guo, Y.-F., Xue, X.: Transfer Active Learning. In: Proceeding of the 20th International Conference on Information and Knowledge Management (CIKM 2011), pp. 2169–2172 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Systems Life Sciences, Kyushu University, Japan
Hao Shao & Bin Tong
Department of Informatics, ISEE, Kyushu University, Japan
Einoshin Suzuki
School of WTO Research & Education, Shanghai Institute of Foreign Trade, China
Hao Shao

Authors

Hao Shao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Tong
View author publications
You can also search for this author in PubMed Google Scholar
Einoshin Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Fudan University, Handan Road 220, 200433, Shanghai, China
Shuigeng Zhou
Chinese Academy of Sciences, Academy of Mathematics and Systems Science, Dongguancun East Road 55, 100190, Beijing, China
Songmao Zhang
Department of Computer Science and Engineering, University of Minnesota, Union Street SE 200, 55455, Minneapolis, MN, USA
George Karypis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shao, H., Tong, B., Suzuki, E. (2012). Query by Committee in a Heterogeneous Environment. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-35527-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35526-4
Online ISBN: 978-3-642-35527-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics