Selective Sampling with a Hierarchical Latent Variable Model

Mamitsuka, Hiroshi

doi:10.1007/978-3-540-45231-7_33

Hiroshi Mamitsuka⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1653 Accesses

Abstract

We present a new method which combines a hierarchical stochastic latent variable model and a selective sampling strategy, for learning from co-occurrence events, i.e. a fundamental issue in intelligent data analysis. The hierarchical stochastic latent variable model we employ enables us to use existing background knowledge of observable co-occurrence events as a latent variable. The selective sampling strategy we use iterates selecting plausible non-noise examples from a given data set and running the learning of a component stochastic model alternately and then improves the predictive performance of a component model. Combining the model and the strategy is expected to be effective for enhancing the performance of learning from real-world co-occurrence events. We have empirically tested the performance of our method using a real data set of protein-protein interactions, a typical data set of co-occurrence events. The experimental results have shown that the presented methodology significantly outperformed an existing approach and other machine learning methods compared, and that the presented method is highly effective for unsupervised learning from co-occurrence events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196 (2001)
Article MATH Google Scholar
Mamitsuka, H.: Hierarchical latent knowledge analysis for co-occurrence data. In: Proceedings of the Twentieth International Conference on Machine Learning. Morgan Kaufmann, San Francisco (2003)
Google Scholar
Mamitsuka, H.: Efficient unsupervised mining from noisy data sets. In: Proceedings of the Third SIAM International Conference on Data Mining, SIAM, pp. 239–243 (2003)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B 39, 1–38 (1977)
MathSciNet MATH Google Scholar
Bishop, C.M., Tipping, M.E.: A hierarchical latent variable model for data visualization. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 281–293 (1998)
Article Google Scholar
Schölkopf, B., et al.: Estimating the support of a high-dimensional distribution. Neural Computation 13, 1443–1471 (2001)
Article MATH Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Mewes, H.W., et al.: MIPS: A database for genomes and protein sequences. Nucleic Acids Research 30, 31–34 (2002)
Article Google Scholar
Li, H., Abe, N.: Generalizing case frames using a thesaurus and the MDL principle. Computational Linguistics 24, 217–244 (1998)
Google Scholar
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Chemical Research, Kyoto University, Gokasho Uji, 611-0011, Japan
Hiroshi Mamitsuka

Authors

Hiroshi Mamitsuka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Berkeley Initiative in Soft Computing (BISC), University of California at Berkeley, USA
Michael R. Berthold
Freie Universität Berlin, Garystr. 21, 14195, Berlin, Germany
Hans-Joachim Lenz
Department of Computer Science, University of Colorado, Boulder, Colorado, USA
Elizabeth Bradley
Otto-von-Guericke-University of Magdeburg, Germany
Rudolf Kruse
Department of Knowledge Processing and Language Engineering, University of Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Christian Borgelt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mamitsuka, H. (2003). Selective Sampling with a Hierarchical Latent Variable Model. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_33

Download citation

DOI: https://doi.org/10.1007/978-3-540-45231-7_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40813-0
Online ISBN: 978-3-540-45231-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics