Associating Items with Scenes Identified in Social Q&A Data

Sato, Shin-ya; Takahashi, Masami; Matsuo, Masato

doi:10.1007/978-3-642-38333-5_20

Shin-ya Sato²¹,
Masami Takahashi²¹ &
Masato Matsuo²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7652))

Included in the following conference series:

1031 Accesses

Abstract

We discuss the problem of discovering associations between typical situations (scenes) in our daily lives and their characteristic items, which refer to anything from real objects to imaginary beings or abstract concepts. Once scenes are associated with items, the scenes can be further computationally analyzed (e.g., compared, tracked) on the basis of their associated items. In our approach for mining such associations, a list L of items and a set D of Web documents, in which scenes are identified, are first prepared. Next, D is divided using latent Dirichlet allocation (LDA) into clusters, each of which can be regarded as corresponding to a distinct characteristic scene. Then, the relevance between the scenes and items in L is estimated on the basis of the statistical significance of occurrence of items in the clusters. We developed two simple techniques for improving the quality (consistency) of the clustering result obtained using LDA with the expectation that the improved clustering result yields better performance in revealing item-scene associations. The most effective of the two techniques, PACA, purifies original clusters (i.e., eliminates unwanted elements in each cluster) created using a clustering algorithm by using the outcome from another clustering algorithm. Through an experiment using pages in a social Q&A site, we verified the effectiveness of the cluster purification techniques and the total effectiveness of our approach of associating items with scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Swan, R., Allan, J.: Extracting significant time varying features from text. In: Proceedings of the 8th International Conference on Information and Knowledge Management, pp. 38–45 (1999)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Wang, X., Mccallum, A.: Topics over time: A non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433 (2006)
Google Scholar
Girju, R., Badulescu, A., Moldovan, D.: Automatic Discovery of Part-Whole Relations 32(1), 83–135 (2006)
Google Scholar
Pantel, P., Pennacchiotti, M.: Espresso: leveraging generic patterns for automatically harvesting semantic relations. In: Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics, pp. 113–120 (2006)
Google Scholar
De Saeger, S., Torisawa, K., Kazama, J.: Looking for trouble. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 185–192 (2008)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research 3, 583–617 (2003)
MathSciNet MATH Google Scholar
Ghaemi, R., Sulaiman, M.N., Ibrahim, H., Mustapha, N.: A Survey: Clustering Ensembles Techniques. World Academy of Science, Engineering and Technology 38, 644–653 (2009)
Google Scholar
Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis. Technical Report CS 01-040, Department of Computer Science, University of Minnesota (2001)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Salton, G.: Automatic Information Organization and Retrieval. McGraw-Hill (1968)
Google Scholar
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Newman, M.E.J.: Networks: An Introduction. Oxford University Press (2010)
Google Scholar
Fortunato, S.: Community detection in graphs. Physics Reports 486, 75–174 (2010)
Article MathSciNet Google Scholar
Pons, P., Latapy, M.: Computing communities in large networks using random walks. Journal of Graph Algorithms and Applications 10(2), 191–218 (2006)
Article MathSciNet MATH Google Scholar
Orman, G.K., Labatut, V.: A Comparison of Community Detection Algorithms on Artificial Networks. In: Proceedings of the 12th International Conference on Discovery Science, pp. 242–256 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Network Innovation Laboratories, 3-9-11, Midori-cho, Musashinno-shi, Tokyo, Japan
Shin-ya Sato, Masami Takahashi & Masato Matsuo

Authors

Shin-ya Sato
View author publications
You can also search for this author in PubMed Google Scholar
Masami Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Masato Matsuo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information Engineering Laboratory, CSIRO ICT Centre, Australia
Armin Haller
Victoria University, Melbourne, Australia
Guangyan Huang
Department of Computer Science, Vrije University, Amsterdam, The Netherlands
Zhisheng Huang
The University of New South Wales, Sydney, NSW, Australia
Hye-young Paik
Department of Computer Science, Adelaide University, 5005, Adelaide, SA, Australia
Quan Z. Sheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sato, Sy., Takahashi, M., Matsuo, M. (2013). Associating Items with Scenes Identified in Social Q&A Data. In: Haller, A., Huang, G., Huang, Z., Paik, Hy., Sheng, Q.Z. (eds) Web Information Systems Engineering – WISE 2011 and 2012 Workshops. WISE WISE 2011 2012. Lecture Notes in Computer Science, vol 7652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38333-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-38333-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38332-8
Online ISBN: 978-3-642-38333-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics