skip to main content
10.1145/2505515.2505676acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Community question topic categorization via hierarchical kernelized classification

Published:27 October 2013Publication History

ABSTRACT

We present a hierarchical kernelized classification model for the automatic classification of general questions into their corresponding topic categories in community Question Answering service (cQAs). This could save many efforts of manual classification and facilitate browsing as well as better retrieving of questions from the cQA archives. To deal with the challenge of short text message of questions, we explore and optimally combine various cQA features by introducing multiple kernel learning strategy into the hierarchical classification framework. We propose a hybrid regularization approach of combining orthogonal constraint and L1 sparseness in our framework to promote the discriminative power on similar topics as well as sparsing the model parameters. The experimental results on a real world dataset from Yahoo! Answers demonstrate the effectiveness of our proposed model as compared to the state-of-the-art methods and strong baselines.

References

  1. E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In Proceedings of WSDM, pages 183--194, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. J. Blooma, D. H.-L. Goh, and A. Y. K. Chua. Question classification in social media. International Journal of Information Studies, 1(2):101--109, April 2009.Google ScholarGoogle Scholar
  3. F. Bu, X. Zhu, Y. Hao, and X. Zhu. Function-based question classification for general qa. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1119--1128, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bunescu and R. J. Mooney. Subsequence kernels for relation extraction. In Proceedings of the 19th Conference on Neural Information Processing Systems. Vancouver, British Columbia, 2005.Google ScholarGoogle Scholar
  5. L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In Proceedings of the 13th CIKM, pages 78--87, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Cai, G. Zhou, K. Liu, and J. Zhao. Large-scale question classification in cqa by leveraging wikipedia semantic knowledge. In Proceedings of CIKM, pages 1321--1330, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Cao, G. Cong, B. Cui, and C. S. Jensen. A generalized framework of exploring category information for question retrieval in community question answer archives. In Proceedings of WWW, pages 201--210. Raleigh, North Carolina, USA, April 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. X. Cao, G. Cong, B. Cui, C. S. Jensen, and C. Zhang. The use of categorization information in language models for question retrieval. In Proceedings of CIKM, pages 265--274, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classification. Journal of Machine Learning Research, 7:31--54, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Chan, X. Zhou, W. Wang, and T.-S. Chua Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization. In Proceedings of ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Collins and N. Duffy. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proceedings of ACL, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Duchi and Y. Singer. Efficient online and batch learning using forward backward splitting. Journal of Machine Learning Research, 10:2873--2898, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: An update. SIGKDD Explorations, 11(1), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. M. Harper, J. Weinberg, J. Logie, and J. A. Konstan. Question types in social q&a sites. First Monday, 15(7), 2010.Google ScholarGoogle Scholar
  15. D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In Proceedings of the 14th ICML, pages 170--178, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. LeCun, S. Chopra, R. Hadsell, R. Marc'Aurelio, and F. Huang. A tutorial on energy-based learning. Predicting Structured Data, MIT Press, 2006.Google ScholarGoogle Scholar
  17. Y.-J. Lee and O. L. Mangasarian. Rsvm: Reduced support vector machines. In Proceedings the First SIAM International Conference on Data Mining, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  18. X. Li and D. Roth. Learning question classifiers. In Proceedings of the 19th International Conference on Computational Linguistics, pages 556--562, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Q. Liu, E. Agichtein, G. Dror, Y. Maarek, and I. Szpektor. When web search fails, searchers become askers: Understanding the transition. In Proceedings of the 35th SIGIR, pages 801--810. Portland, Oregon, USA, August 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Moschitti. Exploiting Syntactic and Shallow Semantic Kernels for Question/Answer Classification In Proceedings of the 45th ACL. Prague, June 2007.Google ScholarGoogle Scholar
  21. A. Moschitti. Syntactic and semantic kernels for short text pair categorization. In Proceedings of the 12th Conference of the European Chapter of the ACL, page 576--584. Athens, Greece, March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X.-H. Phan, L.-M. Nguyen, and S. Horiguchi. Learning to classify short and sparse txt & web with hidden topics from large-scale data collections. In Proceedings of WWW, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Qu, G. Cong, C. Li, A. Sun, and H. Chen. An evaluation of classification models for question topic categorization. JASIST, 63(5):889--903, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Chirag Shah and Jefferey Pomerantz. Evaluating and Predicting Answer Quality in Community QA. In Proceedings of the 33th ACM SIGIR Conference. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Xiao. Dual averaging methods for regularized stochastic learning and online optimization. Journal of Machine Learning Research, 10:2543--2596, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Xue, J. Jeon, and W. B. Croft. Retrieval models for question and answer archives. In Proceedings of ACM SIGIR Conference, pages 475--482, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of SIGIR, pages 42--49, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Zhang and W. Lee. Question classification using support vector machines. In Proceedings of the 26th Annual International ACM SIGIR conference, pages 26--32, 2002.Google ScholarGoogle Scholar
  29. D. Zhou, L. Xiao, and M. Wu. Hierarchical classification via orthogonal transfer. In Proceedings of the 28th ICML. Bellevue, WA, USA, 2011.Google ScholarGoogle Scholar

Index Terms

  1. Community question topic categorization via hierarchical kernelized classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
      October 2013
      2612 pages
      ISBN:9781450322638
      DOI:10.1145/2505515

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader