skip to main content
10.1145/2009916.2010090acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Towards effective short text deep classification

Published: 24 July 2011 Publication History

Abstract

Recently, more and more short texts (e.g., ads, tweets) appear on the Web. Classifying short texts into a large taxonomy like ODP or Wikipedia category system has become an important mining task to improve the performance of many applications such as contextual advertising and topic detection for micro-blogging. In this paper, we propose a novel multi-stage classification approach to solve the problem. First, explicit semantic analysis is used to add more features for both short texts and categories. Second, we leverage information retrieval technologies to fetch the most relevant categories for an input short text from thousands of candidates. Finally, a SVM classifier is applied on only a few selected categories to return the final answer. Our experimental results show that the proposed method achieved significant improvements on classification accuracy compared with several existing state of art approaches.

References

[1]
E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In AAAI, pages 6--12, 2007.
[2]
A. Kosmopoulos, E. Gaussier, G. Paliouras, and S. Aseervatham. The ecir 2010 large scale hierarchical classification workshop. SIGIR Forum, 44:23--32, 2010.
[3]
B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas. Short text classification in twitter to improve information filtering. In SIGIR, pages 841--842, 2010.

Cited By

View all
  • (2021)Topic Modeling for Customer Service Chats2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS)10.1109/ICACSIS53237.2021.9631322(1-6)Online publication date: 23-Oct-2021
  • (2020)Feature selection for classifying multi-labeled past eventsInternational Journal on Digital Libraries10.1007/s00799-020-00293-5Online publication date: 8-Sep-2020
  • (2018)Wikipedia-Based Relatedness Measurements for Multilingual Short Text ClusteringACM Transactions on Asian and Low-Resource Language Information Processing10.1145/327647318:2(1-25)Online publication date: 14-Dec-2018
  • Show More Cited By

Index Terms

  1. Towards effective short text deep classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
    July 2011
    1374 pages
    ISBN:9781450307574
    DOI:10.1145/2009916

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 July 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. large scale hierarchy
    3. short text

    Qualifiers

    • Poster

    Conference

    SIGIR '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Topic Modeling for Customer Service Chats2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS)10.1109/ICACSIS53237.2021.9631322(1-6)Online publication date: 23-Oct-2021
    • (2020)Feature selection for classifying multi-labeled past eventsInternational Journal on Digital Libraries10.1007/s00799-020-00293-5Online publication date: 8-Sep-2020
    • (2018)Wikipedia-Based Relatedness Measurements for Multilingual Short Text ClusteringACM Transactions on Asian and Low-Resource Language Information Processing10.1145/327647318:2(1-25)Online publication date: 14-Dec-2018
    • (2018)Embedding User Behavioral Aspect in TF-IDF Like Representation2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR.2018.00061(262-267)Online publication date: Apr-2018
    • (2018)Classifying Short Descriptions of Past EventsAdvances in Information Retrieval10.1007/978-3-319-76941-7_69(729-736)Online publication date: 1-Mar-2018
    • (2017)Modeling user interest in social media using news media and wikipediaInformation Systems10.1016/j.is.2016.11.00365:C(52-64)Online publication date: 1-Apr-2017
    • (2016)Graph vs. bag representation models for the topic classification of web documentsWorld Wide Web10.1007/s11280-015-0365-x19:5(887-920)Online publication date: 1-Sep-2016
    • (2016)Retrieving Hierarchical Syllabus Items for Exam Question AnalysisAdvances in Information Retrieval10.1007/978-3-319-30671-1_42(575-586)Online publication date: 2016
    • (2015)Wikipedia-Based Semantic Similarity Measurements for Noisy Short Texts Using Extended Naive BayesIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2015.24187163:2(205-219)Online publication date: Jun-2015
    • (2015)A word distributed representation based framework for large-scale short text classification2015 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2015.7280513(1-7)Online publication date: Jul-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media