skip to main content
10.1145/3308558.3313636acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Large Scale Semantic Indexing with Deep Level-wise Extreme Multi-label Learning

Published:13 May 2019Publication History

ABSTRACT

Domain ontology is widely used to index literature for the convenience of literature retrieval. Due to the high cost of manual curation of key aspects from the scientific literature, automated methods are crucially required to assist the process of semantic indexing. However, it is a challenging task due to the huge amount of terms and complex hierarchical relations involved in a domain ontology. In this paper, in order to lessen the curse of dimensionality and enhance the training efficiency, we propose an approach named Deep Level-wise Extreme Multi-label Learning and Classification (Deep Level-wise XMLC), to facilitate the semantic indexing of literatures. Specifically, Deep Level-wise XMLC is composed of two sequential modules. The first module, deep level-wise multi-label learning, decomposes the terms of a domain ontology into multiple levels and builds a special convolutional neural network for each level with category-dependent dynamic max pooling and macro F-measure based weights tuning. The second module, hierarchical pointer generation model merges the level-wise outputs into a final summarized semantic indexing. We demonstrate the effectiveness of Deep Level-wise XMLC by comparing it with several state-of-the-art methods on automatic labeling of MeSH, on literature from PubMed MEDLINE and automatic labeling of AmazonCat13K.

References

  1. Rahul Agrawal, Archit Gupta, Yashoteja Prabhu, and Manik Varma. 2013. Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages. In Proceedings of the 22nd international conference on World Wide Web (WWW). Rio de Janeiro, Brazil, 13-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Krishnakumar Balasubramanian and Guy Lebanon. 2012. The Landmark Selection Method for Multiple Output Prediction. In Proceedings of the 29th International Conference on Machine Learning (ICML). Edinburgh, Scotland, UK, 283-290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. William A Baumgartner Jr, K Bretonnel Cohen, Lynne M Fox, George Acquaah-Mensah, and Lawrence Hunter. 2007. Manual curation is not sufficient for annotation of genomic databases. Bioinformatics 23, 13 (2007), i41-i48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, and Prateek Jain. 2015. Sparse local embeddings for extreme multi-label classification. In Advances in neural information processing systems (NIPS). Montreal, Canada, 730-738. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Wei Bi and James Kwok. 2013. Efficient multi-label classification with many labels. In International Conference on Machine Learning (ICML). Atlanta, GA, 405-413. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Róbert Busa-Fekete, Balázs Szöre´nyi, Krzysztof Dembczynski, and Eyke Hüllermeier. 2015. Online F-measure optimization. In Advances in Neural Information Processing Systems (NIPS). Montreal, Canada, 595-603. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ricardo Cerri, Rodrigo C Barros, and Andre´ CPLF De Carvalho. 2014. Hierarchical multi-label classification using local neural networks. J. Comput. System Sci. 80, 1 (2014), 39-56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yao-Nan Chen and Hsuan-Tien Lin. 2012. Feature-aware label space dimension reduction for multi-label classification. In Advances in Neural Information Processing Systems (NIPS). Lake Tahoe, NV, 1529-1537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Anna E Choromanska and John Langford. 2015. Logarithmic time online multiclass prediction. In Advances in Neural Information Processing Systems (NIPS). Montreal, Canada, 55-63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Moustapha M Cisse, Nicolas Usunier, Thierry Artieres, and Patrick Gallinari. 2013. Robust bloom filters for large multilabel classification tasks. In Advances in Neural Information Processing Systems (NIPS). Lake Tahoe, NV, 1851-1859. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ronan Collobert, Jason Weston, Le´on Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, Aug (2011), 2493-2537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Minlie Huang, Aure´lie Ne´ve´ol, and Zhiyong Lu. 2011. Recommending MeSH terms for annotating biomedical articles. Journal of the American Medical Informatics Association 18, 5(2011), 660-667.Google ScholarGoogle ScholarCross RefCross Ref
  13. Kalina Jasinska, Krzysztof Dembczynski, Róbert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, and Eyke Hullermeier. 2016. Extreme F-measure maximization using sparse probability estimates. In International Conference on Machine Learning (ICML). New York, NY, 1435-1444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar, 1746-1751.Google ScholarGoogle ScholarCross RefCross Ref
  15. Dingcheng Li, Jingyuan Zhang, and Ping Li. 2018. Representation Learning for Question Classification via Topic Sparse Autoencoder and Entity Embedding. In 2018 IEEE International Conference on Big Data (IEEE Big Data). Seattle, WA, 126-133.Google ScholarGoogle ScholarCross RefCross Ref
  16. Dingcheng Li, Jingyuan Zhang, and Ping Li. 2019. TMSA: A Mutual Learning Model for Topic Discovery and WordEmbedding. In In Proceedings of the SIAM conference on Data Mining (SDM). Calgary, Canada.Google ScholarGoogle Scholar
  17. Ping Li. 2009. Abc-boost: Adaptive base class boost for multi-class classification. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML). Montreal, Canada, 625-632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ping Li. 2010. Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI). Catalina Island, CA, 302-311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. DA Lindberg. 1999. Internet access to the National Library of Medicine.Effective clinical practice: ECP 3, 5 (1999), 256-260.Google ScholarGoogle Scholar
  20. Jingzhou Liu, Wei-Cheng Chang, Yuexin Wu, and Yiming Yang. 2017. Deep Learning for Extreme Multi-label Text Classification. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Tokyo, Japan, 115-124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ke Liu, Shengwen Peng, Junqiu Wu, Chengxiang Zhai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2015. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence. Bioinformatics 31, 12 (2015), i339-i347.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yuqing Mao and Zhiyong Lu. 2017. MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank. Journal of biomedical semantics 8, 1 (2017), 15.Google ScholarGoogle ScholarCross RefCross Ref
  23. Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. 2016. Pointer Sentinel Mixture Models. CoRR abs/1609.07843(2016). arxiv:1609.07843http://arxiv.org/abs/1609.07843Google ScholarGoogle Scholar
  24. Haitao Mi, Baskaran Sankaran, Zhiguo Wang, and Abe Ittycheriah. 2016. Coverage Embedding Models for Neural Machine Translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). Austin, TX, 955-960.Google ScholarGoogle ScholarCross RefCross Ref
  25. Paul Mineiro and Nikos Karampatziakis. 2015. Fast label embeddings via randomized linear algebra. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD). Porto, Portugal, 37-51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Xin Mu, Feida Zhu, Yue Liu, Ee-Peng Lim, and Zhi-Hua Zhou. 2018. Social Stream Classification with Emerging New Labels. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). Melbourne, Australia, 16-28.Google ScholarGoogle Scholar
  27. David Newman, Sarvnaz Karimi, and Lawrence Cavedon. 2009. Using Topic Models to Interpret MEDLINE's Medical Subject Headings. In AI 2009: Advances in Artificial Intelligence, Ann Nicholsonand Xiaodong Li (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 270-279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shengwen Peng, Ronghui You, Hongning Wang, Chengxiang Zhai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2016. DeepMeSH: deep semantic representation for improving large-scale MeSH indexing. Bioinformatics 32, 12 (2016), i70-i79.Google ScholarGoogle ScholarCross RefCross Ref
  29. Yashoteja Prabhu and Manik Varma. 2014. Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD). New York, NY, 263-272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Baskaran Sankaran, Haitao Mi, Yaser Al-Onaizan, and Abe Ittycheriah. 2016. Temporal Attention Model for Neural Machine Translation. CoRR abs/1608.02927(2016). arxiv:1608.02927http://arxiv.org/abs/1608.02927Google ScholarGoogle Scholar
  31. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada, 1073-1083.Google ScholarGoogle Scholar
  32. Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling Coverage for Neural Machine Translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany, 76-85.Google ScholarGoogle ScholarCross RefCross Ref
  33. Jason Weston, Ameesh Makadia, and Hector Yee. 2013. Label partitioning for sublinear ranking. In Proceedings of the 30th International Conference on Machine Learning (ICML). Atlanta, GA, 181-189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ting-Fan Wu, Chih-Jen Lin, and Ruby C Weng. 2004. Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research 5, Aug (2004), 975-1005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yan Yan, Xu-Cheng Yin, Chun Yang, Sujian Li, and Bo-Wen Zhang. 2018. Biomedical literature classification with a CNNs-based hybrid learning network. PloS one 13, 7 (2018), e0197933.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, and Inderjit Dhillon. 2016. Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In International Conference on Machine Learning (ICML). New York, NY, 3069-3077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wenjie Zhang, Junchi Yan, Xiangfeng Wang, and Hongyuan Zha. 2018. Deep Extreme Multi-label Learning. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR). Yokohama, Japan, 100-107. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format