Skip to main content

Learning a Joint Representation for Classification of Networked Documents

  • Conference paper
  • First Online:
  • 2457 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11305))

Abstract

Recently, several researchers have incorporated network information to enhance document classification. However, these methods are tied to some specific network representations and are unable to exploit different representations to take advantage of data specific properties. Moreover, they do not utilize the complementary information from one source to the other, and do not fully leverage the label information. In this paper, we propose CrossTL, a novel representation model, to find better representations for classification. CrossTL improves the learning at three levels: (1) at the input level, it is a general framework which can accommodate any useful text or graph embeddings, (2) at the structure level, it learns a text-to-link and link-to-text representation to comprehensively describe the data; (3) at the objective level, it bounds the error rate by incorporating four types of losses, i.e., text, link, and the combination and disagreement of text and link, into the loss function. Extensive experimental results demonstrate that CrossTL significantly outperforms the state-of-the-art representations on datasets with either rich or poor texts and links.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. JMLR 3, 1137–1155 (2003)

    MATH  Google Scholar 

  2. Bottou, L.: Stochastic gradient learning in neural networks. In: Neuro-Nîmes (1991)

    Google Scholar 

  3. Bui, T.D., Ravi, S., Ramavajjala, V.: Neural graph machines: learning neural networks using graphs. In: ICIR (2017)

    Google Scholar 

  4. Chang, J., Blei, D.M.: Relational topic models for document networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 81–88 (2009)

    Google Scholar 

  5. Chen, J., Zhang, Q., Huang, X.: Incorporate group information to enhance network embedding. In: Proceedings of CIKM, pp. 1901–1904 (2016)

    Google Scholar 

  6. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. JMLR 9, 1871–1874 (2008)

    MATH  Google Scholar 

  7. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of SIGKDD, pp. 855–864 (2016)

    Google Scholar 

  8. Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proceedings of SIGKDD, pp. 593–598 (2004)

    Google Scholar 

  9. Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. JMLR 9(2605), 2579–2605 (2008)

    MATH  Google Scholar 

  10. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of ICML, pp. 1188–1196 (2014)

    Google Scholar 

  11. Lu, Q., Getoor, L.: Link-based classification. In: Proceedings of ICML, pp. 496–503 (2003)

    Google Scholar 

  12. Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceedings of WWW, pp. 101–110 (2008)

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of ICLR (2013)

    Google Scholar 

  14. Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of ICML, pp. 641–648 (2007)

    Google Scholar 

  15. Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-party deep network representation. In: Proceedings of 25th IJCAI, pp. 1895–1901 (2016)

    Google Scholar 

  16. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of SIGKDD, pp. 701–710 (2014)

    Google Scholar 

  17. Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP, pp. 1422–1432 (2015)

    Google Scholar 

  18. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of WWW, pp. 1067–1077 (2015)

    Google Scholar 

  19. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of SIGKDD, pp. 990–998 (2008)

    Google Scholar 

  20. Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: Proceedings of the 18th UAI, pp. 485–492 (2002)

    Google Scholar 

  21. Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of SIGKDD, pp. 1225–1234 (2016)

    Google Scholar 

  22. Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: Proceedings of 24th IJCAI, pp. 2111–2117 (2015)

    Google Scholar 

  23. Zhang, X., Hu, X., Zhou, X.: A comparative evaluation of different link types on enhancing document clustering. In: Proceedings of SIGIR, pp. 555–562 (2008)

    Google Scholar 

Download references

Acknowledgments

The work described in this paper has been supported in part by the NSFC project (61572376).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tieyun Qian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

You, Z., Qian, T. (2018). Learning a Joint Representation for Classification of Networked Documents. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11305. Springer, Cham. https://doi.org/10.1007/978-3-030-04221-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04221-9_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04220-2

  • Online ISBN: 978-3-030-04221-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics