Learning a Joint Representation for Classification of Networked Documents

You, Zhenni; Qian, Tieyun

doi:10.1007/978-3-030-04221-9_18

Learning a Joint Representation for Classification of Networked Documents

Zhenni You¹⁶ &
Tieyun Qian¹⁶

Conference paper
First Online: 17 November 2018

2457 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11305))

Abstract

Recently, several researchers have incorporated network information to enhance document classification. However, these methods are tied to some specific network representations and are unable to exploit different representations to take advantage of data specific properties. Moreover, they do not utilize the complementary information from one source to the other, and do not fully leverage the label information. In this paper, we propose CrossTL, a novel representation model, to find better representations for classification. CrossTL improves the learning at three levels: (1) at the input level, it is a general framework which can accommodate any useful text or graph embeddings, (2) at the structure level, it learns a text-to-link and link-to-text representation to comprehensively describe the data; (3) at the objective level, it bounds the error rate by incorporating four types of losses, i.e., text, link, and the combination and disagreement of text and link, into the loss function. Extensive experimental results demonstrate that CrossTL significantly outperforms the state-of-the-art representations on datasets with either rich or poor texts and links.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. JMLR 3, 1137–1155 (2003)
MATH Google Scholar
Bottou, L.: Stochastic gradient learning in neural networks. In: Neuro-Nîmes (1991)
Google Scholar
Bui, T.D., Ravi, S., Ramavajjala, V.: Neural graph machines: learning neural networks using graphs. In: ICIR (2017)
Google Scholar
Chang, J., Blei, D.M.: Relational topic models for document networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 81–88 (2009)
Google Scholar
Chen, J., Zhang, Q., Huang, X.: Incorporate group information to enhance network embedding. In: Proceedings of CIKM, pp. 1901–1904 (2016)
Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. JMLR 9, 1871–1874 (2008)
MATH Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of SIGKDD, pp. 855–864 (2016)
Google Scholar
Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proceedings of SIGKDD, pp. 593–598 (2004)
Google Scholar
Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. JMLR 9(2605), 2579–2605 (2008)
MATH Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of ICML, pp. 1188–1196 (2014)
Google Scholar
Lu, Q., Getoor, L.: Link-based classification. In: Proceedings of ICML, pp. 496–503 (2003)
Google Scholar
Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceedings of WWW, pp. 101–110 (2008)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of ICLR (2013)
Google Scholar
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of ICML, pp. 641–648 (2007)
Google Scholar
Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-party deep network representation. In: Proceedings of 25th IJCAI, pp. 1895–1901 (2016)
Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of SIGKDD, pp. 701–710 (2014)
Google Scholar
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of EMNLP, pp. 1422–1432 (2015)
Google Scholar
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of WWW, pp. 1067–1077 (2015)
Google Scholar
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of SIGKDD, pp. 990–998 (2008)
Google Scholar
Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: Proceedings of the 18th UAI, pp. 485–492 (2002)
Google Scholar
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of SIGKDD, pp. 1225–1234 (2016)
Google Scholar
Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: Proceedings of 24th IJCAI, pp. 2111–2117 (2015)
Google Scholar
Zhang, X., Hu, X., Zhou, X.: A comparative evaluation of different link types on enhancing document clustering. In: Proceedings of SIGIR, pp. 555–562 (2008)
Google Scholar

Download references

Acknowledgments

The work described in this paper has been supported in part by the NSFC project (61572376).

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, Hubei, China
Zhenni You & Tieyun Qian

Authors

Zhenni You
View author publications
You can also search for this author in PubMed Google Scholar
Tieyun Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tieyun Qian .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

You, Z., Qian, T. (2018). Learning a Joint Representation for Classification of Networked Documents. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11305. Springer, Cham. https://doi.org/10.1007/978-3-030-04221-9_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-04221-9_18
Published: 17 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04220-2
Online ISBN: 978-3-030-04221-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics