short-paper

Semantic Gated Network for Efficient News Representation

Authors:
Xuxiao Bu

Xi'an Jiaotong University, Xi'an, Shaanxi, China

Xi'an Jiaotong University, Xi'an, Shaanxi, China
View Profile

,
Bingfeng Li

Tencent, Beijing, China

Tencent, Beijing, China
View Profile

,
Yaxiong Wang

Xi'an Jiaotong University, Xi'an, Shaanxi, China

Xi'an Jiaotong University, Xi'an, Shaanxi, China
View Profile

,
Jihua Zhu

Xi'an Jiaotong University, Xi'an, Shaanxi, China

Xi'an Jiaotong University, Xi'an, Shaanxi, China
View Profile

,
Xueming Qian

Xi'an Jiaotong University, Xi'an, Shaanxi, China

Xi'an Jiaotong University, Xi'an, Shaanxi, China
View Profile

,
Marco Zhao

Tencent, Beijing, China

Tencent, Beijing, China
View Profile

ICMR '20: Proceedings of the 2020 International Conference on Multimedia RetrievalJune 2020Pages 251–255https://doi.org/10.1145/3372278.3390719

Published:08 June 2020Publication History

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

Pages 251–255

ABSTRACT

Learning an efficient news representation is a fundamental yet important problem for many tasks. Most existing news-relevant methods only take the textual information while abandoning the visual clues from the illustrations. We argue that the textual title and tags together with the visual illustrations form the main force of a piece of news and are more efficient to express the news content. In this paper, we develop a novel framework, namely Semantic Gated Network (SGN), to integrate the news title, tags and visual illustrations to obtain an efficient joint textual-visual feature for the news, by which we can directly measure the relevance between two pieces of news. Particularly, we first harvest the tag embeddings by the proposed self-supervised classification model. Besides, news title is fed into a sentence encoder pretrained by two semantically relevant news to learn efficient contextualized word vectors. Then the feature of the news title is extracted based on the learned vectors and we combine it with features of tags to obtain textual feature. Finally, we design a novel mechanism named semantic gate to adaptively fuse the textual feature and the image feature. Extensive experiments on benchmark dataset demonstrate the effectiveness of our approach.

References

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
Tadas Baltru?aitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (2018), 423--443.Google Scholar
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of machine learning research 3, Feb (2003), 1137--1155.Google Scholar
Xingyue Chen, Yunhong Wang, and Qingjie Liu. 2017. Visual and textual sentiment analysis using deep fusion convolutional neural networks. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 1557--1561.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A Smith. 2015. Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:1505.08075 (2015).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Kevin Joseph and Hui Jiang. 2019. Content based News Recommendation via Shortest Entity Distance over Knowledge Graphs. In Companion Proceedings of The 2019 World Wide Web Conference. ACM, 690--699.Google ScholarDigital Library
Dhruv Khattar, Vaibhav Kumar, Vasudeva Varma, and Manish Gupta. 2018. Weave&Rec: A Word Embedding based 3-D Convolutional Network for News Recommendation. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1855--1858.Google ScholarDigital Library
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).Google Scholar
Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, and Xiaodong He. 2018. Stacked cross attention for image-text matching. In Proceedings of the European Conference on Computer Vision (ECCV). 201--216.Google ScholarDigital Library
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101(2016).Google Scholar
Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. 2017. Learned in translation: Contextualized word vectors. In Advances in Neural Information Processing Systems. 6294--6305.Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Je? Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google Scholar
Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018).Google Scholar
Gabriele Sottocornola, Panagiotis Symeonidis, and Markus Zanker. 2018. Session based News Recommendations. In Companion Proceedings of the The Web Conference 2018. International World Wide Web Conferences Steering Committee, 1395--1399.Google ScholarDigital Library
Joseph Turian, James Bergstra, and Yoshua Bengio. 2009. Quadratic features and deep architectures for chunking. In Proceedings of Human Language Technologies:The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers. Association for Computational Linguistics, 245--248Google ScholarCross Ref
Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. DKN: Deep knowledge-aware network for news recommendation. In Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, 1835--1844.Google ScholarDigital Library

Index Terms

Semantic Gated Network for Efficient News Representation
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback
CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval

With Internet delivery of video content surging to an un-precedented level, video recommendation has become a very popular online service. The capability of recommending relevant videos to targeted users can alleviate users' efforts on finding the most ...
Read More
Semantic representation: search and mining of multimedia content
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Semantic understanding of multimedia content is critical in enabling effective access to all forms of digital media data. By making large media repositories searchable, semantic content descriptions greatly enhance the value of such data. Automatic ...
Read More
SWAG-Net: Semantic Word-Aware Graph Network for Temporal Video Grounding
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

In this paper, to effectively capture non-sequential dependencies among semantic words for temporal video grounding, we propose a novel framework called Semantic Word-Aware Graph Network (SWAG-Net), which adopts graph-guided semantic word embedding in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Björn Þór Jónsson
IT University of Copenhagen, Denmark
,
Noriko Kando
National Institute of Informatics, Tokyo
,
Program Chairs:
Klaus Schoeffmann
Klagenfurt University, Austria
,
Phoebe Chen
La Trobe University, Australia
,
Noel E. O'Connor
Dublin City University, Ireland
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 June 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
contextualized word vector
multimodal fusion
self-supervised classification model
semantic gate
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate254of830submissions,31%
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 150
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Semantic Gated Network for Efficient News Representation

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback

Semantic representation: search and mining of multimedia content

SWAG-Net: Semantic Word-Aware Graph Network for Temporal Video Grounding