skip to main content
10.1145/3503161.3548203acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment

Authors Info & Claims
Published:10 October 2022Publication History

ABSTRACT

In talent management, resume assessment aims to analyze the quality of a job seeker's resume, which can assist recruiters to discover suitable candidates and benefit job seekers improving resume quality in return. Recent machine learning based methods on large-scale public resume datasets have provided the opportunity for automatic assessment for reducing manual costs. However, most existing approaches are still content-dominated and ignore other valuable information. Inspired by practical resume evaluations that consider both the content and layout, we construct the multi-modalities from resumes but face a new challenge that sometimes the performance of multi-modal fusion is even worse than the best uni-modality. In this paper, we experimentally find that this phenomenon is due to the cross-modal divergence. Therefore, we need to consider when is it appropriate to perform multi-modal fusion? To address this problem, we design an instance-aware fusion method, i.e., Divergence-Orientated Multi-Modal Fusion Network (DOMFN), which can adaptively fuse the uni-modal predictions and multi-modal prediction based on cross-modal divergence. Specifically, DOMFN computes a functional penalty score to measure the divergence of cross-modal predictions. Then, the learned divergence can be used to decide whether to conduct multi-modal fusion and be adopted into an amended loss for reliable training. Consequently, DOMFN rejects multi-modal prediction when the cross-modal divergence is too large, avoiding the overall performance degradation, so as to achieve better performance than uni-modalities. In experiments, qualitative comparison with baselines on real-world dataset demonstrates the superiority and explainability of the proposed DOMFN, e.g., we find a meaningful phenomenon that multi-modal fusion has positive effects for assessing resumes from UI Designer and Enterprise Service positions, whereas affects the assessment of Technology and Product Operation positions.

Skip Supplemental Material Section

Supplemental Material

References

  1. Jan Ketil Arnulf, Lisa Tegner, and Øyunn Larssen. 2010. Impression making by résumé layout: Its impact on the probability of being shortlisted. European Journal of Work and Organizational Psychology, Vol. 19, 2 (2010), 221--230.Google ScholarGoogle ScholarCross RefCross Ref
  2. Tadas Baltru"aitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 2 (2019), 423--443.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gavin Brown, Jeremy L. Wyatt, and Peter Ti n o. 2005. Managing Diversity in Regression Ensembles. J. Mach. Learn. Res., Vol. 6 (2005), 1621--1650.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kyunghyun Cho, B van Merrienboer, Caglar Gulcehre, F Bougares, H Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  5. Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2019. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In Proceedings of the International Conference on Learning Representations. Addis Ababa, Ethiopia.Google ScholarGoogle Scholar
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Minneapolis, 4171--4186.Google ScholarGoogle Scholar
  7. Jeffrey L. Elman. 1990. Finding Structure in Time. Cognitive Science, Vol. 14, 2 (1990), 179--211.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ben Greiner. 2004. An online recruitment system for economic experiments. (2004).Google ScholarGoogle Scholar
  9. Wei Han, Hui Chen, and Soujanya Poria. 2021a. Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican, 9180--9192.Google ScholarGoogle ScholarCross RefCross Ref
  10. Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2021b. Trusted Multi-View Classification. In Proceedings of the International Conference on Learning Representations. Virtual Event.Google ScholarGoogle Scholar
  11. Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2021c. Trusted Multi-View Classification. In Proceedings of the International Conference on Learning Representations. Austria.Google ScholarGoogle Scholar
  12. Christopher G Harris. 2017. Finding the best job applicants for a job posting: A comparison of human resources search strategies. In 2017 IEEE International Conference on Data Mining Workshops. IEEE, New Orleans, 189--194.Google ScholarGoogle ScholarCross RefCross Ref
  13. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jack Hessel and Lillian Lee. 2020. Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Virtual Event, 861--877.Google ScholarGoogle ScholarCross RefCross Ref
  15. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google ScholarGoogle Scholar
  16. Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).Google ScholarGoogle Scholar
  17. Hamid Reza Vaezi Joze, Amirreza Shaban, Michael L Iuzzolino, and Kazuhito Koishida. 2020. MMTM: Multimodal transfer module for CNN fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 13289--13299.Google ScholarGoogle ScholarCross RefCross Ref
  18. Zhen-zhong Lan, Lei Bao, Shoou-I Yu, Wei Liu, and Alexander G Hauptmann. 2014. Multimedia classification and event detection using double fusion. Multimedia tools and applications, Vol. 71, 1 (2014), 333--347.Google ScholarGoogle Scholar
  19. Hao Lin, Hengshu Zhu, Yuan Zuo, Chen Zhu, Junjie Wu, and Hui Xiong. 2017. Collaborative Company Profiling: Insights from an Employee's Perspective. In Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco, California, 1417--1423.Google ScholarGoogle ScholarCross RefCross Ref
  20. Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks, Vol. 12, 10 (1999), 1399--1404.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zhiyuan Liu, Yankai Lin, and Maosong Sun. 2020. Representation learning for natural language processing. Springer Nature.Google ScholarGoogle Scholar
  22. Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, AmirAli Bagher Zadeh, and Louis-Philippe Morency. 2018. Efficient Low-rank Multimodal Fusion With Modality-Specific Factors. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia, 2247--2256.Google ScholarGoogle ScholarCross RefCross Ref
  23. Yong Luo, Huaizheng Zhang, Yongjie Wang, Yonggang Wen, and Xinwen Zhang. 2018. ResumeNet: A learning-based framework for automatic resume quality assessment. In Proceedings of the IEEE International Conference on Data Mining. Singapore, 307--316.Google ScholarGoogle ScholarCross RefCross Ref
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, Vol. 26 (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. 2021. Attention bottlenecks for multimodal fusion. Advances in Neural Information Processing Systems, Vol. 34 (2021).Google ScholarGoogle Scholar
  26. Chuan Qin, Hengshu Zhu, Tong Xu, Chen Zhu, Chao Ma, Enhong Chen, and Hui Xiong. 2020. An Enhanced Neural Network Approach to Person-Job Fit in Talent Recruitment. ACM Trans. Inf. Syst., Vol. 38, 2 (2020), 15:1--15:33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).Google ScholarGoogle Scholar
  28. Dazhong Shen, Hengshu Zhu, Chen Zhu, Tong Xu, Chao Ma, and Hui Xiong. 2018. A Joint Learning Approach to Intelligent Job Interview Assessment. In Proceedings of the International Joint Conference on Artificial Intelligence. Stockholm, Sweden, 3542--3548.Google ScholarGoogle ScholarCross RefCross Ref
  29. Ekaterina Shutova, Douwe Kiela, and Jean Maillard. 2016. Black holes and white rabbits: Metaphor identification with visual features. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. San Diego, 160--170.Google ScholarGoogle ScholarCross RefCross Ref
  30. Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition., 14 pages.Google ScholarGoogle Scholar
  31. Amit Singh, Catherine Rose, Karthik Visweswariah, Vijil Chenthamarakshan, and Nandakishore Kambhatla. 2010. PROSPECT: a system for screening candidates for recruitment. In Proceedings of the ACM International Conference on Information and Knowledge Management. Toronto, Ontario, Canada, 659--668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017), 5998--6008.Google ScholarGoogle Scholar
  33. Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, and Guangming Shi. 2021. PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network. In Proceedings of the AAAI Conference on Artificial Intelligence. Virtual Event, 2782--2790.Google ScholarGoogle ScholarCross RefCross Ref
  34. Weiyao Wang, Du Tran, and Matt Feiszli. 2020. What makes training multi-modal classification networks hard?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 12695--12705.Google ScholarGoogle ScholarCross RefCross Ref
  35. Yanzhao Wu, Ling Liu, Zhongwei Xie, Ka-Ho Chow, and Wenqi Wei. 2021. Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual Event, 16469--16477.Google ScholarGoogle ScholarCross RefCross Ref
  36. Nan Xu, Wenji Mao, and Guandan Chen. 2019. Multi-interactive memory network for aspect based multimodal sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 371--378.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhen Xu, David R So, and Andrew M Dai. 2021. MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. Virtual Event, 10532--10540.Google ScholarGoogle ScholarCross RefCross Ref
  38. Rui Yan, Ran Le, Yang Song, Tao Zhang, Xiangliang Zhang, and Dongyan Zhao. 2019. Interview choice reveals your preference on the market: to improve job-resume matching through profiling memories. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Anchorage, 914--922.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yang Yang, Ke-Tao Wang, De-Chuan Zhan, Hui Xiong, and Yuan Jiang. 2019. Comprehensive Semi-Supervised Multi-Modal Learning.. In Proceedings of the International Joint Conference on Artificial Intelligence. Macao, China, 4092--4098.Google ScholarGoogle ScholarCross RefCross Ref
  40. Yang Yang, Yi-Feng Wu, De-Chuan Zhan, Zhi-Bin Liu, and Yuan Jiang. 2018. Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK, 2594--2603.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yang Yang, De-Chuan Zhan, Ying Fan, and Yuan Jiang. 2017a. Instance Specific Discriminative Modal Pursuit: A Serialized Approach. In Proceedings of The 9th Asian Conference on Machine Learning. Seoul, Korea, 65--80.Google ScholarGoogle Scholar
  42. Yang Yang, De-Chuan Zhan, Ying Fan, Yuan Jiang, and Zhi-Hua Zhou. 2017b. Deep Learning for Fixed Model Reuse. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, California, 2831--2837.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yang Yang, De-Chuan Zhan, Yi-Feng Wu, Zhi-Bin Liu, Hui Xiong, and Yuan Jiang. 2021. Semi-Supervised Multi-Modal Clustering and Classification with Incomplete Modalities. IEEE Trans. Knowl. Data Eng., Vol. 33, 2 (2021), 682--695.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chen Zhang and Hao Wang. 2018. Resumevis: A visual analytics system to discover semantic information in semi-structured resume data. ACM Transactions on Intelligent Systems and Technology, Vol. 10, 1 (2018), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. 2020. Multimodal intelligence: Representation learning, information fusion, and applications. IEEE Journal of Selected Topics in Signal Processing, Vol. 14, 3 (2020), 478--493.Google ScholarGoogle ScholarCross RefCross Ref
  46. Le Zhang, Zenglin Shi, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Joey Tianyi Zhou, Guoyan Zheng, and Zeng Zeng. 2019. Nonlinear regression via deep negative correlation learning. IEEE transactions on pattern analysis and machine intelligence, Vol. 43, 3 (2019), 982--998.Google ScholarGoogle Scholar

Index Terms

  1. DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 October 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader