Email Importance Evaluation in Mailing List Discussions

Jiang, Kun; Hu, Chunming; Sun, Jie; Shen, Qi; Jiang, Xiaohan

doi:10.1007/978-3-030-19143-6_3

Kun Jiang¹⁹,
Chunming Hu¹⁹,
Jie Sun¹⁹,
Qi Shen¹⁹ &
…
Xiaohan Jiang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11235))

Included in the following conference series:

International Workshop on Data Quality and Trust in Big Data

889 Accesses

Abstract

Nowadays, mailing lists are widely used in team work for discussion and consultation. Identifying important emails in mailing list discussions could significantly benefit content summary and opinion leader recognition. However, previous studies only focus on the importance evaluation methods regarding personal emails, and there is no consensus on the definition of important emails. Therefore, in this paper we consider the characteristics of mailing lists and study how to evaluate email importance in mailing list discussions. Our contribution mainly includes the following aspects. First, we propose ER-Match, an email conversation thread reconstruction algorithm that takes nested quotation relationships into consideration while constructing the email relationship network. Based on the email relationship network, we formulate the importance of emails in mailing list discussions. Second, we propose a feature-rich learning method to predict the importance of new emails. Furthermore, we characterize various factors affecting email importance in mailing list discussions. Experiments with publicly available mailing lists show that our prediction model outperforms baselines with large gains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Exploring Architectural Design Decisions in Mailing Lists and Their Traceability to Issue Trackers

Understanding Email Interactivity and Predicting User Response to Email

Predict Email Success Based on Text Content

Notes

1.
https://lkml.org.
2.
https://lists.w3.org/.
3.
The body here refers to contents without header, signature and quotation.
4.
https://github.com/sloria/TextBlob.
5.
http://scikit-learn.org.
6.
For email importance prediction with XGBoost, we set learning_rate = 0.1, n_estimators = 1000, max_depth = 5, min_child_weight = 1, gamma = 0, subsample = 0.8, colsample_bytree = 0.8, objective = ’binary:logistic’, scale_pos_weight = 1, seed = 27.

References

Aberdeen, D., Pacovsky, O., Slater, A.: The learning behind gmail priority inbox. In: LCCC: NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds (2010)
Google Scholar
Albitar, S., Fournier, S., Espinasse, B.: An effective TF/IDF-based text-to-text semantic similarity measure for text classification. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds.) WISE 2014. LNCS, vol. 8786, pp. 105–114. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11749-2_8
Chapter Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
Dabbish, L.A., Kraut, R.E., Fussell, S., Kiesler, S.: Understanding email use: predicting action on a message. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 691–700. ACM (2005)
Google Scholar
Dehghani, M., Shakery, A., Asadpour, M., Koushkestani, A.: A learning approach for email conversation thread reconstruction. J. Inf. Sci. 39(6), 846–863 (2013)
Article Google Scholar
Golbeck, J., Hendler, J.: Inferring binary trust relationships in web-based social networks. ACM Tran. Internet Technol. (TOIT) 6(4), 497–529 (2006)
Article Google Scholar
Jain, A.: XGboost tuning. https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/. Accessed 24 July 2018
Joshi, S., Contractor, D., Ng, K., Deshpande, P.M., Hampp, T.: Auto-grouping emails for faster e-discovery. Proc. VLDB Endow. 4(12), 1284–1294 (2011)
Google Scholar
Lewis, D.D., Knowles, K.A.: Threading electronic mail: a preliminary study. Inf. Process. Manage. 33(2), 209–217 (1997)
Article Google Scholar
Liu, L., Tang, J., Han, J., Jiang, M., Yang, S.: Mining topic-level influence in heterogeneous networks. In: CIKM ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, October, pp. 199–208 (2010)
Google Scholar
Merton, R.K.: The Matthew effect in science: the reward and communication systems of science are considered. Science 159(3810), 56–63 (1968)
Article Google Scholar
Page, L.: The pagerank citation ranking: bringing order to the web. Stanford Digital Libraries Working Paper 9(1), 1–14 (1999)
Google Scholar
Passant, A., Zimmermann, A., Schneider, J., Breslin, J.G.: A semantic framework for modelling quotes in email conversations. In: Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications. ACM (2010)
Google Scholar
Sharaff, A., Nagwani, N.K.: Email thread identification using latent Dirichlet allocation and non-negative matrix factorization based clustering techniques. J. Inf. Sci. 42(2), 200–212 (2016)
Article Google Scholar
Tsugawa, S., Ohsaki, H., Imase, M.: Estimating message importance using inferred inter-recipient trust for supporting email triage. Inf. Media Technol. 7(3), 1073–1082 (2012)
Google Scholar
Wu, Y., Oard, D.W.: Indexing emails and email threads for retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665–666. ACM (2005)
Google Scholar
Yang, L., Dumais, S.T., Bennett, P.N., Awadallah, A.H.: Characterizing and predicting enterprise email reply behavior. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235–244. ACM (2017)
Google Scholar
Yoo, S., Yang, Y., Lin, F., Moon, I.C.: Mining social networks for personalized email prioritization. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 967–976. ACM (2009)
Google Scholar
Zawinski, J.: Message threading. https://www.jwz.org/doc/threading.html/. Accessed 10 May 2018
Zhang, F., Xu, K.: Annotation and classification of an email importance corpus. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), vol. 2, pp. 651–656 (2015)
Google Scholar

Download references

Acknowledgement

This work is supported by National Key Research & Development Program (2016YFB1000503).

Author information

Authors and Affiliations

Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, 100191, China
Kun Jiang, Chunming Hu, Jie Sun & Qi Shen
Beijing University of Technology, Beijing, 100124, China
Xiaohan Jiang

Authors

Kun Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chunming Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Sun
View author publications
You can also search for this author in PubMed Google Scholar
Qi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohan Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunming Hu .

Editor information

Editors and Affiliations

Zayed University, Dubai, United Arab Emirates
Hakim Hacid
Macquarie University, Sydney, NSW, Australia
Quan Z. Sheng
Nara Women’s University, Nara, Japan
Tetsuya Yoshida
Dalarna University, Borlänge, Sweden
Azadeh Sarkheyli
Department of Computer Science and Software Engineering, Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, K., Hu, C., Sun, J., Shen, Q., Jiang, X. (2019). Email Importance Evaluation in Mailing List Discussions. In: Hacid, H., Sheng, Q., Yoshida, T., Sarkheyli, A., Zhou, R. (eds) Data Quality and Trust in Big Data. QUAT 2018. Lecture Notes in Computer Science(), vol 11235. Springer, Cham. https://doi.org/10.1007/978-3-030-19143-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-19143-6_3
Published: 25 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19142-9
Online ISBN: 978-3-030-19143-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics