Skip to main content

DaFNeGE: Dataset of French Newsletters with Graph Representation and Embedding

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13502))

Included in the following conference series:

  • 824 Accesses

Abstract

Natural language resources are essential for integrating linguistic engineering components into information processing suites. However, the resources available in French are scarce and do not cover all possible tasks, especially for specific business applications. In this context, we present a dataset of French newsletters and their use to predict their impact, good or bad, on readers. We propose an original representation of newsletters in the form of graphs that take into account the layout of the newsletters. We then evaluate the interest of such a representation in predicting a newsletter’s performance in terms of open and click rates using graph convolution network models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Kosmopolead is a UNEEK’s trademark offering services such as CRM (https://www.kosmopolead.com/).

  2. 2.

    With K set to 5, which here represents the number of colors to detect. It is rare to find more than 5 colors in the same portion of the image, and if it is the case, we only focus here on the dominant color.

  3. 3.

    As defined in [25].

References

  1. Abdaoui, A., Azé, J., Bringay, S., Poncelet, P.: FEEL: a French expanded emotion Lexicon. Lang. Resources Eval. 51(3), 833–855 (2017). https://doi.org/10.1007/s10579-016-9364-5. https://hal-lirmm.ccsd.cnrs.fr/lirmm-01348016

  2. Blandin, A., Saïd, F., Villaneau, J., Marteau, P.F.: Automatic emotions analysis for french email campaigns optimization. In: CENTRIC 2021, Barcelone, Spain, October 2021. https://hal.archives-ouvertes.fr/hal-03424725

  3. Bonfrer, A., Drèze, X.: Real-time evaluation of e-mail campaign performance. Marketing Science (2009)

    Google Scholar 

  4. d’Hoffschmidt, M., Belblidia, W., Brendlé, T., Heinrich, Q., Vidal, M.: Fquad: French question answering dataset (2020)

    Google Scholar 

  5. Duvenaud, D., et al.: Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv:1509.09292 (2015)

  6. Ekman, P.: Basic Emotions, chap. 3, pp. 45–60. John Wiley and Sons, Ltd (1999). https://doi.org/10.1002/0470013494.ch3. https://onlinelibrary.wiley.com/doi/abs/10.1002/0470013494.ch3

  7. Guenoune, H., Cousot, K., Lafourcade, M., Mekaoui, M., Lopez, C.: A dataset for anaphora analysis in French emails. In: Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference, pp. 165–175. Association for Computational Linguistics, Barcelona, Spain (online), December 2020. https://aclanthology.org/2020.crac-1.17

  8. Honnibal, M., Montani, I.: spaCy 2: natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing (2017), to appear

    Google Scholar 

  9. Ipsen, N., Mattei, P.A., Frellsen, J.: How to deal with missing data in supervised deep learning? In: ICML Workshop on the Art of Learning with Missing Values (Artemiss) (2020)

    Google Scholar 

  10. Kalitvianski, R.: Traitements formels et sémantiques des échanges et des documents textuels liés à des activités collaboratives. Theses, Université Grenoble Alpes, March 2018. https://tel.archives-ouvertes.fr/tel-01893348

  11. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  12. Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30115-8_22

    Chapter  Google Scholar 

  13. Kumar, A.: An empirical examination of the effects of design elements of email newsletters on consumers’ email responses and their purchase. J. Retailing Consumer Serv. 58, 102349 (2021). https://doi.org/10.1016/j.jretconser.2020.102349. https://www.sciencedirect.com/science/article/pii/S0969698920313576

  14. Loria, S.: textblob documentation. Release 0.15 2 (2018)

    Google Scholar 

  15. Mandivarapu, J.K., Bunch, E., You, Q., Fung, G.: Efficient document image classification using region-based graph neural network. CoRR abs/2106.13802 (2021). https://arxiv.org/abs/2106.13802

  16. Miller, R., Charles, E.: A psychological based analysis of marketing email subject lines. In: 2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer), pp. 58–65 (2016). https://doi.org/10.1109/ICTER.2016.7829899

  17. Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013)

    Article  MathSciNet  Google Scholar 

  18. Olive, T., Barbier, M.L.: Processing time and cognitive effort of longhand note taking when reading and summarizing a structured or linear text. Writ. Commun. 34(2), 224–246 (2017)

    Article  Google Scholar 

  19. Oono, K., Suzuki, T.: Graph neural networks exponentially lose expressive power for node classification. arXiv preprint arXiv:1905.10947 (2019)

  20. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for squad (2018)

    Google Scholar 

  21. Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021). https://doi.org/10.1016/j.procs.2021.05.077

  22. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., Navigli, R., Vidal, M.-E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

    Chapter  Google Scholar 

  23. Seth, S., Biswas, S.: Multimodal spam classification using deep learning techniques. In: 2017 13th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 346–349. IEEE (2017)

    Google Scholar 

  24. Shen, Z., Zhang, R., Dell, M., Lee, B.C.G., Carlson, J., Li, W.: Layoutparser: a unified toolkit for deep learning based document image analysis. arXiv preprint arXiv:2103.15348 (2021)

  25. Wang, M., et al.: Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019)

  26. Wright, P.: The psychology of layout: Consequences of the visual structure of documents. American Association for Artificial Intelligence Technical Report FS-99-04, pp. 1–9 (1999)

    Google Scholar 

  27. Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2. https://github.com/facebookresearch/detectron2 (2019)

  28. Yang, H., Liu, Q., Zhou, S., Luo, Y.: A spam filtering method based on multi-modal fusion. Appl. Sci. 9(6), 1152 (2019)

    Article  Google Scholar 

  29. Yesilada, Y., Jay, C., Stevens, R., Harper, S.: Validating the use and role of visual elements of web pages in navigation with an eye-tracking study. In: Proceedings of the 17th International Conference on World Wide Web, pp. 11–20 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexis Blandin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Blandin, A., Saïd, F., Villaneau, J., Marteau, PF. (2022). DaFNeGE: Dataset of French Newsletters with Graph Representation and Embedding. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2022. Lecture Notes in Computer Science(), vol 13502. Springer, Cham. https://doi.org/10.1007/978-3-031-16270-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16270-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16269-5

  • Online ISBN: 978-3-031-16270-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics