Skip to main content

Identifying and Classifying Influencers in Twitter only with Textual Information

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10859))

Abstract

Online Reputation Management systems aim at identifying and classifying Twitter influencers due to their importance for brands. Current methods mainly rely on metrics provided by Twitter such as followers, retweets, etc. In this work we follow the research initiated at RepLab 2014, but relying only on the textual content of tweets. Moreover, we have proposed a workflow to identify influencers and classify them into an interest group from a reputation point of view, besides the classification proposed at RepLab. We have evaluated two families of classifiers, which do not require feature engineering, namely: deep learning classifiers and traditional classifiers with embeddings. Additionally, we also use two baselines: a simple language model classifier and the “majority class” classifier. Experiments show that most of our methods outperform the reported results in RepLab 2014, especially the proposed Low Dimensionality Statistical Embedding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Influencer is a user (person or brand) who may influence a high number of other people.

  2. 2.

    In ancient Rome, auctoritas was the general level of prestige a person had. Due to that, authority in this context means prestige.

  3. 3.

    http://www.clef-initiative.eu/track/replab.

  4. 4.

    We did not include Stockholder and Investor classes because they did not have data for Automotive and in Banking they did not have either training or test data.

  5. 5.

    https://nlp.stanford.edu/projects/glove/.

  6. 6.

    Sum has empirically given better results than average or concatenation.

  7. 7.

    Sum has empirically given better results than average.

  8. 8.

    Previously in other tasks as Low Dimensionality Representation (LDR).

  9. 9.

    We have tested several machine learning algorithms and finally we report the ones with the best results.

  10. 10.

    As the Undecidable class only appears in training set, we have removed it to avoid noise in the training phase.

References

  1. AleAhmad, A., Karisani, P., Rahgozar, M., Oroumchian, F.: University of Tehran at RepLab 2014. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)

    Google Scholar 

  2. Amigó, E., Carrillo-de-Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., de Rijke, M., Spina, D.: Overview of RepLab 2014: author profiling and reputation dimensions for online reputation management. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 307–322. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11382-1_24

    Chapter  Google Scholar 

  3. Cossu, J.-V., Dugué, N., Labatut, V.: Detecting real-world influence through Twitter. In: 2015 Second European Network Intelligence Conference (ENIC), pp. 83–90. IEEE (2015)

    Google Scholar 

  4. Cossu, J.-V., Janod, K., Ferreira, E., Gaillard, J., El-Bèze, M.: Lia@ RepLab 2014: 10 methods for 3 tasks. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Kim, Y.: Convolutional neural networks for sentence classification. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, A Special Interest Group of the ACL, pp. 1746–1751. ACL (2014)

    Google Scholar 

  7. Lahuerta-Otero, E., Cordero-Gutiérrez, R.: Looking for the perfect Tweet. The use of data mining techniques to find influencers on Twitter. Comput. Hum. Behav. 64, 575–583 (2016)

    Article  Google Scholar 

  8. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML 2014, pp. II-1188-II-1196 (2014). JMLR.org

  9. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  10. Morone, F., Makse, H.A.: Influence maximization in complex networks through optimal percolation. Nature 524(7563), 65 (2015)

    Article  Google Scholar 

  11. Morone, F., Min, B., Bo, L., Mari, R., Makse, H.A.: Collective influence algorithm to find influencers via optimal percolation in massively large social media. Sci. Rep. 6, 30062 (2016)

    Article  Google Scholar 

  12. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543. ACL (2014)

    Google Scholar 

  13. Rangel, F., Franco-Salvador, M., Rosso, P.: A low dimensionality representation for language variety identification. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9624. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75487-1_13. arXiv:1705.10754

    Chapter  Google Scholar 

  14. Vilares, D., Hermo, M., Alonso, M.A., Gómez-Rodríguez, C., Vilares, J.: LyS at CLEF RepLab 2014: creating the state of the art in author influence ranking and reputation classification on Twitter. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)

    Google Scholar 

Download references

Acknowledgements

The work of the last author was funded by the SomEMBED TIN2015-71147-C2-1-P MINECO research project. Authors from Universitat Jaume I have been funded by the MINECO R&D project TIN2017-88805-R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victoria Nebot .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nebot, V., Rangel, F., Berlanga, R., Rosso, P. (2018). Identifying and Classifying Influencers in Twitter only with Textual Information. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91947-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91946-1

  • Online ISBN: 978-3-319-91947-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics