Abstract
Online Reputation Management systems aim at identifying and classifying Twitter influencers due to their importance for brands. Current methods mainly rely on metrics provided by Twitter such as followers, retweets, etc. In this work we follow the research initiated at RepLab 2014, but relying only on the textual content of tweets. Moreover, we have proposed a workflow to identify influencers and classify them into an interest group from a reputation point of view, besides the classification proposed at RepLab. We have evaluated two families of classifiers, which do not require feature engineering, namely: deep learning classifiers and traditional classifiers with embeddings. Additionally, we also use two baselines: a simple language model classifier and the “majority class” classifier. Experiments show that most of our methods outperform the reported results in RepLab 2014, especially the proposed Low Dimensionality Statistical Embedding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Influencer is a user (person or brand) who may influence a high number of other people.
- 2.
In ancient Rome, auctoritas was the general level of prestige a person had. Due to that, authority in this context means prestige.
- 3.
- 4.
We did not include Stockholder and Investor classes because they did not have data for Automotive and in Banking they did not have either training or test data.
- 5.
- 6.
Sum has empirically given better results than average or concatenation.
- 7.
Sum has empirically given better results than average.
- 8.
Previously in other tasks as Low Dimensionality Representation (LDR).
- 9.
We have tested several machine learning algorithms and finally we report the ones with the best results.
- 10.
As the Undecidable class only appears in training set, we have removed it to avoid noise in the training phase.
References
AleAhmad, A., Karisani, P., Rahgozar, M., Oroumchian, F.: University of Tehran at RepLab 2014. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)
Amigó, E., Carrillo-de-Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., de Rijke, M., Spina, D.: Overview of RepLab 2014: author profiling and reputation dimensions for online reputation management. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 307–322. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11382-1_24
Cossu, J.-V., Dugué, N., Labatut, V.: Detecting real-world influence through Twitter. In: 2015 Second European Network Intelligence Conference (ENIC), pp. 83–90. IEEE (2015)
Cossu, J.-V., Janod, K., Ferreira, E., Gaillard, J., El-Bèze, M.: Lia@ RepLab 2014: 10 methods for 3 tasks. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kim, Y.: Convolutional neural networks for sentence classification. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A meeting of SIGDAT, A Special Interest Group of the ACL, pp. 1746–1751. ACL (2014)
Lahuerta-Otero, E., Cordero-Gutiérrez, R.: Looking for the perfect Tweet. The use of data mining techniques to find influencers on Twitter. Comput. Hum. Behav. 64, 575–583 (2016)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML 2014, pp. II-1188-II-1196 (2014). JMLR.org
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Morone, F., Makse, H.A.: Influence maximization in complex networks through optimal percolation. Nature 524(7563), 65 (2015)
Morone, F., Min, B., Bo, L., Mari, R., Makse, H.A.: Collective influence algorithm to find influencers via optimal percolation in massively large social media. Sci. Rep. 6, 30062 (2016)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543. ACL (2014)
Rangel, F., Franco-Salvador, M., Rosso, P.: A low dimensionality representation for language variety identification. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9624. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75487-1_13. arXiv:1705.10754
Vilares, D., Hermo, M., Alonso, M.A., Gómez-Rodríguez, C., Vilares, J.: LyS at CLEF RepLab 2014: creating the state of the art in author influence ranking and reputation classification on Twitter. In: Proceedings of the Fifth International Conference of the CLEF Initiative (2014)
Acknowledgements
The work of the last author was funded by the SomEMBED TIN2015-71147-C2-1-P MINECO research project. Authors from Universitat Jaume I have been funded by the MINECO R&D project TIN2017-88805-R.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Nebot, V., Rangel, F., Berlanga, R., Rosso, P. (2018). Identifying and Classifying Influencers in Twitter only with Textual Information. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-91947-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91946-1
Online ISBN: 978-3-319-91947-8
eBook Packages: Computer ScienceComputer Science (R0)