Abstract
Motivated by some of the recent work based on using sparse principal component analysis to analyse social media, we propose an improvement which involves altering the input data matrices by considering what relationships they represent. Accordingly, we confirm our result by using Twitter data from London in the year 2012 as a medium to demonstrate on. Various alterations are made to the data matrix obtained from this data and the resulting matrices are then passed through a sparse principal component analysis algorithm. The resulting outputs are then analysed and it is shown that indeed the results do differ, with one particular variation consistently outperforming the rest. Our results are especially of interest when the data to be analysed can be represented by a binary matrix of some sort, e.g. in document analysis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
This is the same data which was used in [5].
- 2.
Using a rank 2 approximation.
- 3.
Confirmed by the Guardian: http://www.theguardian.com/football/2012/sep/23/john-terry-retires-international-football.
- 4.
Confirmed by the BBC: http://www.bbc.co.uk/news/uk-england-19634164.
- 5.
Confirmed by the BBC: http://www.bbc.co.uk/sport/0/golf/19780678.
References
H. Shen, J.Z. Huang, Sparse principal component analysis via regularized low rank matrix approximation. J. Multivar. Anal. 99, 1015–1034 (2007)
A. d’Aspremont, L.E. Ghaoui, M. I. Jordan, G.R.G. Lanckriet, A direct formulation for sparse pca using semidefinite programming, in NIPS (2004)
H. Zou, T. Hastie, R. Tibshirani, Sparse principal component analysis. J. Comput. Graph. Stat. 15, 2006 (2004)
X.-T. Yuan, T. Zhang, Truncated power method for sparse eigenvalue problems, ArXiv e-prints, Dec 2011
M.-D. Albakour, C. Macdonald, I. Ounis, Identifying local events by using microblogs as social sensors, in Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR ’13, (Paris, France, France), LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE, 2013, pp. 173–180
D.S. Papailiopoulos, A.G. Dimakis, Sparse pca through low-rank approximations, (2013)
L. Mackey, Deflation methods for sparse pca, in NIPS, ed. by D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (Curran Associates Inc, New York, 2008), pp. 1017–1024
M. Journée, Y. Nesterov, P. Richtárik, R. Sepulchre, Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 Mar. 2010
D.A. Spielman, Algorithms, graph theory, and the solution of laplacian linear equations. ICALP 2, 24–26 (2012)
Acknowledgments
This work has been carried out in the scope of the EC funded project SMART (FP7-287583).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Pavlakou, T., Babaee, A., Draief, M. (2014). Improving Event Recognition Using Sparse PCA in the Context of London Twitter Data. In: Czachórski, T., Gelenbe, E., Lent, R. (eds) Information Sciences and Systems 2014. Springer, Cham. https://doi.org/10.1007/978-3-319-09465-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-09465-6_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09464-9
Online ISBN: 978-3-319-09465-6
eBook Packages: Computer ScienceComputer Science (R0)