Abstract
Unlimited amount of unstructured data is being captured and analyzed over social media. The paper highlights the issue of lack of standard quality control approaches that could be utilized for all social media sites. This is due to the variety of formats of big data acceptable over these sites. The issue reveals a challenge not only in the capture of big data but also in the analysis and yield of valuable data, which affect decision-making. The paper reviews a collection of archived documents in the field of big data and social media. This paper presents a framework identifying the issues of quality analysis of big data on social media, examining current techniques used by social media companies to capture and analyze big data, and mapping social media sites and the appropriate combinations of big data capture and analysis techniques with the data quality control requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gold, M.K.: Debates in the Digital Humanities. Univ of Minnesota Press (2012)
Deters, R., Lomotey, R.K.: RSenter: terms mining tool from unstructured data sources. Int. J. of Business Process Integration and Management 6, 298–311 (2014)
Mayer-Schönberger, V., Cukier, K.: Big Data: A Revolution that Will Transform how We Live, Work, and Think. Eamon Dolan/Houghton Mifflin Harcourt, New York (2013)
Robinson, D.: Big Data – The 4 V’s: What Was Old is New Again; Part 1, from Making Data Meaningful (December 3, 2012), http://makingdatameaningful.com/2012/12/03/big-data-the-4-vs-what-was-old-is-new-again-part-1/ (retrieved March 4, 2014)
Atefeh, F., Khreich, W.: A Survey of Techniques For Event Detection in Twitter. Computational Intelligence (September 4, 2013)
Vemuganti, G.: Metadata Management in Big Data. Infosys Labs Briefings (2013)
Liang, P.-W., Dai, B.-R.: Opinion Mining on Social Media Data. In: IEEE 14th International Conference on Mobile Data Management (MDM), Milan, vol. 2, pp. 91–96 (2013)
Flaounas, I., Sudhahar, S., Lansdall-Welfare, T., Hensiger, E., Cristianini, N.: Big Data Analysis of News and Social Media Content (2014), www.see-a-pattern.org/sites/default/files/Big%20Data%20Analysis%20of%20News%20and%20Social%20Media%20Content.pdf (retrieved 2014 йил 23-03 from See a pattern)
Xin Chen, M.V.: Mining Social Media Data for Understanding Students’ Learning Experiences (2013)
Alexa, Actionable Analytics for the Web, from Alexa (April 5, 2014), http://www.alexa.com/ (retrieved)
Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics. Springer (2013)
Small, H., Kasianovitz, K., Blanford, R., Celaya, I.: What Your Tweets Tell Us About You: Identity, Ownership and Privacy of Twitter Data. The International Journal of Digital Curation 7(1), 174–197 (2012)
Chen, X., Madhavan, K., Vorvoreanu, M.: A Web-Based Tool for Collaborative Social Media Data Analysis. In: IEEE Third International Conference on Cloud and Green Computing, pp. 383–388. IEEE Computer Society, Karlsruhe (2013)
Miners, Z., Ribeiro, J.: Apple snaps up Topsy, PrimeSense: acquisitions reflect interest in Twitter access, 3D sensing technology. Macworld 31(3), 24 (2014)
DataSift. Pull. from DataSift Developers (February 10, 2014) (retrieved April 18, 2014 )
Information Management Journal. Search Firms to Mine Tweets. Information Management Journal 46(3), 17 (2012)
Boicea, A., Radulescu, F., Agapin, L.I.: MongoDB vs Oracle - database comparison. In: Third International Conference on Emerging Intelligent Data and Web Technologies, pp. 330–335. IEEE Computer Society, Bucharest (2012)
Okman, L., Gal-Oz, N., Gonen, Y., Gudes, E., Abramov, J.: Security Issues in NoSQL Databases. In: 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 541–547. IEEE Computer Society, Changsha (2011)
Li, Y., Manoharan, S.: A performance comparison of SQL and NoSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19. IEEE, Victoria (2013)
Information Today. Topsy introduces alerts and reports. EContent 36(4), 15
Akrouf, S., Meriem, L., Yahia, B., Eddine, M.N.: Social Network Analysis and Information Propagation: A Case Study Using Flickr and YouTube Networks. International Journal of Future Computer and Communication (2013)
Hansen, D.L., Rotman, D., Bonsignore, E., Milic-Frayling, N., Rodrigues, E.M., Smith, M., Shneiderman, B.: Do You Know the Way to SNA?: A Process Model for Analyzing and Visualizing Social Media Network Data. In: 2012 International Conference on Social Informatics (SocialInformatics), Lausanne (2012)
Smith, M.A.: NodeXL: Simple network analysis for social media. In: 2013 International Conference Collaboration Technologies and Systems (CTS), San Diego, CA (2013)
Gómez, J.A., Shneiderman, B.: Understanding social relationships from photo collection tags. Human-Computer Interaction Lab & Department of Computer Science (2011)
Smith, M.M.-F.: NodeXL: a free and open network overview, discovery and exploration add-in for Excel (2007/2010), http://nodexl.codeplex.com/ (retrieved 2014 йил 20-April from CodePlex)
Microsoft. Excel specifications and limits, http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HP010073849.aspx (retrieved 2014 йил 20-April from Microsoft Office)
Bonsignore, E.M., Dunne, C., Rotman, D., Smith, M., Capone, T., Hansen, D.L., Shneiderman, B.: First Steps to Netviz Nirvana: Evaluating Social Network Analysis with NodeXL. In: International Conference on Computational Science and Engineering, CSE 2009, Vancouver, BC (2009)
Bonneau, J., Anderson, J.: Prying Data out of a Social Network. Cambridge, UK (2009)
Hogan, B.: Facebook as a data capture site: Techniques, Traps, Terms & Conditions (2011 йил 24-March), http://www.slideshare.net/primath/dl-tech-talkhogan (retrieved 2014 йил 18-April from slideshare)
Rieder, B.: Studying Facebook via Data Extraction. The Netvizz, Amesterdam (2013 йил 29-June)
Hayes, M.: DataFu’s Hourglass: Incremental Data Processing in Hadoop (October 03, 2013)
Diane, M.: The Value and Benefits of Text Mining
Sukanyal, M., Biruntha, S.: Techniques on Text Mining (2012)
Alfawareh, S.J.: Techniques, Applications and Challenging Issue in Text Mining (2012)
Vaughan, W.: DataFu 1.0 (September 2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Al-Hajjar, D., Jaafar, N., Al-Jadaan, M., Alnutaifi, R. (2015). Framework for Social Media Big Data Quality Analysis. In: Bassiliades, N., et al. New Trends in Database and Information Systems II. Advances in Intelligent Systems and Computing, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-319-10518-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-10518-5_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10517-8
Online ISBN: 978-3-319-10518-5
eBook Packages: EngineeringEngineering (R0)