Abstract
With the advent of social media technology for sharing commonly asked questions and answers among end users, there is rapidly growing interest in understanding the characteristics as well as utilizing social question and answer (QA) data. Not only SQL (NoSQL) is a popular technical topic on social question answering Web sites and is gaining popularity with emerging demands for scalable databases of big data. Despite the great interest of users in NoSQL technology, an attempt to analyze how the actual users react to NoSQL has not yet been made. Thus, in the present work, we utilize the QA data acquired from Stack Overflow (a QA Web site that works as a large knowledge repository) to understand how people perceive NoSQL technology. To this end, latent Dirichlet allocation (LDA) topic modeling techniques are used to discover the trend of NoSQL databases. In addition, we examine a weighted LDA model to reflect the influence of answers and finally propose the topic discrimination value in an attempt to find topics that distinguish each NoSQL database.




Similar content being viewed by others
References
Sammut C, Banerji RB (1986) Learning concepts by asking questions. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach, vol 2. Morgan Kaufmann. Burlington, USA, pp 167–192
King A (1994) Guiding knowledge construction in the classroom: effects of teaching children how to question and how to explain. Am Educ Res J 31:338–368
Ross J (2009) How to ask better questions. Harv Bus Rev. https://hbr.org/2010/02/learn-to-ask-better-questions. Accessed 26 Jan 2018
Liu Z, Jansen BJ (2017) Identifying and predicting the desire to help in social question and answering. Inf Process Manag 53:490–504
Palomera D, Figueroa A (2017) Leveraging linguistic traits and semi-supervised learning to single out informational content across how-to community question-answering archives. Inf Sci 381:20–32
Atwood J (2009) What was stack overflow built with? Stack overflow blog. https://stackoverflow.blog/2008/09/21/what-was-stack-overflow-built-with/. Accessed 26 Jan 2018
Kanwar R, Trivedi P, Singh K (2013) NoSQL, A solution for distributed database management system. Int J Comput Appl 67:6–9
Zhang H, Wang Y, Han J (2011) Middleware design for integrating relational database and NOSQL based on data dictionary. In: 2011 International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE). IEEE, Piscataway, pp 1469–1472
Hajoui O, Dehbi R, Talea M, Batouta ZI (2015) An advanced comparative study of the most promising nosql and newsql databases with a multi-criteria analysis method. J Theor Appl Inf Technol 81:579–588
Tudorica BG, Bucur C (2011) A comparison between several NoSQL databases with comments and notes. In: The 10th Roedunet International Conference (RoEduNet). IEEE, Piscataway, pp 1–5
Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: The 6th International Conference on Pervasive Computing and Applications (ICPCA). IEEE, Piscataway, pp 363–366
Lourenço JR, Cabral B, Carreiro P, Vieira M, Bernardino J (2015) Choosing the right NoSQL database for the job: a quality attribute evaluation. J Big Data 2:18. https://doi.org/10.1186/s40537-015-0025-0
Moniruzzaman ABM, Hossain SA (2013) Nosql database: new era of databases for big data analytics-classification, characteristics and comparison. Int J Database Theor Appl 6:1–14
Parnin C, Treude C, Grammel L, Storey MA (2012) Crowd documentation: exploring the coverage and the dynamics of API discussions on stack overflow. Georgia Institute of Technology, USA
Gajduk A, Madjarov G, Gjorgjevikj D (2013) Intelligent tag grouping by using an aglomerative clustering algorithm. In: The 10th Conference for Informatics and Information Technology (CIIT 2013). Springer, New York, pp 94–96
Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? An analysis of topics and trends in stack overflow. Empir Softw Eng 19:619–654
Yang J, Tao K, Bozzon A, Houben GJ (2014) Sparrows and owls: characterisation of expert behaviour in stack overflow. In: International Conference on User Modeling, Adaptation, and Personalization. Springer, New York, pp 266–277
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Ramage D, Heymann P, Manning CD, Garcia-Molina H (2009) Clustering the tagged web. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, New York, pp 54–63
Celikyilmaz A, Hakkani-Tur D, Tur G (2010) LDA based similarity modeling for question answering. In: Proceedings of the NAACL HLT 2010 workshop on semantic search. MIT Press, Cambridge, pp 1–9
Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: The European Conference on Information Retrieval. Springer, Berlin, pp 338–349
Jeong YS, Lee SH, Gweon G, Choi HJ (2017) Discovery of topic flows of authors. J Super Comput. https://doi.org/10.1007/s11227-017-2065-z
Wallach HM, Murray I, Salakhutdinov R, Mimno D (2009) Evaluation methods for topic models. In: Proceedings of the 26th Annual International Conference on Machine Learning. ACM, New York, pp 1105–1112
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228–5235
Messina A, Storniolo P, Urso A (2016) Keep it simple, fast and scalable: a multi-model NoSQL DBMS as an (eb) XML-over-SOAP service. In: 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, Piscataway, pp 220–225
Acknowledgements
This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2015S1A3A2046711)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, M., Jeon, S. & Song, M. Characterizing user interest in NoSQL databases of social question and answer data. J Supercomput 76, 3866–3881 (2020). https://doi.org/10.1007/s11227-018-2293-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2293-x