Skip to main content

Advertisement

Log in

Characterizing user interest in NoSQL databases of social question and answer data

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

With the advent of social media technology for sharing commonly asked questions and answers among end users, there is rapidly growing interest in understanding the characteristics as well as utilizing social question and answer (QA) data. Not only SQL (NoSQL) is a popular technical topic on social question answering Web sites and is gaining popularity with emerging demands for scalable databases of big data. Despite the great interest of users in NoSQL technology, an attempt to analyze how the actual users react to NoSQL has not yet been made. Thus, in the present work, we utilize the QA data acquired from Stack Overflow (a QA Web site that works as a large knowledge repository) to understand how people perceive NoSQL technology. To this end, latent Dirichlet allocation (LDA) topic modeling techniques are used to discover the trend of NoSQL databases. In addition, we examine a weighted LDA model to reflect the influence of answers and finally propose the topic discrimination value in an attempt to find topics that distinguish each NoSQL database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Sammut C, Banerji RB (1986) Learning concepts by asking questions. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach, vol 2. Morgan Kaufmann. Burlington, USA, pp 167–192

    Google Scholar 

  2. King A (1994) Guiding knowledge construction in the classroom: effects of teaching children how to question and how to explain. Am Educ Res J 31:338–368

    Article  Google Scholar 

  3. Ross J (2009) How to ask better questions. Harv Bus Rev. https://hbr.org/2010/02/learn-to-ask-better-questions. Accessed 26 Jan 2018

  4. Liu Z, Jansen BJ (2017) Identifying and predicting the desire to help in social question and answering. Inf Process Manag 53:490–504

    Article  Google Scholar 

  5. Palomera D, Figueroa A (2017) Leveraging linguistic traits and semi-supervised learning to single out informational content across how-to community question-answering archives. Inf Sci 381:20–32

    Article  Google Scholar 

  6. Atwood J (2009) What was stack overflow built with? Stack overflow blog. https://stackoverflow.blog/2008/09/21/what-was-stack-overflow-built-with/. Accessed 26 Jan 2018

  7. Kanwar R, Trivedi P, Singh K (2013) NoSQL, A solution for distributed database management system. Int J Comput Appl 67:6–9

    Google Scholar 

  8. Zhang H, Wang Y, Han J (2011) Middleware design for integrating relational database and NOSQL based on data dictionary. In: 2011 International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE). IEEE, Piscataway, pp 1469–1472

  9. Hajoui O, Dehbi R, Talea M, Batouta ZI (2015) An advanced comparative study of the most promising nosql and newsql databases with a multi-criteria analysis method. J Theor Appl Inf Technol 81:579–588

    Google Scholar 

  10. Tudorica BG, Bucur C (2011) A comparison between several NoSQL databases with comments and notes. In: The 10th Roedunet International Conference (RoEduNet). IEEE, Piscataway, pp 1–5

  11. Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: The 6th International Conference on Pervasive Computing and Applications (ICPCA). IEEE, Piscataway, pp 363–366

  12. Lourenço JR, Cabral B, Carreiro P, Vieira M, Bernardino J (2015) Choosing the right NoSQL database for the job: a quality attribute evaluation. J Big Data 2:18. https://doi.org/10.1186/s40537-015-0025-0

    Article  Google Scholar 

  13. Moniruzzaman ABM, Hossain SA (2013) Nosql database: new era of databases for big data analytics-classification, characteristics and comparison. Int J Database Theor Appl 6:1–14

    Google Scholar 

  14. http://nosql-database.org/

  15. http://cassandra.apache.org/

  16. https://hbase.apache.org/

  17. http://couchdb.apache.org/

  18. http://www.mongodb.org/

  19. https://ravendb.net/

  20. http://www.aerospike.com/

  21. https://aws.amazon.com/dynamodb/?nc1=h_ls

  22. http://leveldb.org/

  23. https://redis.io/

  24. http://basho.com/products/

  25. https://www.arangodb.com/

  26. http://neo4j.org/

  27. http://orientdb.com/

  28. Parnin C, Treude C, Grammel L, Storey MA (2012) Crowd documentation: exploring the coverage and the dynamics of API discussions on stack overflow. Georgia Institute of Technology, USA

    Google Scholar 

  29. Gajduk A, Madjarov G, Gjorgjevikj D (2013) Intelligent tag grouping by using an aglomerative clustering algorithm. In: The 10th Conference for Informatics and Information Technology (CIIT 2013). Springer, New York, pp 94–96

  30. Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? An analysis of topics and trends in stack overflow. Empir Softw Eng 19:619–654

    Article  Google Scholar 

  31. Yang J, Tao K, Bozzon A, Houben GJ (2014) Sparrows and owls: characterisation of expert behaviour in stack overflow. In: International Conference on User Modeling, Adaptation, and Personalization. Springer, New York, pp 266–277

  32. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  33. Ramage D, Heymann P, Manning CD, Garcia-Molina H (2009) Clustering the tagged web. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, New York, pp 54–63

  34. Celikyilmaz A, Hakkani-Tur D, Tur G (2010) LDA based similarity modeling for question answering. In: Proceedings of the NAACL HLT 2010 workshop on semantic search. MIT Press, Cambridge, pp 1–9

  35. Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: The European Conference on Information Retrieval. Springer, Berlin, pp 338–349

  36. Jeong YS, Lee SH, Gweon G, Choi HJ (2017) Discovery of topic flows of authors. J Super Comput. https://doi.org/10.1007/s11227-017-2065-z

    Article  Google Scholar 

  37. http://www.nltk.org/

  38. https://radimrehurek.com/gensim/

  39. Wallach HM, Murray I, Salakhutdinov R, Mimno D (2009) Evaluation methods for topic models. In: Proceedings of the 26th Annual International Conference on Machine Learning. ACM, New York, pp 1105–1112

  40. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228–5235

    Article  Google Scholar 

  41. Messina A, Storniolo P, Urso A (2016) Keep it simple, fast and scalable: a multi-model NoSQL DBMS as an (eb) XML-over-SOAP service. In: 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, Piscataway, pp 220–225

Download references

Acknowledgements

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2015S1A3A2046711)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Song.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, M., Jeon, S. & Song, M. Characterizing user interest in NoSQL databases of social question and answer data. J Supercomput 76, 3866–3881 (2020). https://doi.org/10.1007/s11227-018-2293-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2293-x

Keywords