Skip to main content
Log in

A social voting approach for scientific domain vocabularies construction

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Scientific domain vocabularies play an important role in academic communication and lean research management. Confronted with the dramatic increasing of new keywords, the continuous development of a domain vocabulary is important for the domain to keep its long survival in the scientific context. Current methods based either on statistical or linguistic approaches can automatically generate vocabularies that consist of popular keywords, but these approaches fail to capture high-quality standardized terms due to the lack of human intervention. Manual methods take use of human knowledge, but they are both time-consuming and expensive. In order to overcome these deficiencies, this research proposes a novel social voting approach to construct scientific domain vocabularies. It integrates automatic system and human knowledge based on the theory of linguistic arbitrariness and selects widely accepted standardized set of keywords based on social voting. A social voting system has been implemented to aid scientific domain vocabulary construction in the National Natural Science Foundation of China. Two experiments are conducted to demonstrate the effectiveness and validity of the built system. The results show that the constructed domain vocabulary using this system covers a wide range of areas under a discipline and it facilitates the standardization of scientific terminology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The readers can refer to http://www.niso.org/schemas/iso25964.

  2. http://www.researchgate.net.

  3. http://www.scholarmate.com.

  4. http://www.academia.edu.

  5. http://webofknowledge.com.

  6. The readers can refer to “http://www.nsfc.gov.cn/nsfc/cen/daima/index.html”.

References

  • Barki, H., Rivard, S., & Talbot, J. (1988). An information systems keyword classification scheme. MIS Quarterly, 12(2), 299–322.

    Article  Google Scholar 

  • Bowen, L. (2013). Weighted voting systems. Retrieved January 05, 2013, from http://www.ctl.ua.edu/math103/power/wtvoting.htm.

  • Buckland, M. (1999). Vocabulary as a central concept in library and information science. In Proceedings of the third international conference on conceptions of library and information science (pp. 23–26).

  • Bullinger, A. C., Hallerstede, S. H., Renken, U., Soeldner, J. H., & Moeslein, K. M. (2010). Towards research collaboration—A taxonomy of social research network sites. In: Proceedings of the 16th Americas conference on information systems (AMCIS) (pp. 12–15).

  • Cai, S., & Zou, C. (2010). Formal theories of natural languages. Kunming: People’s Publishing House.

    Google Scholar 

  • Chung, T. M., & Nation, P. (2004). Identifying technical vocabulary. System, 32(2), 251–263.

    Article  Google Scholar 

  • Coursey, K. H., Mihalcea, R., & Moen, W. E. (2009). Automatic keyword extraction for learning object repositories. Proceedings of the American Society for Information Science and Technology, 45(1), 1–10.

    Article  Google Scholar 

  • Du, W., Lau, R. Y. K., Ma, J., & Xu, W. (2015). A multi-faceted method for science classification schemes (SCSS) mapping in networking scientific resources. Scientometrics, 105(3), 2035–2056.

    Article  Google Scholar 

  • Ercan, G., & Cicekli, I. (2007). Using lexical chains for keyword extraction. Information Processing and Management, 43(6), 1705–1714.

    Article  Google Scholar 

  • Fei, L., Feifan, L., & Yang, L. (2011). A supervised framework for keyword extraction from meeting transcripts. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 538–548.

    Article  Google Scholar 

  • Garrod, S. (1998). How groups co-ordinate their concepts and terminology: Implications for medical informatics. Methods of Information in Medicine, 37, 471–476.

    Google Scholar 

  • Gašević, D., Guizzardi, G., Taveter, K., & Wagner, G. (2010). Vocabularies, ontologies, and rules for enterprise and business process modeling and management. Information Systems, 35(4), 375–378.

    Article  Google Scholar 

  • HaCohen-Kerner, Y., Gross, Z., & Masa, A. (2005). Automatic extraction and learning of keyphrases from scientific articles. In A. Gelbukh (Ed.), Computational linguistics and intelligent text processing (pp. 657–669). Berlin: Springer.

    Google Scholar 

  • HaCohen-Kerner, Y., Stern, I., Korkus, D., & Fredj, E. (2007). Automatic machine learning of keyphrase extraction from short html documents written in Hebrew. Cybernetics and Systems: An International Journal, 38(1), 1–21.

    Article  MATH  Google Scholar 

  • Hervás, R., Francisco, V., & Gervás, P. (2013). Assessing the influence of personal preferences on the choice of vocabulary for natural language generation. Information Processing and Management, 49(4), 817–832.

    Article  Google Scholar 

  • Hörlesberger, M., Roche, I., Besagni, D., Scherngell, T., François, C., Cuxac, P., et al. (2013). A concept for inferring ‘frontier research’ in grant proposals. Scientometrics, 97(2), 129–148.

    Article  Google Scholar 

  • Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 conference on empirical methods in natural language processing (pp. 216–223). Association for Computational Linguistics.

  • Jones, S., & Paynter, G. W. (2002). Automatic extraction of document keyphrases for use in digital libraries: Evaluation and applications. Journal of the American Society for Information Science and Technology, 53(8), 653–677.

    Article  Google Scholar 

  • Kageura, K., & Umino, B. (1996). Methods of automatic term recognition: A review. Terminology, 3(2), 259–289.

    Article  Google Scholar 

  • Kim, S. J., Lee, H., & Kim, H. J. (2007). Adaptive partitioned indexes for efficient XML keyword search. Journal of Research and Practice in Information Technology, 39(3), 211–228.

    Google Scholar 

  • Merriam-Webster. (2013). How does a word get into a Merriam-Webster Dictionary? Retrieved January 05, 2013, from http://www.merriam-webster.com/help/faq/words_in.htm.

  • Missikoff, M., Velardi, P., & Fabriani, P. (2003). Text mining techniques to automatically enrich a domain ontology. Applied Intelligence, 18(3), 323–340.

    Article  MATH  Google Scholar 

  • National Information Standards Organization. (2005). Guidelines for the construction, format, and management of monolingual controlled vocabularies. Baltimore, Maryland: NISO Press.

    Google Scholar 

  • Pardo, J. S. (2006). On phonetic convergence during conversational interaction. The Journal of the Acoustical Society of America, 119(4), 2382–2393.

    Article  Google Scholar 

  • Reitter, D., & Lebiere, C. (2011). How groups develop a specialized domain vocabulary: A cognitive multi-agent model. Cognitive Systems Research, 12(2), 175–185.

    Article  Google Scholar 

  • Rowley, J. (1994). The controlled versus natural indexing languages debate revisited: A perspective on information retrieval practice and research. Journal of Information Science, 20(2), 108–119.

    Article  Google Scholar 

  • Saussure, F. D. (1959). Course in general linguistics. New York: McGraw-Hill Book Company.

    Google Scholar 

  • Spies, M. (2010). An ontology modelling perspective on business reporting. Information Systems, 35(4), 404–416.

    Article  Google Scholar 

  • Turney, P. D. (2000). Learning algorithms for keyphrase extraction. Information Retrieval, 2(4), 303–336.

    Article  Google Scholar 

  • Wan, X., Yang, J., & Xiao, J. (2007). Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Annual meeting-association for computational linguistics (pp. 552–559).

  • Wang, X. (2008). Distinction between langue and parole and research subject of lexicology. Journal of Bohai University (Philosophy & Social Science Edition), 30(6), 29–35.

    Google Scholar 

  • Yang, C., Ma, J., Silva, T., Liu, X., & Hua, Z. (2014). A multilevel information mining approach for expert recommendation in online scientific communities. The Computer Journal, 58(9), 1921–1936.

    Article  Google Scholar 

  • Yoon, B., Lee, S., & Lee, G. (2010). Development and application of a keyword-based knowledge map for effective R&D planning. Scientometrics, 85(3), 803–820.

    Article  MathSciNet  Google Scholar 

  • Yule, G. (2006). The study of language. New York: Cambridge University Press.

    Google Scholar 

  • Zaharee, M. (2013). Building controlled vocabularies for metadata harmonization. Bulletin of the American Society for Information Science and Technology, 39(2), 39–42.

    Article  Google Scholar 

  • Zheng, X. (2010). On the arbitrariness of linguistic signs. Cross-Cultural Communication, 5(4), 86–91.

    Google Scholar 

Download references

Acknowledgments

This research was partially supported by the General Research Fund of the Hong Kong Research Grant Council (CityU 119611, CityU 148012), the National Natural Science Foundation of China (71371164) and City University of Hong Kong Teaching Development Grant (6000201).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chen Yang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, H., Yang, C., Ma, J. et al. A social voting approach for scientific domain vocabularies construction. Scientometrics 108, 803–820 (2016). https://doi.org/10.1007/s11192-016-1990-6

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-016-1990-6

Keywords

Navigation