Abstract
Scientometric evaluation of nanoscience/nanotechnology requires complex search strategies and lengthy queries which retrieve massive amount of information. In order to offer some insight based on the most frequently occurring terms our research focused on a limited amount of data, collected on uniform principles. The prefix nano comes about in many different compound words thus offering a possibility for such assessment. The aim is to identify the scatter of nanoconcepts, among and within journals, as well as more generally, in the Web of Science (WOS). Ten principal journals were identified along with all unique nanoterms in article titles. Such terms occur on average in half of all titles. Terms were thoroughly investigated and mapped by lemmatization or stemming to the appropriate roots—nanoconcepts. The scatter of concepts follows the characteristics of power laws, especially Zipf’s law, exhibiting clear inversely proportional relationship between rank and frequency. The same three nanoconcepts are most frequently occurring in as many as seven journals. Two concepts occupy the first and the second rank in six journals. The same six concepts are the most frequently occurring in ten journals as well as full WOS database, representing almost two thirds of all nanotitled articles, in both instances. Subject categories don’t play a decisive role. Frequency falls progressively, quickly producing a long tail of rare concepts. Drop is almost linear on the log scale. The existence of hundreds of different closed-form compound nanoterms has consequences for the retrieval on the Internet search engines (e.g. Google Scholar) which do not permit truncation.
Similar content being viewed by others
References
Adamic, L. A. (2000). Zipf, power-laws, and pareto: A ranking tutorial. Xerox Palo Alto Research Center, Palo Alto. http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html. Accessed 20 April 2014.
Baird, D., Nordmann, A., & Schummer, J. (2004). Introduction. Discovering the nanoscale (pp. 1–8). Amsterdam: IOS Press.
Bar-Ilan, J. (2008). Informetrics at the beginning of the 21st century: A review. Journal of Informetrics, 2(1), 1–52.
Bartol, T., Budimir, G., Dekleva-Smrekar, D., Pusnik, M., & Juznic, P. (2014). Assessment of research fields in Scopus and Web of Science in the view of national research evaluation in Slovenia. Scientometrics, 98(2), 1491–1504.
Bassecoulard, E., Lelu, A., & Zitt, M. (2007). Mapping nanosciences by citation flows: A preliminary analysis. Scientometrics, 70(3), 859–880.
Benz, R. W., Swamidass, S. J., & Baldi, P. (2008). Discovery of power-laws in chemical space. Journal of Chemical Information and Modeling, 48(6), 1138–1151.
Braun, T., Schubert, A., & Zsindely, S. (1997). Nanoscience and nanotecnology on the balance. Scientometrics, 38(2), 321–325.
Calero, C., Buter, R., Cabello Valdés, C., & Noyons, E. (2006). How to identify research groups using publication analysis: An example in the field of nanotechnology. Scientometrics, 66(2), 365–376.
Glänzel, W., Meyer, M., Du Plessis, M., Thijs, B., Magerman, T., Schlemmer, B., et al. (2003). Nanotechnology: Analysis of an emerging domain of scientific and technological endeavour (Report). Leuven: K.U. Leuven, Steunpunt O&O Statistieken.
Grieneisen, M. L., & Zhang, M. (2011). Nanoscience and nanotechnology: Evolving definitions and growing footprint on the scientific landscape. Small (Weinheim an der Bergstrasse, Germany), 7(20), 2836–2839.
Guan, J., & Ma, N. (2007). China’s emerging presence in nanoscience and nanotechnology: A comparative bibliometric study of several nanoscience “giants”. Research Policy, 36(6), 880–886.
Heinze, T. (2004). Nanoscience and nanotechnology in Europe: Analysis of publications and patent applications including comparisons with the United States. Nanotechnology Law & Business, 1(4), 427–447.
Heinze, T., Shapira, P., Senker, J., & Kuhlmann, S. (2007). Identifying creative research accomplishments: Methodology and results for nanotechnology and human genetics. Scientometrics, 70(1), 125–152.
Holliday, J. D., Kanoulas, E., Malim, N., & Willett, P. (2011). Multiple search methods for similarity-based virtual screening: Analysis of search overlap and precision. Journal of Cheminformatics, 3(1), 1–15.
Huang, C., Notten, A., & Rasters, N. (2011). Nanoscience and technology publications and patents: A review of social science studies and search strategies. Journal of Technology Transfer, 36(2), 145–172.
Karakoc, E., Sahinalp, S. C., & Cherkasov, A. (2006). Comparative QSAR- and fragments distribution analysis of drugs, druglikes, metabolic substances, and antimicrobial compounds. Journal of Chemical Information and Modeling, 46(5), 2167–2182.
Kostoff, R. N., Lau, C. G. Y., Tolles, W. M., & Murday, J. S. (2006). The seminal literature of nanotechnology research. Journal of Nanoparticle Research, 8(2), 193–213.
Leydesdorff, L., & Zhou, P. (2007). Nanotechnology as a field of science: Its delineation in terms of journals and patents. Scientometrics, 70(3), 693–713.
Lipkus, A. H., Yuan, Q., Lucas, K. A., Funk, S. A., Bartelt, W. F., Schenck, R. J., & Trippe, A. J. (2008). Structural diversity of organic chemistry. A scaffold analysis of the CAS Registry. The Journal of Organic Chemistry, 73(12), 4443–4451.
Magerman, T., Looy, B. V., & Song, X. (2010). Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications. Scientometrics, 82(2), 289–306.
Maghrebi, M., Abbasi, A., Amiri, S., Monsefi, R., & Harati, A. (2011). A collective and abridged lexical query for delineation of nanotechnology publications. Scientometrics, 86(1), 15–25.
Marinova, D., & McAleer, M. (2003). Nanotechnology strength indicators: International rankings based on US patents. Nanotechnology, 14(1), R1. doi:10.1088/0957-4484/14/1/201.
Melz, R., Biemann, C., Böhm, K., Heyer, G., & Schmidt, F. (2005). Real-time analysis of speech streams and their representation as conceptual structures. In Proceedings of HCI-05. Las Vegas, Nevada, USA: HCI International.
Meyer, M., & Persson, O. (1998). Nanotechnology-interdisciplinarity, patterns of collaboration and differences in application. Scientometrics, 42(2).
Milojević, S. (2010). Power law distributions in information science: Making the case for logarithmic binning. Journal of the American Society for Information Science and Technology, 61(12), 2417–2425.
Milojević, S. (2012). Multidisciplinary cognitive content of nanoscience and nanotechnology. Journal of Nanoparticle Research, 14(1), 1–28.
Mogoutov, A., Cambrosio, A., Keating, P., & Mustar, P. (2008). Biomedical innovation at the laboratory, clinical and commercial interface: A new method for mapping research projects, publications and patents in the field of microarrays. Journal of Informetrics, 2(4), 341–353.
Mogoutov, A., & Kahane, B. (2007). Data search strategy for science and technology emergence: A scalable and evolutionary query for nanotechnology tracking. Research Policy, 36(6), 893–903.
Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46(5), 323–351.
Noyons, E. C. M., Buter, R. K., van Raan, A. F., Schmoch, U., Heinze, S., Hinze, S., & Rangnow, R. (2003). Mapping excellence in science and technology across Europe: Nanoscience and Nanotechnology (Final report No. EC-PPN CT-2002-0001). Leiden: Leiden University.
Piantadosi, S. T. (2014). Zipf’s word frequency law in natural language: A critical review and future directions. Psychonomic Bulletin & Review, 1–19. doi:10.3758/s13423-014-0585-6.
Porter, A. L., Youtie, J., Shapira, P., & Schoeneck, D. J. (2008). Refining search terms for nanotechnology. Journal of Nanoparticle Research, 10(5), 715–728.
Shiri, A. (2011). Revealing interdisciplinarity in nanoscience and technology queries: A transaction log analysis approach. Knowledge Organization, 38(2), 135–153.
Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87(2), 373–388.
Strotmann, A., & Zhao, D. (2010). Combining commercial citation indexes and open-access bibliographic databases to delimit highly interdisciplinary research fields for citation analysis. Journal of Informetrics, 4(2), 194–200.
Thelwall, M., & Price, L. (2006). Language evolution and the spread of ideas on the Web: A procedure for identifying emergent hybrid word family members. Journal of the American Society for Information Science and Technology, 57(10), 1326–1337.
Tsuda, K., Rinaldo, F. J., Kryssanov, V. V., & Thawonmas, R. (2006). The structure of patent authorship networks in Japanese manufacturing companies. In ICE-B (pp. 289–293). International Conference on E-Business, Setubal, Portugal. http://www.ice.ci.ritsumei.ac.jp/~ruck/PAP/ice-b06.pdf. Accessed 20 April 2014.
Turenne, N. (2010). Modelling noun-phrase dynamics in specialized text collections. Journal of Quantitative Linguistics, 17(3), 212–228.
Veltri, G. A. (2012). Viva la Nano-Revolución! A semantic analysis of the Spanish national press. Science Communication, 35(2), 143–167.
Wang, L., Notten, A., & Surpatean, A. (2013). Interdisciplinarity of nano research fields: A keyword mining approach. Scientometrics, 94(3), 877–892.
Warris, C. (2004). Nanotechnology benchmarking project (p. 45). Australian Academy of Science. http://www.sciencearchive.org.au/policy/nano-report.pdf. Accessed 20 April 2014.
Yan, S., Spangler, W. S., & Chen, Y. (2013). Chemical mame extraction based on automatic training data generation and rich feature set. IEEE-ACM Transactions on Computational Biology and Bioinformatics, 10(5), 1218–1233.
Zhang, W., Yoshida, T., & Tang, X. (2009). Distribution of multi-words in Chinese and English documents. International Journal of Information Technology & Decision Making, 8(2), 249–265.
Zibareva, I. V., Vedyagin, A. A., & Bukhtiyarov, V. I. (2014). Nanocatalysis: A bibliometric analysis. Kinetics and Catalysis, 55(1), 1–11.
Zipf, G. K. (1949). Human behaviour and the principle of least effort. Cambridge, MA: Addison-Wesley.
Zitt, M., & Bassecoulard, E. (2006). Delineating complex scientific fields by an hybrid lexical-citation method: An application to nanosciences. Information Processing and Management, 42(6), 1513–1531.
Acknowledgments
This work was supported by the Slovenian Research Agency, Research Programme P4-0085 (D).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bartol, T., Stopar, K. Nano language and distribution of article title terms according to power laws. Scientometrics 103, 435–451 (2015). https://doi.org/10.1007/s11192-015-1546-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-015-1546-1