Non-alphanumeric characters in titles of scientific publications: An analysis of their occurrence and correlation with citation impact
Highlights
► Over two-third of the titles of scientific publications have at least one non-alphanumeric character, mostly the hyphen, colon, comma, or parentheses. ► In general, non-alphanumeric publications are associated with larger citation impact, when compared with publications that have only alphanumeric characters in their titles, but this may differ in specific disciplines. ► The relative amount of non-alphanumeric characters does not increase the last 10 years, fcontrasting earlier results. ► Thematically related fields show comparable numbers of specific non-alphanumeric characters.
Introduction
Every day, the inbox of a modern researcher readily fills up with emails from friends, colleagues, and even complete strangers. Even more, at stated intervals, emails arrive that contain titles of interesting publications which have recently been added to databases such as Pubmed, Scopus, or the Web of Science. Furthermore, personal messages, electronic forums, web sites, and social networks all require attention and time. Evidently, new scientific literature is only one stream of information that nowadays flows towards a researcher—albeit a rather pivotal one for the profession at hand. Already some time ago, Meadows (1974) estimated that an average researcher had to scan through roughly 3000 titles per year. We assume that this has only become more, and that the increased information burden leaves even less time to deal with them. Clearly, to get attention of potential readers, it is crucial that a publication is presented effectively to a researcher. In many cases, the title is the way to accomplish this (Soler, 2007). Of course, an author could try a tactic employed by writers of certain emails BEGGING FOR attention. Yet, there is a good chance that this will annoy and subsequently put off potential readers, and since being read is an important factor in the professional success of authors, this is evidently not desirable. As writing and publishing is a communal effort, readers are used to certain topics and styles. Authors can use this to their benefit, by using familiar ways of phrasing a title in order to facilitate quick reading and to use signal words that are expected to trigger the interest of an audience. Yet, phrasing a title too general can bore: a title has to stand out too. Standing out can be accomplished by phrasing differently, for example by using a well-known (but within science not common) literary template such as “to X or not to X” (and filling at the X the particular topic of interest). Alternatively, it could be as simple as using particular, non-alphanumeric characters in a title.
Specific non-alphanumeric characters and title characteristics have been the subject of previous research. Early studies by Dillon, 1981, Dillon, 1982 showed that the colon (“:”) has become a standard character in titles of scientific publications. Lewison and Hartley (2005) also studied the colon and found differences in title length and colon usage, both over time and over disciplines. Hartley (2007) combined a meta-analysis with new results and showed that colons are preferred by students because they improve the structure of a title, but are not necessarily appreciated by their fellow academics, who make up the intended audience of most scientific publications. However, studies cited by Hartley (2007) failed to find significant differences between the number of citations for publications with and without colons in their title, although the scope of this result was limited to a single journal. Beside the colon, Ball (2009) showed that the question mark has become a frequently appearing in titles in Medicine and (to a lesser extend) in Physics. We generalize these previous studies on specific aspects of titles and investigate both use of specific characters in publication titles and correlation with impact in a broad and extensive sense. By this, we mean that we do not focus on a particular (non-alphanumeric) character nor limit our investigation to specific journals or science fields.
Our main research question is: given the importance of readership in the success of scientific publications, could something simple as using a particular type of character “boost” the success of a publication. Our hypothesis is that the effect of non-alphanumeric characters on the success of publications is constrained by conventions regarding readability and form. Consequently, if such characters occur and exhibit a positive correlation with the success of publications, those characters usually have a known function or are accepted elements. We investigate this by posing the following research questions. First, what non-alphanumeric characters exist in scientific publications? Then, can we see a difference in the success of publications with and without such characters? Also, are such effects global, or can we see differences over disciplines? Additionally, what is the effect of frequently occurring characters? Finally, how does the use and impact of characters compare over fields?
Section snippets
Method
To investigate non-alphanumeric characters in titles, we extracted publications from all research fields available in the Web of Science database1 (WoS) published in the period 1999–2008. However, the number of publications available in the WoS for that period is large (almost 13 million), which makes exhaustive analyses too time-consuming and we
Occurrence of non-alphanumeric characters
Our 5% random sample consisted of 642,807 WoS publications, all published between 1999 and 2008. Table 1 lists the 29 non-alphanumeric characters we encountered in the titles of these publications. Next to rank (#) and character (C), this table also shows the number of publications (N) associated with a character, as well as the percentage (%) relative to all publications (in the sample); a point estimate of the impact (I); and the number of publications (articles, letters, notes, reviews) used
Conclusions
We started this publication by pointing out that nowadays, there are many sources of information which require the attention of a scientific researcher. As a result, searching of new, potentially interesting scientific publications, has to compete with those other sources of information, that a researcher has to scan every day. We then continued to hypothesize that therefore, the title of a publication which wants to capture an audience, needs to strike a balance between conforming and
Acknowledgements
We kindly thank the anonymous reviewers for sharpening our arguments, as well as their suggestions for future research.
References (26)
- et al.
Design of [email protected]–Al2O3 nanocomposite for ethanol steam reforming
Journal of Alloys and Compounds
(2008) Planning that title: Practices and preferences for titles with colons in academic articles
Library & Information Science Research
(2007)- et al.
Voltammetry of tetraalkylammonium picrates at water|nitrobenzene and water|dichloroethane microinterfaces: Influence of partition phenomena
Journal of Electroanalytical Chemistry
(2004) Writing titles in science: An exploratory study
English for Specific Purposes
(2007)Algorithms for finding patterns in strings
Handbook of theoretical computer science (Vol. A): Algorithms and complexity
(1990)Scholarly communication in transition: The use of question marks in the titles of scientific articles in medicine, life sciences and physics 1966–2005
Scientometrics
(2009)- et al.
Analysis of cost data in randomized trials: An application of the non-parametric bootstrap
Statistics in Medicine
(2000) - et al.
Bootstrap confidence intervals: When, which, what? A practical guide for medical statisticians
Statistics in Medicine
(2000) - et al.
Estimating confidence intervals for cost-effectiveness ratios: An example from a randomized trial
Statistics in Medicine
(1996) - et al.
ISOL@: An Italian SOLAnaceae genomics resource
BMC Bioinformatics
(2008)
The emergence of the colon: An empirical correlate of scholarship
American Psychologist
In pursuit of the colon: A century of scholarly progress: 1880–1980
The Journal of Higher Education
An introduction to the bootstrap
Cited by (60)
Academic “click bait”: A diachronic investigation into the use of rhetorical part in pragmatics research article titles
2023, Journal of English for Academic PurposesMotivation for downloading academic publications
2023, Library and Information Science ResearchPoincare: Recommending Publication Venues via Treatment Effect Estimation
2022, Journal of InformetricsCitation Excerpt :Our approach can be seen as the next step of Dong et al. as they formulated the problem by a prediction problem, while we provide a concrete action via venue recommendation. The relationship between the citation pattern and the content of a paper has also been extensively studied (Buter and van Raan, 2011; Falagas et al., 2013; Subotic and Mukherjee, 2014; Vieira and Gomes, 2010). Overall, the existing methods forecast the number of citations or model the transition of the number of citations, and they do not recommend venues to maximize the impact.
Titles in research articles
2022, Journal of English for Academic PurposesCitation Excerpt :So, longer titles are cited more often in many disciplines (Jacques & Sebire, 2010, pp. 2–3; van Wesel, Wyatt & ten Haaf, 2014), although this might not be the case in pure sciences (van Wesel et al., 2014). Titles that contain a colon may be more likely to attract citations (Buter & van Raan, 2011; Jacques & Sebire, 2010), while those with question marks had poorer citation rates (Paiva et al., 2012). In the largest study to date, Hudson (2016) examined the impact of multiple authorship on the titles of papers submitted to the UK's four-yearly Research Evaluation Framework (the REF) in 2014.
Comparison of citations and attention of cover and non-cover papers
2020, Journal of Informetrics