Introduction

Rheumatology is considered as a subspecialty of internal medicine and pediatrics. It includes clinical problems in joints, soft tissues, autoimmune diseases, vasculitis and heritable connective tissue disorders (Cheng and Zhang 2013). In fact, this kind of musculoskeletal diseases are prevalent and have significant consequences for the individual and society, being one of the major causes of disease burden around the world (Brooks 2006).

Due to this high social importance, the present study aims at identifying the Highly Cited Papers (HCP) into the Rheumatology scientific production. Furthermore, the conceptual evolution and co-word analysis of the HCP detected is also performed. HCP could be considered relevant in a research field development because they have attracted the interest of the research community. In this sense, the concept of citation classic consists in characterize the HCP of a scientific discipline (Garfield 1977). Citations classics help to discover potentially valuable information towards the development of a discipline and understand its past, present and future scientific structure.

According to the research literature, a series of papers have been published focusing the bibliometric impact of the Rheumatology research field. Cheng and Zhang (2013) analyzed the articles published in 39 rheumatology journals from 1996 to 2010 using the Scopus database; the number of articles, citations, h-index, and international collaborations were determined by countries or regions. In Chen et al. (2011) the Impact Factors of rheumatology journals from 1999 to 2008 were analyzed and compared with other fields. Cheng & Zhang (2010) evaluated the scientific production on rheumatology field in the 3 major regions of China (Mainland, Hong Kong, and Taiwan) during the period 2000–2009. Batlle-Gualda et al. (1998) analyzed the magnitude, evolution and, characteristics of the Spanish scientific production from 1990 to 1996 in Rheumatology. Redondo et al. (2016) performed a bibliometric study of the scientific publications on patient-reported outcomes in Rheumatology. However, the HCP of the Rheumatology research field have not been analyzed.

According to the aim stated above, some aspects can be analyzed by identifying the set of HCP: (I) the HCP distribution during the period studied; (II) the most productive journals, authors, institutions and countries; and, (III) the main topics covered by the papers detected.

Finally, we should highlight that this paper is a further extension of our contribution presented at ISSI 2017 (Perez-Cabezas et al. 2017).

Methods

In order to discover and analyze the HCP into the Rheumatology scientific output two different methodologies have been applied: H-Classics approach and co-word analysis. Thus, this section is divided into three parts: (i) a description of the sample used to discover the HCP is provided, (ii) the HCP identification approach is presented, and (iii) the conceptual evolution and co-word analysis are described.

Sample

The set of documents to perform the bibliometric analysis is based on the journals indexed in the Rheumatology category of the Journal Citation Report (JCR-2016), which is used to construct an adequate list of the Rheumatology journals. It has been suggested that JCR contains the most important research documents of the different scientific disciplines since they are considered as a very important criterion in tenure, promotion and other professional decisions (Hodge and Lacasse 2011; Seipel 2003).

Therefore, to develop the HCP analysis, the documents published by the 30 journals indexed in the JCR Rheumatology category were obtained. The search was performed in December 2017. Finally, a total of 103.175 documents (articles and reviews) were retrieved, from period 1945–2016, containing the following information: authors, affiliations, title, year of publication, citations, sources, abstract and keywords. It is important to remark that the full list of HCP was reviewed in order to exclude those documents based on consensus diagnostic criteria, such as clinical guidelines.

Highly cited papers identification

A common characteristic in the study of HCP is to select a criterion to establish a threshold value to discriminate whether a paper is considered as highly cited or not (Garfield 1977, 1987). In the present study, the concept of H-Classics (Martínez et al. 2014), based on the popular H-index (Gutiérrez-Salcedo et al. 2017; Hirsch 2005), is applied to identify the highly cited documents in Rheumatology research field. This approach provides an unbiased and fair criterion for constructing a systematic search procedure for HCP. Furthermore, the H-Classics provides a rigorous and scientific method to discover the most relevant papers in a field.

In order to better understand the procedure, it is interesting to know the following concepts: (i) The H-index can be defined as follows (Hirsch 2005): “A scientist has index h if h of his or her Np papers have at least h citations each, and the other (Np–h) papers have ≤ h citations each”. (ii) The concept of H-Classics can be defined as follows (Martínez et al. 2014): “H-Classics of a research area A could be defined as the H-core (a group of high-performance publications) of A that is composed of the H highly cited papers with more than H citations received.”

Therefore, the following steps are applied to carry out the identification process of highly cited paper using the H-Classics concept (Moral-Munoz et al. 2016):

  1. (a)

    Bibliographic database selection to retrieve the study sample There are various bibliographic databases available to perform bibliographic studies, with the three most important ones being: Web of Science (WoS), Scopus, and Google Scholar. For the present study, as mentioned before, WoS was used as it indexes the most reliable research information and offers a high number of analysis tools to process it.

  2. (b)

    Delimit the research area It is necessary to identify the leading research publications related to the area of study, so the set of journals that are traditionally used to disseminate scientific advances in the area needs to be established. In the case of WoS, if the area matches one of the scientific areas within JCR, then it would be easy to get the set of journals of interest in that area. In the present study, the Rheumatology category was used to retrieve the set of documents.

  3. (c)

    Computation of the H-index of the research area To compute the H-index of a research field, establish a ranking of papers according to their citations is necessary, that is, citations must order the set of documents in a decreasing way. The interest here is to locate the first document whose ranking position is below its citation count because the H-index will be the ranking position of the paper immediately above. Although the H-index can be computed manually, as mentioned above, there are different tools available in WoS that facilitates its computation for a given set of research documents. In the present study, the H-index of Rheumatology research field is 317.

  4. (d)

    Retrieve the H-core of the research area At this step, the first H highly cited papers in the previous ranking are included in the H-core of the research area. The H-Core of the research area includes its H-Classics, so the H-index of a research area is the cardinality of its H-core of the area.

Conceptual evolution and co-word analysis

Once the H-Classics were identified two different analyses were performed. Former, the documents were classified in decades and through the central theme/clinical condition of each document the more highlighted concepts are shown. Latter, a co-word analysis using SciMAT (Cobo et al. 2012) has been performed for the whole period (1945–2016) following the next phases:

  1. (a)

    Research themes detection In this phase, the clusters obtained correspond to centers of interest and/or research problems that attract the researchers’ attention.

  2. (b)

    Low dimensional space layout of research themes A spatial layout of research themes is achieved by plotting each detected cluster into a two-dimensional strategic diagram (Callon et al. 1991). Once the research themes are mapped into a two-dimensional space, they can be classified into four groups (Cobo et al. 2011): (I) Motor themes appear in the upper-right quadrant and are considered well developed and important for the structuring of a research field. (II) Basic and transversal appear in the lower-right quadrant and are considered important for a research field but are not yet developed. (III) Emerging or declining appear in the lower-left quadrant and are considered weakly or marginally developed. (IV) Highly developed and isolated appear in the upper-left quadrant and are considered to be well developed but of marginal importance for the field.

  3. (c)

    Performance analysis The relative contribution of research themes to the whole research field is measured quantitatively and qualitatively. In this way, the most prominent, productive and highest-impact subfields may be identified. To do this, a set of bibliometric indicators could be applied to the different detected themes and thematic areas: number of published documents, number of received citations, and h-index (Alonso et al. 2009; Hirsch 2005; Martínez et al. 2014).

Before applying these phases, document’s keywords were submitted to different processes. A de-duplicating process was applied to group the synonymous as well as the plural and singular forms of the same words, in order to identify a single word representing a concept. Finally, keywords that are thought to be meaningless in this context such as stop words, or words with very broad and general meanings were removed (e.g., disease, outcomes, etc.).

Results

In the following sections, the 317 HCP identified (Table 1) in Rheumatology research field are analyzed: (I) the distribution of HCP per year of publication is studied, (II) journals, authors, institutions and countries producing the highest number of HCP are identified, and (III) a content analysis is provided.

Table 1 Top 10 highly cited papers

Distribution of HCP

As aforementioned, the study was performed with a collection of 103.175 documents, 317 of which were classified as HCP. The distribution of the HCP documents per year is shown in Fig. 1. These documents are published between 1950 and 2012, covering an extended period and concentrating the production in the period 1997–2008. The year with more HCP documents published is 2005. The first document was published by Forestier and Rotes-Querol (1950). It is worth noting the ranking position of the document published by Helmick et al. (2008), ranked 6/317 in a short period. Overall, although HCP are usually identified at the beginning of the period, Rheumatology research area is attracting the research community attention in relatively recent years.

Fig. 1
figure 1

Distribution of Rheumatology H-Classics documents per year of publication

Most productive journals, authors, institutions, and countries

In this section, different social units (journals, authors, institutions and countries) are analyzed. It is important to highlight that each document was considered from all authors’ institutions and countries, not only the first or corresponding author. Firstly, the most productive journals could be detected. In this sense, Table 2 shows the journals with 5 or more H-Classics in the Rheumatology research field. Arthritis and Rheumatism with 196 documents is the most productive journal, followed by Annals of the Rheumatic Diseases and Journal of Rheumatology, with 43 and 32 respectively.

Table 2 Most productive highly cited journals

Otherwise, some authors with 9 or more HCP documents in the Rheumatology research field are remarkable. First ranked, Emery, P., affiliated with the University of Leeds (UK), and Felson, D.T., affiliated with Boston University (USA) with 13 HCP, are the authors with the highest number of HCP. They are followed by van der Heijde, D., affiliated with Leiden University Medical Center (Netherlands) with 11 HCP and Dougados, M., affiliated to Paris Descartes University (France) with 9 HCP.

On the other hand, according to the information on author addresses contained in the research papers, the corresponding institutions and countries can be identified. Table 3 shows the ranking of institutions with 10 or more HCP. The Top 2 most productive institutions are the University of California (USA) and the University of Standford (USA), with 23 and 17 documents, respectively. Nevertheless, if a proportion (total number of published articles in the research field/number of HCP) is calculated, the Lainz Hospital (Austria) is highly remarkable. It could be concluded that it has 1 HCP for every 7.60 published articles.

Table 3 Most productive highly cited institutions

Finally, Table 4 shows the ranking of countries that produce the HCP in Rheumatology research field. It is ordered by an Adjustment Index (AI) (Zyoud et al. 2015) based on the GDP (Gross Domestic Product) per capita (The World Bank 2016). The AI was calculated with the formula: AI = ((total number of HCP/GDP per capita of the country) * 100). If we take into account these results, the positions reached by Peoples R. China, Mexico and, South Africa, with 4, 4 and 2 documents respectively, are remarkable. On the other hand, if the total of HCP is taken into account, the country with the highest number of HCP is the USA with 151, the half of the total number of documents. It is followed by UK and Netherlands with 76 and 50 documents respectively. The predominance of USA in the Rheumatology HCP production is evident.

Table 4 Highly cited countries according to an adjustment index based on GDP per capita

In views of these results, it could be stated that the Rheumatology research area is now in the consolidated development stage (Vargas-Quesada et al. 2017). The HCP are mainly concentrated in the early years of the area development and relatively recent publications are attracting the scientific interest. On the other hand, different research producers (authors, institutions and countries) have been consolidated as primary knowledge generators.

Content analysis and co-word analysis

Given the above information, an analysis of the content of Rheumatology HCP can be carried out. The first HCP is from 1950, “Senile ankylosing hyperostosis of the spine” by Forestier and Rotes-Querol (1950). They performed a clinical and radiological study of nine patients and anatomopathological analysis of two deceased subjects. It allowed the authors to isolate an ankylosing condition of the spine that differs from spondylitis. In the same decade, Kellgren and Lawrence (1957, 1958) wrote two papers, first based on the radiological diagnosis of osteoarthritis (1st ranked HCP), and the second on the frequency of degenerative joint diseases in the urban population. Finally, Pearson and Wood (1959) observed non-joint associated lesions induced by injection of mycobacterial adjuvant and apparent hypersensitivity in rats with polyarthritis.

From the 60s, only three documents are considered HCP. First one, in chronological order and citations, Lawrence et al. (1966) observed the incidence of rheumatic diseases collecting information about occupation, weather and other possible factors that may be related to these clinical conditions. Balazs et al. (1967) is based on the parameters of hyaluronic acid in synovial fluid in patients with arthritis. Mason and Barnes (1969) analyzed the diagnostic criteria for Behçet’s syndrome.

From the 70s, several HCP appears with a wide range of topics, such as Lyme’s arthritis (Steere et al. 1977), Still’s disease in the adult (Bywaters 1971), articular mobility in the African population (Beighton et al. 1973), or a correlational study on four scales for pain measurement (Downie et al. 1978). Next decade, in the 80 s, a total of 42 HCP have been detected. Bellamy et al. (1988) (2nd ranked HCP) carried out the validation of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), a multidimensional and self-administered health assessment instrument for patients with osteoarthritis of the hip or knee. According to our findings, in this decade there was great interest in studies centered on different measurement scales (Lorig et al. 1989) and questionnaires about health status, symptoms, and disabilities in rheumatic patients (Fries et al. 1980, 1982).

From the 90s, documents are focused on two main topics. On the one hand, there are documents focused on the different rheumatic diseases, such as the prevalence and characteristics of fibromyalgia in the general-population (Wolfe et al. 1995) or the mortality of rheumatoid-arthritis (Wolfe et al. 1994). On the other hand, there are HCP about the questionnaires, indexes and scales validation for measuring the impact and the physical activity of these diseases. Among the thirteen first HCP we find four works made in this decade and with this topic; a document about the development and validation of Modified Disease Activity Scores (Prevoo et al. 1995); another about the disease activity index for lupus patients (Gladman et al. 1997); and two about the Bath Ankylosing-Spondylitis Index (Calin et al. 1994; Garrett et al. 1994).

From 2000, most of the articles that appear as HCP are research on treatments and side effects of different therapies and drugs. In 2005, the largest amount of HCP appeared. The most cited documents are, a clinical trial about four different treatment strategies in patients with early rheumatoid arthritis (Goekoop-Ruiterman et al. 2005) (33rd ranked HCP), and a study that compares the properties of human mesenchymal stem cells from bone marrow, synovium, periosteum, skeletal muscle, and adipose tissue as a potential source for clinical applications (Sakaguchi et al. 2005) (38th ranked HCP). In 2008 there are two parts of a study appears; the first one (Helmick et al. 2008) is focused on the USA prevalence of rheumatoid arthritis, juvenile arthritis, the spondylarthritides, systemic lupus erythematosus, systemic sclerosis, and Sjögren’s syndrome; the second one (Lawrence et al. 2008) studies the USA prevalence of osteoarthritis, polymyalgia rheumatic and giant cell arteritis, gout, fibromyalgia, and carpal tunnel syndrome, as well as the symptoms of neck and back pain. Also, we should remark that although both documents are relatively recent, they appear in a good position, the first one 21st ranked, and the second 7th ranked.

On the other hand, a strategic diagram was built using SciMAT, in order to analyze the most notable themes for the HCP detected in the period 1945–2016. In these spatial representations, the spheres’ volume is proportional to the number of documents associated with each theme. In brackets, the number of citations associated with each theme is also depicted.

According to the strategic diagram shown in Fig. 2, some observations can be made: Two motor themes, Monoclonal Antibody and Osteo-arthritis, reflect the principal topics of the set of documents that compounds the HCP. Monoclonal Antibody seems to be the most frequent topic among the HCP. It is related to several applications, such as diagnostic tests, biosensors and treatment (autoimmune diseases). Osteo-arthritis is related to the documents about different aspects of this rheumatologic pathology, such as prevalence, treatments and diagnosis. On the other hand, a basic and transversal theme, Messenger RNA, is focused on a topic related to different clinical conditions. It is used for the treatment of different rheumatic pathologies, such as rheumatoid arthritis. Furthermore, an emerging or declining theme, Chondrocytes, is related to the cartilage degeneration due to the immune response directed against this tissue. In autoimmune diseases, the inflammatory process causes that antibodies and T cells have the chondrocytes as the target. Lastly, as a highly developed and isolated theme appears Spondylarthropathy. This topic is focused on aspects related to this clinical condition, such as its treatment (anti-tumor necrosis factor, infliximab, etc.) and diagnosis or (radiography, scintigraphy, etc.).

Fig. 2
figure 2

Strategic diagram for the HCP detected in the period 1985–2000

Conclusions

In the present study, Rheumatology HCP have been identified and consequently analyzed using the concept of H-Classics. The analysis of the HCP allows us to highlight the following remarkable findings:

  • 317 Rheumatology HCP were identified in the period 1945–2016, with citation counts ranging from 317 to 4772. The HCP with the highest citations count, about the radiological assessment of osteo-arthritis, is authored by Kellgren and Lawrence (1957).

  • The first HCP was published by Forestier and Rotes-Querol (1950) about the senile ankylosing hyperostosis of the spine. Furthermore, it is worth noting the paper of Helmick et al. (2008), ranked 6/317 in a short period.

  • Arthritis and Rheumatism is the most productive journal with 196 HCP.

  • Professor Emery, from the University of Leeds (UK), and professor Felson, from the Boston Univesity (USA), are the authors with the highest number of HCP, 13 each one.

  • The University of California (USA) and the University of Standford (USA) are the main institutional contributors of HCP, with 23 and 17 documents respectively.

  • The predominance of USA in producing HCP is remarkable. Its production represents more than half of the total of HCP identified. Nevertheless, it is interesting to remark the positions reached by Peoples R. China, Mexico and, South Africa, with 4, 4 and 2 documents respectively, when an adjustment index based in the GDP per capita is applied.

  • Taking a look at the conceptual evolution, the topics change from basics, such as the characteristics related to rheumatic diseases, to more complex, such as advanced kinds of treatments. On the other hand, the co-word analysis reveals that osteo-arthritis and monoclonal antibody are the leader topics of this set of HCP.

It is worth mentioning the practical application of the present study as it provides potentially relevant information to help understand the past, present and future scientific structure of the Rheumatology field that could help its future research development. It is also interesting in order to identify the different actors in which the scientific attention is focused.